JVM memory structure
The JVM stack consists of a heap, a stack, a local method stack, and a method area, as shown below:
JVM memory reclamation
Sun JVMGenerationalCollecting (recycling) principle is this: the object is divided into Young generation (Young), the old generation (Tenured), lasting generation (Perm), objects of different life cycle using different algorithms. (Based on object lifecycle analysis)
1.Young
The younger generation is divided into three sections. One Eden zone, two Survivor zones. Most objects are generated in the Eden zone. When Eden is full, the surviving objects will be copied to the Survivor zone (one of the two). When this Survivor zone is full, the surviving objects in this Survivor zone will be copied to another Survivor zone. When this Survivor zone is full, Objects copied from the first Survivor zone that are still alive at this time will be copied to the old zone (Tenured). It should be noted that the two Survivor zones are symmetric and have no sequence relationship, so there may be objects copied from Eden and the object copied from the previous Survivor in the same zone, while only the object copied from the first Survivor to the old zone. Also, one of the Survivor zones is always empty.
2.Tenured
The tenured generation holds objects that survive from the younger generation. In general, the old generation stores objects with a long life span.
3.Perm (Persistent generation)
Used to store static files, nowadays Java classes, methods, etc. Persistent generation has no significant impact on garbage collection, but some applications may generate or call classes dynamically, such as Hibernate, etc. In such cases, a large persistent generation space needs to be set up to store the classes added during the run. The persistent generation size can be set by -xx :MaxPermSize=.
For example: when an object is generated in a program, the normal object will allocate space in the young generation, and if the object is too large, it may be directly generated in the old generation (it is observed that during the running of a program, a space of 10 megabytes will be generated for sending and receiving messages, which will be directly allocated in the old generation). When the young generation runs out of space, it initiates a memory reclamation. Most of the memory will be reclaimed, and some of the surviving memory will be copied to the Survivor FROM region. After multiple reclamation, if the FROM region is also allocated, reclamation will also occur and the remaining objects will be copied to the TO region. When the TO region is also full, memory reclamation occurs again and the surviving objects are copied to the old region.
It is true that only the contents of the Heap are allocated dynamically, so both the young and old generation of the above objects refer to the Heap space of the JVM, while the persistent generation is the MethodArea mentioned earlier, which is not part of the Heap.
Some suggestions for JVM memory management
-
Manually set the generated useless objects and intermediate objects to NULL to speed up memory reclamation.
-
Object pooling technology If the generated objects are reusable objects with different properties, you can consider using object pooling to reduce the generation of objects. If there are idle objects, they are taken out of the object pool and used instead of being regenerated into new objects, which greatly improves the reuse rate of objects.
-
JVM tuning by configuring the JVM parameters to improve the speed of garbage collection, if in no memory leaks and the above two ways can guarantee that the JVM memory recovery, can consider to adopt the way of the JVM tuning to solve, but must go through physical machine test for a long time, because of the different parameters may cause different effects. For example, the -xnoclassGC parameter.
Determination of garbage objects
The Java heap stores almost all object instances. Before garbage collector collects objects in the heap, it must first determine whether these objects are still useful. There are the following algorithms to determine whether objects are garbage objects:
Reference counting algorithm
Adding a reference counter to an object increases its value every time a reference is made to it, and decreases its value every time a reference is broken. An object with a zero counter at any time is no longer usable.
The reference counting algorithm is easy to implement and efficient to determine, and it is a good choice in most cases, while the Java language does not choose this algorithm for garbage collection, mainly because it is difficult to solve the problem of circular references between objects.
Root search algorithm
**Java and C#** both use the root search algorithm to determine whether an object is alive. The basic idea of this algorithm is to search down from a series of objects named “GC Roots” as the starting point. The search path is called the reference chain. When an object is not connected to GC Roots by any reference chain, it is proved that the object is not available. In the Java language, implementations of GC Roots include the following:
- The object referenced in the virtual machine stack (the local variable table in the stack frame).
- The object referenced by the class static property in the method area.
- The object referenced by a constant in the method area.
- JNI (Native method) reference objects in the Native method stack.
In fact, in root search algorithm, to truly an object declared dead, will experience at least two tag process: if the object in the root search found no reference chain connected to GC Roots, it will be the first time tag and a screening, screening is the condition of this object is it necessary to perform the finalize () method. When an object does not overwrite a Finalize () method, or a Finalize () method has already been called by a virtual machine, the virtual machine considers both cases unnecessary. If the object is determined to be necessary to finalize (), then the object will be placed ina Queue named F-Queue, and finalize () will be executed later by a low-priority Finalizer thread automatically set up by the virtual machine. Finalize () method is the last chance for objects to escape death (because the finalize () method of an object can only be called automatically by the system once at most). Later, GC will make a second small tag for objects in the F-queue. If you want to save yourself successfully in Finalize () method, Just make the object rereference any object in the chain to establish the association in the Finalize () method. If the object is not yet associated with any reference on the chain, it will be recycled.
Garbage collection algorithm
Once the garbage object has been determined, garbage collection is ready. Some garbage collection algorithms are introduced below. Because the implementation of garbage collection algorithms involves a lot of program details, this paper mainly illustrates the implementation ideas of each algorithm rather than the detailed implementation of the algorithm.
Mark-clean algorithm
Mark – clear algorithm is the most basic collection algorithm, it is divided into “mark” and “clear” two stages: first mark the object to be recycled, after the completion of the mark unified recycle all marked objects, its marking process is actually the previous root search algorithm to determine the garbage object marking process. The execution of the mark-clear algorithm is shown in the following figure:
The algorithm has the following disadvantages:
- The tagging and cleanup processes are inefficient.
- A large number of discrete memory fragments are generated after the tag is cleared. Too much space fragmentation may cause the program to fail to find enough contiguous memory and have to trigger another garbage collection action when it needs to allocate larger objects at a later run.
Replication algorithm
Replication algorithm is suitable for the new generation, replication algorithm is against disadvantages of tags – sweep algorithm, based on its improvement and get it according to the lecture with memory capacity is divided into two pieces of equal size, using only one piece, every time when it ran out of a piece of memory, it will still alive the objects are copied to another block of memory, Then clean up the used memory space once more. The replication algorithm has the following advantages:
- Only one piece of memory is reclaimed at a time, running efficiently.
- Simply move the pointer to the top of the stack and allocate memory sequentially.
- Memory reclamation does not consider the occurrence of memory fragmentation.
The downside: the maximum amount of memory that can be allocated at a time is halved. The execution of the replication algorithm is shown in the figure below:
However, instead of dividing the memory space 1:1, it can be divided into one large Eden and two small survivors.
Mark-collation algorithm
In the old age, the survival rate of objects is relatively high. If more replication operations are performed, the efficiency will be lower. Therefore, other algorithms, such as mark-collation algorithm, are generally selected in the old age. The marking process of this algorithm is the same as the marking process in the mark-clean algorithm, but the processing of the garbage objects after the marking is different. Instead of cleaning the recyclable objects directly, it moves all objects to one end and then directly cleans up the memory beyond the end boundary. The reclamation of the mark-collation algorithm is as follows:
Generational collection
The current garbage collection of commercial virtual machines uses generational collection to manage memory, which divides the memory into several blocks according to the different life cycles of objects. Typically, the Java heap is divided into new generation and old generation. In Cenozoic, die each time the garbage collection will be found to have a large number of objects, and only a small amount of live, so can choose replication algorithm to complete the collection, and in old age because objects high survival rate, no extra space allocated to it, you must use the tag – sweep algorithm or tags – sorting algorithm for recycling.
Each object has an Age counter. If the object speaks in Eden and survives a Minor GC, it is moved to a Survivor zone and its Age is set to 1. After that, the Age is increased by 1 for each Survivor Minor GC. When it reaches a certain level (15 by default), it can be put into the old age.
Garbage collector
The garbage collector is a concrete implementation of the memory reclamation algorithm, and there is nothing in the Java Virtual Machine specification about how the garbage collector should be implemented, so the garbage collector provided by different vendors and different versions of virtual machines can vary greatly. The Sun HotSpot VIRTUAL Machine version 1.6 includes the following collectors: Serial, ParNew, Parallel Insane, CMS, Serial Old, Parallel Old. These collectors work together in different combinations to accomplish garbage collection for different generations.
Garbage recovery analysis
Before using the code analysis, we define the following three memory allocation strategies:
- Objects are allocated in Eden first. When Eden does not have enough space to allocate, a Minor GC is initiated
- Large objects (Java objects that require a lot of contiguous space, such as long strings and arrays) go straight to the old days. Since the new generation uses a replication algorithm to reclaim memory, this avoids a lot of memory replication between Eden and two Survivor regions.
- Long-lived objects will enter the old age.
The garbage collection strategy explains the following two points:
- Minor GC: Garbage collection that occurs in the new generation. Because Java objects are mostly ephemeral, Minor GC is very frequent and generally fast.
- Old GC (Major GC/Full GC) : A GC that occurs in an old era with a Major GC, often accompanied by at least one Minor GC. Due to the long life cycle of objects in the old age, Major GC is infrequent and typically waits until the old age is Full before performing Full GC, which is typically 10 times slower than Minor GC. In addition, if Direct Memory is allocated, garbage objects in Direct Memory will be cleaned up during Full GC in the old days.
The Dalvik VIRTUAL machine uses the Mark-sweep algorithm for garbage collection. As the name suggests, the Mark-sweep algorithm does garbage collection for both the Mark and Sweep phases. The Mark phase starts from the Root Set and recursively marks all currently referenced objects, while the Sweep phase is responsible for collecting those that are not. Before looking at the Mark-sweep algorithm used by the Dalvik VIRTUAL machine, let’s take a look at what happens when GC is triggered.