The introduction
The JVM’s memory area was discussed in detail in the previous article, and as mentioned, most objects created during Java program execution are allocated in heap space. In this article, from the perspective of an object instance, we will explain the life and death of a Java object, the layout of Java objects in memory, and object reference types.
Layout of Java objects in memory
Java source code, usednew
We all know that the object instance created by the keyword is allocated to memory storage at runtime, but the allocation of the time is directly in the memory “dig” a corresponding size of the pit, and then throw the object instance into storage? In fact, the layout of Java objects in memory is usually composed of object headers, instance data, and alignment padding, as follows:
Source of the HotSpot in the HotSpot virtual machine/SRC/share/vm/oops/directory, instanceOop, instanceKlass, oop several c + + file describes the definition of the object (interested friend can to study by oneself, HotSpot source code is provided in the introduction.
1.1 Object Header
A Java object header is a complex thing, and usually consists of multiple parts, includingMarkWord
And type Pointers (ClassMetadataAddress/KlassWord
If it is an array object, there is also an array length. As follows:
If the object is an array, it needs to store the array length. Therefore, array objects in 32-bit virtual machines take 3 bytes to store object headers. On a 64-bit VM, align the data store object header with two half character widths and half character widths. On a 32-bit VM, the size of a character width is 4 bytes, and the size of the next character is 8 bytes. If pointer compression is enabled (-xx :+UseCompressedOops) on 64-bit VMS, align the data store object header with two half character widths and half character widths. MarkWord is 8 bytes, KlassWord is 4 bytes.
There is a lot of ambiguity about this area, and most of it is based on 32-bit virtual machines. Therefore, I will list the 32-bit and 64-bit object header information. The structure and storage size of the object header are described as follows:
Number of VM slots | Object header structure information | instructions | The size of the |
---|---|---|---|
32 – | MarkWord | HashCode, generational age, lock bias, and lock flag bit | 4byte/32bit |
32 – | ClassMetadataAddress/KlassWord | A type pointer points to an object’s class metadata, which the JVM uses to determine which class instance the object is | 4byte/32bit |
32 – | ArrayLenght | If an array object stores the length of an array, non-array objects do not exist | 4byte/32bit |
Number of VM slots | Object header structure information | instructions | The size of the |
---|---|---|---|
A 64 – bit | MarkWord | Unused, HashCode, generation age, lock preference, and lock flag bit | 8byte/64bit |
A 64 – bit | ClassMetadataAddress/KlassWord | A type pointer points to an object’s class metadata, which the JVM uses to determine which class instance the object is | 8byte/64bit |
A 64 – bit | ArrayLenght | If an array object stores the length of an array, non-array objects do not exist | 4byte/32bit |
By default, the MarkWord in the object header of the 32-bit JVM stores the object’s HashCode, generation age, whether it is biased towards locks, lock marker bits and other information. In 64-bit JVMS, the default information for MarkWord in the object header stores HashCode, generation age, lock preference, lock marker unused, and so on:
Reservation number | The lock state | Hash code | Generational age | Biased lock | Lock label information |
---|---|---|---|---|---|
32 – | Lock-free (default) | 25bit | 4bit | 1bit | 2bit |
digits | The lock state | Hash code | Generational age | Biased lock | Lock label information | unused |
---|---|---|---|---|---|---|
A 64 – bit | Lock-free (default) | 31bit | 4bit | 1bit | 2bit | 26bit |
Because the object of the header information is has nothing to do with members of the definition of object itself attribute data of additional storage costs, so considering the space efficiency, the JVM MarkWord is designed as a non fixed data structure, so that you can reuse convenience store more effective data, it will be according to the state of the object itself reuse their own storage space, In addition to the MarkWord default storage structures listed above, there are the following structures that may change:
Markword information:
unused
: Unused area.identity_hashcode
: the original hash value of the object, even if overriddenhashcode()
It’s not going to change.age
: Object age.biased_lock
: Indicates whether to bias the lock.lock
: Lock flag bit.ThreadID
: ID of the thread holding the lock resource.epoch
: Bias lock timestamp.ptr_to_lock_record
: points to the thread stacklock_record
The pointer.ptr_to_heavyweight_monitor
: points to the heapmonitor
Object.
Product description: A LockRecord exists in the thread stack, which copies the product header markword information to its thread stack. The copy markword is called product Mark Word, and a pointer points to the product.
The MrakWord area is mostly used for Synchronized locks, but if you’re interested in this area, check out the previous article: In-depth understanding of Java concurrent programming Synchronized keyword implementation principle analysis, which detailed the object in the process of running, lock expansion/lock upgrade changes in this area.
To summarize, the object header consists of MarkWord, KlassWord, and possibly array length. MarkWord is used to store information about objects and locks, KlassWord is used to store Pointers to class metadata in the metadata space, and of course, if the current object is an array, the length of the current array is also stored in the object header.
1.2 Instance Data
Instance data refers to the sum of all scalars of an aggregate quantity, that is, the current object attribute member data and the parent class attribute member data. Here’s an example:
public class A{
int ia = 0;
int ib = 1;
long l = 8L;
public static void main(String[] args){
A a = newA(); }}Copy the code
In the above case, class A has three attributes ia, IB, and L, two of which are of type int and one is of type long. Therefore, the instance data size of object A is 4 + 4 + 8 = 16 bytes (bytes).
Now try adding some spice to the case, as follows:
public class A{
int ia = 0;
int ib = 1;
long l = 8L;
B b = new B();
public static void main(String[] args){
A a = new A();
}
public static class B{
Object obj = newObject(); }}Copy the code
How to calculate the size of the instance data of object A? Do we need to include the member data of class B? If a member of a class is a reference type, the pointer is stored directly, and the size of the reference pointer is one word wide, that is, 32bit in a 32-bit VM and 64bit in a 64-bit VM. Therefore, the instance data size of object A is: 4 + 4 + 8 + 8 = 24 bytes (this size is used if pointer compression is not enabled, but not if pointer compression is enabled, more on this later).
1.3. Align (Padding)
Alignment padding may or may not exist in an object, because in 64-bit virtual machines, the VIRTUAL Machine Specification states: The total size of Java objects must be a multiple of 8 in order to facilitate memory reading, addressing, and allocation. So when the object header + instance data of an object is not a multiple of 8, the alignment padding section will now appear to complete the object size to a multiple of 8.
For example, if the sum of the object header + instance data size of an object is 28bytes, then an alignment of 4bytes will occur, and the JVM will fill the object with an integer multiple of 8:32bytes.
1.4. Pointer Compression
Pointer compression is an optimization idea for the JVM. On the one hand, it can save a lot of memory, and on the other hand, it can make it easier for the JVM to jump addressing (more on that later). Pointer compression was developed to improve memory utilization in 64-bit virtual machines. Pointer compression technology will all the references in the Java program pointer (type pointer variable reference pointer, stack, heap reference frame pointer, etc.) will be compressed in half, and the size of a pointer in Java is a word of wide unit, in a 64 – bit virtual machine a word wide size for 64 – bit, so that means in a 64 – bit virtual machine, Pointers are compressed from 64bit to 32bit, which is enabled by default after JDK1.7.
Some friend might think, a 32 bit pointer to save space, and the like and will not save much space, but if you think that would be wrong, Java program is running, its internal most is not constant, also is not object, but a pointer, the stack frame of reference, heap pointer, head of the object class pointer references in the pointer… Pointers are the largest number of runtimes in the JVM, so when each pointer can be compressed in half, it can save a lot of space for the program as a whole.
Pointer compression failure: The benefits of pointer compression are undeniable, almost saving a lot of memory space for Java programs. Generally speaking, if you do not turn on compression, object memory needs 14GB. After turning on pointer compression, you can almost allocate these objects in 10GB of memory. In Java, the maximum address of 32bit pointer is 32GB, which means that if you have OOM problem when the heap is 32GB, you may still have OOM problem when you expand the heap to 48GB. If the memory size exceeds 32GB, 32bit Pointers cannot be addressed. Therefore, all compressed Pointers will be invalidated, resulting in pointer bloat, and all Pointers will revert from the compressed 32bit size to the pre-compressed 64Bit size.
The 32bit pointer supports 4GB9 (2 ^ 32) memory. Why do 32-bit Pointers in Java support addressing 32GB? In fact, there is a huge connection to the alignment fill mentioned above. As mentioned earlier, in a 64-bit virtual machine, the object size must be an integer multiple of 8, and alignment padding occurs when the total size of an object is less than an integer multiple of 8. From this conclusion, it can be seen that the memory bit is the second digit, which can never be the start of an object. Only the memory position is an integer multiple of 8, which can be the start of an object. Therefore, 8bit can be addressed as a location, and 4GB can be treated as 4*8=32GB, which can be addressed 32GB. Here’s an example for you to understand:
A person can only walk four steps, the average person can only walk one meter, so this person can only walk four meters at most, but there is another person, can walk eight meters at one step, so this person can walk 32 meters at most.
After pointer compression is enabled in the JVM, there are three ways to evaluate the location of an object:
- ① If the high address of the heap is less than
32GB
, indicating that no base address is requiredbase
I can locate any object in the heap, and this pattern is calledZero-based Compressed Oops Mode
, the calculation formula is as follows:- Formula: Add =0+offset∗ 8Add =0+offset * 8 Add =0+offset * 8
- The highheap<32GBhigh_{heap} < 32GBhighheap<32GB
- ② If the stack height is greater than or equal to
32GB
, stating the needbase
Base address if the heap space is less than4GB
, indicating that the base address + offset can locate any object in the heap, as follows:- Add = Base + offsetAdd = Base +offset Add = Base +offset
- Prerequisites: SizeHeap <4GBsize_{heap} < 4GBsizeheap<4GB
- ③ If the heap size is in
4GB
with32GB
This can only be scaled by base address + offset Xscale
To locate any object in the heap, as follows:- Add = Base +offset∗ 8Add = Base +offset * 8 Add = Base +offset∗8
- Prerequisites: 4GB<=sizeheap<32GB4GB <= size_{heap} <32GB4GB <=sizeheap<32GB
1.5, JOL object size calculation actual combat
In order to observe the memory layout of objects, first import a tool provided by the OpenJDK organization: JOL. Maven relies on the following:
<! -- https://mvnrepository.com/artifact/org.openjdk.jol/jol-core -->
<dependency>
<groupId>org.openjdk.jol</groupId>
<artifactId>jol-core</artifactId>
<version>0.9</version>
</dependency>
Copy the code
Two apis are provided in the tool:
GraphLayout.parseInstance(obj).toPrintable()
: Displays external information about an object, including referenced objectsGraphLayout.parseInstance(obj).totalSize()
: Displays the total space occupied by an object
Start with the interview question and create one in JavaObject
How much memory do objects take up?
According to the above explanation, we can carry out preliminary calculation, the Object header size should be theoretically mrakword+klassword=16bytes=128bit, and the Object class does not define any attributes, so there is no instance data. However, if pointer compression is enabled, only 12bytes will be present, because the class pointer in the object header will be half compressed, so a 4bytes alignment will be filled, and eventually the size should be 16 bytes regardless of whether pointer compression is enabled or not.
public static void main(String[] args){
Object obj = new Object();
System.out.println(ClassLayout.parseInstance(obj).toPrintable());
}
Copy the code
The result runs as follows:
java.lang.Object object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) ......
4 4 (object header) ......
8 4 (object header) ......
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
Copy the code
From the result, it is clear that 0 to 12 bytes are the object header, 12 to 16bytes are the aligned fill data, and the final size is 16bytes. Consistent with the above prediction, when pointer compression is enabled, 4bytes of aligned fill data will appear.
1.5.1 Array object size calculation
After the above simple analysis of the size of the Object, let’s look at an example, as follows:
public static void main(String[] args){
Object obj = new int[9];
System.out.println(ClassLayout.parseInstance(obj).toPrintable());
}
Copy the code
Now what is the magnitude? Since this array is an int array and the size of int is 32bit/4bytes, its theoretical size is :(12bytes object header +9*4=36bytes array space) = 48bytes, right? Take a look at the results:
[I object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) .....
4 4 (object header) .....
8 4 (object header) .....
12 4 (object header) .....
16 36 int [I.<elements> N/A
52 4 (loss due to the next object alignment)
Instance size: 56 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
Copy the code
It can be seen from the result that the final size is 56bytes. The actual size is obviously different from the previous inference. Why? If an object is an array object, its object header uses 4bytes to store the length of the array. Therefore, the size of the obj object header is 16bytes, in which 12 to 16bytes are used to store the length of the array. Add in nine ints with 36bytes of space and a size of 52bytes, and since 52 is not a multiple of 8, the JVM fills it with 4bytes of alignment data, resulting in 56bytes of the above run result.
When we use the array.length property, where is the length obtained from? From now on, you’ll get the answer: from the object’s header. What is the maximum size of an array object in Java, regardless of memory? The answer is the maximum that an int can express, since only 4bytes are used in the object header to store the array length. How’s that? Isn’t that interesting? In fact, many of the usual development process of confusion, when you understand the underlying concept, the answer will naturally emerge in front of you.
1.5.2 Instance object size calculation
After analyzing array objects, let’s take a look at instance objects that are often defined during development, as follows:
public class ObjectSizeTest {
public static class A{
int i = 0;
long l = 0L;
Object obj = new Object();
}
public static void main(String[] args){
A a = newA(); System.out.println(ClassLayout.parseInstance(a).toPrintable()); }}// --------- Result: -------------
java.lang.Object object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) ......
4 4 (object header) ......
8 4 (object header) ......
12 4 int A.i 0
16 8 long A.l 0
24 4 java.lang.Object A.obj (object)
28 4 (loss due to the next object alignment)
Instance size: 32 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
Copy the code
Nothing unexpected result, mastery of the knowledge in front of the friends all can independently calculated the results, only it is worth mentioning is that you can see, in 24 ~ 28 bytes of the four bytes of storage is a pointer to a pile of reference object obj, at that moment, because open the pointer compression, so the 32 bit / 4 bytes in size.
Now that we have explained how Java objects are laid out in memory and how they are sized, let’s look at the process of allocating Java objects.
Second, Java object allocation process details
There are many ways to create objects in Java. The most common and most commonly used is the new keyword. However, there are several other ways to create objects besides the new keyword.
- ① By calling
Class
Of the classnewInstance
Method completes the object creation. - ② Called by reflection mechanism
Constructor
Of the classnewInstance
Method completes the creation. - (3) class
Cloneable
Interface, throughclone
Method to create a cloned object. - ④ Read binary stream data from local files and network, and create it through deserialization.
- ⑤ Use third-party libraries
Objenesis
The object is created.
But no matter how objects are created, the virtual machine divides the creation process into three steps: class load detection, memory allocation, and object header setting.
2.1 Class loading detection
When the virtual machine reaches a create instruction, it first checks whether the instruction’s parameter can locate a symbolic reference to a class in the constant pool, and also checks whether the symbolic reference represents a class that has been loaded and parsed initialized. If not, in parent delegate mode, use the current class loader to search for the corresponding. Class file using the fully qualified name of the currently created object as the key value. If no file is found, throw ClassNotFoundException. Then it starts allocating memory for its objects.
2.2. Memory allocation
After the class of an object is loaded, it will calculate the size of the memory space needed by the object according to the method of the first stage analysis. After calculating the size, it will start the object allocation process, and memory allocation refers to the process of drawing a block of memory equal to the size of the object, and then putting the object into the process. It is important to note, however, that Java objects do not try to be allocated on the heap directly from the start, as follows:
2.2.1 Stack allocation
Stack allocation is a radical optimization of the C2 compiler. If you don’t understand the radical optimization of C2, you can refer to the third article: The comprehensive explanation execution engine subsystems and JIT instant compilation principle “, built on the basis of analyzing the escape, replace disassembly polymerization using scalar quantity, replace the object with basic content, the object and then finally scattered distribution in the local variables of virtual machine stack, thereby reducing production of object instances, reduce heap memory usage and number of GC.
Escape analysis: Escape analysis is based on the method as a unit. If a member is created in the method body, but does not leave the scope of the method body until the end of the method, the member can be understood as not escaping. Conversely, if a member is returned at the end of the method or is assigned to an external member in the method body logic, the member escapes. Scalar replacement: On the basis of escape analysis, a fundamental scalar is used to replace the aggregate quantity of objects. Scalars generally refer to data that cannot be disassembled. The eight basic data types are typical scalars.
If an object is allocated on the stack, there is no need for GC to reclaim it, and the object is automatically reclaimed as the method stack frame is destroyed. However, if an object size exceeds the stack available space (total stack size – used space), then no attempt is made to allocate the object on the stack.
Stack allocation is based on escape analysis, so the objects that can be allocated on the stack are definitely useful only within the stack frame, which means that the objects allocated on the stack do not have GC age and are created and destroyed as the stack frame moves on and off the stack.
2.2.2 TLAB allocation
TLAB is a private Buffer allocated by the JVM for each Thread in Eden. In the previous article on JVM memory region analysis, we analyzed that most Java objects are allocated on the heap, but also stated that the heap is shared by threads, so there is a problem: When the JVM is running, if two threads choose to allocate objects in the same memory area, it is inevitable that there will be a race, which causes the allocation speed to slow down.
Background: the tang dynasty story: building the house Zhang SAN and li si two children are grown up (in ancient times men’s adult separation), zhang SAN and li si are a little small, so think about paying money to government to buy land, and then give their children to build a house, zhang SAN and li si behind have a crush on the same piece of land, both sides refused to obey. What should I do now? There is bound to be a conflict over who wins the land. And both sides a conflict, from quarrel, fight, official, mediation…. , and will delay a long period of time, eventually lead to the building of the house delayed….
From the above story, it can be seen that this kind of “many people like the same piece of land” thing is very affect performance, then how to solve this kind of problem?
For the government, like “Tom, dick and harry” happens if it is a small amount of fine, but this two heads together for three days, local government reported to the court, the court in order to effect a radical cure this kind of problem, launched a “land privatization” system directly, assign each home a few acres of land, if you want their children to build a house, So there is no need to buy public land in the government money, directly on their own land to build houses, this problem will be solved.
And similar troubles in the JVM, when allocating memory for the object, often can appear multiple threads the same piece of memory area “tragedy” of the competition, the virtual machine in order to effect a radical cure this problem also adopted a similar to the story of “imperial” means, for each thread specifically allocated a memory area, the area is called TLAB area, When a thread tries to allocate memory for an object, if TLAB allocation is enabled, it tries to allocate memory in the TLAB area first. (When the program starts, you can set whether to enable TLAB allocation by using the parameter -xx :UseTLAB).
It is important to note that TLAB is not a separate area outside of heap space, but is directly divided by the JVM for each thread in Eden area. By default, the size of TLAB area accounts for only 1% of the entire area Eden, but also can pass parameters: – XX: TLABWasteTargetPercent set TLAB area proportion of Eden area occupied space.
In general, the JVM will use TLAB as a priority for memory allocation (with the exception of C2 aggressive optimization for on-stack allocation) and will only attempt to allocate on the heap when TLAB allocation fails.
TLAB allocation process
When an object is created, when aggressive optimization is enabled, stack allocation is first attempted. If stack allocation fails, TLAB allocation is performed. First, the amount of space required by the object is compared with the amount of remaining TLAB space, and if TLAB can be put down, the object is allocated directly to the TLAB area. If the available space in the TLAB area is not enough to allocate the object, the TLAB determines whether the available space is greater than the specified spaceMaximum amount of space wastedIf greater than, the heap is allocated directly; if not, empty objects are filled firstMemory spaceThen return the current TLAB to heap space and rebaseexpectationsApply for a new TLAB area and assign again. As follows:
In the TLAB allocation process analysis above, the terms maximum space wasted, memory gap, and expected value are mentioned as follows: Maximum space wasted: As the name suggests, this value is the maximum amount of memory that the JVM will allow a TLAB area to remain unused. This value is generally dynamic. Memory gap: When the current TLAB is not allocated enough, if the remaining space is less than the maximum space waste limit, the TLAB area will be returned to Eden area, and then apply for a new TLAB. After the TLAB is returned to Eden Area, the remaining space of the TLAB will become a void. If these pores are left alone, since TLAB only knows what is allocated within the thread, additional checks need to be done when a GC scan occurs, which can affect THE GC scan efficiency. Therefore, when TLAB returns to Eden, it will fill the remaining available space with a dummy Object. If you populate a dummy object that is already confirmed to be reclaimed, the GC will simply mark it and skip the memory, increasing GC scanning efficiency. Expected value: The concept of expected value is common in JVMS, whether JIT or GC, and is used as the basis for radical optimizations based on expected value, which is calculated from “historical data” during the JVM run, i.e., the latest expected value based on historical samples each time a sample value is entered.
TLAB commonly used expected value algorithm EMA – exponential moving average algorithm
The core of EMA (Exponential Moving Average) algorithm is to set an appropriate minimum weight. The larger the minimum weight is, the faster the change will be and the less it will be affected by historical data. Setting the right minimum weight for your application can make your expectations even better. Specific can refer to: Baidu encyclopedia.
Note: when TLAB returns to heap space, do the objects stored in it need to be moved to the new TLAB area?
The answer is no, because the TLAB area itself uses the memory of Eden area, so you can directly fill the gap memory with empty objects and return them to the heap space. The original object does not need to be moved to the newly allocated TLAB area, and the object in the previous position can still be accessed through the original reference pointer. The only change required is to change the thread’s TLAB orientation to the newly requested memory area.
2.2.3 Allocation of elderly generation
If the allocation fails in the TLAB area, the object will determine whether the allocation criteria of the aged generation are met. If so, it will directly allocate the allocation in the aged generation space. Some of you may wonder: didn’t the target first try to distribute in the new generation and then enter the old generation? In fact, this is a wrong concept. In the first allocation, the object will first determine whether it meets the allocation criteria of the aged generation. If so, it will directly enter the aged generation.
Old generation distribution conditions
When the first allocation is made, the large object goes directly to the aged generation. General object into the situation of the old generation: only three large objects, long-term survival and dynamic age judge qualified object, the JVM startup you can – XX: PretenureSizeThreshold parameter specifies the threshold of large objects, if beyond the size when the object in distribution, goes straight to the old generation.
This has the advantage of avoiding a large object bouncing back and forth between two survivor regions. Because every generation GC moves the surviving object from one survivor zone to another, and large objects are not in the survivor zone in general, this represents: Large objects are most likely to move back and forth between two survivor zones after being allocated. Moving large objects is a heavy burden for THE JVM, as memory allocation and data copy require time and resource overhead. It can also lead to longer GC times because large object migrations are time-consuming.
So for large objects, it is appropriate to go straight to the old generation, which is also a detail optimization for the JVM.
The above paragraph is based on generational GC, but in fact, different GC criteria for large objects are different, especially in the later generation of non-generational GC, the large object will not enter the old generation, but will have a special area for storing large objects. Such as G1, Humongous region in ShenandoahGC, Large region in ZGC, etc. These GC criteria for large objects can be found in the heap space section of “In-depth Understanding of virtual Machine running data area and the anatomy of memory overflow and memory leak”.
2.2.4 New generation distribution
If the on-stack allocation, TLAB allocation, and old generation allocation fail, the new generation allocation will be tried in Eden Area. There are two ways of distribution in the new generation:
- Pointer collision: Pointer collision is a kind of memory allocation method when Java allocates heap memory for objects, generally applicable to
Serial, ParNew
Such garbage collector does not generate memory fragmentation, heap memory complete.- Allocation: Allocated memory and free memory allocated to the heap are on different sides of the heap, separated by a pointer to the dividing point. When the JVM wants to allocate memory for a new object, it simply moves the pointer to the free end equal to the size of the object.
- ② Free list: like pointer collision, the free list is also a way of allocating memory for new objects in Java heap memory allocation, generally suitable for CMS and other garbage collectors will produce memory fragmentation, heap memory is not complete.
- Distribution process: The USED memory and free memory in the heap are interlocked. The JVM keeps track of the available free memory blocks by maintaining a memory list. When a new object needs to be allocated memory, the JVM finds a block from the list that is large enough to be allocated to the object instance and synchronously updates the records on the list. Reclaimed memory is also updated to the memory list.
In both cases, pointer collisions are more suitable for clean heap space and the free list is more suitable for incomplete heap space. In general, the JVM will decide which allocation method to use depending on the GC currently used by the program.
When allocating memory in Eden area, because it is a shared area, it is possible for multiple threads to operate at the same time. Therefore, in order to avoid thread safety problems, memory allocation in Eden area needs to be synchronized. In HotSpot VM, CAS+ failed swap retry is adopted to ensure atomicity.
2.2.5 Summary of memory allocation
At this point, about Java object memory allocation phase has been completed, simply put, if the current JVM in engine condition, C2 compiler has intervened, will first try to objects allocated on the stack, if the stack allocated on failure will try TLAB allocation, TLAB allocation failure will determine whether the object meet the old generation distribution standard, If yes, the object is directly allocated to the old generation; otherwise, the object is allocated to the Eden area of the new generation.
TLAB allocation is preferred for object allocation if the JVM is in a cold state and the C2 compiler is not working.
2.3. Initialize memory
After the memory allocation step, the currently created Java objects are allocated an area of memory, and the allocated space is initialized. The JVM initializes the allocated memory (excluding object headers) to zero. It can ensure that the object instance field is directly used in Java code without assigning the initial value, and the program can access the zero value corresponding to the corresponding data type of the field, avoiding the null pointer exception caused by direct access without assigning the value.
If the object is allocated on the stack, all data will be allocated in the local variable table in the stack frame. If the object is TLAB allocated, the initialization step is moved up to the allocation phase.
2.4. Set the object header
When the zero value is initialized, the object header of the object is immediately set. KlassWord, which points to the metadata of the current object class, is also added to the object header. If the current object is an array object, ArrayLength, the ArrayLength specified during encoding, is also put into the object. Finally, when all the data in the object header is assembled, the object header is stored in the memory area allocated by the object.
2.5, perform<init>
function
When all the above steps have been completed, the
function, or constructor, is finally executed, mainly to explicitly assign the property. From the Java level, this is really the initial assignment of an object that the developer wants to do, and then you can actually build a usable object.
An object’s journey from birth to death
After the allocation process, a Java object is actually created in memory, the object ends up in Eden (TLAB allocation is also Eden, stack allocation does not count), and a reference to the object appears in the thread stack. When the object is needed later, Object data in the block memory region is accessed directly from the direct address or handle in the reference.
3.1 Access mode of objects
In Java, all objects are accessed through Reference, which is mainly divided into two access methods, one is handle access, the other is direct pointer access.
3.1.1 handle access
An area of memory in the Java heap is designated as a handle pool for storing all referenced addresses,reference
The handle contains information about the instance data and type data of the object, as follows:
When objects need to be used, they are accessed firstreference
The handle address stored in the handle address, and then according to the actual memory address stored in the handle address after locating again, access the object in memory data.
3.1.2. Direct pointer access
If direct pointer access is used, thenreference
The type pointer is stored in the object header as follows:
In this access mode, when you need to use objects, you can directly passreference
The heap memory address stored in the.
3.1.3 Summary of access methods
The biggest benefit of using handle access is that reference stores the stable handle address and only changes the instance data pointer in the handle when the object is moved (which happens during GC). Reference itself does not change. But in general, each access to an object requires a forward, which is much slower than direct pointer access.
The biggest advantage of using pointer access is that it is fast and saves the time cost of a pointer location. Since object access is very frequent in Java, it adds up to a lot and saves a lot of execution cost as a whole. However, when an object is moved by GC, the reference information in all references of the moved object needs to be updated synchronously.
The HotSpot VIRTUAL machine uses Pointers to locate and access object data through direct Pointers (but with the Shenandoah collector, there is also an extra forward).
3.2 Object movement and object promotion during GC
In HotSpot, objects are accessed by direct Pointers, while reference is stored in the thread stack and the object instance data is stored in the heap during runtime. When a thread completes the execution of a method, the stack frame corresponding to the method will be destroyed, and the local variable table in the stack frame will also be destroyed, and the reference in the local variable table will also be reclaimed. At this point, the object in the heap becomes a “garbage” object with no pointer references, and if no new pointer refers to it by the time the next GC occurs, the object will be reclaimed (this process will be described in the GC section).
Objects that still have references at the time of GC are moved from Eden to Survivor, and the pointer in reference must be changed to the latest memory address.
There are two Survivor zones in the Cenozoic: S0/S1, also known as or From/To zones. One of these two zones is always empty at the same time and acts as a new “shelter” for the surviving object when the next GC occurs. However, the From/To nouns are not fixed names for a region. Instead, they are dynamic. The Survivor zone that holds the object is called the From zone, and the empty Survivor zone is called the To zone.
When the object is moved once, the age of the object in MrakWord in the object header is +1 (the age of the newly created object is 0). In most generational GCS, the default promotion criteria for the old generation is 15 years (8 years for CMS), that is, when objects move back and forth 16 times, the surviving objects will be transferred to the old generation storage. You can change the age threshold with -xx :MaxTenuringThreshold.
3.2.1. Age determination of dynamic objects
Normally, normal objects need to reach a specified age threshold to be advanced to the aged generation, but the JVM does not always require objects to reach a certain age threshold to be advanced to the aged generation in order to better accommodate the memory conditions of different programs. If the sum of the size of all objects of the same age in the Survivor zone is greater than half of the Survivor space, then all objects of that age in the Survivor zone can be directly entered into the aged generation without having to wait until the criteria of the threshold are met before being promoted, which is also known as the JVM’s dynamic object age determination.
3.2.2 Guarantee mechanism for space allocation
Allocation guarantee is a guarantee that the older generation provides on behalf of the younger generation, which can be turned off or on by the HandlePromotionFailure parameter (enabled by default after JDK1.6). When GC occurs, if one s-region space cannot store Eden and another s-region living objects, these objects will be directly transferred to the old generation, which is the process of space allocation guarantee. Before MinorGC, if the continuous space of the old age is greater than the sum of the size of the new generation object or the average size of previous promotions, if it is greater than, then MinorGC is safe, then MinorGC will be carried out, otherwise FullGC will be carried out.
Role of allocation guarantee: If a large number of objects still survive after GC occurs in the new generation (the most extreme case is that all objects survive in the new generation after GC) and Survivor space is relatively small, then the old generation needs to allocate guarantee and put the objects that Survivor cannot accommodate into the old generation. The old age is guaranteed space allocation, provided that there is enough space for the old age to accommodate these objects, but the total number of objects that survive the memory collection is unpredictable, so we have to take the average size of objects promoted to the old age after each previous garbage collection as a reference. This average is compared with the remaining space in the old age to determine whether FullGC should be performed to make more space in the old age.
3.3, summary
After the object is created, the instance data is stored in the heap, and the runtime thread accesses the object through a pointer in the stack frame. When the method execution ends, the corresponding pointer is also destroyed, and the heap object is reclaimed with the next GC, and the age of the object that escaped a GC is +1. When the age of an object reaches a specified threshold or meets the criteria for determining the age of dynamic objects, the object is moved from the new generation to the old generation.
Four, object reference type – strong weak virtual comprehensive analysis
In JDK1.2, Java expanded on the concept of references. After 1.2, Java provided four levels of references, The order of reference strength is StrongReference, SoftReference, WeakReference, and virtual reference. Except for strong reference types, the other three reference types have corresponding classes in the java.lang.ref package, allowing direct manipulation of these reference types during development.
4.1 StrongReference
Strong reference type is the most common reference type in the running process of Java programs. Objects created by new quality belong to strong reference type. Objects in the heap keep direct references to variables in the stack. As follows:
Object obj = new Object();
In the above code, the Object instance created by the new instruction will be allocated for storage in the heap, and the variable STR will be stored in the local variable table of the stack frame corresponding to the current method. At run time, the instance Object in the heap can be directly manipulated by the variable STR, so that STR is the strong reference of the Object instance Object.
Weeks a well-known, if insufficient heap memory in the process of Java program is running, the GC mechanism will be triggered, GC collector will begin testing recycled object of “junk”, but when there is a strong reference object when the GC is met, the GC mechanism is not compulsory recycling it, because there is a strong reference object is judged to be “live” objects, When the GC finds strong references to objects in the heap after several rounds of scanning, the GC mechanism would rather throw OOM than force a collection of objects. Objects that maintain strong references are not collected by GC. Therefore, if an object is determined to be no longer in use, it can be explicitly cleared. For example, obj=null; This makes it easy for the GC mechanism to find and mark the object directly when looking for garbage.
4.2 SoftReference type
Soft reference refers to the use of Java. Lang. Ref. SoftReference type of object, when an object is only soft references, under the condition of no memory on the heap, the levels of the objects will be referred by GC mechanism for recycling. However, objects at this reference level are not recycled when the heap is full, so if you need simple JVM-level caching, you can use this reference type. The use cases are as follows:
SoftReference<HashMap> cacheSoftRef =
new SoftReference<HashMap>(new HashMap<Object,Object>());
cacheSoftRef.get().put("Bamboo"."Panda");
System.out.println(cacheSoftRef.get().get("Bamboo"));
Copy the code
In the example above, a simple cache is implemented using soft reference types.
4.3. WeakReference Type
A weak reference type refers to the use of Java. Lang. Ref. WeakReference type modified object, and the difference between soft references: a weak reference object of type life cycle shorter, because a weak reference object of type as long as found by GC, regardless of whether the current heap memory resources nervous, will be the GC mechanism for recycling. However, because the GC thread has lower priority than the user thread, weak reference objects are generally not found immediately, and therefore generally have a long lifetime.
From the characteristics of soft citation and weak citation, they are suitable for the implementation of a simple cache mechanism, used to save the cache data that is not necessary, when the memory is sufficient, it can slightly increase the execution efficiency of the program, but when the memory is tight, it will be recycled, not resulting in OOM.
4.4 Virtual Reference Type
Virtual reference in some places also known as the phantom reference, virtual reference refers to the use of Java. Lang. Ref. PhantomReference type of object, but in the use of virtual reference is the need to cooperate with ReferenceQueue reference queue can be used in combination. Unlike the other reference types, a virtual reference does not determine the GC mechanism’s right to reclaim an object. If an object has only a virtual reference, the GC mechanism treats it as an object with no reference type and can reclaim it at any time. However, it has an additional use for tracking garbage collection, and since virtual references can track the collection time of objects, some resource release operations can also be performed and logged in virtual references.
When the GC mechanism is about to reclaim an object and finds that it still has a virtual reference, the GC mechanism will add the virtual reference to the reference queue associated with it before recycling. The program can judge whether the referenced object will be recycled by GC by judging whether the virtual reference is added to the queue. Therefore, we can take some corresponding processing measures in Finalize method.
Five, Java object summary
The introduction to Java objects concludes with a thorough analysis of the objects’ memory layout, allocation process, object promotion, object movement, access methods, and object references. The next chapter will take a thorough look at Java’s GC mechanism.