This article analyzes the form of objects we often contact in the development of virtual machines. This article mainly from the object composition, object creation, object access to several aspects to analyze.

  1. Object memory structure
  2. Object creation
  3. Object access

Young, is it still in trouble to find the object? Fear not, by the end of this article you will have achieved object freedom in the virtual machine world.

1. Memory structure of the object

The in-memory structure of an object consists of three parts: Header, Instance Data, and Padding.

1.1 Object Headers

The object header contains two parts of information, one is a Mark Word field and the other is a type pointer.

  • Mark Word: Used to store the runtime data of the object itself, such as HashCode, GC generation age, lock status flag, lock held by the thread, bias thread ID, bias timestamp, etc. The data length of this part is 32bit and 64bit respectively in 32-bit and 64-bit VMS. In a 32-bit vm, if the object is in the unlocked state, the 32bit space of the Mark Word contains 25bits for storing hash codes, 4bits for storing generational ages, 1bit is always 0, indicating that the lock is not biased, and 2 bits indicate that the lock is marked. The storage structure of lock state in Mark Word is shown as follows:

In Mark Word, different lock states correspond to different lock levels of the object. Knowing these lock types is very important for the analysis of JVM lock implementation thread safety. Favoring locks, lightweight locks, heavyweight locks) and spin locks (lightweight locks) are not discussed here.

  • Klass Pointer: A Pointer to an object’s type data. Current can be determined by the pointer object belongs to which class instance, special note here, have some differences in different virtual machine implementation, the pointer is not necessarily the type of object data, it is also possible that a handle in the address pool, in handles in the pool indirectly point to type data, this is in part 3 again when object access details.

When the object is an array, must also record the length of the array in the head of an object, the object can be confirmed by type data need to allocate memory size, but can’t confirm how many this type of the array object, so you need to record the length of the array, the array is used to calculate the required memory size (type data space x array length).

1.2 Instance Data

Instance data is the actual stored valid information, that is, the various types of field data defined in the program code. Both those inherited from a parent class and those defined in the current subclass need to be recorded. The order in which this part is stored is influenced by the virtual machine allocation policy parameter (FieldsAllocationStyle) and the order in which the fields are defined in the Java source code. The following principles are generally met: Fields of the parent class are usually placed before fields of the child class, and fields of the same type are assigned together.

1.3 Align fill (Padding)

Alignment padding serves primarily as a placeholder and is not necessarily there. Virtual machine automatic memory management system requirements objects start address must be 8 bytes integer times, namely the size of the object must be 8 bytes integer times, head of object happens to be in multiples of 8 bytes (1 or 2 times), object instances will exist the difference of a data item, so need to align filling to ensure that the object is 8 bytes integer times.

2. Object creation

In Java programs, objects can be cloned, deserialized, directly created and other methods, but in the final analysis are achieved through the new keyword, in the VIRTUAL machine encountered new keyword to do what?

When a new instruction is encountered in the VIRTUAL machine, the first step is to check whether the parameter of the instruction can locate the symbolic reference of a class in the runtime constant pool. Then, the class of the obtained symbolic reference is checked to determine whether the class has been loaded, parsed, and initialized. If not, the above operation is performed. After the class is passed, memory is allocated for the new object, and the size of the object is determined after the class is loaded. The corresponding size of memory space is then allocated from the Java heap.

Object memory allocation needs to consider two problems: the way of memory partition and the security of memory partition.

1. Memory space is divided in two ways: pointer collision and free list

  • 1.1 Bump The Pointer: The Java heap needs to be absolutely neat. All used memory is stored on one side and unused memory is stored on The other side. There is a Pointer in The middle as The dividing point. GC collection algorithms in this way include: mark-collation algorithm and copy algorithm.
  • 1.2 Free List: The Java heap memory is not orderly. The used memory and unused memory are interlaced. The VIRTUAL machine maintains a List of Free memory blocks. GC collection algorithms in this way include: mark-sweep algorithm.

2. Memory partition security problem

Object instances are allocated to the Java heap, which was a thread shared area when this article introduced Java heap space. So how do you ensure safe allocation of object memory on the Java heap when multiple threads create objects at the same time? There are two main processing methods: CAS plus retry and thread-local cache.

  • 2.1 CAS Mode: The system synchronizes the memory space allocation to ensure atomicity of the update operation. If the operation fails, the system applies for the memory again.
  • 2.2 Thread Local Allocation Buffer (TLAB) : Each Thread allocates a small block of memory in advance, and allocates the memory in different Spaces according to the Thread. When the TLAB space is used up and needs to be reallocated, a synchronization lock is added. Virtual machines can use the -xx :+/UseTLAB parameter to set the size of the TLAB space.

After allocating the memory, the virtual machine initializes the allocated memory space to zero (excluding the object header). This step is also called class initialization, in Java code similar to static modification, then set the object header related data, class metadata, hash code, object generation age, and other information. When that is done, there is a final step, which is the initialization of the object instance, the initialized part of the Constructor of the Java code. When all work is done, the creation of the object is complete.

3. Object access

Once an object is created, it lives in the Java heap. How do we access an instance of an object in the Java heap in a Java method? The JVM virtual machine series (a) the runtime data areas described in this article the runtime data areas, we know there are threads in Java thread stack area, private virtual machine each method in the area have their corresponding stack frame, stack frames stored in addition to similar to the Java virtual machine data type of 8 kinds of basic types, There are also returnAddress and Reference types, where the reference type represents the reference type and represents an address pointer. The address record is the address information related to the object in the Java heap. This address information is divided into two types: One is the address of an indirect pointer (also known as a handle pointer) and the other is the address of a direct pointer. The virtual machine implementation depends on the choice of the virtual machine.

3.1 Handle pointer Access

The handle pool contains Pointers to object instance data and Pointers to object type data. Reference stores the address of the handle to the object. Handle access is shown below:

Features:

  1. Reference stores a stable handle address. When an object is moved, only the handle address needs to be changed. The address in Reference does not need to be changed.
  2. The access speed is slower than that of direct access. You need to query data twice to access real data.

3.2 Direct pointer access

The reference type directly holds the address of the object instance data in the Java heap, with direct pointer access as shown in the figure below:

Features:

  1. Direct pointer access is faster
  2. For GC operations, you need to change the address of the object that reference holds.

3.3 Direct pointer access demo

3.3.1 Direct access overview

In the HotSpot VIRTUAL machine, direct pointer access is adopted, and the memory relationship of Java objects in the JVM data area is shown in the following figure:

3.3.2 Demo source code and decompile bytecode

Demo source jVMtest.java

1 package com.jvm.test; 2 3 public class JvmTest { 4 public static void main(String[] args) { 5 Obj objA = new Obj(); 6 objA.a = 10; 7 objA. B = 1.0 f; 8 int result = objA.testAB(); 9 System.out.println("result = " + result); 10 } 11 12 private static class Obj { 13 private int a; 14 private float b; 15 16 public int testAB() { 17 return (int) (a + b); 18} 19} 20}Copy the code

The jvmtest. class file is a bytecode decompiled by running the javap-verbose jvmtest. class command

public class com.jvm.test.JvmTest minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPER Constant pool: #2 = Class #29 // com/jvm/test/JvmTest$Obj #3 = Methodref #2.#30 // com/jvm/test/JvmTest$Obj."<init>":(Lcom/jvm/test/JvmTest$1;) V #4 = Methodref #2.#31 // com/jvm/test/JvmTest$Obj.access$102:(Lcom/jvm/test/JvmTest$Obj; I)I #5 = Methodref #2.#32 // com/jvm/test/JvmTest$Obj.access$202:(Lcom/jvm/test/JvmTest$Obj; F)F #6 = Methodref #2.#33 // com/jvm/test/JvmTest$Obj.testAB:()I ... #28 = NameAndType #20:#21 // "<init>":()V #29 = Utf8 com/jvm/test/JvmTest$Obj #30 = NameAndType #20:#46 // "<init>":(Lcom/jvm/test/JvmTest$1;) V #31 = NameAndType #47:#48 // access$102:(Lcom/jvm/test/JvmTest$Obj; I)I #32 = NameAndType #49:#50 // access$202:(Lcom/jvm/test/JvmTest$Obj; F)F #33 = NameAndType #51:#52 // testAB:()I { public com.jvm.test.JvmTest(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 3: 0 public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=3, locals=3, args_size=1 0: new #2 // class com/jvm/test/JvmTest$Obj 3: dup 4: aconst_null 5: invokespecial #3 // Method com/jvm/test/JvmTest$Obj."<init>":(Lcom/jvm/test/JvmTest$1;)V 8: astore_1 9: aload_1 10: bipush 10 12: invokestatic #4 // Method com/jvm/test/JvmTest$Obj.access$102:(Lcom/jvm/test/JvmTest$Obj;I)I 15: pop 16: aload_1 17: fconst_1 18: invokestatic #5 // Method com/jvm/test/JvmTest$Obj.access$202:(Lcom/jvm/test/JvmTest$Obj;F)F 21: pop 22: aload_1 23: invokevirtual #6 // Method com/jvm/test/JvmTest$Obj.testAB:()I 26: istore_2 ... 52: return LineNumberTable: Line 6: 9 line 7: 16 line 8: 22 line 9: 27 line 10: 52 }Copy the code
3.3.3 Bytecode execution

The program counter records the address of the bytecode instruction currently executed, the local variable table records the local variables in the method, and the operand stack is the variable operated by the current instruction. The change process of each data area is as follows:

    1. Execute the instruction with offset address 0:

    1. Execute the instruction with offset address 3:

    1. Execute instruction with offset address 5:

    1. Execute instruction with offset address 8:

    1. Execute the instruction with offset address 9:

    1. Execute the instruction with offset address 10:

    1. Execute instruction with offset address 12:

    1. Execute the instruction with offset address 15:

Conclusion: This paper explained the Java objects in the virtual machine structure, create and access, at this point, we of the virtual machine object in memory has a clear understanding of the forms of combining demonstration example, parse and execute virtual machine bytecode instructions, on how to access and use the object had the further understanding, mastery of the knowledge, To our development will have a great help, I hope this article can unlock the mystery of the virtual machine world for you.