According to the Java Virtual Machine specification, Java objects are divided into three parts: Object Header, instance data, and padding.

Example of the size of a Hello object

// mark word 8 bytes,
// If pointer compression is enabled, class pointer; //
// Instance data has only one a of type long, which is 8 bytes in total
// Since the object size must be divisible by 8, the padding padding needs to be added by 4 bytes.
public class Hello {
    private long a;
}
Copy the code

Next, we introduce the composition of objects in parts.

1. mark word

This is probably the most important and complex part of the whole object. For Mark Word, and even the request header, many students are aware of it when they learn the synchronized keyword, so we also introduce Mark Word from the different states of the lock.

state biased_lock lock
unlocked 0 01
Biased locking 1 01
Lightweight lock 0 00
Heavyweight lock 0 10
                                      64A virtual machine | -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | | unused:25|identity_hashcode:31|unused:1|age:4|biase_lock:0| 01 |      Nomal         |
|------------------------------------------------------------------------------------|
|thread:54|      epoch:2       |unused:1|age:4|biase_lock:1| 01 |      Biased        |
|------------------------------------------------------------------------------------|
|                    ptr_to_lock_record:62                 | 00 | Lightweight Locked |
|------------------------------------------------------------------------------------|
|                   ptr_to_heavyweight_monitor:62          | 10 | Heavyweight Locked |
|------------------------------------------------------------------------------------|
|                                                          | 11 |    Marked forGC | | -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | biased_lock: whether to enable biased locking the age:4Bits of Java object age identity_hashcode:31HashCode thread: the ID of the thread holding the bias lock. Epoch: Timestamp of bias lock. Ptr_to_lock_record: pointer to the lock record in the stack in the lightweight lock state. Ptr_to_heavyweight_monitor: pointer to the object Monitor Monitor in the heavyweight lock state.Copy the code

We will find that in the unlocked state, the Mark Word stores the object’s identity Hash code value.

  • When the object’s hashCode() method (not user-defined) is called for the first time, the JVM generates the corresponding Identity Hash code value and stores the value in the Mark Word. Later, if the object’s hashCode() method is called again, it is not computed by the JVM, but is retrieved directly from the Mark Word. Only in this way can the value of the identity Hash code obtained multiple times be the same.

  • In jdK8, for example, the default JVM calculation of the identity Hash code is a random number, so we must ensure that an object’s identity hash code can only be computed once by the underlying JVM.

  • However, when the Lock is upgraded to a lightweight Lock, the object’s Hashcode and GC age are lost, because when the Lock is upgraded to a lightweight Lock, the JVM copies the object’s Mark word into the Lock Record of the stack frame. When the thread releases the object, it copies it back to the object.

  • The ObjectMonitor class also has a field to record the mark Word in the unlocked state, which can also store the value of Identity Hash code. Therefore, the heavyweight lock can also coexist with identity Hash code.

  • However, for biased locks, the Thread ID and epoch values overwrite the location of identity Hash code when the Thread obtains the biased lock. If an object’s hashCode() method has already been called once, the object can no longer be biased to lock. If it does, the identity hash code in Mark Word will be overwritten by the biased thread Id, which will cause the same object to be called hashCode() method twice.

  1. So when an object has computed an Identity Hash code, it cannot enter the biased lock state.
  2. When an object is currently in a biased lock state and needs to calculate its identity Hash code, its biased lock is removed and the lock expands to a lightweight or weight lock.
  3. In the implementation of lightweight lock, product of product product will be stored by locking record of thread stack frame. In the implementation of weight lock, ObjectMonitor class has a field to record the mark Word in the unlocked state, which can store the value of Identity Hash code.

2. class pointer

A pointer to the Class information in the method area that the virtual machine uses to determine which instance of the Class the object belongs to.

Pointer compression is supported in 64-bit JVMS:

  • When pointer compression is not enabled, type pointer occupies 8B (64bit)
  • With pointer compression enabled, type Pointers occupy 4B (32bit)

Without compression:

  • This equates to an additional 4 bytes per object, requiring more heap space and increasing GC overhead
  • As 64-bit objects become larger, the CPU will have less OOP to cache, reducing CPU caching efficiency

Pointer compression principle:

  • On a 32-bit JVM, as shown in the first figure, a class Pointer of 4 bytes is exactly the same as a class pointer of 4*8=32 bits, which is a maximum of 2 ^ 32 addresses. Since the minimum unit of CPU address is byte, 2 ^ 32 bytes =4G.
  • But when the absolute majority of servers started to have more than 4 gigabytes of memory, 32-bit JVMS became insufficient, and the era of 64-bit JVMS began.
  • Going back to Pointers, 64 JVMS have Pointers that are 64 bits, or 8 bytes, so how do you compress them into 4 bytes? If byte is still the CPU’s smallest unit of addressing, then a 4-byte pointer would represent, at most, four gigabytes.
  • As a result, the older generation took advantage of Java’s alignment padding mechanism, known as 8-byte alignment padding, so that one bit (address 1) no longer represented a 1-byte address, but an 8-byte address, and the amount of memory that could be represented became 2 to the power of 328=4G8=32 GB, enough to represent the memory size of most servers.
  • Therefore, when the memory is larger than 32GB, the parameter for enabling pointer compression becomes invalid.

3. instance data

  • The instance data part is the valid information that the object actually stores, that is, the contents of the various types of fields that we define in our program code, whether inherited from a parent class or defined in a subclass, need to be recorded.
  • The order in which this part is stored is affected by the order in which the vm allocation policy parameters (FieldsAllocationStyle) and fields are defined in the Java source code.
  • The default allocation policies of HotSpot VIRTUAL machine are Longs/Doubles, INTS, SHORTS /chars, Bytes/Booleans, and Oops (Ordinary Object Pointers). Fields of the same width are always assigned together. If this condition is met, variables defined in the parent class appear before subclasses. If the CompactFields parameter is true (the default is true), narrower variables in the subclass may also be inserted into the gap between the parent class variables.