preface
Java classes and objects can be stored in a JVM, and they can be stored in a JVM. If you have not read the last article, you can click here:
This article will focus on a detailed analysis of Java objects, based on the previous article, the layout of Java objects and some of the underlying mechanisms will be interpreted, I believe that these will be of great help to later JVM tuning.
Object memory layout
As we mentioned in the previous article, objects are described in the JVM by an Oop. As a reminder, Oop consists of object headers (_mark, _metadata) and the instance data area. The object header contains an _metadata with a pointer to the metadata information of the class, as shown in the following figure:
And the underlying memory layout of objects that WE’re talking about today actually comes from this diagram.
Those of you who have learned about object composition know that objects are made up of three parts: object header, instance data, and alignment fill. Object header and example data correspond to two large parts of Oop objects, and alignment fill is really a part that only exists in logic.
-
Object head
We can take a closer look at each of these parts, starting with the object header:
The object header is divided into MarkWord and type pointer. MarkWord is _mark in Oop object, which is used internally to store the object’s own runtime data, such as HashCode, GC generation age, lock status flag, thread holding lock, bias thread Id, bias timestamp, etc.
This is the memory layout of the object header I found online (64-bit operating system, no pointer compression) :
The object header takes up 128 bits (16 bytes), including 8 bytes for MarkWord and 8 bytes for Klass Point. MarkWord stores some basic information about the object, such as the GC generation age, which allows the JVM to determine whether the current object should be old or not. Lock status flag, in the process of processing concurrency, you can judge the current level of means to ensure thread safety, so as to optimize the performance of synchronous operation, other believe we are more familiar with, here is not a list of the first. Of course, object headers will be covered later in the topic of concurrency.
The other 8 bytes of the object header, however, are KlassPoint, the type pointer that was mentioned in the Oop model in the previous article to point to a Klass object, which is used at run time to get meta information about the class to which the object belongs.
-
The instance data
What is instance data, as the name implies, is the field in the object. In more precise terms, the non-static properties of the class, after the object is generated, are the instance data, and the size of the instance data part is the sum of the actual space occupied by multiple properties, such as the following class:
public class Test{ private int a; private double b; private boolean c; } Copy the code
So after new Test(), the instance data area of this object will take up 4+8+1 = 13 bytes, and so on.
In Java, basic data types have sizes:
boolean — 1B
byte — 1B
short — 2B
char — 2B
int — 4B
float — 4B
double — 8B
long — 8B
In addition to the eight basic data types mentioned above, a class can also contain reference type objects. How to calculate this part?
Here we need to discuss the case, since we have not talked about pointer compression, so we should first write down:
On a 32-bit machine, the reference type takes up 4 bytes.
On a 64-bit machine, the reference type takes up 8 bytes.
If it is a 64-bit machine and pointer compression is turned on, the reference type is 4 bytes.
If there is another reference type object in the instance data area of the object, it actually only holds the address of the object. If you understand this concept, you can understand these three cases.
Why do 32-bit machine reference types take up 4 bytes while 64-bit machine reference types take up 8 bytes?
The concept of addressing is that if you store a memory address, it is for the convenience of addressing it later, and the 32-bit machine means that its address is made up of 32 bits, so it takes 4 bytes to record its memory address, and the 64-bit machine also takes 8 bytes to record its memory address.
-
Alignment filling
We mentioned the object is made up of three parts, but only involves two parts, the above part of this is filling, this is a part of a special, only exists in logic, there needs to be science, the object in the JVM has a feature, that is 8 bytes alignment, what call 8 byte alignment, is the size of an object, is only 8 integer times, If an object is less than an integer multiple of 8, it is populated. For those of you who are wondering, suppose that the content of an object is only 20 bytes, then by virtue of the 8-byte alignment feature, wouldn’t that object be 24 bytes? Isn’t that a waste of space? According to the 8-byte logic, the answer to this question is yes. If an object is only 20 bytes, it will fill into 24 bytes. The extra four bytes are what we call aligned fill.
The object header takes up 16 bytes, regardless of pointer compression, and the instance data area, let’s say an int takes up 4 bytes, so the total is 20 bytes, and the object will fill up to 24 bytes due to the 8-byte alignment feature.
So why do we design this way? In the beginning, THE author also had such doubts, such a design will have a lot of wasted space, after all, the data filled in, in logic is not meaningful, but if you stand on the point of view of a designer, such a design in the future maintenance is the most convenient. Instead of 8 byte alignment, assuming that objects randomly size distribution in memory, irregular because of this, will cause the designer’s code logic is complicated, because the designers don’t know you how big this object, thus there is no way to completely remove the whole object, and may in this kind of uncertainty, to other objects, causing chaos system.
Of course, some students can overcome the problem on the design, that cause will not be enough to let us waste memory, that’s why I understand the second point, so there will be a good design, is to improve performance, assuming that objects are long, so in order to obtain a complete object, must be a byte to read a byte, read until the end, However, if the 8-byte alignment, the acquisition object can be read in 8-byte units, quickly obtain an object, which is also a design scheme to exchange space for time.
If 8 bytes can improve performance, why not align them with 16 bytes? The answer is: Is not necessary, for two reasons, first, we head maximum is 16 bytes, and the instance data area of the largest data type is 8 bytes, so if you choose to 16 bytes alignment, imagine a 18 bytes of objects, so we need to fill it became an object of a 32 bytes, and select the 8 bytes, you just need to fill to 24 bytes, It doesn’t waste more space. The second reason, which allows me to play a trick here, will be explained in more detail later in pointer compression.
-
Proof of object memory layout
There are two ways to prove this, one is to use code, and the other is to use HSDB, which we mentioned in the last article, to directly view the composition of the object. Since HSDB was mentioned in the last article, I will only say the first way.
First, we need to introduce a Maven dependency:
<! -- https://mvnrepository.com/artifact/org.openjdk.jol/jol-core --> <dependency> <groupId>org.openjdk.jol</groupId> <artifactId>jol-core</artifactId> <version>0.10</version> </dependency> Copy the code
After introducing this dependency, we can view the memory layout of the object in the console as follows:
public class Blog { public static void main(String[] args) { Blog blog = newBlog(); System.out.println(ClassLayout.parseInstance(blog).toPrintable()); }}Copy the code
First is the case where pointer compression is turned off, alignment padding is 0 bytes, and object size is 16 bytes:
Then, when pointer compression is turned on, the alignment fill is 4 bytes and the object size remains 16 bytes:
Explain why both cases are 16 bytes:
Turn pointer compression on, object size (16 bytes) = MarkWord (8 bytes) + KlassPointer (4 bytes) + Array length (0 bytes) + instance data (0 bytes) + Align fill (4 bytes) turn pointer compression off, Object size (16 bytes) = MarkWord (8 bytes) + KlassPointer (8 bytes) + Array length (0 bytes) + instance data (0 bytes) + Alignment fill (0 bytes)
How do I calculate the memory footprint of an object
In the first section, we have explained the layout of objects in memory in detail, which is divided into three parts: object header, instance data, and alignment fill, and proved it. In this section, we will calculate the memory footprint of an object.
Actually in just now of the memory layout, there should be a lot of students on how to calculate the object memory footprint have a preliminary understanding, in fact this is not difficult, does not take up the sum of the three area, but in this paper, we just say some simple case, so here mainly, said we don’t have considered above, we will discuss and prove it.
-
There are only basic data types in the object
public class Blog { private int a = 10; private long b = 20; private double c = 0.0; private float d = 0.0 f; public static void main(String[] args) { Blog blog = newBlog(); System.out.println(ClassLayout.parseInstance(blog).toPrintable()); }}Copy the code
This is the simplest case except for empty objects. Suppose that all properties in an object are of one or more of the eight basic Java types. How do you calculate the size of the object?
Here are the results:
In this case, we simply populate the object header + sample data + alignment. Since we have four properties in the object, int(4 bytes)+ Long (8 bytes)+ Double (8 bytes)+float(4 bytes), we get 24 bytes of instance data. The object header is 12 bytes (pointer compression is turned on), so the total is 36 bytes, but since Java objects must be aligned with 8 bytes, the alignment fill fills it with 4 bytes, so the entire object is:
Object header (12 bytes)+ instance data (24 bytes)+ Alignment padding (4 bytes) = 40 bytes
-
Object has a reference type (with pointer compression turned off)
How do we calculate if there is a reference type in the object? There are two cases in which pointer compression is turned on and pointer compression is turned off. Let’s take a look at the difference between turning pointer compression off.
public class Blog { Map<String,Object> objMap = new HashMap<>(16); public static void main(String[] args) { Blog blog = newBlog(); System.out.println(ClassLayout.parseInstance(blog).toPrintable()); }}Copy the code
Again, the results first:
The object’s instance data area contains a reference type attribute. As described in section 1, the object only holds a pointer to this attribute. With pointer compression turned off, this pointer takes up 8 bytes.
Object header (16 bytes with pointer compression turned off)+ instance data (8 bytes for 1 object pointer)+ Alignment padding (no padding required)=24 bytes
-
Object has a reference type (pointer compression enabled)
What if pointer compression is turned on?
If pointer compression is enabled, both the type pointer and the instance data area pointer occupy only 4 bytes, so the memory size is:
MarkWord(8B)+KlassPointer(4B)+ instance data area (4B)+ Alignment fill (0B) = 16B
-
Array types (with pointer compression turned off)
What if it’s an array object? Because of the directional thinking you’ve developed in this article, you may have started to use the old formula to calculate the size of an array object, but the situation here is much more complicated than normal objects, and some of the phenomena may surprise you.
Here we enumerate three cases:
public class Blog { private int a = 10; private int b = 10; public static void main(String[] args) { // An array without attributes in the object Object[] objArray = new Object[3]; // An array of two attributes of type int Blog[] blogArray = new Blog[3]; // Array of primitive types int[] intArray = new int[1]; System.out.println(ClassLayout.parseInstance(blogArray).toPrintable()); System.out.println(ClassLayout.parseInstance(objArray).toPrintable()); System.out.println(ClassLayout.parseInstance(intArray).toPrintable()); }}Copy the code
Again, the results first:
Let’s start with the first case: an array of objects without attributes:
In addition to MarkWord, KlassPointer, and instance data alignment and fill, the same operation of printing objects has a new space. We can find that the algorithm used in ordinary objects is no longer applicable to the algorithm of array, because there is a very strange thing in the array that we have never mentioned. That’s the third part of the object header, the array length.
What exactly is an array length?
If the object is an array, there will be an array length property inside of it, which will record the size of the array. The array length is 32 bits, which is 4 bytes, and there will also be a basic thing here: what is the maximum size of an array in Java? Similar to the representation of the computed memory address, the array is at most 2^32 in length because it is four bytes long.
We take a look at the situation of instance data area, due to the storage of the three objects, and we are in the object exists in the reference type in this situation, even if there is, we are only preserved the pointer to the memory address, closed here due to pointer compression, occupies the first 8 bytes each pointer, a total of 24 bytes.
Back on the diagram, in the previous cases, the alignment fill was after the instance data area, but here the alignment fill is in the fourth part of the object header. In the array object, the memory layout of the array object should look like this:
We can test this idea in two other cases:
Object containing an array of two attributes of type int:
Array of primitive data types:
We can see that even if the object exists in the two type int array, still keep the memory address pointer, so still is 4 bytes, in the basic type of the array, it is the size of the instance data, which is the length of the int type 4 bytes, if array length is 3, the instance data here is 12 bytes, and so on, and this kind of circumstance, Since the length of the array in our code is set to 1, the size of the object here is:
MarkWord(8B)+KlassPointer(8B)+ array length (4B)+ first paragraph alignment (4B)+ instance data area (4B)+ second paragraph alignment (4B) = 32B
-
Array type (pointer compression enabled)
What happens if pointer compression is turned on? With the above foundation, you can think about it for a moment, and I’m going to go straight to the figure above.
Array of basic type 1:
There is a reference type in the object (with pointer compression enabled). As I said in pointer compression enabled, our type pointer takes up 4 bytes. Since it is an array, the object header still has an extra pointer to hold the object, but the alignment padding in the object header has disappeared, so its size is:
MarkWord(8B)+KlassPointer(4B)+ array length (4B)+ instance data area (4B)+ Alignment fill (4B) = 24B
-
Only static variables exist
The last case assumes that there is only one static variable in the class (with pointer compression turned on) :
public class Blog { private static Map<String,Object> mapObj = new HashMap<>(16); public static void main(String[] args) { Blog blog = new Blog(); int[] intArray = new int[1]; System.out.println(ClassLayout.parseInstance(blog).toPrintable()); }}Copy the code
You can see that there’s no instance data inside of it, and the reason for that is very simple, because as we said, remember, only the non-static properties of the class, after the object is generated, are the instance data, and the static variables are not in the column.
-
conclusion
In terms of how objects are sized, it’s pretty simple. We’ll focus first on whether pointer compression is enabled, and then on whether it’s an ordinary object or an array object. So let’s summarize.
If it is a normal object, then you only need to calculate: MarkWord+KlassPointer (8B) + instance data + alignment fill.
MarkWord+KlassPointer (4B) + first paragraph alignment + instance data + second paragraph alignment.
If there is reference type data in the object, only a pointer to that data is saved, which is 4 bytes with pointer compression enabled and 8 bytes with pointer compression disabled.
If the object has a basic datatype, then the entity is saved, which needs to be calculated flexibly according to the size of the basic datatype in 8.
Pointer to the compressed
We’ve dealt with pointer compression a lot in this article, so what exactly is pointer compression?
To put it simply, pointer compression is a memory saving technique and can improve the efficiency of memory addressing. Since Pointers in objects take up 8 bytes (64Bit) on 64-bit systems, what is the size of memory that can be represented by 8-byte Pointers?
2^64 = 18446744073709552000Bit = 2147483648GB
Obviously, from a memory point of view, first of all, it is almost impossible to achieve this level of memory with current hardware conditions. Second, 64-bit object references take up more pair space, leaving less space for other data, speeding up GC. From the PERSPECTIVE of the CPU, the object reference becomes larger, and the CPU can cache fewer objects. Each time the object is used, the CPU needs to fetch the object from the memory, reducing the EFFICIENCY of the CPU. Therefore, the concept of pointer compression was introduced in the design.
-
Principle of pointer compression
We all know that pointer compression takes an 8-byte pointer and compresses it down to 4 bytes, so what is the memory size of the 4-word saving?
2^32 = 4GB
The level of memory, in the circumstances, the current 64 – bit machines in most of the production environment is not enough, need more addressing range, but just now we see that the pointer after compression, the size of the object pointer is 4 bytes, so we need to know is, how the JVM pointer compression conditions, promote addressing range?
Note that since the maximum memory address recognized by 32-bit operating systems is 4GB, the pointer compression will still suffice, so 32-bit operating systems are out of this discussion and only 64-bit operating systems will be discussed here.
First, let’s look at the rule of the memory address of the object after the pointer compression:
Suppose we have three objects, A with 8 bytes, B with 16 bytes, and C with 24 bytes.
Then its memory address (assuming starting from 00000000) is:
A: 00000000 00000000 00000000 00000000 0x00000000
B: 00000000 00000000 00000000 00001000 0x00000008
C: 00000000 00000000 0000000010000 0x00000010
Due to the 8-byte alignment of objects in Java, the last three digits of the memory address of all objects are always 0. So here’s the subtlety of how the JVM is designed to solve this problem.
First, the JVM will erase the last three zeros of the object’s memory address when it is stored (move it three bits to the right), and replace the last three zeros of the object’s memory address when it is used (move it three bits to the left).
According to this logic, when storing, suppose there is an object, the memory address has reached 8GB, exceed 4GB, then its memory address is :**00000010 00000000 00000000 00000000 00000000 00000000 **
This is obviously beyond the range of 32 bits (4 bytes), so following the logic above, the JVM moves the address of the object three places to the right when storing it, to 01000000 00000000 00000000 00000000 00000000 00000000 00000000, and fills the last three places to the left when using it. Then we are back to the beginning: **00000010 00000000 00000000 00000000 00000000 00000000 **, and we can find the object in memory again and load it into the register for use.
Because of the 8-byte alignment, the last three bits of the memory address are always zeros. I was amazed by the ingenuity of the JVM to turn a 32-bit object pointer into an actual 35-bit memory address that could represent up to 32GB of memory.
Of course, this is just to say that the JVM is addressing with pointer compression enabled, but in fact 64-bit operating systems are very capable of addressing. If the JVM is allocated more than 32GB of memory, it will automatically turn pointer compression off and use 8-byte Pointers for addressing.
-
Answering the legacy question: Why not use 16-byte alignment
The second reason why we didn’t use 16-byte alignment in the first section is that after learning about pointer compression, the answer is already clear. When we use 8-byte alignment with pointer compression enabled, the maximum memory representation range is already 32GB. If it is larger than 32GB, turn off pointer compression. You get very powerful addressing capabilities.
, of course, if the JVM without pointer compression, but set the start object pointer only 8 bytes, so at this time if you need more than 32 gb of memory addressing capability, you will need to use 16 byte alignment, the principle is the same as the above said, if it is 16 bytes alignment, the object of the memory address after 4 to 0, So we move 4 bits left and 4 bits right when we store and read, and we get 36 bits of addressing power with just a 32 bit pointer, which is 64GB of addressing power.
conclusion
This article is the second in the JVM series, mainly based on the previous article “Oop-Klass model” to deconstruct Java objects, mainly describes the memory layout of Java objects, the case is discussed, and in the code for proof, and finally discusses the technical scenarios and implementation principles of pointer compression technology in simple terms.
What the JVM looks like at a macro level, what regions it consists of, and how the JVM schedules these objects at runtime will be explained in the next article.
Welcome to visit my personal Blog: Object’s Blog