Hello, everyone, I’m the teacher who gave the nickname doug Tooth.

Recently read hotspo source a bit fascinated. Hotspot is a treasure trove of things to explore. Every time I reach a new Level and look back, I have different feelings. Get hooked, get hooked, share, share. Sharing makes everyone praise me and makes me happy. ^_^

Recently, some of the students who have enrolled in JVM classes have asked me a lot about the processing of attributes during class loading. This is not limited to the loading phase:

  1. How are properties stored during load
  2. What are the details of assigning an initial value to a property in preparation
  3. What are the details of assigning static properties during initialization
  4. What are the details of assigning values to non-static properties when creating objects
  5. What are the details when accessing properties

There are too many details to cover in one article, but a whole series. This is the first in a series on OOP properties. On the whole…

Problem analysis

In the computer world, there is always more than one solution to a problem. But after the choice, the most suitable only one. Of course, the number of solutions you can come up with depends on how well you understand the problem. Your understanding of a problem is closely related to your technical vision. You think and can not do, and your technical strength directly linked. It’s not just the computer world, it’s everywhere!

Let’s say you implement OOP mechanisms. Let’s not talk about the whole thing, but let’s focus on property inheritance, how you might implement it. Those who often read my articles may wonder why I always ask such questions as type. Because studying the bottom layer is not the same as studying the application layer, or you can study the bottom layer in the same way you study the application layer, but it’s not going to work very well. As far as I’m concerned, I work at the bottom, and the first thing I do is put myself in the designer’s shoes, to think like a designer, to understand ideas, to read code. Every time I ask this question, I mean that. Designer thinking is an attempt to understand thinking, learning thinking is a critical thinking.

Anyway, I can guess how you would do that, right

To explain this code: OOP is the existence of all Java objects in Hotpot, klass is the existence of all Java classes in Hotspot. Now that you need to store instance data in objects, there is no doubt that you need containers, and there is no doubt that Map is the most suitable container. If you implement it in Java, that’s really the only way to do it. Because Java is an application layer language, beyond Unsafe, which provides simple ways to manipulate memory, Java has no memory processing capabilities. And write Java programs that focus on business implementation. How well your code performs, eats memory, and is secure is essentially a matter of whether the JVM you choose is good.

If hotspot implements it that way, that’s fine too. Just not good enough, you might hear disdain from other programming languages. What is a good program, can be realized at the moment, to achieve the perfect combination of time and space. If implemented this way, there is a serious memory waste problem. I’m not going to explain this, but I talked about it in my previous post. portal

So how does HotShot work? Memory weaving. That is, in a piece of pre-applied memory, according to the attribute type, it is divided into a memory block of the same size, woven into it. The article I wrote earlier is not covered here. This article, yes, expands here in detail.

Here are some questions we need to answer:

  1. This pre-applied memory, how much to apply
  2. Properties can be bool, char, short, int, long, oop. How to knit to save memory and align memory
  3. When you access an object’s properties, it cannot be found directly through oop
  4. Is it necessary to create objects in memory weaving without wasting memory

How much memory to allocate

It’s like buying a pair of shoes for a boyfriend or girlfriend you’ve never met. Who knows how big they are? Guess what? Appropriate to say you read countless people, not appropriate to say you don’t care. Oh, it’s too hard. Let’s do it.

So how do we do that? You have to have a picture in your head, right? No, two pictures. What figure? Object memory layout diagram.

That answers the fourth question: there is still memory waste, and the gray padding area is the area filled in for alignment, which is wasted memory. But this waste is very little, it is acceptable, it is the best that can be done under the present conditions.

The details of how much memory each area occupies are as follows:

  1. Mark Word: 4B on 32-bit machines. On 64-bit machines, 8B. This article is about 64-bit machines
  2. Klass pointer (4B on pointer compression, 8B off pointer compression) It is enabled by default. If pointer compression is enabled for 4B, the valid data occupies 4B, which still occupies 8B in memory. This area is important. How do you understand this importance? 1. This region is related to the answer to question 3, which we’ll talk about later; Second, pointer compression on or off, has an impact on the memory structure diagram. If you compare the two images, you’ll see that there’s an extra filling area. I’ll talk about that later
  3. Array length: 4B if array object. If it’s not an array object, it’s 0B, it doesn’t appear
  4. Instance data: This is the core influence area, more on that later
  5. Alignment padding: All OOP must be 8B aligned, this convention. If an OOP is only 12B, such as the new object is 12B, it is not divisible by 8, and the end of the oop is filled with 0 of 4B

Let’s say that in most cases: 64-bit machines, pointer compression enabled, non-array objects, if memory is allocated in advance, the only area where the size of the instance data is currently uncertain. This one is also the hardest to pin down. So what does Hotspot do? The size of each data type is counted and then unified. In the code

Parse_fields is used to parse attribute information in bytecode files. But in order to implement memory weaving, you need to count the number of attributes of each type in addition to parsing. The collected information is stored in the FieldAllocationCount object. The calculation details are shown below:

Just so you know, hotspot treats Boolean, byte, and char as if they were c++ bytes, which counts as 1B. You can read the comments to see which C++ types the other Java types map to. Static and non-static properties are counted separately. Why? Because the storage location is different. Static properties are in the OOP corresponding to the Class object, and non-static properties are in the OOP coming out of New.

So there’s a little bit of detail here, char is 2B in Java, and I’m going to treat it as 1B, so is that ok? Don’t. Hotspot bottom layer does the work, exactly how to do it. Speak later.

After counting, you know how many bytes the object to be created will take up, with the following pseudocode

8 + 4+ oop * count + byte * count......Copy the code

You can’t knit without a container. So now that I have the container, how do I weave it in?

Weave details

Also 64-bit. Let’s start with the details of knitting with pointer compression turned off, which is a little bit special.

Hotspot supports three knitting rules:

  1. Allocation_style =0: Attributes are woven in descending order, oop first. The weaving sequence is OOP, long/double, int, short/char, and byte. If the object size is not 8B aligned after all the attributes are woven, add a padding area at the end. What is the number of padding bytes? I’ll leave that to the smart guys.
  2. Allocation_style =1: Properties are still woven in descending order, but this way oop is woven last. Also, non-8B alignment still requires padding.
  3. Allocation_style =2: This rule combines oop of a subclass with OOP of its parent class

I think, do you have this question: why can not be woven from small to large. Not for also, really can not also. See for yourself.

It has a text marked in red: “4B of valid data is 4B when pointer compression is enabled. This area still occupies 8B of memory. This is the core difference between turning pointer compression on or off.

Whether pointer compression is turned on or not, this area eats 8B of memory. With pointer compression turned off, 4B memory is wasted in this area. Can endure? I can’t stand it. Hotspot gives you choice. By changing the value of -xx :+/-CompactFields, you can choose whether hotspot will weave attributes into the gap. It is enabled by default. You can test it with the following code

Hotspot uses these three sets of rules for attribute weaving to achieve both memory saving and memory alignment. The default is allocation_style=1.

How to access properties

This leaves one last question: how to access. Before the question was raised here, today look at the source found not quite right. The interview details look like this

Oop. Type pointer. Fields. OffsetCopy the code

Because OOP is just a chunk of memory, you don’t know which chunk of memory stores what properties. So hotspot’s approach is to use a type pointer to find the KLass of this OOP object, which has all the property information stored in an array. Use the attribute name + signature to find all information about the specific accessed attribute, which contains the offset. This offset is not yet an object offset, it is an index. To actually find out where the property is in the OOP, take the index and run it. It’s a little abstract. Let me give you an example.

For example, there are two chArs in a Java class that are woven in code order. If I want to access C2:

  1. Get the klass of the Test class using the oop
  2. Call findField and pass in the attribute name + signature to get the complete information for C2 and offset
  3. Then call inject char_offset_ADDR (offset) to calculate the MEMORY address of C2 in oop
  4. Access the data from C2

When is field. Offset calculated

Recommended reading

  1. How to become a tech guru? How to gain architectural experience?
  2. STW, who’s been bothering you for most of your life, is finally graduating today
  3. From hotspot source code level analysis of Java polymorphic implementation principle

conclusion

I am ziya teacher, like to delve into the bottom layer, in-depth study of Windows, Linux kernel, JVM. Like to share core knowledge, if you also like to study the bottom, like core knowledge, you can follow my public number: hard nuclear teeth.