Last review

1. What is the implementation idea of reference counting algorithm and reachability algorithm?

2. What is the recycling effect of the method area? Can you give a brief explanation of the causes?

3. Give four simple examples of the conditions for being GC root

In this chapter the feed

The main content of this chapter is the implementation details of the garbage collection algorithm and Hotspot algorithm. Algorithms should be clear, the main focus of this section is on the implementation details of Hotspot algorithm.

Three common garbage collection algorithms

In terms of performance, these three common garbage collection algorithms are actually very easy to understand. I personally remember them through these three pictures. In fact, if you understand the extra long paragraphs of text, it will be confused. Take a look at the image from Inside the Home Java Virtual Machine

1. Mark – clear

The characteristics of: Simple implementation, unstable efficiency (depending on how many objects need to be marked), easy to cause space fragmentation, resulting in large objects cannot allocate space.

2. Mark-copy

The characteristics of: Split memory in half, move live objects, recycle garbage objects. The obvious problem is that space waste is relatively serious, but the implementation is simple, high efficiency (provided that most objects need to be recycled).

3. Mark and tidy

The characteristics of: Compared with The mark-clear algorithm, The mark-clean algorithm takes one more step to move The living object. In this way, The space waste caused by space fragmentation can be avoided, but it also increases The load of virtual machine and The time of “Stop The World”

Second, the implementation details of HotSpot algorithm

This is where we need to go in detail in this section, and understanding this section will give you an insight into the whole garbage collection process. Let’s cut to the chase

1. Enumeration of root nodes

As we mentioned in the previous chapter, Hotspot’s garbage collection algorithm uses reachability analysis to determine whether a reference to the root node is reachability. Although the idea is correct, our GC memory is so large, contains countless objects and references, and the relationship between them is so complicated that it would be inefficient to scan everything every time.

The current mainstream Java virtual machines use exact garbage collection, as opposed to exact conservative and semi-conservative garbage collection.

conservative

The JVM does not record the type of a piece of data at a memory location, and we cannot determine whether we are gc using a reference or another type. In this case, conservative gc is implemented. Microsoft’s JScript and early versions of VBScript also used conservative GC; So is Microsoft’s JVM.

Conservative implementation of the method is simple, the defect is clear, not accurately recycling has been the object of death, only as long as object, a pointer to the object can escape the GC, and because there may be a pointer to the object, cause objects move at the same time I also have to modify the pointer pointing to, it could not be achieved. So the object can’t move.

A conservative

Semi-conservative solves the problem of object movement by adding a layer of “handles” in the middle.

Accurate garbage collection

The type of data that the JVM records anywhere in memory to determine whether the data is a pointer or not.

There are three ways to achieve accurate garbage collection:

1. Tag the data itself and judge by obtaining the tag content in the data

2. Have the compiler generate special scan code for each method.

3. Maintain records between objects and types by maintaining a mapping table.

Our hotspot is implemented in the third way, the corresponding mapping table in hotspot we call OopMap.

There are generally two ways to use such a mapping table

1. Traverse the original mapping table each time, scanning each offset in the cycle; This usage is also called “interpretive” and is how the hotspot virtual machine is used.

2. Generate a custom scan code for each mapping table (imagine the loop of scanning the mapping table being expanded), and execute the generated scan code directly each time the mapping table is used; This usage is also called “compiled”.

Each variable and each object in the code has its own type. The code will divide the instruction set of the code into several sections according to SafePoint, which is the safety point to be discussed later. Each section of the instruction set will record an OopMap.

Each JIT-compiled method also records an OopMap at a number of locations, where references are placed on the stack and in registers when an instruction to that method is executed. The GC will then look up these OOPmaps as it scans the stack to know where the references are. These specific locations are mainly in:

1. The end of the loop

2. Before the method returns/after the call instruction of the method is called

3. Possible position of throwing exceptions

2. Safety points

As we said earlier, the instruction set of code is divided into several sections by safe points, so what are safe points?

Safe choice

The safe point is selected according to the criterion of “whether the program has the characteristics of long execution”, that is, some instructions can keep the program running for a long time, which may not be able to suspend immediately, and other threads have paused, which will cause new waiting problems.

Therefore, the choice of safe points is generally based on the reuse of instruction sequence, that is to say, method call, circular jump, exception jump and so on all belong to the reuse of instruction sequence, so only instructions with these functions will produce safe points.

A way to get to safety

Now that The safe point selection is done, The next question is, I’m going to GC, how do I make sure that all The threads reach The safe point and Stop at The same time, and that my root selection is done smoothly, since root selection will “Stop The World”. There are two options:

1. Preemptive interrupt

Preemptive interrupt does not require the execution code of the thread to cooperate actively. When garbage collection occurs, the system first interrupts all user threads. If it is found that the user thread is interrupted where it is not at the safe point, it will resume the execution of the thread and let it re-interrupt until it runs to the safe point. Few virtual machine implementations now use preemptive interrupts to suspend threads in response to GC events.

2. Active interrupt

The idea of active interrupt is that when the garbage collection needs to interrupt the thread, it does not operate directly on the thread, but simply sets a flag bit. Each thread will take the initiative to poll this flag during the execution process. Once it finds that the interrupt flag is true, it voluntarily interrupts and suspends at the nearest safe point. Polling flags overlap with safe points, plus all created objects and other places where memory needs to be allocated on the Java heap to check if garbage collection is imminent and there is not enough memory to allocate new objects.

But this frequent access is similar to writing a while(! Sign is an infinite loop that doesn’t jump out until our flag is true, which is a performance drain, so to keep things efficient HotSpot uses a memory-protected trap to simplify polling down to a single assembly instruction. That is, within the JVM this implementation is really just an instruction.

3. Secure areas

The function is similar to the safety point, which can be regarded as the extension of the safety point, the first process from point to line. I believe students can understand this meaning.

The main solution is that when the program is in the Sleep or Blocked state and cannot receive the virtual machine interrupt request, the period of time in which the program is in is the safe zone time. The safe zone can be GC at any time, but the threads are not allowed to leave the safe zone during GC, which means that the thread that is Sleep cannot be activated and the thread that is Blocked cannot be reawakened during GC.

4. Memory sets and card tables

The memory set

In order to solve the problem of cross-generation references, the concept of memory set is introduced. In general terms, we need to record some Pointers to the required GC region that are not involved in the GC region. When a partial collection occurs, we can easily solve the problem of cross-generation references by scanning the table of the memory set.

The memory set provides only an idea, which can be understood as an interface, and the card table is a concrete implementation of the memory set.

Card table

The precision of card table records is card precision, that is, each record is accurate to a memory region, and objects in this memory region have cross-generation Pointers, and this memory region can be regarded as a card page.

The simplest form of a card table can be just a byte array, and the HotSpot VIRTUAL machine does exactly that. The following line of code is HotSpot’s default card table markup logic.

CARD_TABLE [this address >> 9] = 0;
Copy the code

CARD_TABLE is our array of card tables, the elements of which are the card pages defined by each memory area of a specific size.

The card page size is 2 to the NTH power of bytes. As shown in the above code, the card page size used in HotSpot is 2 to the 9th power of 512 bytes. That is to say, if the memory start address of the card table is 0x0000, The memory addresses of each card page are 0x0000-0x01FF, 0x0200-0x03FF, and 0x0400-0x05FF 、、、、 respectively

The memory of a card page usually contains more than one object. As long as there is a cross-generation pointer in the field of one (or more) object in the card page, the value of the array element of the corresponding card table is marked as 1, which is called the element becomes Dirty. If there is no such element, it is marked as 0. When garbage collection occurs, it is easy to determine which card page memory blocks contain cross-generation Pointers by screening out the dirty elements in the card table and adding them to GC Roots for scanning.

5. Write barriers

Following the logic above, if there are cross-generation references, then I maintain a card table page by page to keep track of whether there are Pointers to other areas in each block of memory. But how to maintain this card page??

At this time the students must feel infinite set baby, Russian infinite set baby ~ ~ ~

Why every time is to tell me: you want to do so, understand, very simple, this evening should be able to change it. But every time I use the time just discover this pit after all how deep ~ ~ ~ but why did I believe every time

Once the mask of pain is on, no one loves you anymore

All right, play essence cell close, close ~


The write barrier here you need to distinguish it from the memory barrier we talked about in Volitile to keep order, two is not one thing, two is not one thing!!

There are two types of write barriers:

1. The write barrier before assignment is called the pre-write barrier

2. The one after assignment is called the post-write barrier

Before the G1 collector, other collectors used post-write barriers

The main logic for writing back barriers is this

Void oop_field_store(oop* field, oop new_value) {// Reference field assignment *field = new_value; Post_write_barrier (field, new_value); }Copy the code

Once the write barrier is applied, the VIRTUAL machine generates instructions for all assignment operations. Once the collector adds an update card table operation to the write barrier, it incurs extra overhead every time the reference is updated, regardless of whether the reference is old or new. However, this overhead is still much lower than the cost of scanning the entire old age in Minor GC.

So you can think about why do you have to update every time? How do I implement this without every update?

In addition to the additional overhead, write barriers also face the problem of pseudo-sharing in high-concurrency scenarios in order to make them more efficient.

Pseudo-sharing is actually caused by a cache line designed to increase CPU read and write efficiency.

In computer theory, A cache row may contain multiple card tables, each with multiple card pages. So now WE have CPU1 and CPU2 reading card tables A and B from the same cache row. CPU1 changes the values of card tables A and B and synchronizes them to the cache row. However, since the value of card table A is invalid, it is necessary to re-read card table A and card table B from main memory. In fact, CPU2 only focuses on card table B, because card table A and card table B are in the same cache row, which leads to A performance problem.

To solve this problem, JDK7 adds a new parameter -xx: +UseCondCardMark, which determines whether to enable the conditional judgment for card table updates. This avoids the problem of pseudo-sharing, but it also incurs the overhead of an extra judgment, so the culprit is the cache row design.

Of course, any design is not perfect, there must be a trade-off, since the concept of cache line still exists, then its advantages must outweigh its disadvantages.

6. Black, white and white markings

In The algorithm of reachaeability analysis, we know that during The root election, all threads reach The safe point and Stop The World until The root election, but what happens after The root election?

In many cases the Minor GC is insensitive, so it’s a safe guess that the thread is definitely not in a suspended state when it comes to reference judgments after the root. I don’t know if you would agree with that.

Objects disappear

Therefore, in the case of multi-threaded concurrency, once the judgment of references between different objects is misjudged, two situations will occur:

1. Mismarking dead objects as alive is not a good thing, but in fact it is tolerable. It just creates a bit of floating garbage that escapes this collection and can be cleaned up next time.

2. Mismarking a previously alive object as dead is a fatal consequence, and the program is bound to fail.

And the scariest part is number two, which is basically what happens when I call the cops and I kill myself

Let’s use the black, white and white spheres from Inside the Home Java Virtual Machine to describe this process

Black ball: Indicates that the object has been accessed by the garbage collector and that all references to the object have been scanned. The black object generation table has been scanned, it is safe to live, if there are other object reference to the black object, do not need to scan again. A black object cannot point directly to a white object (without passing the gray object).

Greyball: Indicates that an object has been accessed by the garbage collector, but at least one reference to the object has not yet been scanned.

White ball: Indicates that the object has not been accessed by the garbage collector. Obviously, at the beginning of the reachability analysis, all objects are white. If at the end of the analysis, all objects are still white, that is, they are unreachable.

The necessary condition for an object to disappear

Wilson theoretically proved in 1994 that the problem of “object disappearance” will occur if and only if the following two conditions are met simultaneously:

1.· The assignor inserts one or more new references from the black object to the white object

2. The assignor removes all direct or indirect references from the gray object to the white object

Therefore, to solve the object disappearance problem, you only need to break any of the conditions, which leads to two solutions:

1. Incremental update is to destroy the first condition. When a black object inserts a new reference to a white object, the newly inserted reference will be recorded. This can be simplified to say that a black object becomes a gray object as soon as a new reference to a white object is inserted.

2. The original snapshot destroys the second condition. When a gray object wants to delete a reference to a white object, the reference to be deleted is recorded. This can also be simplified to mean that the search is based on a snapshot of the object graph at the moment when the scan begins, whether the reference relationship is deleted or not.

No matter the above reference records are inserted or deleted, the recording operations of the virtual machine are implemented through write barriers. In the HotSpot VIRTUAL machine, both incremental update and raw snapshot solutions have practical applications. For example, CMS is based on incremental updates for concurrent marking, G1 and Shenandoah are based on raw snapshots.


The content of this chapter is relatively dry. I hope you can read it again and again to deepen your impression. Don’t forget to like 👍+ follow oh ~ ~