“This article has participated in the call for good writing activities, click to view: the back end, the big front end double track submission, 20,000 yuan prize pool waiting for you to challenge!”

Content abstract

I wrote a preliminary article on SafePoint security point, which mainly introduced the concept and definition of SafePoint and related functions, but there was no explanation and introduction of SafeRegion. This article is mainly to reshape and deepen the deeper principles of SafePoint and SafeRegion integration and introduction.

Safe Point

Java program execution does not always pause to begin GC, but only at specific locations, called “safepoints”

  • From a thread point of view, safe points are special places in code execution where a thread will pause and block the Mutator thread until the GC ends.

  • GC suspends all active Mutator threads, so the thread suspends when it reaches a safe point.

  • This special location holds all information about the thread context. Printing the log when entering the safe point shows what the thread is doing at the moment.

Safe choice

  • The choice of Safe Points is important because too little can lead to long GC waits, and too often can cause performance problems at runtime. The execution time of most instructions is very short and is usually based on “characteristics that make the program run for a long time”.

  • For example, select instructions that take longer to execute as Safe points, such as method calls, loop jumps, and exception jumps.

Thread interrupt type

How do you check that all threads stop at the nearest safe point when a GC occurs?

  • Preemptive interrupt: (no virtual machine currently uses it) interrupts all threads first. If there are still threads that are not at the safe point, restore the thread and let it run to the safe point.

  • Active Interrupts: Sets an interrupt flag, which threads actively poll when running to Safe Point, and suspends themselves if the interrupt flag is true.

“A point in program where the state of execution is known by the VM” means that the VM knows exactly where in the code the execution status is.

Classification of safety points

  • GC SafePoint: To trigger a GC, all threads in the JVM must reach A GC Safepoint

  • Deoptimization Safepoint, to trigger a Deoptimization, the thread executing Deoptimization must reach SafePoint before it can start deoptimize

In Hotspot the two are implemented together and conceptually not directly related, requiring different data

Safe Region

The Safepoint mechanism ensures that a program will run into a Safepoint ready for GC within a short period of time

But sometimes it is not long execution, but long idle time, such as sleep, block, thread executing other native functions, when the JVM has no control over execution and therefore cannot respond to GC events.

How to solve sleep/ Block problems

Refer to safe – region. Safe-region refers to the part of a code that does not have any variation. Any point in such a code block can be safely enumerated as a root.

When entering the Safe-Region, mutator sets a ready flag and checks to see if the GC has completed its collection before leaving the Safe-Region. If not, execution is paused. If so, mutator can leave the safe-Region without suspending mutator

  • What about when the program “doesn’t execute”? For example, when a thread is in the state of Sleep or Blocked, unable to respond to interrupt requests from the JVM, the thread “walks” to a safe point to interrupt the suspension, and the JVM is less likely to wait for the thread to wake up. In this case, a Safe Region is required.

  • A safe zone is a code snippet where the reference relationship of the object does not change and it is safe to start GC anywhere in the zone. We can think of Safe Region as Safepoint extended.

When the program is actually executed

  1. When a user thread runs into Safe Region code, it first identifies that it has entered the Safe Region. If GC occurs during this period, the JVM ignores the STW user thread identified as Safe Region and waits for the JVM to complete GC.

  2. When the user thread is about to leave the SafeRegion, it checks to see if the JVM has completed GC. If so, the user thread continues running, otherwise it must wait until it receives a signal that it is Safe to leave SafeRegion.

Accessibility analysis

How does GC find unavailable objects? It is possible to write code to know that an object is not available, but there is a certain way for a program to know this, using methods such as compile analysis, reference counting, and reachable objects.

Mutator meaning

  • The application is normally resumed after GC execution, and in GC literature, the application is the Mutator thread.

  • An object is alive as long as it can be reached by the mutator.

  • If slots in the Mutator thread stack contain references to objects, then objects are directly reachable.

  • Directly reachable objects Reachable objects must also be reachable, so reachable analysis requires only finding directly reachable references.

Accessibility references

A directly reachable reference is a root reference, and a collection of root references is a collection of roots.

  • The Mutator context contains directly reachable data, so to get the object root collection, you need to find the object reference in the Mutator context, which is its stack, its register file, and some thread-specific data.

The global data itself is also directly reachable

  • To ensure that the correct determination of whether an object is alive or not, the GC takes a consistent snapshot of the Mutator context and enumerates the GCRoots root objects corresponding to all mutator thread stacks.

How do I take a consistent snapshot of a Mutator context

  • Consistency refers to the fact that snapshots are extracted as if they were only taken at one point in time to avoid losing some living objects.

  • An easy way to do this is to suspend all threads during the reference. When the Mutator suspends its execution, the collection of roots can only be enumerated if all reference information is saved in its context, which means that the Mutator needs to be able to tell which slots in the stack have references and which registers hold references.

  • If the GC can accurately obtain the above reference information, it is called the exact root set enumeration.

How to obtain accurate enumeration of reference information

  • For Java, the JIT knows all the stack frame information and register contents. When the JIT compiles a method, it can save the root reference information for each instruction. Saving means extra storage space, and it would be too expensive to store all instructions.

  • In addition, only a few instructions will be pause points during the actual run, so the JIT only needs to store the information of these instruction points. The places that have a real chance of becoming pause points are called safe-points, which are pause points for a collection of enumerations that are safe.

How do I guarantee that Mutator will be paused at Safe-Point

When the GC wants to trigger a collection, it sets a flag, and the Mutator periodically polls the flag, and if it does, it immediately pauses. Poll points are also safe points. It is the JIT’s responsibility to put poll points in place (similar to writing barriers)

Those are good places to set up flags to check for GC events

The main principles of polling point insertion are as follows:

  • 【polling Point 】 There should be enough polling points to prevent GC from pausing one mutator for too long, causing other mutators to wait too long for GC to free space. Don’t weigh it too much and only force it when allocating addresses where it is necessary and necessary, because allocating space can lead to reclamation.

  • So there’s a safety point here where long execution generally means loops and method calls, so method calls and loop returns are best served with >.

  • Different JVMS choose different locations for SafePoint.

conclusion

So if the code is going to be GC, Deoptimize, whatever it is, it’s going to have to know where all the threads are going that I’m going to be able to execute without hurting the application itself, These are safePoint/Saferegion

When does the JVM reclaim class metadata in the method area?

Three conditions are indispensable:

  1. All instances of the class (in the heap) have been reclaimed.
  2. The ClassLoader for this class has been reclaimed.
  3. The Class object corresponding to this Class does not have any references.

After the method is executed, the stack frame is immediately removed from the stack, and the variable data in the stack frame is immediately reclaimed. Or wait for the garbage collector to collect it? Why is that?

  • The memory allocation of the underlying type variable is in the stack, so it is destroyed when the stack is unloaded. Reference objects in the heap need to wait once YoungGC.

Is’ instance object reclaimed ‘the same as’ Class object not referenced’?

  • No, the Class object represents a Class, and if a variable references the Class object, then it is referenced.

Why is the Cenozoic divided into three regions {Eden, From, To}, but not into two?

  • In the three areas, only From or To space is idle, and after dividing into two areas, half of the new generation resources should be idle.

How to understand the impact of STW on the system? How is the tuning strategy developed?

  • We have been trying to control younggC below 50ms and OLDGC below 300ms. But GC execution inevitably carries STW, and the essence of JVM generational collection is that the nearest GC is executed at the end of an object’s life cycle. So the tuner needs to estimate the life cycle of the object.

Parnew + CMS collector, how to ensure that only do younggc?

  • We need to observe how many new objects are added every second, how often younggc is triggered, how many objects survive after an average younggC, whether survivor region is put down (dynamic age of objects, etc.), calculate the ratio of survivor region to Eden region, and skip dynamic age, leading to the problem of entering the old age.

How are parallel threads set up using the ParNew collector?

  • The number of CPU cores must be the same as that of the application server. If necessary, you can specify -xx :ParallelGCThreads.

Should I choose server mode or client mode when starting the system? What is the impact on ParNew?

  • Select server mode if the system is deployed on Linux, and Select Client mode if the system is deployed on Windows. While web projects are typically deployed on multi-core Linux servers, ParNew can take full advantage of multi-core resources. Windows is generally installed in client mode, such as QQ, WX, etc., if using ParNew mode will cause the CPU to run multiple threads, but increase the performance overhead. Therefore, the Client usually selects Serial mode.

As GC occurs, each thread is only really entered the SafePoint hangs, that is true, the meaning of this log is the GC STW time, in the process of configuration – XX: + PrintGCApplicationStoppedTime this parameter can print this information.

What is the STW

The act of waiting for all user threads to enter the safe point and block, doing some global operation.

When will STW happen?

  • Garbage collection pauses.
  • JIT related, such as Code deoptimization, Flushing Code cache
  • Class redefinition (e.g. javaagent, AOP code for instrumentation)
  • Biased lock revocation Cancel Biased lock
  • Various debug operation (e.g. thread dump or deadlock check

The specification of STW

Configuration – XX: + PrintSafepointStatistics – XX: PrintSafepointStatisticsCount = 1 parameters, virtual opportunity to print the following log files:

  • -xx :+PrintSafepointStatistics, PrintSafepointStatistics,
  • – XX: PrintSafepointStatisticsCount = n setting the number of statistics to print security;

. To be continued