Abstract: This project is the data development (DLF) of DAYU platform. An important function of data development is ETL (data cleaning). ETL is from the source end to the destination end, and the business logic in the middle is generally implemented by SQL templates written by users. Velocity is one of the templating languages involved.

Velocity of OOM

Basic use of Velocity

The basic code for using the Velocity template language is as follows:

1. Initialize the template engine

2. Obtain the template file

3. Set variables

4. The output

In ETL business, the output of a Velocity template is the user’s SET of ETL SQL statements, equivalent to a.SQL file. The official API here requires passing in an object of java.io.Writer class to store the set of SQL statements generated by the template. These sets of statements are then split into SQL statements based on our business and executed one by one.

The java.io.Writer class is an abstract class that has multiple implementations in JDK1.8, including but not limited to the following:

Because the cloud environment has strict security requirements on user permissions such as file reading, writing and creation, we use java.io.StringWriter, whose underlying layer is a StringBuffer object and the underlying layer is a char array.

Simple template HelloVelocity.vm:

#set(iAMVariable = ‘good! ‘) #set(person.password = ‘123’) Welcome nametovelocity.comtodayis{name} to velocity.com today is nametovelocity.comtodayis{date} #foreach( oneinone in oneinlist) one #end Name: {person.name} Password: ${person.password}

HelloVelocity.java

package com.xlf;

import org.apache.velocity.Template; import org.apache.velocity.VelocityContext; import org.apache.velocity.app.VelocityEngine; import org.apache.velocity.runtime.RuntimeConstants; import org.apache.velocity.runtime.resource.loader.ClasspathResourceLoader;

import java.io.StringWriter; import java.util.ArrayList; import java.util.Date; import java.util.List; Public class HelloVelocity {public static void main(String[] args) {// Initialize the template engine VelocityEngine(); ve.setProperty(RuntimeConstants.RESOURCE_LOADER, “classpath”); ve.setProperty(“classpath.resource.loader.class”, ClasspathResourceLoader.class.getName()); ve.init(); Template Template = ve.getTemplate(” helloVelocity.vm “); VelocityContext ctx = new VelocityContext(); // Set the variable ctx.put(“name”, “velocity”); ctx.put(“date”, (new Date()));

List temp = new ArrayList(); temp.add("Hey"); temp.add("Volecity!" ); ctx.put("list", temp); Person person = new Person(); ctx.put("person", person); StringWriter sw = new StringWriter(); template.merge(ctx, sw); System.out.println(sw.toString()); }Copy the code

}

Console output

OOM reproduce

The large template file bigvelocity.template.vm

(The number of words in the file exceeds the blog limit, which will be given in the attachment later.)

The template file itself is 379KB, but the key is that it defines a String array of over 90,000 elements, each element of which is a “1”, and then writes 79 nested loops, each of which iterates through the String array. The innermost loop is called once:

show table;

This means that this template will generate 96372 to the power of 79 SQL statements, each of which is:

show table;

Filling such a large number of characters into a StringWriter object would require at least 10 to 380 square gigabytes of memory, which is hardly realistic. Therefore, OOM overflow is inevitable.

BigVelocity.java

package com.xlf;

import org.apache.velocity.Template; import org.apache.velocity.VelocityContext; import org.apache.velocity.app.VelocityEngine; import org.apache.velocity.runtime.RuntimeConstants; import org.apache.velocity.runtime.resource.loader.ClasspathResourceLoader;

import java.io.StringWriter; Public class BigVelocity {public static void main(String[] args) {// Initializing the VelocityEngine ve = new VelocityEngine(); ve.setProperty(RuntimeConstants.RESOURCE_LOADER, “classpath”); ve.setProperty(“classpath.resource.loader.class”, ClasspathResourceLoader.class.getName()); ve.init(); Template Template = ve.getTemplate(” bigvelocity.template.vm “); VelocityContext ctx = new VelocityContext(); StringWriter sw = new StringWriter(); template.merge(ctx, sw); }}

Console output

OOM Cause Analysis

The result generated by the Velocity template is written to a StringWriter object, whose underlying layer, as discussed earlier, is a char array. The code that generates OOM directly is in the java.util.array.copyof () function:

StringWriter Tests the capacity limit of the underlying char array

StringWriterOOMTest.java

package com.xlf;

import java.io.StringWriter; public class StringWriterOOMTest { public static void main(String[] args) { System.out.println(“The maximum value of Integer is: ” + Integer.MAX_VALUE); StringWriter sw = new StringWriter(); int count = 0; for (int i = 0; i < 100000; i++) { for (int j = 0; j < 100000; j++) { sw.write(“This will cause OOMn”); System.out.println(“sw.getBuffer().length(): ” + sw.getBuffer().length() + “, count: ” + (++count)); }}}}

Jvm parameter Settings (see hardware configuration)

Environment: JDK8 + Windows10 desktop + 32GB memory + 1TB SSD + I7-8700

If your hardware is not fully configured, do not try!

The test results

The StringWriterOOMTest runtime stops when the total process memory size in Windows Task Manager reaches 10300 + MB.

Console output

Analysis of test results

A char array cannot exceed Integer.MAX_VALUE, which is a very close value. I’m off by more than 20. A search on the Internet, more reliable statement is: It is indeed smaller than integer.max_value and cannot be equal to integer.max_value, because char[] has other space, such as the object header, that cannot add up to more than integer.max_value. If you are interested, you can explore the number of elements in other types of arrays. Here’s my humble opinion, too.

OOM Solutions

The reason summary

Through the above series of recurrence and analysis, we know that the root cause of OOM is that the StringWriter object rendered by the template file is too large. Specific performance:

  1. If the system does not have enough memory space to allocate to the JVM, it will result in OOM because this memory is not useless and cannot be reclaimed by the JVM
  2. An OOM error is raised if the number of elements in a char array approaches MAX_VALUE if the system has enough memory to allocate to the JVM.

The solution

As discussed earlier, for security reasons, we can only use the StringWriter object to receive the output of the template rendering result. Files cannot be used. So StringWriter itself has to be improved:

Inherit the StringWriter class and override its write method as follows:

StringWriter sw = new StringWriter() { public void write(String str) { int length = this.getBuffer().length() + str.length(); If (length > 10 * 1024 * 1024) {this.getBuffer().delete(0, this.getbuffer ().length()); throw new RuntimeException(“Velocity template size exceeds limit!” ); } this.getBuffer().append(str); }};

The rest of the code remains the same

BigVelocitySolution.java

package com.xlf;

import org.apache.velocity.Template; import org.apache.velocity.VelocityContext; import org.apache.velocity.app.VelocityEngine; import org.apache.velocity.runtime.RuntimeConstants; import org.apache.velocity.runtime.resource.loader.ClasspathResourceLoader;

import java.io.StringWriter; Public class BigVelocitySolution {public static void main(String[] args) {// Initialize template engine ve = new VelocityEngine(); ve.setProperty(RuntimeConstants.RESOURCE_LOADER, “classpath”); ve.setProperty(“classpath.resource.loader.class”, ClasspathResourceLoader.class.getName()); ve.init(); Template Template = ve.getTemplate(” bigvelocity.template.vm “); VelocityContext ctx = new VelocityContext(); StringWriter sw = new StringWriter() { public void write(String str) { int length = this.getBuffer().length() + str.length(); If (length > 10 * 1024 * 1024) {this.getBuffer().delete(0, this.getbuffer ().length()); throw new RuntimeException(“Velocity template size exceeds limit!” ); } this.getBuffer().append(str); }}; template.merge(ctx, sw); }}

Console output

If the size of the VELOCITY template rendered SQL statement set is within the allowable range, the statement set will be split according to our business and executed sentence by sentence.

How to gracefully terminate a thread

In the subsequent process of executing SQL statements one by one, each SQL statement is executed by calling peripheral services (DLI, OBS, MySql, etc.), and the results will be returned to our job development scheduling Service (DLF) background every time. Our DLF platform supports the ability to stop a job just in time, which means that if the job is scheduled to execute 10000 SQL, I have to stop in the middle and not execute subsequent SQL — this is supported.

After correcting the OOM bug mentioned above and passing the test, the test student found that our homework could not be stopped. In other words, the Java thread where our homework was located could not be stopped.

Thread stop failed again

After some debug and code reading, it was found that the corresponding interrupt method thread.interrupt() was called in our project. To terminate the thread.

So why does calling interrupt still fail to terminate a thread?

TestForInterruptedException.java

package com.xlf; public class TestForInterruptedException { public static void main(String[] args) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < 10; i++) { sb.append(“show tables; n”); } int i = 0; for (String str : sb.toString().split(“n”)) { if (i > 4) { Thread.currentThread().interrupt(); System.out.println(i + ” after interrupt”); } System.out.println(str); System.out.println(i++); }

}
Copy the code

}

Console output

Analysis of test results

TestForInterruptedException. The main function of doing simple enough to produce a larger string, split into 10 small string, in the for loop is presented.the print small string; It attempts to terminate the thread starting at paragraph 5 (the initial paragraph is 0). Turns out the thread didn’t terminate!

What’s going on here? Why does calling the interrupt method on a thread not terminate the thread? Or is it because the JVM takes a little time to respond to this method? This is not the case. If you are interested, you can increase the number of loops even more. Interrupt the first few times in the loop and you will find the same result.

After some exploration, there are only two ways to terminate a thread:

  • Using the stop() method on the Thread object stops the Thread immediately, but this method is too violent to actually be used. See JDK1.8 for more information:

  • Deprecated. This method is inherently unsafe. Stopping a thread with Thread.stop causes it to unlock all of the monitors that it has locked (as a natural consequence of the unchecked ThreadDeath exception propagating up the stack). If any of the objects previously protected by these monitors were in an inconsistent state, the damaged objects become visible to other threads, potentially resulting in arbitrary behavior. Many uses of stop should be replaced by code that simply modifies some variable to indicate that the target thread should stop running. The target thread should check this variable regularly, and return from its run method in an orderly fashion if the variable indicates that it is to stop running. If the target thread waits for long periods (on a condition variable, for example), The interrupt method should be used to interrupt the wait…

  • The second method is to set the flag bit as mentioned in the JDK comments above. There are two ways of doing this, either of which requires the terminated thread itself to “actively” determine the state of the flag bit:

  1. Set a normal flag bit, such as true/ false for Boolean variables, to determine whether the thread should continue running based on the state of the variable — the code actively determines the state of the variable. This is commonly used in loops where a break, return, or throw Exception is detected.

  2. Use Thead instance method Interrupt to terminate the thread represented by the Thread object. But the interrupt method also essentially sets an interrupt flag bit, and once that flag bit is caught (read), it is reset (invalidated) “most of the time.” Therefore, it does not guarantee that the thread will stop, and it does not guarantee that it will stop immediately, as follows:

  3. If a wait, join, or sleep of the Object class is performed in subsequent program execution logic after the interrupt bit is set by the interrupt method, all three methods catch the interrupt bit, reset it, and throw InterruptedException.

  4. Similar to the previous point, the InterruptibleChannel class in the Java.nio.Channels package actively captures the interrupt flag, which is then reset if the thread is in an InterruptibleChannel I/O block. And then close the channel, throw the Java. Nio. Channels. ClosedByInterruptException; The same example with Java. Nio. Channels. The Selector, see the JavaDoc

  5. The Thread class instance method isInterrupted() also captures the interrupt flag and resets it. This method is used to determine where the program has terminated and can be understood as actively and explicitly capturing the interrupt flag.

  6. It is important to note that throwing and catching InterruptedException do not involve capturing or resetting thread identity bits

  7. What about my earlier statement that once the interrupt identifier bit is captured, it is reset “most of the time”? The Thread class has private native Boolean isInterrupted(Boolean ClearInterrupted); When the pass parameter is false, the interrupt identifier bit is captured without being reset. Generally, however, it’s only used in two places

  8. Static method of Thread class: this resets the interrupt flag bit and does not specify a Thread object, only the current Thread

  1. Instance method of the Thread class: This method is also commonly used to determine the Thread interrupt identifier without resetting the identifier bit.

summary

To terminate a thread, currently available in the JDK:

  1. You set variables to indicate whether a thread has been interrupted
  2. Proper use of the JDK’s thread interrupt identifier to determine whether a thread is interrupted

Both of these things need to be done later to break the loop, return the method, throw the exception, etc.

When does a thread terminate?

There are generally two reasons for thread termination:

  1. The thread completes its normal code logic and terminates naturally.
  2. When a Throwable object is thrown during thread execution and not explicitly captured, the JVM terminates the thread. It is well known that the Throwable class is the parent of Exception and Error!

Thread abort ExplicitlyCatchExceptionAndDoNotThrow. Java

package com.xlf; public class ExplicitlyCatchExceptionAndDoNotThrow { public static void main(String[] args) throws Exception { boolean flag = true; System. Out.println (” Main started!” ); try { throw new InterruptedException(); } catch (InterruptedException exception) { System.out.println(“InterruptedException is caught!” ); } system.out. println(“Main doesn’t stop! “) ); try { throw new Throwable(); } catch (Throwable throwable) { System.out.println(“Throwable is caught!” ); } system.out. println(“Main is still here!” ); if (flag) { throw new Exception(“Main is dead!” ); } System.out.println(“You’ll never see this!” ); }}

Console output

Analysis of test results

This test validates the previous conclusion about abnormal thread terminations:

When a Throwable object is thrown during thread execution and not explicitly captured, the JVM terminates the thread.

Gracefully manually terminate threads

Need to manually terminate threads to execute, the best approach is to set identifier bit (can be interrupt can also define your own), then capture timely identification and throw an exception, at the end of the business logic to catch exceptions and do some finishing cleaning action: statistical failed to perform tasks such as the ratio of success, or closing some flow and so on. In this way, the execution of the program takes into account both normal and abnormal conditions and is handled elegantly.

TerminateThreadGracefully.java

package com.xlf; public class TerminateThreadGracefully { public static void main(String[] args) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < 10; i++) { sb.append(“show tables; n”); } int i = 0; try { for (String str : sb.toString().split(“n”)) { if (i > 4) { Thread.currentThread().interrupt(); if (Thread.currentThread().isInterrupted()) { throw new InterruptedException(); } System.out.println(i + ” after interrupt”); } System.out.println(str); System.out.println(i++); } catch (InterruptedException exception) {// TODO: System.out.println(thread.currentThread ().isinterrupted ())); } System.out.println(“Thread main stops normally!” ); }}

Console output

Why did the thread in the project fail to terminate?

Our project does call the corresponding thread object’s interrupt method thread.interrupt(); To terminate the thread.

So why can’t the thread interrupt the identifier bit and terminate accordingly?

Back to the business logic of our project:

The job is divided into three stages: template reading, rendering, and SQL execution. Generally speaking, the first two stages are relatively fast. In the subsequent process of executing SQL statements one by one, each SQL statement is executed by calling peripheral services (DLI, OBS, MySql, etc.), and the results will be returned to our job development scheduling Service (DLF) background every time. Our DLF platform supports the ability to stop a job just in time, which means that if the job is scheduled to execute 10000 SQL, I have to stop in the middle and not execute subsequent SQL — this is supported.

So the problem lies in the process of SQL execution. After multiple debugging, it is found that logs must be written to OBS (a third-party package developed by Huawei) during SQL execution. This process cannot be omitted. The interrupt method on the thread object is called thread.interrupt(). The interrupt bit is first captured by the await() method of the java.util.concurrent. CountDownLatch class used by OBS. Reset the flag bit and throw an exception, which is then changed to another type of exception as it is thrown up one layer at a time, and there is no way to determine whether it was caused by manually terminating the job based on the type of exception finally thrown.

There is nothing wrong with the third-party package OBS handling exceptions thrown by CountDownLatch based on its own underlying logic. But our program won’t stop! To achieve thread termination, I added a custom flag variable to set the state of the variable when thread.interrupt() is called, and to determine the state of my custom flag bit after a few key points like OBS write log. Throw a RuntimeException if the state changes (it can go uncaught and minimize code changes). And to reuse thread objects in the thread pool, reset the custom identity bit at the start of each job. The job is terminated manually and gracefully.

The source code for this section will not be posted for project details, but the related logic has been shown in the previous code.

The system memory usage is high and inaccurate

Ordinary local variables defined during a run in a thread, of non-threadLocal type, are generally recycled when the thread terminates. The problem I encountered was that the thread was stopped after the bug was resolved, but running the top command on Linux was still using a lot of memory.

  1. First I did a forced GC on the JVM using the jmap-histo: Alive PID command and found that the heap was pretty much empty at this point (around 1% in both the old and young bands). But the top command sees around 18% * 7GB (total Linux memory).
  2. Second, I used the JCMD command to analyze external memory, excluding the problem of out-of-heap memory leaks
  3. The next step is to use the jStack command to see what the state of each thread of the JVM process is. The threads associated with the job are all waiting on condition (the thread is finished and the pool suspends them).
  4. So, a preliminary conclusion is that both heap and off-heap memory used by this JVM process is very small (relative to the 18% * 8GB footprint of the top command explicitly). So is it safe to assume that the JVM simply appropriated so much memory from the operating system that it didn’t give it back until the next time there was a new task in the thread pool? The last part of this paper focuses on one point.

Is back

In the following experiment

Set the JVM parameters to:

-Xms100m -Xmx200m -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps

Its significance lies in:

Limit initial JVM memory to 100M and maximum heap memory to 200M. It also prints detailed GC information and time stamps in time when garbage collection occurs in the JVM. All I have to do in my code is repeat that the JVM is out of memory and garbage collection has to happen. At the same time, observe the memory usage of the Java process at the operating system level.

SystemMemoryOccupiedAndReleaseTest.java

package com.xlf;

import java.util.concurrent.SynchronousQueue; import java.util.concurrent.ThreadFactory; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit; public class SystemMemoryOccupiedAndReleaseTest { public static void main(String[] args) { try { System.out.println(“start”); Thread.sleep(5000); } catch (InterruptedException e) { e.printStackTrace(); }

ThreadPoolExecutor executor = new ThreadPoolExecutor(3, 3, 30, TimeUnit.SECONDS, new SynchronousQueue<Runnable>(), new ThreadFactory() { public Thread newThread(Runnable r) { return new Thread(r); } }, new ThreadPoolExecutor.AbortPolicy()); Try {system.out.println ("(executor has been initialized):"); Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } Thread t1 = new Thread(new Runnable() {{system.out.println ("t1 is initialized "); } @Override public void run() { byte[] b = new byte[100 * 1024 * 1024]; System.out.println(" T1 allocates 100M space to array "); try { Thread.sleep(5000); } catch (InterruptedException e) { e.printStackTrace(); throw new RuntimeException("t1 stop"); } System.out.println("t1 stop"); } }, "t1"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } Thread t2 = new Thread(new Runnable() {{system.out.println ("t2 is initialized "); } @Override public void run() { byte[] b = new byte[100 * 1024 * 1024]; System.out.println(" T2 allocates 100M space to array "); try { Thread.sleep(5000); } catch (InterruptedException e) { e.printStackTrace(); throw new RuntimeException("t2 stop"); } System.out.println("t2 stop"); } }, "t2"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } Thread t3 = new Thread(new Runnable() {{system.out.println ("t3 is initialized "); } @Override public void run() { byte[] b = new byte[100 * 1024 * 1024]; System.out.println(" T3 allocates 100M space to array "); try { Thread.sleep(5000); } catch (InterruptedException e) { e.printStackTrace(); throw new RuntimeException("t3 stop"); } System.out.println("t3 stop"); } }, "t3"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } executor.execute(t1); System.out.println("t1 executed!" ); try { Thread.sleep(10000); } catch (InterruptedException e) { e.printStackTrace(); } executor.execute(t2); System.out.println("t2 executed!" ); try { Thread.sleep(10000); } catch (InterruptedException e) { e.printStackTrace(); } executor.execute(t3); System.out.println("t3 executed!" ); try { Thread.sleep(10000); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println("jmap -histo:live pid by cmd:"); try { Thread.sleep(20000); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println("After jmap!" ); // You may run jmap -heap pid using cmd here // executor.shutdown();Copy the code

}}

In the code above, I defined three Thread objects, all of which were allocated 100M char[] in the run() method, and then the Thread slept for 5 seconds. Then a new thread pool is created and the three thread objects are handed to the pool in turn to execute. The thread pool is separated by 10 seconds every time the previous thread runs out and the JVM recollects the memory, so that GC information can be printed for observation. Finally, wait until all three threads finish executing and sleep for a while (about 20 seconds), which gives me time to manually execute jmap-histo live PID in CMD, which will force FullGC. After jmap you can also try jmap-heap PID, which will not trigger GC. But take a look at the overall JVM heap usage details.

Console output

Process memory footprint on operating system before JMP-histo :live execution:

After executing JMP-histo :live

Jmap-heap PID

Win10 task manager is not accurate

After t1 allocates 100M space to the array, T2 ends:

Memory footprint: 107042K, total available heap size: 166400K

Failed to allocate 100M to T2, triggering FullGC:

103650K->1036K(98304K)

After t2 allocates 100M space to the array, t2 ends:

Memory footprint: 104461K, total available heap size: 166400K

Unable to allocate 100M to T3, trigger FullGC:

103532K->1037K(123904K)

After T3 allocates 100M space to the array, T3 terminates.

jmap -histo:live pid by cmd:

103565K->997K(123904K)

The final jMAP-heap PID result also shows a heap size of 123M.

During this process, the memory footprint of the JVM process at the operating system level will not exceed 122M. Jmap-histo :live PID will remain around 87M after FullGC is triggered (this result is repeated several times).

So why is the stack information size of the JVM inconsistent with that of the resource manager?

I have searched the Internet for this question, and the conclusion is as follows:

Committed memory refers to the minimum size that the program requires the system to run for the program, and if it is not met, it will be indicated that it is out of memory.

Working set memory is what the program actually consumes, and working set memory = shareable memory + dedicated memory

Can be Shared memory is useful when you open more larger software, or for memory consolidation, this part will be distributed to other software, so it is a piece of memory dedicated memory reserved for the program is run down, a dedicated memory refers to the current program run exclusive memory, a block, and that can be Shared memory is not the same, no matter how nervous system memory now, This dedicated memory does not voluntarily make room for other programs

So to summarize, the memory that the task manager displays is actually dedicated memory for the program that’s displayed and the memory that the program actually uses is working set memory

The two pictures above match:

The following two pictures “barely” match:

However, there is still a little gap with the heap memory 123904K after JMAP triggers GC. The gap is not big, so I can’t find a reliable answer on the Internet for the time being. I guess that this part may use the shareable memory of another process. I tried it on Linux and had this memory miscalculation problem. This question will be filled in next time

conclusion

  1. A thread terminates normally or by throwing an uncatched Throwable object
  2. After a thread terminates, the memory occupied by the thread is reclaimed as needed by the JVM, including objects allocated on the heap whose unique reference exists only in that thread
  3. After FullGC, the JVM does not just request XMS size memory from the operating system, although the heap footprint is small. This seemingly large amount of available memory is actually utilized when new thread tasks are assigned
  4. JVM heap memory usage is slightly higher than that of the operating system. This may be due to shared memory, which will be filled in next time!

Write in the last

Attach all the code described in this article and corresponding resource files for your reference! Comments and questions are welcome!

VelocityExperiment. Zip 19.40 KB

This article is shared by Huawei Cloud community “A magical bug: OOM? Gracefully terminating threads? High system memory usage?” A stop at work.

Click to follow, the first time to learn about Huawei cloud fresh technology ~