Abstract: With the maturity of container technology, more and more enterprise customers choose Docker and Kubernetes as the basis of application platform in the enterprise. However, in the process of practice, there will be many specific problems. This article analyzed and addressed a common problem with container usage in Java applications regarding Heap size Settings.
With the maturity of container technology, more and more enterprise customers choose Docker and Kubernetes as the basis of application platform in the enterprise. However, in the process of practice, there will be many specific problems. This series of articles will document some of ali Cloud container service team’s insights and best practices in supporting customers. We also welcome you to contact us via email and tweetgroup to share your thoughts and problems.
The problem
Some students said: I set the resource limit of the container, but the Java application container will still be magically killed by OOM Killer when running.
A very common reason behind this is that the resource limits of the container and the heap size of the corresponding JVM are not set properly.
Taking a Tomcat application as an example, the example code and Kubernetes deployment file are available on Github.
git clone https://github.com/denverdino/system-info cd system-info`
Kubernetes’ Pod definition is as follows:
- 1. The APP in Pod is an initialization container that copies a JSP application to the “Webapps” directory of the Tomcat container. Note: The image JSP application index. JSP is used to display JVM and system resource information.
2. The Tomcat container will keep running and we limit the maximum memory usage of the container to 256MB.
apiVersion: v1 kind: Pod metadata: name: test spec: initContainers: – image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info name: app imagePullPolicy: IfNotPresent command: – “cp” – “-r” – “/system-info” – “/app” volumeMounts: – mountPath: /app name: app-volume containers: – image: tomcat:9-jre8 name: tomcat imagePullPolicy: IfNotPresent volumeMounts: – mountPath: /usr/local/tomcat/webapps name: app-volume ports: – containerPort: 8080 resources: requests: memory: “256Mi” cpu: “500m” limits: memory: “256Mi” cpu: “500m” volumes: – name: app-volume emptyDir: {}
We execute the following commands to deploy and test the application
$ kubectl create -f test.yaml pod “test” created $ kubectl get pods test NAME READY STATUS RESTARTS AGE test 1/1 Running 0 28s $ kubectl exec test curl http://localhost:8080/system-info/ …
We can see the system CPU/Memory information in HTML format, we can also use the html2text command to convert it to text format.
Note: This test was performed on a 2C 4G node, and the results of the test may vary in different environments
$kubectl exec test curl http://localhost:8080/system-info/ | html2text Java version Oracle Corporation 1.8.0 comes with _162 Operating System Linux 4.9.64 Server Apache Tomcat/9.0.6 Memory Used 29 of 57 MB, Max 878 MB Physica Memory 3951 MB CPU Cores 2 **** Memory MXBean **** Heap Memory Usage init = 65011712(63488K) used = 19873704(19407K) committed = 65536000(64000K) max = 921174016(899584K) Non-Heap Memory Usage init = 2555904(2496K) used = 32944912(32172K) committed = 33882112(33088K) max = -1(-1K)
We can see that the system memory seen in the container is 3951MB, while the maximum JVM Heap Size is 878MB. What? ! Didn’t we set the capacity of the container resource to 256MB? If so, when the application memory usage exceeds 256MB and the JVM has not GC it, the JVM process will be killed in OOM.
The root of the problem is:
- For the JVM, if Heap Size is not set, it defaults to its own maximum Heap Size based on the memory Size of the host environment.
- The Docker container uses cgroups to limit the resources used by the process, while the JVM in the container still defaults to the memory size and CPU cores of the host environment, which leads to incorrect calculation of the JVM Heap.
Similarly, the JVM’s default number of GC and JIT compilation threads depends on the number of host CPU cores. If we run multiple Java applications on a node, even if we set CPU limits, there is still a chance that application performance will suffer due to GC thread preemption switching between applications.
Knowing the root of the problem, we can solve it very simply
solution
Enable CGroup resource awareness
The Java community is also aware of this issue and supports automatic awareness of container resource limits in JavaSE8u131+ and JDK9 blogs.oracle.com/java-platfo…
It is used by adding the following parameters
Java – XX: XX: + UnlockExperimentalVMOptions – + UseCGroupMemoryLimitForHeap…
We added the environment variable “JAVA_OPTS” parameter to the Tomcat container in the previous example
apiVersion: v1 kind: Pod metadata: name: cgrouptest spec: initContainers: – image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info name: app imagePullPolicy: IfNotPresent command: – “cp” – “-r” – “/system-info” – “/app” volumeMounts: – mountPath: /app name: app-volume containers: – image: tomcat:9-jre8 name: tomcat imagePullPolicy: IfNotPresent env: – name: JAVA_OPTS value: “-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap” volumeMounts: – mountPath: /usr/local/tomcat/webapps name: app-volume ports: – containerPort: 8080 resources: requests: memory: “256Mi” cpu: “500m” limits: memory: “256Mi” cpu: “500m” volumes: – name: app-volume emptyDir: {}
We deploy a new Pod and repeat the corresponding tests
$ kubectl create -f cgroup_test.yaml pod “cgrouptest” created $ kubectl exec cgrouptest curl http://localhost:8080/system-info/ | html2txt Java version Oracle Corporation 1.8.0 comes with _162 4.9.64 Linux Operating system Server Apache Tomcat/9.0.6 Memory Used 23 of 44 MB, Max 112 MB Physica Memory 3951 MB CPU Cores 2 **** Memory MXBean **** Heap Memory Usage init = 8388608(8192K) used = 25280928(24688K) committed = 46661632(45568K) max = 117440512(114688K) Non-Heap Memory Usage init = 2555904(2496K) used = 31970840(31221K) committed = 32768000(32000K) max = -1(-1K)
We see that the maximum Heap size for the JVM has been changed to 112MB, which is nice, as this ensures that our application won’t get OOM easily. Then the question arises, why do we set the container maximum memory limit to 256MB when the JVM only sets the Heap maximum to 112MB?
This gets into the details of memory management for the JVM. Memory consumption in the JVM includes both Heap and non-heap categories. Non-heap memory includes meta information like Class, JIT-compiled code, thread stack, and GC memory. Therefore, the JVM also reserves some memory for Non-Heap memory based on CGroup resource limits to ensure system stability. (In the above example, we can see that the Non Heap takes up nearly 32MB of memory after Tomcat is started.)
In the latest JDK 10, there have been further optimizations and enhancements to running JVMS in containers.
CGroup resource limits are aware inside the container
If you can’t take advantage of new JDK 8/9 features, such as older applications that are still using JDK6, you can also use scripts inside the container to get the container’s CGroup resource limit by setting the JVM Heap size.
Docker1.7 began cgroup information mounted to the container, the container so the application can be from the/sys/fs/cgroup/memory/memory. Limit_in_bytes Settings file access memory, CPU and so on, Configure the correct resource Settings according to Cgroup in the container’s application startup command -xmx, -xx :ParallelGCThreads and other parameters
In yq.aliyun.com/articles/18… The corresponding examples and code have been provided in this article and will not be described in this article
conclusion
This article examined a common problem with the Heap setting in container use for Java applications. Unlike virtual machines, containers limit their resources through Cgroups. On the other hand, memory and CPU allocation by processes in a container that are not aware of CGroup limits can lead to resource conflicts and problems.
It is very simple to take advantage of the JVM’s new features and custom scripts to properly set resource limits. This solves most resource constraints.
Another problem with resource limitations in container applications is that some older monitoring tools or system commands such as free/top can still fetch CPU and memory from the host when running in the container, which causes some monitoring tools to fail to calculate resource consumption properly when running in the container. A common practice in the community is to use LXCFS to keep the behavior of the container in resource visibility consistent with the virtual machine, and its use on Kubernetes will be described in a future article.
Ali Cloud Kubernetes service through the world’s first Kubernetes consistency certification, simplified Kubernetes cluster lifecycle management, built with Ali cloud product integration, will further simplify Kubernetes developer experience, to help users focus on cloud application value innovation.
The original link
To read more articles, please scan the following QR code: