Let’s start with a well-worn limitation: We often get errors when using commands like jmap for Java applications in Docker:
Can't attach to the process: ptrace(PTRACE_ATTACH, ..) .
This is mainly because tools like JStack and JMap are implemented in two main ways:
- Attach mechanism (also called Vitural Machine. Attach (), is mainly used to interact with the Attach Listener thread of the target JVM through Socket. For details, see the article “JVM source code analysis of Attach mechanism to achieve full interpretation”).
- The Serviceability Agent(which is essentially a Attach that relies on the system call ptrace in Linux) is configurable.
Since version 1.10, the default seccomp configuration file disables pTrace, so some SA operations such as jmap-heap error, Docker official also provides a solution:
- Use -cap-add =SYS_PTRACE to explicitly add the specified functionality:
[docker run --cap-add=SYS_PTRACE ...]
- Turn off seccomp/add pTrace to the list of allowed:
docker run --security-opt seccomp:unconfined ...
In addition to this limitation, some time ago WHEN I was looking at the JDK BUG SYSTEM, I accidentally found such a BUG :JDK-8140793
getAvailableProcessors may incorrectly report the number of cpus in Docker container
A BUG roughly describes the phenomenon that Java may get an incorrect number of cpus when running in a Docker container.
Docker is known to rely on Cgroups and Namespace, and Cgroups is a Linux kernel function that can limit and isolate the resource usage of processes (CPU, memory, disk I/O, network, etc.). So I guess it’s possible that the JVM didn’t read the Docker restriction using Cgroups at runtime.
As I continued to look at this BUG and realized it was RESOLVED, I continued to look and found this article in the official Blog
: Java SE Support for Docker CPU and Memory Limits (jK-8140793, Docker memory limits enhancement jK-8170888, Container detection and resource configuration enhancement JK-8146115)
In Java SE 8U121 and earlier, the number of cpus and memory read by the JVM is not restricted by Cgroups. As FAR as I know, when we don’t explicitly specify some parameters, we often use the data read by the JVM to do some default configuration. For example, if not explicitly specified
-XX:ParallelGCThread
s and
-XX:CICompilerCount
The JVM will set the value based on the number of cpus read, such as runtime\vm_version. CPP where the number of Threads read for Parallel GC is calculated.
if (FLAG_IS_DEFAULT(ParallelGCThreads)) { assert(ParallelGCThreads == 0, "Default ParallelGCThreads is not 0"); // For very large machines, there are diminishing returns // for large numbers of worker threads. Instead of // hogging the whole system, use a fraction of the workers for every // processor after the first 8. For example, on a 72 cpu machine // and a chosen fraction of 5/8 // use 8 + (72 - 8) * (5/8) == 48 worker threads. unsigned int ncpus = (unsigned int) os::active_processor_count(); return (ncpus <= switch_pt) ? ncpus : (switch_pt + ((ncpus - switch_pt) * num) / den); } else { return ParallelGCThreads; }Copy the code
Run the following commands to obtain the number of cpus in the OS ::active_processor_count()
int os::active_processor_count() {
// Linux doesn't yet have a (official) notion of processor sets,
// so just return the number of online processors.
int online_cpus = ::sysconf(_SC_NPROCESSORS_ONLN);
assert(online_cpus > 0 && online_cpus <= processor_count(), "sanity check");
return online_cpus;
}
Copy the code
It is found that ::sysconf(_SC_NPROCESSORS_ONLN) is used to read the CPU of the physical machine, so it seems that the calculation of the number of GC threads will have some problems, and JIT Compiler threads will also have the same problem.
If we do not explicitly specify some parameters such as -xmx (MaxHeapSize), -xMS (InitialHeapSize), the JVM will make some default Settings based on the size of the machine it reads:
void Arguments::set_heap_size() { if (! FLAG_IS_DEFAULT(DefaultMaxRAMFraction)) { // Deprecated flag FLAG_SET_CMDLINE(uintx, MaxRAMFraction, DefaultMaxRAMFraction); } const julong phys_mem = FLAG_IS_DEFAULT(MaxRAM) ? MIN2(os::physical_memory(), (julong)MaxRAM) : (julong)MaxRAM; // If the maximum heap size has not been set with -Xmx, // then set it as fraction of the size of physical memory, // respecting the maximum and minimum sizes of the heap. if (FLAG_IS_DEFAULT(MaxHeapSize)) { julong reasonable_max = phys_mem / MaxRAMFraction; if (phys_mem <= MaxHeapSize * MinRAMFraction) { // Small physical memory, so use a minimum fraction of it for the heap reasonable_max = phys_mem / MinRAMFraction; }....}}Copy the code
OS :: Physical_memory () reads physical memory, and running this in Docker can cause errors such as being killed by OOMKiller (see).
As you can see, when using older JDK8 versions, we may run into some weird issues if we don’t explicitly specify some parameters. I found in JDK-8146115 that this enhancement to Docker payments is already implemented in JDK10. Container support can be turned on using -xx :+UseContainerSupport, and this enhancement has been backport into some new versions of JDK8 (after JDK8u131).
I downloaded the new version of OpenJDK8 and looked through the source code to find that Oracle did the corresponding processing.
OS ::active_processor_count() is changed to:
// Determine the active processor count from one of // three different sources: // // 1. User option -XX:ActiveProcessorCount // 2. kernel os calls (sched_getaffinity or sysconf(_SC_NPROCESSORS_ONLN) // 3. extracted from cgroup cpu subsystem (shares and quotas) // // Option 1, if specified, will always override. // If the cgroup subsystem is active and configured, we // will return the min of the cgroup and option 2 results. // This is required since tools, such as numactl, that // alter cpu affinity do not update cgroup subsystem // cpuset configuration files. int os::active_processor_count() { // User has overridden the number of active processors if (ActiveProcessorCount > 0) { if (PrintActiveCpus) { tty->print_cr("active_processor_count: " "active processor count set by user : %d", ActiveProcessorCount); } return ActiveProcessorCount; } int active_cpus; if (OSContainer::is_containerized()) { active_cpus = OSContainer::active_processor_count(); if (PrintActiveCpus) { tty->print_cr("active_processor_count: determined by OSContainer: %d", active_cpus); } } else { active_cpus = os::Linux::active_processor_count(); } return active_cpus; }Copy the code
OSContainer::is_containerized(); OSContainer::is_containerized(); OSContainer::is_containerized(); OSContainer::is_containerized()
inline bool OSContainer::is_containerized() {
assert(_is_initialized, "OSContainer not initialized");
return _is_containerized;
}
Copy the code
While _is_containerized is obtained when Threads:: create_VM calls OSContainer::init() to check whether the virtual machine is running in a container (this method is too long) :
/* init * * Initialize the container support and determine if * we are running under cgroup control. */ void OSContainer::init() { int mountid; int parentid; int major; int minor; FILE *mntinfo = NULL; FILE *cgroup = NULL; char buf[MAXPATHLEN+1]; char tmproot[MAXPATHLEN+1]; char tmpmount[MAXPATHLEN+1]; char tmpbase[MAXPATHLEN+1]; char *p; jlong mem_limit; assert(! _is_initialized, "Initializing OSContainer more than once"); _is_initialized = true; _is_containerized = false; _unlimited_memory = (LONG_MAX / os::vm_page_size()) * os::vm_page_size(); if (PrintContainerInfo) { tty->print_cr("OSContainer::init: Initializing Container Support"); } if (! UseContainerSupport) { if (PrintContainerInfo) { tty->print_cr("Container Support not enabled"); } return; }... _is_containerized = true; }Copy the code
Check whether the UseContainerSupport parameter is enabled, whether /proc/self-/mountinfo, whether /proc/self-/cgroup is readable, etc. If the JVM is running in a container, OSContainer:: Active_processor_count () is called to get the number of cpus the container can limit:
/* active_processor_count * * Calculate an appropriate number of active processors for the * VM to use based on these three inputs. * * cpu affinity * cgroup cpu quota & cpu period * cgroup cpu shares * * Algorithm: * * Determine the number of available CPUs from sched_getaffinity * * If user specified a quota (quota ! = -1), calculate the number of * required CPUs by dividing quota by period. * * If shares are in effect (shares ! = 1), calculate the number * of CPUs required for the shares by dividing the share value * by PER_CPU_SHARES. * * All results of division are rounded up to the next whole number. * * If neither shares or quotas have been specified, return the * number of active processors in the system. * * If both shares and quotas have been specified, the results are * based on the flag PreferContainerQuotaForCPUCount. If true, * return the quota value. If false return the smallest value * between shares or quotas. * * If shares and/or quotas have been specified, the resulting number * returned will never exceed the number of active processors. * * return: * number of CPUs */ int OSContainer::active_processor_count() { int quota_count = 0, share_count = 0; int cpu_count, limit_count; int result; cpu_count = limit_count = os::Linux::active_processor_count(); int quota = cpu_quota(); int period = cpu_period(); int share = cpu_shares(); . }Copy the code
Cgroup CPU quota & CPU period, Cgroup CPU shares, Docker can be set by -cpu-period, -cpu-quota, etc.
Similarly, for the processing of the Memory, if you don’t indicate the -xmx, the JVM can open * – XX: + UnlockExperimentalVMOptions *,
-XX:+UseCGroupMemoryLimitForHeap
These two parameters enable the JVM to determine the maximum Java heap size using the Linux Cgroup configuration.
The Arguments: : set_heap_size () method:
void Arguments::set_heap_size() { if (! FLAG_IS_DEFAULT(DefaultMaxRAMFraction)) { // Deprecated flag FLAG_SET_CMDLINE(uintx, MaxRAMFraction, DefaultMaxRAMFraction); } julong phys_mem = FLAG_IS_DEFAULT(MaxRAM) ? MIN2(os::physical_memory(), (julong)MaxRAM) : (julong)MaxRAM; // Experimental support for CGroup memory limits if (UseCGroupMemoryLimitForHeap) { // This is a rough indicator that a CGroup limit may be in force // for this process const char* lim_file = "/sys/fs/cgroup/memory/memory.limit_in_bytes"; FILE *fp = fopen(lim_file, "r"); if (fp ! = NULL) { julong cgroup_max = 0; int ret = fscanf(fp, JULONG_FORMAT, &cgroup_max); if (ret == 1 && cgroup_max > 0) { // If unlimited, cgroup_max will be a very large, but unspecified // value, so use initial phys_mem as a limit if (PrintGCDetails && Verbose) { // Cannot use gclog_or_tty yet. tty->print_cr("Setting phys_mem to the min of cgroup limit (" JULONG_FORMAT "MB) and initial phys_mem (" JULONG_FORMAT "MB)", cgroup_max/M, phys_mem/M); } phys_mem = MIN2(cgroup_max, phys_mem); } else { warning("Unable to read/parse cgroup memory limit from %s: %s", lim_file, errno ! = 0? strerror(errno) : "unknown error"); } fclose(fp); } else { warning("Unable to open cgroup memory limit file %s (%s)", lim_file, strerror(errno)); }}... }Copy the code
The JVM initializes OS: : Physical_Memory () by using memory_limit () values from the cgroup file system, but I noticed Experimental support in the comments.
We’d better set the JVM parameters explicitly according to the Docker configuration to avoid most of the problems. If there is still a problem, you can consider upgrading a higher version of JDK8u. If the cost is high and you do not want to upgrade, please refer to the scheme to load some libraries externally for interception and modification.
Three things to watch ❤️
If you find this article helpful, I’d like to invite you to do three small favors for me:
-
Like, forward, have your “like and comment”, is the motivation of my creation.
-
Follow the public account “Java rotten pigskin” and share original knowledge from time to time.
-
Also look forward to the follow-up article ing🚀
-
[666] Scan the code to obtain the learning materials package
Author: bean warrior Reference: club.perfma.com/article/215…