reading

OOM Series

Android app OutOfMemory — 1.OOM

ANR series

Android application ANR source code analysis –1.ANR trigger mechanism understanding

java.lang.OutOfMemoryError

In order to analyze OOM problems, first of all, it is necessary to understand which scenes will lead to OOM. The reasons leading to OOM in Android can be divided into the following types :(borrow the figure of meituan)

How is OutOfMemoryError thrown by Android? The following is based on the code of Android9.0 search

Focus on the following two points

    1. Failed to allocate heap memory /art/ Runtime /gc/heap.cc
    1. Failed to create thread /art/ Runtime /thread.cc

So let’s talk about it.

1. The heap memory allocation fails

Void Heap: : ThrowOutOfMemoryError (Thread * self, size_t byte_count, AllocatorType allocator_type) throws the error message:  oss << "Failed to allocate a " << byte_count << " byte allocation with " << total_bytes_free << " free bytes and " << PrettySize(GetFreeMemoryUntilOOME()) << " until OOM";Copy the code

This is an OOM error thrown when allocating heap memory, which can be broken down into two different types:

  • 1. The process memory upper limit was reached when allocating memory for objects. By the Runtime. GetRuntime. MaxMemory () can get each process in the Android system allocated memory limit, when the process footprint reaches the ceiling OOM occurs, this also is the most common type of OOM in Android.

Below we use demo to test the virtual machine heap memory insufficient situation

List<byte[]> bytesList = new ArrayList<>(); private void testCreatHeap() { while (true) { try { Thread.sleep(100); } catch (InterruptedException e) { e.printStackTrace(); } Log.e("oom", "i..." + i++); Byteslist. add(new byte[1024 * 1024 * 10]); }}Copy the code

Test model:Huawei mate20pro harmonyOS 2.0

Explain the above code and diagram:

  • We allocate 10MB of memory at a time

  • 2. As you can see from Figure 1, there were a total of 51 allocations with a heap memory footprint of approximately 535MB, reaching or exceeding 512MB(maxMemory), yet the GC did not reclaim any memory at this point.

  • 3. Finally, ThrowOutOfMemoryError is thrown due to insufficient heap memory, resulting in an error in Figure 2.

  • 2. There is no continuous address space of sufficient size. This is usually caused by a large amount of memory fragmentation in the process, and the stack information in the process is one more section than in the first OOM stack: Failed due to fragmentation (required continguous free “<< required_bytes <<” bytes for a new buffer where largest Contiguous free “<< largest_continuous_free_pages <<” bytes “; The detailed code in art/runtime/gc/allocator/rosalloc. Cc, the following figure. Here the demonstration is more troublesome, do not make an example.

2. Failed to create a thread

Let’s take a look at the code that creates the thread

/art/runtime/thread.cc void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) { CHECK(java_peer ! = nullptr); Thread* self = static_cast<JNIEnvExt*>(env)->GetSelf(); if (VLOG_IS_ON(threads)) { ScopedObjectAccess soa(env); ArtField* f = jni::DecodeArtField(WellKnownClasses::java_lang_Thread_name); ObjPtr<mirror::String> java_name f->GetObject(soa.Decode<mirror::Object>(java_peer))->AsString(); std::string thread_name; if (java_name ! = nullptr) { thread_name = java_name->ToModifiedUtf8(); } else { thread_name = "(Unnamed)"; } VLOG(threads) << "Creating native thread for " << thread_name; self->Dump(LOG_STREAM(INFO)); } Runtime* runtime = Runtime::Current(); // Atomically start the birth of the thread ensuring the runtime isn't shutting down. bool thread_start_during_shutdown = false; { MutexLock mu(self, *Locks::runtime_shutdown_lock_); if (runtime->IsShuttingDownLocked()) { thread_start_during_shutdown = true; } else { runtime->StartThreadBirth(); } } if (thread_start_during_shutdown) { ScopedLocalRef<jclass> error_class(env, env->FindClass("java/lang/InternalError")); env->ThrowNew(error_class.get(), "Thread starting during runtime shutdown"); return; } Thread* child_thread = new Thread(is_daemon); // Use global JNI ref to hold peer live while child thread starts. child_thread->tlsPtr_.jpeer = env->NewGlobalRef(java_peer); stack_size = FixStackSize(stack_size); // Thread.start is synchronized, so we know that nativePeer is 0, and know that we're not racing // to assign it. env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer, reinterpret_cast<jlong>(child_thread)); // Try to allocate a JNIEnvExt for the thread. We do this here as we might be out of memory and //1. // Do not have a good way to report this on the child's side.std ::string error_msg; std::unique_ptr<JNIEnvExt> child_jni_env_ext( JNIEnvExt::Create(child_thread, Runtime::Current()->GetJavaVM(), &error_msg)); int pthread_create_result = 0; if (child_jni_env_ext.get() ! = nullptr) { pthread_t new_pthread; pthread_attr_t attr; child_thread->tlsPtr_.tmp_jni_env = child_jni_env_ext.get(); CHECK_PTHREAD_CALL(pthread_attr_init, (&attr), "new thread"); CHECK_PTHREAD_CALL(pthread_attr_setdetachstate, (&attr, PTHREAD_CREATE_DETACHED), "PTHREAD_CREATE_DETACHED"); CHECK_PTHREAD_CALL(pthread_attr_setstacksize, (&attr, stack_size), stack_size); Pthread_create_result = pthread_create(&new_pthread, &attr, Thread::CreateCallback, child_thread); CHECK_PTHREAD_CALL(pthread_attr_destroy, (&attr), "new thread"); if (pthread_create_result == 0) { // pthread_create started the new thread. The child is now responsible for managing the // JNIEnvExt we created. // Note: we can't check for tmp_jni_env == nullptr, as that would require synchronization // between the threads. child_jni_env_ext.release(); return; } } // Either JNIEnvExt::Create or pthread_create(3) failed, so clean up. { MutexLock mu(self, *Locks::runtime_shutdown_lock_); runtime->EndThreadBirth(); } // Manually delete the global reference since Thread::Init will not have been run. env->DeleteGlobalRef(child_thread->tlsPtr_.jpeer); child_thread->tlsPtr_.jpeer = nullptr; delete child_thread; child_thread = nullptr; // TODO: remove from thread group? env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer, 0); { std::string msg(child_jni_env_ext.get() == nullptr ? StringPrintf("Could not allocate JNI Env: %s", error_msg.c_str()) : StringPrintf("pthread_create (%s stack) failed: %s", PrettySize(stack_size).c_str(), strerror(pthread_create_result))); ScopedObjectAccess soa(env); soa.Self()->ThrowOutOfMemoryError(msg.c_str()); }}Copy the code

Summarize the main process of creating Thread in native layer:

  • Create JNIEnvExt;
  • 2. Create thread.

The whole process of combing is shown below (yes, again borrowed from Meituan); And mark the places that may result in oom;

1. Failed to create JNIEnv

Creating a JNIEnv can be done in two steps:

  • 1. Allocate 4KB (one page) kernel Memory through Andorid’s Anonymous Shared Memory.
  • 2. Map the virtual memory address space to the user mode through the Mmap invocation of Linux.

Step 1: To create anonymous shared memory, you need to open the /dev/ashmem file, so you need a FD (file descriptor). If the number of FD’s created has reached the upper limit, the JNIEnv creation will fail and the following error message will be thrown:

E/art: ashmem_create_region failed for 'indirect ref table': 
Too many open files java.lang.OutOfMemoryError:
Could not allocate JNI Env at java.lang.Thread.nativeCreate(Native Method) 
at java.lang.Thread.start(Thread.java:730)

Copy the code

If the mmap process runs out of virtual memory address space, it will fail to create the JNIEnv.

E/art: Failed anonymous mmap(0x0, 8192, 0x3, 0x2, 116, 0): Operation not permitted.
See process maps in the log. java.lang.OutOfMemoryError:
Could not allocate JNI Env at java.lang.Thread.nativeCreate(Native Method) 
at java.lang.Thread.start(Thread.java:1063)
Copy the code

2. Failed to create a thread

Creating a thread can also be reduced to two steps:

    1. Call Mmap to allocate stack memory. In this case, MAP_ANONYMOUS is specified in the MMAP flag. This is a common way to allocate large chunks of memory in Linux. It allocates virtual memory, and physical memory for the corresponding page is not allocated immediately. Instead, when needed, the kernel triggers a page miss interrupt, and the interrupt handler reallocates physical memory.
    1. Call the Clone method for thread creation.

The possible cause is that the process is running out of virtual memory. The following error message may be thrown:

W/libc: pthread_create failed: couldn't allocate 1073152-bytes mapped space: Out of memory
W/tch.crowdsourc: Throwing OutOfMemoryError with VmSize  4191668 kB "pthread_create (1040KB stack) failed: Try again"
java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
        at java.lang.Thread.nativeCreate(Native Method)
        at java.lang.Thread.start(Thread.java:753)
Copy the code

Here’s a test:

Private void testCreatThread() {while (true) {log.e ("oom", "I...") + i++); new Thread(new Runnable() { @Override public void run() { try { Thread.sleep(10000); } catch (InterruptedException e) { e.printStackTrace(); } } }, "i..." + i).start(); }}Copy the code

Huawei mate20pro harmonyOS 2.0 Explain the above code and diagram:

  • 1. As you can see from figure 1, the single process limitation of huawei mate20pro harmonyOS 2.0 device.

  • 2. We created 2900+ threads and ended up running out of virtual memory, which will be explained separately below.

The possible reason for the failure of the second clone method is that the number of threads exceeds the limit, and the error message is generally as follows, which will not be demonstrated here:

W/libc: pthread_create failed: clone failed: Out of memory
W/art: Throwing OutOfMemoryError "pthread_create (1040KB stack) failed: Out of memory"
java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Out of memory
  at java.lang.Thread.nativeCreate(Native Method)
  at java.lang.Thread.start(Thread.java:1078)
Copy the code

3. The number of FD exceeds the limit

Since Android is based on Linux, you can use /proc/pid/limits to check the limits of the system on the corresponding processes. FD is generally not capped for non-extreme applications or misuse. The limitations of FD for a huawei device running Android7.0 are as follows:

4. The number of threads exceeds the upper limit

Similar to FD, it can be viewed through /proc/sys/kernel/threads-max. The number of threads exceeded the limit. Currently, only Huawei phones were listed in OOM because the system changed the limit on the number of threads in a single process. Other phones are more likely to get OOM not because of the thread limit, but because of something else, especially the virtual memory problems of 32-bit apps.

With regard to thread limits, the failure to create a thread all the time is a very different case on the following two phones:

Huawei mate20pro harmonyOS 2.0, the number of threads can be created up to 2900+

For huawei 7.0, when the number of threads reaches 400+, the device must be OOM

5. The virtual memory is insufficient

You probably already know about Android virtual memory; If you don’t know, check out this article, Android Memory. To summarize, when an application allocates memory, it gets virtual memory, and only when the block is actually written does a page miss interrupt occur and physical memory is allocated. The size of virtual memory is limited by CPU architecture and kernel.

The 32-bit CPU architecture (ARM-V7) has a maximum address space of 4GB. The kernel takes up some of the high addresses. The maximum address space available for arm64 is 3GB. The reality of many applications today is that, as 32-bit applications, they generally run on 64-bit CPU architectures. In this case, the application can monopolize the low address space of 4GB, while the kernel can still use the high address of 512GB. So the virtual memory shortage is mostly concentrated in 32-bit applications.

Here’s how this code runs out of virtual memory on voviY66, a 32-bit device running android6.0.1

Private void testCreatThread() {while (true) {log.e ("oom", "I...") + i++); try { if (i == 1801) { Thread.sleep(200); } } catch (InterruptedException e) { e.printStackTrace(); } new Thread(new Runnable() {@override public void run() {try {if (I == 1800) {// Obtain proc/pid/status status log.e ("oom", "i..." + DeviceUtil.getProcData()); } thread.sleep (100000); } catch (InterruptedException e) { e.printStackTrace(); } } }, "i..." + i).start(); }}Copy the code

Explain the above code and diagram:

  • 1. Create threads all the time. As shown in Figure 1, we have created more than 1800 threads, which is not as strict as the limit on the number of threads on Huawei devices.
  • 2. According to figure 2, virtual memory (Vmsize value) occupation, it can be seen that this crash is due to insufficient virtual memory, Vmsize is about 3GB. Also note that the performance here is NativeCrash Fatal Signal 11 (SIGSEGV) and not OOM for the Java layer.
  • 3. If you want to solve the problem of running out of virtual memory, the best solution is to upgrade 64-bit applications.

Reference links:

Probe: Indicates an OOM fault locating component on the Android line

Android Advanced performance tuning; Incredible OOM!