OOM type


1.1 Java heap Memory Is Out of limit + Not enough continuous memory space

Traditional Java heap memory limit, that is, the requested heap size exceeds runtime.geTruntime ().maxMemory();

The printed LOG looks like this:

void Heap::ThrowOutOfMemoryError(Thread* self, size_t byte_count, AllocatorType allocator_type)

Error message when thrown:

oss << "Failed to allocate a " << byte_count << " byte allocation with " << total_bytes_free << " free bytes and " << PrettySize(GetFreeMemoryUntilOOME()) << " until OOM";

Copy the code

If there is a large amount of memory fragmentation in the process and there is not enough space to allocate memory, an additional LOG message will be added:

failed due to fragmentation (required continguous

Free "< < < < required_bytes" bytes for anew buffer where largest

contiguous free "
<< largest_continuous_free_pages << "Bytes)";

The detailed code in art/runtime/gc/allocator/rosalloc cc

Copy the code

case

  • Android OOM Problem Analysis

This case introduces OOM caused by improper use of Handler

  • Case study of Meituan OOM

Due to sCompatVectorFromResourcesEnabled (will produce multiple examples of Resources, resource reuse failure) inappropriate use, a murder case caused by OOM

1.2 The Number of Threads Exceeds the Upper Limit + Insufficient Virtual Memory

The number of threads exceeds the limit, that is, the number of threads (threads item) recorded in proc/pid/status exceeds the maximum number of threads specified in /proc/sys/kernel/threads-max.

Possible scenarios include:

  • The use of multiple threads in app is not reasonable, such as multiple OKhttpclient that do not share thread pool, etc.

Print LOG information:

void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon)

Error message when thrown:

"Could not allocate JNI Env"

or

StringPrintf("pthread_create (%s stack) failed: %s", PrettySize(stack_size).c_str(), strerror(pthread_create_result)));

Copy the code

Creating a thread can be divided into two steps:

  1. Call Mmap to allocate stack memory. In this case, MAP_ANONYMOUS is specified in the MMAP flag. This is a common way to allocate large chunks of memory in Linux. It allocates virtual memory, and physical memory for the corresponding page is not allocated immediately. Instead, when needed, the kernel triggers a page miss interrupt, and the interrupt handler reallocates physical memory.
  2. Call the Clone method for thread creation.

The first error is thrown because the process is running out of virtual memory:

W/libc: pthread_create failed: couldn't allocate 1073152-bytes mapped space: Out of memory

W/tch.crowdsourc: Throwing OutOfMemoryError with VmSize 4191668 kB "pthread_create (1040KB stack) failed: Try again"

java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again

at java.lang.Thread.nativeCreate(Native Method)

at java.lang.Thread.start(Thread.java:753)

Copy the code

In the second case, the number of threads exceeds the limit and the Clone method fails

W/libc: pthread_create failed: clone failed: Out of memory

W/art: Throwing OutOfMemoryError "pthread_create (1040KB stack) failed: Out of memory"

java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Out of memory

at java.lang.Thread.nativeCreate(Native Method)

at java.lang.Thread.start(Thread.java:1078)

Copy the code

case

  • The incredible OOM

This case is about the analysis and verification process of OOM on some Huawei mobile phones when the number of threads exceeds 500.

1.3 Number of File Descriptors (FD) exceeds the upper limit

The number of file descriptors (FD) exceeds the limit, that is, the number of files in proc/pid/fd exceeds the limit set in /proc/pid/limits. The possible scenarios are as follows: A large number of socket fd requests surge in a short period of time, and a large number of files are opened repeatedly.

How to locate OOM

First of all, we need to know the operation steps of OOM reappearance. If it is randomly tested, we also need to find an effective reappearance step. Fetch.hprof before operation and.hprof after operation. If memory is growing, use 3 or 4 times. Then open the Histogram view of MAT respectively, and compare the variation of Retained size of each object in the object list.

  • ShallowSize: The memory size of the object itself, not including the objects it references.
  • RetainedSize: The sum of the object’s own ShallowSize and the ShallowSize of the objects it governs (directly or indirectly referenced) is the sum of memory that can be reclaimed after the object is GC. For example, in the figure above, D’s ‘RetainedSize’ is the sum of ShallowSize of D, H and I.

Considering that objects with larger RetainSize have greater influence on memory, that is, instances with larger RetainSize are most likely to cause OOM.

Note: Big object! = Memory leak object. The object at the top of the list, which might normally be a memory hog, is the one that suddenly jumps up the list.

OOM monitoring measures

3.1 For overload of threads /fd

This can be monitored using Linux’s inotify mechanism:

  • Watch /proc/pid/fd to monitor app opening
  • Watch /proc/pid/task to monitor thread usage.

3.2 Overflow of heap memory

Open a separate thread for bottom pocket, as shown in the figure:


3.3 Global: Probe

Probe is an OOM detection tool produced by Meituan. It can effectively locate OOM problems such as insufficient online Java heap memory, FD leakage, and thread overflow. Interested in you can understand ~

How to avoid OOM (Memory optimization experience)

  • Use more lightweight data structures

Replace traditional data structures such as HashMap with ArrayMap/ SparseArray. ArrayMap is an Android container written specifically for mobile operating systems and, in most cases, is more efficient and takes up less memory than HashMap. SparseArray is more efficient because they avoid autobox auto-boxing of keys and values and unboxing after boxing.

  • Avoid using enUms in Android
  • Reduce the memory footprint of Bitmap objects
  • Use smaller diagrams
  • Avoid creating objects in the onDraw method
  • Notice object leakage in the cache container

If the container is static or global, remove the objects in it.

  • Check for memory leaks, including common Context leaks, singleton leaks, EditText TextWatcher leaks, and so on. Find and fix them
  • In the Activity onDestory, traverse the View tree, clearing backGround, Drawable, EditText TextWatcher, and so on
  • Optimization of Fresco: When an RN Activity is destroyed using Fresco to load images, the default Fresco cache will be emptied, but the GIF cache will not be empty

Refer to the article

  • How to analyze android OOM and Java static code analysis tool
  • Android – Eliminate OOM experience (1.5 per thousand -> 0.2 per thousand)
  • Android create thread source code and OOM analysis
  • Probe: Indicates an OOM fault locating component on the Android line