The paper

Recently, I encountered a crash related to FD in the project. The information obtained from log is as follows

2021-12-13 14:33:47.302   878  1017 F libc    : FORTIFY: FD_SET: file descriptor >= FD_SETSIZE
2021-12-13 14:33:47.302   878  1017 F libc    : Fatal signal 6 (SIGABRT), code -6 in tid 1017 (pool-2-thread-1)
Copy the code

After some struggle, I finally solved the problem. Then I investigated the thing I had never encountered before. I found it was very important and common, but not easy to be found.

What is a FD

The File Descriptor FD (File Descriptor) is a non-negative integer in form. It is an index value that points to the record table of open files maintained by the kernel for each process. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. In Linux, all devices are treated as files, and file descriptors provide a unified approach to device-related programming on the Linux platform.

FD, as an instance of a file handle, can be used to represent an open file, an open socket, a pipe or resource (such as a memory block), and input/output (in/out/error).

You can run the ls -l /proc/$pid/fd command to view the file descriptor usage of the current process.

root@generic_x86:/ # ls -l /proc/2479/fd
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 0 -> /dev/null
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 1 -> /dev/null
l-wx------ u0_a55   u0_a55            2022-01-21 15:42 10 -> /dev/cpuctl/tasks
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 11 -> anon_inode:[eventfd]
l-wx------ u0_a55   u0_a55            2022-01-21 15:42 12 -> /dev/cpuctl/bg_non_interactive/tasks
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 13 -> anon_inode:[eventpoll]
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 14 -> socket:[10778]
lr-x------ u0_a55   u0_a55            2022-01-21 15:42 15 -> pipe:[10779]
l-wx------ u0_a55   u0_a55            2022-01-21 15:42 16 -> pipe:[10779]
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 17 -> socket:[10783]
lr-x------ u0_a55   u0_a55            2022-01-21 15:42 18 -> /data/app/com.example.kotlintest-1/base.apk
lrwx------ u0_a55   u0_a55            2022-01-21 15:20 19 -> anon_inode:[eventfd]
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 2 -> /dev/null
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 20 -> socket:[9794]
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 21 -> anon_inode:[eventpoll]
lrwx------ u0_a55   u0_a55            2022-01-21 15:20 22 -> /dev/goldfish_pipe
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 23 -> socket:[10790]
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 24 -> /dev/goldfish_pipe
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 25 -> /dev/goldfish_pipe
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 26 -> socket:[10795]
lrwx------ u0_a55   u0_a55            2022-01-21 15:42 27 -> /dev/goldfish_sync
Copy the code

2022-01-21 15:420 -> socket:[9794]; socket:[9794]; socket:[9794];

The types of FD are shown below

FD type instructions
socket Related to network requests
anon_inode:[eventpoll] HandlerThread Thread Looper
anon_inode:[eventfd] HandlerThread Thread Looper
anon_inode:[timerfd] The system file descriptor type is not application-dependent
anon_inode:[dmabuf] InputChannel leakage increases significantly
/vendor/ Generally used for system operation
/dev/ashmem Database operation related
pipe: Generally used for system operation
/sys/ Generally used for system operation
/data/data/ Open file correlation
/data/app/ Open file correlation
/storage/emulate/0/ Open file correlation

There is a limit to the number of file descriptors that can be opened in The Android system, so the number of file descriptors that can be opened by each process is limited. You can run the ulimit -n command to check whether the default value of Linux Android is 1024. Most of the newer Android devices are already larger than 1024

root@generic_x86:/ # ulimit -n
1024
Copy the code

FD leak

Compared to traditional memory leaks, FD leaks do not run out of memory in most cases, so problems are more subtle when they occur. Since the memory may not be insufficient when FD leak occurs, GC operation of the system will not be started, resulting in the only way to recover by crash process. In fact, in many cases, even if the system GC is triggered, it may not be possible to reclaim the handle files that have been created.

Error Msg of the following Java layer is suspected of fd leakage:

  1. “Too many open files”\
  2. “Could not allocate JNI Env”\
  3. “Could not allocate dup blob fd”\
  4. “Could not read input channel file descriptors from parcel”\
  5. “pthread_create * “\
  6. “InputChannel is not initialized”\
  7. “Could not open input channel pair”

FD leak scenario

Input and output

Input and output streams are frequently used in any program. If input and output streams such as FileInputStream, FileOutputStream, FileReader and FileWriter are constantly created but not closed in time, they may not only leak memory but also overflow FD. Each time a FileInputStream is new, FileOutputStream creates a FD in the process that points to the open file, and if the following code is repeated, the FD file will continue to grow until the FC exceeds 1024.

val file = File(cacheDir, "testFdFile")
file.createNewFile()
val out = FileOutputStream(file)
Copy the code

Run ls -l in /proc/${process id}/fd/ to see if the added fd points to the created file. Here different files are created. Even for the same file, multiple FDS are created to point to the open file stream.

l-wx------ u0_a55   u0_a55   2022-01-24 11:26 30 -> /data/data/com.example.kotlintest/cache/testFdFile
l-wx------ u0_a55   u0_a55   2022-01-24 11:26 31 -> /data/data/com.example.kotlintest/cache/testFdFile
l-wx------ u0_a55   u0_a55   2022-01-24 11:26 32 -> /data/data/com.example.kotlintest/cache/testFdFile
l-wx------ u0_a55   u0_a55   2022-01-24 11:26 33 -> /data/data/com.example.kotlintest/cache/testFdFile
l-wx------ u0_a55   u0_a55   2022-01-24 11:26 34 -> /data/data/com.example.kotlintest/cache/testFdFile
l-wx------ u0_a55   u0_a55   2022-01-24 11:26 35 -> /data/data/com.example.kotlintest/cache/testFdFile
l-wx------ u0_a55   u0_a55   2022-01-24 11:26 38 -> /data/data/com.example.kotlintest/cache/testFdFile
Copy the code

The right thing to do is to be able to close the stream in final, so that the stream will be shut down smoothly regardless of whether an exception occurs halfway through the process that causes the program to break.

out.close()
Copy the code

Stars, HandlerThread

Using threads in Android, and handlerThreads in particular, must be particularly careful. You must ensure that the function that created the HandlerThread is not repeatedly called, causing the thread to be created repeatedly.

//1.HandlerThread
val handlerThread = HandlerThread("test")
handlerThread.start()

//2.Thread+Looper
Thread {
    Looper.prepare()
    Looper.loop()
}.start()
Copy the code

Looper object initialization looper.prepare () requires fd resources, and a single HandlerThread consumes a pair of FD’s (eventFd and epollFd), which are intended to communicate between threads.

To destroy the Loop when the thread Loop is not needed, call handlerThead.quitsafely () or handlerThead.quit () and release the handle resource as follows:

/ / 1
handlerThread.quitSafely()

/ / 2
Looper.myLooper().quit()
Copy the code

Cursor

In daily development, if the database SQLite is used to manage local data, after the database query cursor is used, it also needs to call the close method to release resources, otherwise it may lead to memory and file descriptor leakage.

db = ordersDBHelper.getReadableDatabase(); Cursor cursor = db.query(...) ;while (cursor.moveToNext()) {
  / /...
}
if(flag){
   // For some reason retrn
   return;
}
// If close is not called, the fd will leak
cursor.close();
Copy the code

InputChannel

Windowmanager.addview, repeatedly adding views through WindowManager can also cause file descriptors to grow. You can call removeView to release a previously created FD. When we show an AlertDialog, a window is created, and a FD is created. When we keep creating an AlertDialog, a FD leak is created, as follows:

for (index in 1 until 1024) {
    AlertDialog.Builder(this).show()
}
Copy the code
E/AndroidRuntime: FATAL EXCEPTION: main
    Process: com.example.kotlintest, PID: 4333
    java.lang.RuntimeException: Could not read input channel file descriptors from parcel.
        at android.view.InputChannel.nativeReadFromParcel(Native Method)
        at android.view.InputChannel.readFromParcel(InputChannel.java:148)
        at android.view.IWindowSession$Stub$Proxy.addToDisplay(IWindowSession.java:759)
        at android.view.ViewRootImpl.setView(ViewRootImpl.java:531)
        at android.view.WindowManagerGlobal.addView(WindowManagerGlobal.java:310)
        at android.view.WindowManagerImpl.addView(WindowManagerImpl.java:85)
        at android.app.Dialog.show(Dialog.java:319)
        at android.support.v7.app.AlertDialog$Builder.show(AlertDialog.java:1007)
        at com.example.kotlintest.FDActivity$onCreate$4.onClick(FDActivity.kt:36)
        at android.view.View.performClick(View.java:5198)
        at android.view.View$PerformClick.run(View.java:21147)
        at android.os.Handler.handleCallback(Handler.java:739)
        at android.os.Handler.dispatchMessage(Handler.java:95)
        at android.os.Looper.loop(Looper.java:148)
        at android.app.ActivityThread.main(ActivityThread.java:5417)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)
Copy the code

See not only the demo app crash, and system_server also appeared abnormal crash, phone restart, terrible!! This shows how serious the FD leak is and how app exceptions can affect the stability of system_server.

Inputchannel also needs fd resources. The application’s input events are managed by WindowManagerService. WMS creates an InputManager internally, and both are done through an InputChannel. WMS needs to register two InputChannels to connect to the InputManager. The Server InputChannel is registered in the InputManager (SystemServer), and the Client is registered in the main thread of the application program. InputChannel uses Ashmem anonymously shared memory to pass data, which is pointed to by a FD file descriptor with one FD for read and one FD for write. When a new Task is created, both the server(system_server) and client(app) build FDS. AddWindow initializes Inputchannel to communicate with InputManagerService across processes to monitor Input events, essentially initializing a pair of socket files for communication

A file descriptor is created for the server process and for the client process, because the socket is created for interprocess communication. In addition, too many system process file descriptors can theoretically cause a system crash.

How to solve FD leakage problem

StrictMode

Use StrictMode framework to locate specific code occupying fd, search the log TAG StrictMode to locate the faulty code

StrictMode.setThreadPolicy(new StrictMode.ThreadPolicy.Builder()
    .detectDiskReads()
    .detectDiskWrites()
    .detectNetwork() // or .detectAll() for all detectable problems
    .penaltyLog()
    .build());
StrictMode.setVmPolicy(newStrictMode.VmPolicy.Builder() .detectLeakedSqlLiteObjects() .detectLeakedClosableObjects() .penaltyLog() .penaltyDeath()  .build());Copy the code

However, the strict model does not find all the problems. I have experienced situations where the problem persists even after strict mode is used. So something else is needed.

Print the current FD information

If the FD leak problem can be reproduced, you can try to reproduce it first and then run the command ‘ls -la /proc/$pid/fd’ to check the current process file descriptor consumption. Common Android application file descriptors can be divided into several categories, by comparing which type of file descriptors are too high, to narrow the scope of the problem.

FD type instructions
socket Checking network Requests
anon_inode Check HandlerThread thread Looper InputChannel
/dev/ashmem Checking database operations
/data/data/ /data/app/ /storage/emulate/0/ Check whether the corresponding file is open or not

Dump system information

Run dumpsys Window to check whether an exception window exists. Used to resolve inputChannel-related leakage problems. As follows, there are many a Window {6541819 u0 com. Example. Kotlintest/com. Example. Kotlintest. FDActivity}, you can from this Activity to find errors

D:\>adb shell dumpsys window

Window #38 Window{6541819 u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{3c2a817 com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #37 Window{20c9363 u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{3301496 com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #36 Window{c67271d u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{5bb77b1 com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #35 Window{5a1a1c7 u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{3365a58 com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #34 Window{9508de1 u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{f70133b com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #33 Window{af9d1eb u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{4f965ca com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #32 Window{4465065 u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{e473d35 com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #31 Window{f3287cf u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{50f736c com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #30 Window{66632a9 u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{b90541f com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #29 Window{f9ae773 u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
  mOwnerUid=10055 mShowToOwnerOnly=true package=com.example.kotlintest appop=NONE
  WindowStateAnimator{6084bbe com.example.kotlintest/com.example.kotlintest.FDActivity}:
Window #28 Window{7b9b8ad u0 com.example.kotlintest/com.example.kotlintest.FDActivity}:
Copy the code

Online monitoring

If the problem is that the file cannot be reproduced locally, you can add an online monitoring code to poll the number of FDS used by the current process periodically. When the number reaches the threshold, the information of the current FD is read and sent to the background for analysis. The code for obtaining the file information of the FD is as follows.

val fdFile = File("/proc/" + android.os.Process.myPid() + "/fd/")
val files = fdFile.listFiles() // Lists all files in the current directory
vallength = files? .size;// The number of FDS in the process
Log.d(TAG, "listFd = " + android.os.Process.myPid() + "=" + length)
// Train FD and its pointing file informationfiles? .forEach { file ->try {
        val linkTarget = Os.readlink(file.absolutePath);
        Log.d(TAG, "$file= = = = >$linkTarget")}catch (e: Exception) {
        Log.d(TAG, "$file====> error")}}Copy the code

Check the logs that are periodically printed

Check whether information is frequently displayed in logcat, for example, failed to create socket.

Thank you documents:

  1. This article will help you understand the Android file descriptor
  2. AndroidFD leak summary