Photo by Yang Shuo on Unsplash

Some time ago, in the implementation of “quick open folder”, I did not think to use the mass scheme (File#listFiles List of all files into List

for RecyclerView use, Info encapsulates the path, file size, modification time and other information), and easily into the pit 😓. It took more than 2 minutes until I tested the super large folder (50000 files), which was far from fast. After a period of time, I found out the reason and wrote this article.

Source code analysis

File#listFilesAnalysis of the

The execution logic was analyzed and no significant time consuming operations were found, so the list method was further analyzed.

  • File#listFilescalllistScan out all file names
  • Then create the corresponding File object and save it.
public File[] listFiles() {
    String[] ss = list();
    if (ss == null) return null;
    int n = ss.length;
    File[] fs = new File[n];
    for (int i = 0; i < n; i++) {
        fs[i] = new File(ss[i], this);
    }
    return fs;
}
Copy the code

File#listAnalysis of the

Java layer logic

Android 7.0 is a watershed, before 7.0 directly calling listImpl() methods, into the native layer; Instead of calling listImpl() directly, a subclass of FileSystem performs the list operation into the native layer.

public String[] list() {
    return listImpl(path);
}
private static native String[] listImpl(String path);
--------------------------------------------------7.0--------------------------------------------------
// The source code is in File
public String[] list() {
    SecurityManager security = System.getSecurityManager();
    if(security ! =null) {
        security.checkRead(path);
    }
    if (isInvalid()) {
        return null;
    }
    return fs.list(this);
}

// the source code is in UnixFileSystem#list
public String[] list(File f) {
    BlockGuard.getThreadPolicy().onReadFromDisk();
    BlockGuard.getVmPolicy().onPathAccess(f.getPath());
    return list0(f);
}
private native String[] list0(File f);
Copy the code

Native layer analysis

ListIml (), located in the/libcore/luni/SRC/main/native/java_io_File CPP (jump), realize the logic is as follows:

  • Constantly callingreaddir(Older versions use readdir_r,See here for reasons) reads the directory until it reaches the end of the directory, which returns null and stops.
  • std::vector<std::string> DirEntriesTo save the file name and expand to the data structurevector. (The time is mainly due to data copy during vector expansion.)
  • Finally, convert the string in the vector to string [] and return it.
// Iterates over the filenames in the given directory.
class ScopedReaddir {
 public:
  ScopedReaddir(const char* path) {
    mDirStream = opendir(path);
    mIsBad = (mDirStream == NULL);
  }
  ~ScopedReaddir() {
    if(mDirStream ! =NULL) { closedir(mDirStream); }}// Returns the next filename, or NULL.
  const char* next(a) {
    if (mIsBad) {
      return NULL;
    }
    errno = 0;
    dirent* result = readdir(mDirStream);
    if(result ! =NULL) {
      return result->d_name;
    }
    if(errno ! =0) {
      mIsBad = true;
    }
    return NULL;
  }
  // Has an error occurred on this stream?
  bool isBad(a) const {
    return mIsBad;
  }
 private:
  DIR* mDirStream;
  bool mIsBad;
  // Disallow copy and assignment.
  ScopedReaddir(const ScopedReaddir&);
  void operator= (const ScopedReaddir&);
};
typedef std: :vector<std: :string> DirEntries;
// Reads the directory referred to by 'pathBytes', adding each directory entry
// to 'entries'.
static bool readDirectory(JNIEnv* env, jstring javaPath, DirEntries& entries) {
  ScopedUtfChars path(env, javaPath);
  if (path.c_str() == NULL) {
    return false;
  }
  ScopedReaddir dir(path.c_str());
  const char* filename;
  while((filename = dir.next()) ! =NULL) {
    if (strcmp(filename, ".") != 0 && strcmp(filename, "..") != 0) {
      // TODO: this hides allocation failures from us. Push directory iteration up into Java?entries.push_back(filename); }}return! dir.isBad(); }static jobjectArray File_listImpl(JNIEnv* env, jclass, jstring javaPath) {
  // Read the directory entries into an intermediate form.
  DirEntries entries;
  if(! readDirectory(env, javaPath, entries)) {return NULL;
  }
  // Translate the intermediate form into a Java String[].
  return toStringArray(env, entries);
}
Copy the code

List0 (), located in libcore ojluni/SRC/main/native/UnixFileSystem_md c (jump). The implementation logic is as follows:

  • Constantly callingreaddir64Read the directory until the end of the directory is read, which returns null and stops
  • Initialize an array of 16 strings (where Java data types are stored), determine whether the number of files being read exceeds maxLen, and if so create a new array of the size of (maxlen <<= 1), and then useJNU_CopyObjectArrayMethod to copy the old data into the new array and delete the old data. No more than maxlen to store data. (This time is mainly due toC calls Java methods to copy data)
  • Finally, create a new String array of length len, and copy the original data into the new array, reducing memory.
// Android-changed: Name changed because of added thread policy check
JNIEXPORT jobjectArray JNICALL
Java_java_io_UnixFileSystem_list0(JNIEnv *env, jobject this,
                                  jobject file)
{
    DIR *dir = NULL;
    struct dirent64 *ptr;
    // Android-removed: Integrate OpenJDK 12 commit to use readdir, not readdir_r. b/64362645
    // struct dirent64 *result;
    int len, maxlen;
    jobjectArray rv, old;
    jclass str_class;

    str_class = JNU_ClassString(env);
    CHECK_NULL_RETURN(str_class, NULL);


    WITH_FIELD_PLATFORM_STRING(env, file, ids.path, path) {
        dir = opendir(path);
    } END_PLATFORM_STRING(env, path);
    if (dir == NULL) return NULL;

    // Android-removed: Integrate OpenJDK 12 commit to use readdir, not readdir_r. b/64362645
    /* ptr = malloc(sizeof(struct dirent64) + (PATH_MAX + 1)); if (ptr == NULL) { JNU_ThrowOutOfMemoryError(env, "heap allocation failed"); closedir(dir); return NULL; } * /

    /* Allocate an initial String array */
    len = 0;
    maxlen = 16;
    rv = (*env)->NewObjectArray(env, maxlen, str_class, NULL);
    if (rv == NULL) goto error;

    /* Scan the directory */
    // Android-changed: Integrate OpenJDK 12 commit to use readdir, not readdir_r. b/64362645
    // while ((readdir64_r(dir, ptr, &result) == 0)  && (result != NULL)) {
    while((ptr = readdir64(dir)) ! =NULL) {
        jstring name;
        if (!strcmp(ptr->d_name, ".") || !strcmp(ptr->d_name, ".."))
            continue;
        if (len == maxlen) {
            old = rv;
            rv = (*env)->NewObjectArray(env, maxlen <<= 1, str_class, NULL);
            if (rv == NULL) goto error;
            if (JNU_CopyObjectArray(env, rv, old, len) < 0) goto error;
            (*env)->DeleteLocalRef(env, old);
        }
#ifdef MACOSX
        name = newStringPlatform(env, ptr->d_name);
#else
        name = JNU_NewStringPlatform(env, ptr->d_name);
#endif
        if (name == NULL) goto error;
        (*env)->SetObjectArrayElement(env, rv, len++, name);
        (*env)->DeleteLocalRef(env, name);
    }
    closedir(dir);
    // Android-removed: Integrate OpenJDK 12 commit to use readdir, not readdir_r. b/64362645
    // free(ptr);

    /* Copy the final results into an appropriately-sized array */
    old = rv;
    rv = (*env)->NewObjectArray(env, len, str_class, NULL);
    if (rv == NULL) {
        return NULL;
    }
    if (JNU_CopyObjectArray(env, rv, old, len) < 0) {
        return NULL;
    }
    return rv;

 error:
    closedir(dir);
    // Android-removed: Integrate OpenJDK 12 commit to use readdir, not readdir_r. b/64362645
    // free(ptr);
    return NULL;
}
Copy the code

Optimize the source

After source code analysis, it is found that both versions have been expanded + copy operation, so the use of linked list for optimization, after testing, although the time does reduce, did not achieve the ideal effect.

using InfoList = struct Info {
    char *d_name = nullptr;
    Info *next = nullptr;
};

static JNIEXPORT jobjectArray JNICALL
nativeList(JNIEnv *env, jclass clazz, jstring jpath) {
    const char *path = env->GetStringUTFChars(jpath, nullptr);
    DIR *dir;
    struct dirent *ent;
    InfoList *head = nullptr;
    InfoList *temp;
    InfoList *current = nullptr;
    auto size = 0;
    if ((dir = opendir(path)) ! =nullptr) {
        while ((ent = readdir(dir)) ! =nullptr) {
            int length = ent->d_reclen;
            if (strcmp(ent->d_name, ".") = =0 || strcmp(ent->d_name, "..") = =0) {
                continue;
            }
            if (head == nullptr) {
                head = static_cast<InfoList *>(malloc(sizeof(InfoList)));
                head->d_name = static_cast<char* > (malloc((length + 1) * sizeof(char)));
                memcpy(head->d_name, ent->d_name, length);
                head->d_name[length] = '\ 0';
                current = head;
            } else {
                temp = static_cast<InfoList *>(malloc(sizeof(InfoList)));
                temp->d_name = static_cast<char* > (malloc((length + 1) * sizeof(char)));
                memcpy(temp->d_name, ent->d_name, length);
                temp->d_name[length] = '\ 0';
                current->next = temp;
                current = temp;
            }
            size++;
        }
        closedir(dir);
        if (size > 0) {
            jobjectArray result = env->NewObjectArray(size, env->FindClass("java/lang/String"),
                                                      nullptr);
            for (int i = 0; i < size; i++) {
                jstring tempStr = env->NewStringUTF(head->d_name);
                env->SetObjectArrayElement(result, i, tempStr);
                env->DeleteLocalRef(tempStr);
                temp = head;
                head = head->next;
                temp->next = nullptr;
                if (temp->d_name) {
                    free(temp->d_name);
                    temp->d_name = nullptr;
                }
                free(temp);
                temp = nullptr;
            }
            returnresult; }}return nullptr;
}
Copy the code

conclusion

After a period of bizarre operations, the culprit was found to be stat()/stat64() (a method used to get file attributes). When only the list operation is performed on the file, it only takes about ten seconds to complete for 50,000 files. When the file attributes such as file size and modification time are obtained and encapsulated into Info, I cannot bear to look at it. Finally, the scheme is to only encapsulate the path information in Info, when RecyclerView$Adapter#onBindViewHolder, and then convert the path into File, refresh the UI; You can also optimize the nativeList method to support paging file names, further reducing the time needed to scan directories.

legacy

The File#list method was optimized after 6.0. Why is it optimized after 7.0? Obviously, the speed and copy efficiency of data expansion in native layer is higher than that of Java layer calling data expansion and copy in native layer. Therefore, the similar implementation method in the earlier version is compared with File#list in the higher version of mobile phone. The result is that the efficiency of the new version is higher, which may also be due to the problem of insufficient data. Here’s the implementation code:

static JNIEXPORT jobjectArray JNICALL
nativeList64(JNIEnv *env, jclass clazz, jstring jpath) {
    const char *path = env->GetStringUTFChars(jpath, nullptr);
    DIR *dir = nullptr;
    struct dirent64 *ent;
    std::vector<std::string> data;
    if ((dir = opendir(path)) ! =nullptr) {
        while ((ent = readdir64(dir)) ! =nullptr) {
            if (strcmp(ent->d_name, ".") = =0 || strcmp(ent->d_name, "..") = =0) {
                continue;
            }
            data.emplace_back(ent->d_name);
        }
        closedir(dir);
        int size = data.size(a);if (size > 0) {
            jobjectArray result = env->NewObjectArray(size, env->FindClass("java/lang/String"),
                                                      nullptr);
            for (int i = 0; i < size; i++) {
                jstring temp = env->NewStringUTF(data.at(i).c_str());
                env->SetObjectArrayElement(result, i, temp);
                env->DeleteLocalRef(temp);
            }
            std::vector<std::string>().swap(data);
            returnresult; }}return nullptr;
}
Copy the code

By the way, I hope that good people can help to push it (epidemic + entrepreneurship resulted in wasted two years, 😓), [email protected], thanks!

Refer to the content

  • readdir vs readdir64
  • Readdir & Readdir64 in Android implementation
  • Obtain file attributes – stat, lstat, and fstat