This paper refers to:

  • Understanding the Linux Kernel, Third Edition, chapter 12.1

Introduction to the

Part of the reason for Linux’s success is that it supports different file systems so well that you can easily and transparently mount Windows, other Unix systems, and even a small share of the Amiga file systems into Linux’s file systems. This is done through the Virtual Filesystem (VFS).

The idea behind the VFS is that different file systems are abstracted from the kernel, and the Linux kernel implements specific operations for specific file systems. When system calls read and write occur, the kernel invokes corresponding functions according to specific file systems, such as Native Linux file systems and NTFS (Windows NT).

Example:cp

To execute a cp directive:

cp /floppy/TEST /tmp/test
Copy the code

Where /floppy is a mounted MS-DOS file system and/TMP is ext2. The VFS is an abstraction layer between the application and the underlying file system implementation. The CP does not need to know the file system types of /floppy/TEST/TMP /TEST. It only needs to invoke standard system calls, such as read and write, to transfer the complexity of the underlying file system to the kernel.

The code for this example is as follows:

inf = open("/floppy/TEST", O_RDONLY, 0);
outf = open("/tmp/test",
       O_WRONLY|O_CREAT|O_TRUNC, 0600);
do {
    i = read(inf, buf, 4096);
    write(outf, buf, i);
  } while (i);
close(outf);
close(inf);
Copy the code

Schematic diagram:

The code and screenshots are from Understanding the Linux Kernel, Third Edition P457

File systems supported by the VFS

VFS supports the following three main categories of file systems:

  1. Disk-based FS

Local disk. Include:

  • ext2, etx3, ext4
  • Unix family, such as SYSV file system, UFS (BSD, Solaris), MINIX, etc
  • Microsoft file systems, such as MS-DOS and NTFS
  • ISO9660 CD-ROM (previous High Sierra Filesystem)
  • Other less popular
  1. Network FS

This category supports access to remote file systems such as NFS, Coda, AFS, and so on.

  1. Special FS

For example, /proc virtual file system.

Generally speaking, the root directory is the native Linux ext2, ext3, and ext4 directory. Other file systems are mounted to a specific subdirectory.

Common File Model

The core idea behind the VFS is to represent all real-world FS with a common File Model. This model strictly uses the native Unix FS model, and each particular FS needs to translate its hardware architecture into the Common File Model.

In the Common File Model, for example, directories are also treated as files, containing other file box directories. However, some non-UNIX FS use file Allocation Table (FAT), in which case the directory is not a file. But in order to follow the rules of the Common File Model, Linux must be able to abstract an interface that follows the Common File Model for fat-based FS.

More specifically, the Linux kernel cannot directly hardcode a specific underlying function when handling system calls such as READ and IOCtl. The kernel actually uses a pointer to a file system-specific handler for each operation.

Let’s take a look at how kernel does the cp operation mentioned above.

When the application layer calls read(), the kernel actually calls sys_read() service routine (as do other system calls). Ms-dos FS files are represented by a data structure in kernel memory that contains a F_OP field pointing to the read function for MS-DOS. Sys_read () finds this function and calls it. So the whole process can be seen as:

read() -> sys_read() -> file data structure -> f_op -> read_for_msdos()
Copy the code

The same goes for the call to write(), which triggers a write call to ext2 FS.

Briefly, for each file object created by open(), the kernel is responsible for assigning Pointers to the file object correctly, pointing to specific functions for the file system, and then calling those functions.

What object types does CFM contain

  • Superblock Object: contains data in Mounted FS.
  • Indoe Object: data of a specific file. For disk-based FS, this object typically corresponds to a file control block. Each inode object has an inode number that uniquely specifies a file in FS.
  • File Object: Information about the interaction between an open file and a process. This object only exists in kernel memory when the process opens the file
  • Dentry object: file name to inode mapping. Different FS have different underlying implementations.

To summarize

The VFS is a layer of abstraction between an application and a particular file system, and some operations can be done directly at the VFS layer without involving the underlying file system. For example, when a process closes a file, the file itself on the disk does not change, so the VFS simply frees the corresponding file object. The lseek() system call, for example, also changes the file object in memory, without the need to design the underlying file system.