Link: PJMike’s blog

preface

This article explores the theme of the file system, the operating system students should have an understanding. Personally, I think the file system is an important part of the operating system. As a back-end developer, there will certainly be a time when dealing with files, through the relevant file I/O function to read and write files, and learning the file system can let us write code to do know, understand the logic behind

disk

A disk is a block device used for storage. A block device is a piece of hardware in a system that can access fixed-size pieces of data randomly (not sequentially)

Image: time.geekbang.org/column/arti…

The disk is shown on the left, the middle circle is the disk, and the diagram on the right is an abstraction. Each layer is divided into multiple tracks, each track is divided into multiple sectors, each sector is 512 bytes.

Disk partition

For a large hard disk, the available space can be divided or divided into multiple partitions. For example, on a Windows operating system, a disk can be divided into C, D, and E drives. This is called partitioning.

Partition information is stored in the partition table on the disk. This table lists the start and end points of each partition and its type. The partition table has two main types: MBR and GPT. The former is the traditional format, good compatibility; The latter is more modern and powerful.

The file system

A file system is actually a way of storing files on a disk. After the disk is divided into multiple partitions, each partition can have an independent file system. The operation of installing a file system for a hard disk is called “formatting”. Formatting is the process of organizing a disk into a file system in a certain format using commands. In Windows, the format is NTFS, while in Linux, it is ext2, ext3, or ext4.

After partitioning and formatting the hard disk, you need to mount the partition to a directory in the operating system so that it can be accessed as a common file system. The directory to be mounted can be the root directory, other secondary or tertiary directories. Any directory can be a mount point, but the root directory is where all Linux files and directories are located and a disk partition needs to be mounted.

File systems and blocks

Instead of reading the disk sector by sector, which is inefficient, the operating system reads multiple sectors in a row, one “block” at a time. The size of a block is an integer multiple of 2 of the sector size, typically 4KB.

A file system is created on disk and is divided into blocks of disk space. That is, blocks are the smallest addressing unit of the file system, sometimes referred to as “file blocks” or “I/O” blocks.

File systems and files

In a file system, a file can consist roughly of directory entries, inodes, and data blocks:

  • Inode: index node that stores Pointers to data blocks
  • Directory item: contains the file name and inode node number
  • Data block: Contains specific file contents

Another concern is the superblock, which records the overall information of the entire file system, including the total number of inodes and blocks, usage, and remaining amount, as well as the format and related information of the file system.

inode

Inodes are used to record file metadata, such as inode number, file size, access permission, modification date, and address index of file data blocks. You can find specific data blocks by using the index of data blocks.

In ext3, inodes have 15 indexes. The first 12 indexes directly record the address of the data block, and the 13th index records the address of the index. In other words, the disk data block pointed to by the index block does not directly record file data, but the index table of the file data block. The 14th index records the secondary index address, and the 15th index records the tertiary index address.

Image: time.geekbang.org/column/arti…

Another point to note is that index nodes correspond to files one by one. Like file contents, they need to be persisted to disk and take up disk space.

Directory entry

In Linux, a directory is also a file. When you open a directory, you open a directory file. The structure of a directory file is very simple, and it is a list of dirents. Each directory entry consists of two parts: the name of the file it contains and the inode number corresponding to the file name. Here’s the schematic

Image source: www.pc-freak.net/blog/find-f…

Unlike index nodes, a directory entry is an in-memory data structure maintained by the kernel, so it is often called a directory entry cache

Virtual file Systems (VFS)

A disk can be divided into multiple partitions, and each partition can have different file systems. In other words, the operating system may have read and write problems on different file systems. How do I manage these file systems?

In Linux, a virtual file system (VFS) is provided, which is a kernel subsystem that provides files and specific file system-related interfaces for user-space programs. In other words, up, provide a standard file manipulation interface to the application layer; Provides a standard interface for a specific file system.

To support multiple file systems, a layer of abstraction connects the various file systems. This abstraction layer, also known as the VFS, shields the file system directly, allowing users to use functions like open(), read(), and write() without regard to the file system or the actual physical medium, as shown below:

When a computer starts up, file systems on each partition of the disk need to register with the VFS. When they register, the file system provides a list of function addresses that the VFS needs to know how to read blocks of data from the file system. VFS, meanwhile, has a superblock object that must be implemented by all file systems to store information about a particular file system, usually corresponding to a file system superblock stored in a particular sector of a disk.

Architecture of file system components

Next, let’s take a look at the architecture, from the application layer to the VFS to the disk, to get a clearer picture of the file system, as shown in the following figure:

File I/O can be divided into direct I/O and indirect I/O, depending on whether the operating system’s Page Cache is utilized or not.

  • Direct I/O: Directly accesses disk data without page caching
  • Indirect I/O: A file is read from or written to the page cache. If there is data in the cache, the file is read directly. If not, read it from the disk

Page Cache is used to reduce I/O operations on disks. By caching data on disks to physical memory, the access to disks is changed to physical memory access.

There is also a Buffer Cache, which is used to Cache disk blocks for block I/O, and Page Cache, which is used to Cache file data. Before the Linux 2.4 kernel, Page Cache and Page Cache were independent. Buffers are page-mapped blocks, which are actually in the Page Cache.

References & acknowledgements

  • www.pc-freak.net/blog/find-f…
  • www.ibm.com/developerwo…
  • Book.douban.com/subject/609…
  • www.ruanyifeng.com/blog/2018/1…
  • Time.geekbang.org/column/arti…
  • Time.geekbang.org/column/arti…
  • Time.geekbang.org/column/arti…