Sima Niu was worried, saying, “Every man has his brother. I die alone.” Zi Xia said, “It is true that death and life have life, and wealth and honour are in heaven. A gentleman respects without loss, and is courteous and courteous to others. Within the four seas, all are brothers. Why does a gentleman have no brothers?” The Analects of Confucius: Yan Yuan

A hundred blog series. This is:

V62. Xx HongMeng kernel source code analysis concept (file) | why everything is the file

This article begins with the file system, which is one of the five major modules in the kernel and even has the Linux design philosophy of “everything is a file”. So its importance is self-evident. Can’t figure out the file system, the kernel must be confused. There are many concepts related to file system, which will be elaborated in detail in combination with kernel source code. This article first illustrates the source concept: file.

File system related sections are as follows:

  • V62. Xx HongMeng kernel source code analysis concept (file) | why everything is the file
  • V63. Xx HongMeng kernel source code analysis (file system) | said in books management file system
  • V64. Xx HongMeng kernel source code analysis (inode) | who is the most important concept of file system
  • V65. Xx HongMeng kernel source code analysis (mount directory) | why need to mount the file system
  • V66. Xx HongMeng kernel source code analysis (the root file system) | on first/File system on
  • V67. Xx HongMeng kernel source code analysis (character device) | bytes read/write device for the unit
  • V68. Xx HongMeng kernel source code analysis file system (VFS) | the foundation of the harmonious coexistence
  • V69. Xx HongMeng kernel source code analysis (file handle) | why do you call a handle?
  • V70. Xx HongMeng kernel source code analysis (pipe file) | how to reduce the data flow cost

What is a file

  • You can’t say what a file is without saying what a file is, let alone how and why the kernel manages files the way it does.
  • Modern operating systems have introduced files to address the idea that information can be stored independently of processes for long periods, abstracting files into a broad concept that combines documents, directories (folders), keyboards, monitors, hard disks, removable media devices, printers, modems, virtual terminals, and interprocess communication (IPCInput/output resources such as) and network communication are treated as files for unified operation.
  • Because they all have common reading and writing characteristics, once they are universal, they can abstract the ideal model, through which the design work becomes simple and orderly, and the design of API can be simplified. Users can use a common way to access any resource, so that they can be processed in a unified byte stream mode, and the difference part is made by the corresponding middleware adaptation to the bottom.
  • Inaccurate but graphic example A Linux system maps a hardware device to a file, such as a camera as /dev/video, and then can manipulate it using basic functions. Connect the device with the open() function, read the image with the read() function, and finally save the image with the write() function. In a sound card device, the read() function becomes the recording function and the write() function becomes the playback function.

The file type

There are seven types of files from a kernel perspective:

  • – Regular file

    Commonly understood files fall into this category (images, videos, MP3s, PPT,zip ==), also known as regular files, and are of course ubiquitous.

    turing@ubuntu:/home/tools$ls-hil total 12M 1083954 -rwxrwxr-x 1 Turing Turing 2.3m Feb 18 18:55 gn 1083803-RW-r --r-- 1 root root 9.4m Nov 25 2020 hapSignToolv2. jar 1083802 -rw-r--r-- 1 root root 58K Nov 25 2020 hMOS_APP_packing_toolCopy the code
  • (d, directory file)

    It’s a directory or folder that can be accessed using the CD command. It is also everywhere

    Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ls - lhi total 68 K DRWXR xr - 1202976 - x 3 Turing Turing Jun 21 02:38 4.0 K Applications 1173738 DRwxr-XR-x 10 Turing Turing 4.0K Jun 21 02:38 Base 1106153 DRwxR-XR-x 3 Turing Turing 4.0K Jun 21 02:38 buildCopy the code
  • Block device file (B, Block Device)

    An interface device that stores data for system access, simply a hard disk. For example, the code of disk 1 is /dev/hda1. The attribute is [b] : block device, which is usually seen in the /dev directory.

    turing@ubuntu:/dev$ ls -lhi
    total 0
    210 brwxr-xr-x  2 root   root         420 Jul 23 18:59 block
    337 brwxr-xr-x  2 root   root          80 Jul 23 18:05 bsg
    Copy the code
  • Character Device (C, char Device)

    Character device files: Devices that interface with serial ports, such as keyboards, mice, and so on. It is usually found in the /dev directory

    turing@ubuntu:/dev$ ls -lhi
    total 0
    124 crw-------  1 root   root     10, 175 Jul 23 18:05 agpgart
    373 crw-r--r--  1 root   root     10, 235 Jul 23 18:05 autofs
    Copy the code
  • Socket file (S, socket)

    Such files are commonly used for network data connections. You can start a program to listen for requests from the client, and the client can communicate data over the socket. This file type is most commonly seen in the /var/run directory.

    turing@ubuntu:/var/run$ ls -lhi 690 srw-rw-rw- 1 root root 0 Jul 23 18:05 snapd-snap.socket 689 srw-rw-rw- 1 root root 0  Jul 23 18:05 snapd.socketCopy the code
  • Pipe file (P, pipe)

    Pipe files are mainly used for interprocess communication. For example, using the mkFIFo command, you can create A FIFO file, enable process A to read data from the FIFO file, and start process B to write data into the FIFO file, FIFO first in first out, write with read.

    turing@ubuntu:/var/run$ ls -lhi
    269 prw-------  1 root              root     0 Jul 23 18:05 initctl
    Copy the code
  • Symbolic link file (L, symbolic Link)

    Links here refer to soft links, similar to shortcuts under Windows. /usr/bin is the directory with the most files in it.

    turing@ubuntu:/bin$ ls -lhi 143828 lrwxrwxrwx 1 root root 29 Jul 14 21:51 rmiregistry -> /etc/alternatives/rmiregistry 132128 lrwxrwxrwx 1 root root 4 Jul 14 19:10 rnano -> nano 132131 lrwxrwxrwx 1 root root 29 Jul 14 19:10 rrsync -> .. /share/rsync/scripts/rrsync 132132 lrwxrwxrwx 1 root root 21 Jul 14 19:10 rsh -> /etc/alternatives/rshCopy the code

File attributes

File properties, in a nutshell, are several

  • permissions
  • Who belongs to
  • Subordinate to the group
1173738 drwxr-xr-- 10 Turing Turing 4.0K Jun 21 02:38 Base [0][1][2][3][4][5][6][7][VNode id][Permissions [Hard link][Owner][Group][File capacity][Modified date][File name]Copy the code

[vnode number] Vnode file system is a very important concept, the subsequent have special length combining source in detail, each file has only a number, like id number, 1 million people across the country have called li wei, everybody all communication is called li wei, don’t shout id card, does not affect the communication, but by the public security bureau only id card, as long as you dare to protect a catch a quasi crime . So different perspectives, the focus is different. The file management mechanism is exactly the same, ordinary users just need to remember the HD movie in C:\xx\xx\xx\xx\ XXX \ XXX \xx.avi, no matter how deep buried can be turned out. You don’t even need to know what vNode. id is. But at the kernel level, it’s all vNode.id

[Permission] For a multi-user and multi-group system, you must have the permission to hold files. This field can be divided into the following four groups

d, rwx, r-x, r-x
Copy the code
  • The first characterdIndividually, this is the file type, and this is the(d, directory file)
  • The remaining three are composed of [RWX], r-read, W-write, and x-execute. [-] indicates placeholders, that is, no permission.
    • The second group is “file owner permissions”,rwxIndicates that the file owner can read, write, and execute the file
    • The third group is “group permissions”;r-xThe owning group of a file can be read and executed but cannot be written
    • The fourth group is “other permissions other than this group”,r--Others can read
  • Permissions can be expressed numerically in addition to letters
    R =100=4, w=010=2, x=001=1, -=0 rwxr-xr-- can be expressed as 111101100 = 754Copy the code

Chmod [-r] XXX File or directory: Change the file owner There are two methods to change the permission of a file

  • Numbers: CHMOD-R 777 OCOMPANIES config.json
    Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ls - hli - r - 1103292 - rw - r 1 Turing root 350 Jul 21 00:17 Ohos_config. Json Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $chmod -r 777 ohos_config. Json Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ls - hli 1103292-2 Turing RWXRWXRWX root 350 Jul 21 00:17 ohos_config.jsonCopy the code

    777 = (111) (111) (111). = (rwx)(rwx)(rwx)

  • Letters:
    U +(adding) R CHmod G -(excluding) W files or directories O =(setting) X A (U)user (g)group (O) Others (a)all CHmod U = RWX, GO = Rx ocompanies config.json Rwxr-xr-x CHMOD A + W OCOMPANIES Config. json companies COMPANIES I, RWXRWXRWX CHMOD U-R + WX OCOMPANIES companiesCopy the code

The link column represents the number of hard links. If there are hard links, there will be soft links. What’s the difference? Why is there a link? The reason is that the same file often needs to be used by the same user or multiple users at the same time, good things to know how to share, good life and peace, how to enjoy large. Let’s do a little experiment to see the difference

# to ohos_config. Json create hard links and soft link # create hard links command ln ohos_config. Json hard_link # to create soft links command ln -s ohos_config. Json hard_link Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ls - hli - r - 1103292 - rw - r 1 Turing root 350 Jul 21 00:17 Ohos_config. Json Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ln ohos_config. Json hard_link Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ln -s ohos_config. Json soft_link Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ls - hli - r - 1103292 - rw - r 2 Turing root 350 Jul 21 00:17 hard_link 1103292 -rw-r--r-- 2 turing root 350 Jul 21 00:17 ohos_config.json 1086100 lrwxrwxrwx 1 turing root 16 Jul 29 01:06 soft_link -> ohos_config.jsonCopy the code
  • Hard links:Keep track of how many times blockbusters are shared,hard_linkandohos_config.jsonthevnode_idAre all1103292The content is exactly the same. But the difference is that the number of links, or references, goes from 1 to 2. And the new number of citations ishard_linkAs a result of
  • Soft links:There’s a separate file created, there’s a separatevnode_id, but the contents of this file point toohos_config.jsonJust, what good would it do? Just take an example.
  • There is a beautiful woman living in room 301 of a hotel. If you want to enter the room, you need a key. Just knock on the door and give you a key. So what are soft links? It was room 302 next door, and there was only a note in room 302 that said, “Knock on room 301, you know.” Is that clear? Although around a bay, but in return is very flexible operation, the police to how to do? Just change the note to “Knock on room 404.404It’s a democracy in the room, no problems.
  • In the application layer, a large number of soft links are used, such as version switching, upgrading soft links is very convenient.

[owner] chown: changes the file owner. Chown [-r] Account name File or directory. Chown [-r] Account name: user group name file or directory

Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ll - rw - r - r - 2 root root 350 Jul 21 00:17 ohos_config. Json Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $sudo chown -r Turing: Turing ohos_config. Json Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ll - rw - r - r - 2 Turing Turing 350 Jul 21 00:17 ohos_config. JsonCopy the code

[Group] CHGRP [-r] User group name File or directory

Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ll - rw - r - r - 2 root root 350 Jul 21 00:17 ohos_config. Json Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $sudo CHGRP -r Turing ohos_config. Json Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $ll - rw - 2 root Turing r - r - 350 Jul 21 00:17 ohos_config. JsonCopy the code

[Modified date] Run the stat command to view information about a file

Turing @ ubuntu: / home/openharmony/code - v1.1.1 - LTS $stat ohos_config. Json File: ohos_config. Json Size: 350 Blocks: 8 IO Block: 4096 regular file Device: 805h/2053d Inode: 1103292 Links: 2 Access: (0644/-rw-r--r--) Uid: (1000/ Turing) Gid: (0/ root) Access: 2021-07-24 02:07:21.683190622-0700 Modify: 2021-07-21 00:17:34.733766830-0700 Change: 2021-07-29 01:20:14.314343117-0700 Birth: -Copy the code
  • mtime(modify time): Modification time Indicates the time when the file content was last modified. Such as:vimSave the file after operation.ls -lThat’s the time listed
  • atime(access time): Access time is read the contents of the file once, this time will be updated. Such asmore,catSuch as the order.ls,statCommands will not changeatime
  • ctime(change time): Indicates the time when the status changedvnodeTime when the node was last modifiedchmod,chownCommand to modify the file properties once, this time is updated

Once you know the files, look at the file system.

The file system

What is a file system? Take a look at wikipedia’s explanation:

  • The file system of a computer is a method of storing and organizing computer data. It makes it easy to access and find it. File systems use the abstract logical concept of files and tree directories instead of the concept of data blocks used by physical devices such as hard disks and optical disks. When using the file system to save data, users do not need to care about the actual number of data blocks stored in the hard disk (or CD). They only need to remember the directory and file name of the file. Before writing new data, users do not need to care about which block address is not used on the hard disk. Storage space management (allocation and release) on the hard disk is automatically done by the file system. Users only need to remember which file the data is written to.
  • File systems typically use storage devices such as hard disks and CD-RoMs and maintain the physical location of files on the devices. In reality, however, a file system may be nothing more than an interface to access data, where the actual data is provided over network protocols (such as NFS, SMB, 9P, etc.) or in memory, or there may be no corresponding file at all (such as the Proc file system).
  • Strictly speaking, a file system is a set of abstract data types that implement data storage, hierarchical organization, access, and retrieval operations.

Simply put, a file system is a subsystem in an operating system that manages persistent data. The basic data unit is a file. It organizes and manages files on disks.

Computer file system much like the smart books management system management system of our university, you go to the library to borrow books, only selected on the screen need to borrow a book list is submitted automatically extracted from the book in front of you, you don’t need to know how the books are detected, it real is put in a few what row) on the date of the bookshelf. Each university has its own set of books, some by category, some by subject, some by region, and some by time. Even if they are classified by category, the classification method will be different, and what is efficient for managing 10,000 books may not be as efficient for managing 10 million books, but it can be measured by the following factors:

  • Add, delete, change and check quickly,
  • Small storage space, good space recycling algorithm,
  • The security mechanism, who had what access to do what to the book.
  • Records of operations, when the book was put in storage, when it was last borrowed, when it was revised, etc.

Computer’s file system due to technical updates, because the protection of the interests of companies and a number of reasons, certainly also flowers, like a computer language, the vast majority of the invention of the language is just in order to solve the problems present a laboratory or a company, a lot of didn’t want to so far, standards and specifications that are later, depending on your market size and behind The gold master. Uniformity is good, but it’s really hard. Esperanto has been around for many years, but a few people learn it. The UN has existed for many years, but it does not listen to you and does not pay its dues. What can you do? Economic base determines the superstructure. This sentence has been repeated since junior high school politics. I did not understand it at the beginning, but now I understand it thoroughly.

So don’t wonder why there are so many languages to learn, so many front-end, background framework to do, the Internet technology map is still in the era of hegemony, giants, in which the code farmers are meat mincer. In the PC era, Windows will dominate the world; in the mobile era, Apple and Android will compete with Each other between Chu and Han; in the Internet of Everything era, Apple and Fuchsia will probably divide the world into three parts. The new generation is rising, and the old aristocracy is not at the card table.

File systems can be classified into the following four types:

  • Disk file system: A file system designed to hold computer files using data storage devices. The most common data storage devices are disk drives, which can be directly or indirectly connected to a computer. For example, FAT, exFAT, NTFS, HFS, HFS+, ext2, ext3, ext4, ODS-5, BTRFS, XFS, UFS, and ZFS. Some file systems are either travel file systems (also known as log file systems) or trace file systems.

  • Flash file system: A flash file system is a file system designed to store files on flash memory. With the proliferation of mobile devices and the increase in flash memory capacity, these file systems are becoming more popular. Although disk file systems can also be used on flash, flash file systems are the preferred flash device for the following reasons:

    • Erase blocks: Blocks of flash memory must be erased before being rewritten. Erasing blocks can take considerable time. Therefore, erasing unused blocks when the device is idle can help speed things up, and erasable blocks can be used first when writing data.
    • Random access: Due to the high latency of addressing on disk, disk file systems are optimized for addressing to avoid addressing as much as possible. But flash memory has no addressing delay.
    • Write levelling: Frequently written blocks in flash are prone to damage. Flash file systems are designed so that data is written evenly across the device.

    Journalized file systems have features required by flash file systems, such as JFFS2 and YAFFS. There are also non-journalized file systems, such as exFAT, that reduce flash memory life due to frequent log writes.

    JFFS2 (full name: Journalling Flash File System Version2), formerly known as JFFS, is a Flash File System developed by Redhat. Originally only supported NOR Flash, NAND Flash has been supported since version 2.6, suitable for embedded systems.

    YAFFS (short for Yet Another Flash File System) is an embedded NAND Flash File System developed by Aleph One.

  • Pseudo file system: A file system that is dynamically generated at startup and contains much information, configuration, and logs about the currently running kernel. Because they are placed in volatile storage, they are only available at run time and disappear when shut down. These pseudo files are often mounted to the following directories: sysfs (/sys), procfs (/proc), debugfs (/sys/kernel/debug), configfs (/sys/kernel/config), tracefs (/sys/kernel/tracing), tmppfs (/dev/shm, /run, /sys/fs/cgroup, /tmp/, /var/volatile, /run/user/

    ), devtmpfs (/dev)

    • procfsFile System is an abbreviation for process file system and is used to access process information through the kernel. The file system is normally mounted to/procDirectory. Due to the/procNot a true file system, it takes up no storage, only limited memory.
    • tmpfsTemporary File system (TEMPORARY file system) is a common name for a temporary archive storage space on UniX-like systems, usually implemented by mounting file systems and storing data in volatile storage rather than permanent storage devices. All data stored on TMPFS is, in theory, temporarily borrowed, which means files are not created on hard disk. Once restarted, all data in TMPFS will disappear.
    • SysfsIs a virtual file system provided with Linux 2.6. This file system can not only connect devices (devices) and drivers (drivers) is output from the kernel to user space and can also be used to set devices and drivers.sysfsThe aim is to put something that was originally inprocfsThe section about devices in the “Device Tree” is presented separately.
    • devtmpfsIs in theLinuxThe core starts early to build a preliminary/dev, so that the normal startup process does not have to waitudevTo shortenGNU/LinuxThe boot time of. The device is also seen as a file, highlightedLinuxFile system characteristics: everything is a file or directory.
  • Network file system: NFS (Network File System) is a mechanism for mounting partitions (directories) on remote hosts to the local System over the Network. NFS is a distributed File System that enables client hosts to access server files in the same process as accessing local storage. It is developed by Sun Company. It was released in 1984. It features the network as well as the document, once again embodiments everything is a document.

For HongMeng kernel, JFFS2, YAFFS, TMPFS, procfs, FAT, NTFS, ZFS will be the focus of subsequent chapters.

Intensive reading of the kernel source code

Four code stores synchronous annotation kernel source code, >> view the Gitee repository

Analysis of 100 blogs. Dig deep into the core

Add comments to hongmeng kernel source code process, sort out the following article. Content based on the source code, often in life scene analogy as much as possible into the kernel knowledge of a scene, with a pictorial sense, easy to understand memory. It’s important to speak in a way that others can understand! The 100 blogs are by no means a bunch of ridiculously difficult concepts being put forward by Baidu. That’s not interesting. More hope to make the kernel become lifelike, feel more intimate. It’s hard, it’s hard, but there’s no turning back. 😛 and code bugs need to be constantly debug, there will be many mistakes and omissions in the article and annotation content, please forgive, but will be repeatedly amended, continuous update. Xx represents the number of modifications, refined, concise and comprehensive, and strive to create high-quality content.

Compile build The fundamental tools Loading operation Process management
Compile environment

The build process

Environment script

Build tools

Designed.the gn application

Ninja ninja

Two-way linked list

Bitmap management

In the stack way

The timer

Atomic operation

Time management

The ELF format

The ELF parsing

Static link

relocation

Process image

Process management

Process concept

Fork

Special process

Process recycling

Signal production

Signal consumption

Shell editor

Shell parsing

Process of communication Memory management Ins and outs Task management
spinlocks

The mutex

Process of communication

A semaphore

Incident control

The message queue

Memory allocation

Memory management

Memory assembly

The memory mapping

Rules of memory

Physical memory

Total directory

Scheduling the story

Main memory slave

The source code comments

Source structure

Static site

The clock task

Task scheduling

Task management

The scheduling queue

Scheduling mechanism

Thread concept

Concurrent parallel

The system calls

Task switching

The file system Hardware architecture
File concept

The file system

The index node

Mount the directory

Root file system

Character device

VFS

File handle

Pipeline file

Compilation basis

Assembly and the cords

Working mode

register

Anomaly over

Assembly summary

Interrupt switch

Interrupt concept

Interrupt management

HongMeng station | into a little bit every day, the original is not easy, welcome to reprint, please indicate the source.