The file system

  • Public id: Rand_cs

Continuing with the file system part of xv6XV6xv6, xv6XV6xv6 divides the file system design into seven layers: Disk → Cache → Log →inode→ Directory → Path → File system call disk \ RightarRow cache \ Rightarrow log \ Rightarrow inode \ Rightarrow directory \rightarrow path File system call disk → cache → Log →inode→ directory → path → file system call

For this part of the theory, see the previous section: Think through the file system. This article looks directly at how the xv6 file system section is implemented.

File system Layout

Xv6 file system layout

This diagram is somewhat different from the layout shown in the XV6XV6xv6 documentation, mainly because the location of the log area has changed. The xv6XV6xv6 document gives the log area at the end of the file system, but according to the source code the log area should be behind the superblock. The picture in the xv6xv6xv6 document used directly above should be wrong, I’m really sorry. I have looked at several versions of the xv6XV6xv6 source code and documentation. The source code shows that the log area is placed at the end of the super block, while the layout of the documentation shows that the log area is placed at the end. But that’s not the point, it doesn’t affect our understanding, no matter where it is, you can modify it in the superblock.

Boot block, superblock

0 is the boot block, which contain the initiator namely bootblockbootblockbootblock, see above: under multiprocessor computer startup

The first block is the super block, which holds meta-information about the file system. The related structures are defined as follows:

struct superblock {
  uint size;         // Size of file system image (blocks) Specifies the Size of the file system
  uint nblocks;      // Number of data blocks Number of data blocks
  uint ninodes;      // Number of inodes. // Number of inodes
  uint nlog;         // Number of log blocks // Number of log blocks
  uint logstart;     // Block number of first log Block // First log Block number
  uint inodestart;   // Block number of first inode Block // Block number of the first I node
  uint bmapstart;    // Block number of first free map Block // First bitmap Block number
};
Copy the code

You can see that the super block is really a collection of information about the layout of the file system. In mkfs.cmkfs.cmkfs.c we can know:

#define NINODES 200
#define MAXOPBLOCKS  10 
#define LOGSIZE      (MAXOPBLOCKS*3) 
#define FSSIZE       1000
#define IPB           (BSIZE / sizeof(struct dinode))


int nbitmap = FSSIZE/(BSIZE*8) + 1;  
int ninodeblocks = NINODES / IPB + 1;   
int nlog = LOGSIZE;   

int nmeta = 2 + nlog + ninodeblocks + nbitmap;
int nblocks = FSSIZE - nmeta;

int logstart = 2;
int inodestart = 2 + nlog;
int bmapstart = 2 + nlog + ninodeblocks;
Copy the code

As you can see from the code above, you can calculate exactly where the parts of the file system start and end, so you can get the size information from the superblock no matter where the log area is placed.

Data area

The next area of the superblock is the inodeinodeinode, but the inodeinodeinode is a bit more complicated. We will talk about it later. First, we will look at the organization and management of data blocks in the data area.

The allocation and release of data blocks is managed by bitmaps, but the area managed by bitmaps is not just data areas, but the entire file system. Bitmap macros are defined as follows:

// The number of bits per block
#define BPB           (BSIZE*8)
// Block of free map containing bit for Block b
#define BBLOCK(b, sb) (b/BPB + sb.bmapstart)
Copy the code

Distribution of recycling

static uint balloc(uint dev)
{
  int b, bi, m;
  struct buf *bp;

  bp = 0;
  for(b = 0; b < sb.size; b += BPB){
    bp = bread(dev, BBLOCK(b, sb));       // Read bitmap information
    for(bi = 0; bi < BPB && b + bi < sb.size; bi++){
      m = 1 << (bi % 8);   
      if((bp->data[bi/8] & m) == 0) {// Is block free? If the block is free
        bp->data[bi/8] |= m;  // Mark block in use
        log_write(bp);      
        brelse(bp);           / / releases the lock
        bzero(dev, b + bi);   // Set the block to 0
        return b + bi;     // return the block number
      }
    }
    brelse(bp);    / / releases the lock
  }
  panic("balloc: out of blocks");
}

static void bfree(int dev, uint b) // Release a block of data, the corresponding bitmap is cleared
{
  struct buf *bp;
  int bi, m;

  bp = bread(dev, BBLOCK(b, sb));
  bi = b % BPB;
  m = 1 << (bi % 8);
  if((bp->data[bi/8] & m) == 0)
    panic("freeing free block");
  bp->data[bi/8] &= ~m;
  log_write(bp);
  brelse(bp);
}
Copy the code

Each bit in a bitmap block represents a block, with position 1 indicating that the corresponding block is in use and position 0 indicating that the corresponding block is free. To allocate a block of data is to find free bits in the bitmap and set them to 1 to represent allocation.

The above code involves relatively simple bit operations, with detailed comments, it is not clear, the operation of releasing a data block is the inverse operation of allocation, and will not be described again.

inode

Dinode on disk

The region immediately following the superblock is the inodesinodesinodes region, with 200 inodeinodeinodes defined by xv6xv6xv6. The dinodedinodedinode on each disk is defined as follows:

struct dinode {
  short type;           // File type 
  short major;          // Major device number (T_DEV only)
  short minor;          // Minor device number (T_DEV only)
  short nlink;          // Number of links to inode in file system
  uint size;            // Size of file (bytes)
  uint addrs[NDIRECT+1];   // Data block addresses
};
#define NDIRECT 12
Copy the code

Typetypetype indicates the type of file the inodeinodeinode points to. In xv6xv6xv6, there are only three types: normal file, directory file, and device file

Nlinknlinknlink is the number of links in this file. There are hard links and soft links, and the one that’s directly related to the number of links is the hard link, and we’ll see later when we implement file system calls, The sys_link()sys\_link() system call creates a new directory entry and increases the number of links. Sys_unlink ()sys\_unlink() the sys_unlink() system call subtracts the number of links by one and deletes the file if it has zero references and links in memory. We’ll talk about this later, but I’ll just mention it in this article.

Sizesizesize Specifies the file size, in bytes.

In LinuxLinuxLinux, use major NumberMajor \ NumberMajor Number and Minor NumberMinor \ numberMinor Number to distinguish between devices, A device of the same class has the same major NumberMajor \ NumberMinor number, which in turn indicates a specific device of the same class. We will see later that if the file type is device, then the device’s reader is called according to Major NumberMajor \ NumberMajor Number.

Each inodeinodeinode has 11 first-level Pointers, an indirect pointer, to a block of data. These can be changed, and xv6xv6xv6 has an implementation that supports large files by adding an indirect index.

The inode cache

The inodeinodeinode on disk also has a corresponding cache in memory:

struct {
  struct spinlock lock;
  struct inode inode[NINODE];
} icache;         // Memory cache of node I on disk
Copy the code

An in-memory inodeinodeinode is defined as follows:

struct inode {
  uint dev;           // Device number Device number
  uint inum;          // Inode number Indicates the Inode number
  int ref;            // Reference count Number of references
  struct sleeplock lock; // protects everything below here
  int valid;          // inode has been read from disk? The validity of

  short type;         // copy of disk inode
  short major;
  short minor;
  short nlink;
  uint size;
  uint addrs[NDIRECT+1];
};
Copy the code

The inodeinodeinode in memory has several more attributes than the inodeinodeinode on disk. The first one is the device number. LinuxLinuxLinux everything is a file, and the device is also a file, so the device number is a device. However, xv6 is less complicated, and the main purpose here is to distinguish between the primary and secondary disks. Dev =1dev =1dev =1 is the secondary disk, and dev=0dev =0dev =0 is the primary disk. This value is used when reading and writing disks. It is used to set the device register of the disk to specify the primary and secondary disks. See: Take you to understand disk drivers

Linklink refers to the relationship between files on disk. Refrefref refers to the number of references to files in memory. For example close()close()close() close()close()close() close the file to reduce the number of references by one. More on this in file system calls.

Validvalidvalid Indicates whether the data in the cache block on the disk is valid. If the data in the cache is read from the disk, the value is valid. Usually invalid because the inodeinodeinode has just been allocated, so the data inside is invalid.

The entire inodeinodeinode cache has a spin lock, and each inodeinodeinode cache has a sleep lock, which is the same as the disk and cache block. First, they are all common resources and require locks to avoid race conditions. In addition, icacheicacheicache is used to organize and manage inodeinodeinode, like an allocator. Access to icacheicacheicache’s critical sections is very short, using spin locks. A process may use an inodeinodeinode for a long time, so it is better to use sleep locks. If other processes also want to use the inodeinodeinode, sleep CPUCPUCPU to improve performance.

Assign inodes

For dinodedinodedinodes on disk, xv6xv6xv6 has no organizational structure, and the way to allocate free dinodedinodedinodes is simply to find free dinodedinodedinodes in a rough loop from beginning to end.

struct inode*
ialloc(uint dev, short type)
{
  int inum;
  struct buf *bp;
  struct dinode *dip;

  for(inum = 1; inum < sb.ninodes; inum++){  
    bp = bread(dev, IBLOCK(inum, sb));      // Read the position of node I in the inum
    dip = (struct dinode*)bp->data + inum%IPB;    // Address of node I
    if(dip->type == 0) {// a free inode finds a free I node
      memset(dip, 0.sizeof(*dip));
      dip->type = type;
      log_write(bp);   // mark it allocated on the disk
      brelse(bp);
      return iget(dev, inum);   // returns as an I node in memory
    }
    brelse(bp);
  }
  panic("ialloc: no inodes");
}

#define IBLOCK(i, sb)     ((i) / IPB + sb.inodestart)
#define IPB           (BSIZE / sizeof(struct dinode))
Copy the code

IBLOCK(I,sb)IBLOCK(I,sb)IBLOCK(I,sb)IBLOCK(I,sb)IBLOCK(I,sb)IBLOCK(I,sb)IBLOCK(I,sb)IBLOCK(I,sb)

Dinodedinodedinode has no organization mechanism, so it directly queries dinodedinodedinode usage from beginning to end. If the dinodedinodedinode’s TypeTypeType property is 0, the dinodedinodedinode is idle and allocatable; otherwise, the dinodedinodedinode is unallocatable.

After allocation, initialize the dinodedinodedinode, set all its data to 0, and assign the Dinodedinodedinode to a typeTypeType file.

The dinodedinodedinode needs to be marked as allocated on disk as well because it is currently operating in memory and needs to be synchronized to disk. So call log_write()log\_write()log_write() to synchronize the cache block where the dinodedinodedinode resides to disk. Of course, it does not actually write directly to disk, but just marks the cached data as dirty. The operation of logging, reading and writing to disk is not described in this article, but can refer to the previous section: Show you the disk driver

So if we go back to the function that allocates dinodedinodedinode, dinodedinodedinode on disk is allocated, we get inodeinodeinode number, But the file system actually works using an in-memory inodeinodeinode cache, So call iget()iget()iget() to allocate (retrieve) an in-memory inodeinodeinode to cache dinodedinodedinode data:

static struct inode* iget(uint dev, uint inum)
{
  struct inode *ip, *empty;

  acquire(&icache.lock);

  // Is the inode already cached? If the dinode is cached in memory
  empty = 0;
  for(ip = &icache.inode[0]; ip < &icache.inode[NINODE]; ip++){
    if(ip->ref > 0 && ip->dev == dev && ip->inum == inum){    // Find the I node in the cache
      ip->ref++;                   // Add 1 to the reference
      release(&icache.lock);
      return ip;
    }
    if(empty == 0 && ip->ref == 0)    // Remember empty slot. Record the empty inodes in the icache
      empty = ip;
  }

  // Recycle an inode cache entry.
  if(empty == 0)
    panic("iget: no inodes");
  // The dinode has no cache in memory and an empty inode is allocated
  // Initialize the idle inode according to the argument. No data has been read in yet
  ip = empty;
  ip->dev = dev;
  ip->inum = inum;
  ip->ref = 1;
  ip->valid = 0;
  release(&icache.lock);

  return ip;
}
Copy the code

If dinodedinodedinode is cached in icacheicacheicache, add one to the number of references to the inodeinodeinode and return the inodeinodeinode. If there is no cache, allocate a free inodeinodeinode, initialize the inodeinodeinode according to the argument, because no dinodedinodedinode data is actually read, Therefore, if inodeinodeinode’s validValidValid attribute is set to 0, it is invalid.

Use modified inodes

Use inodeinodeinode before you need to lock:

void ilock(struct inode *ip)
{
  struct buf *bp;
  struct dinode *dip;

  if(ip == 0 || ip->ref < 1)    // A null pointer reference less than 1 is an error
    panic("ilock");

  acquiresleep(&ip->lock);   / / lock

  if(ip->valid == 0) {// The significant bit is 0 to read data from the disk
    bp = bread(ip->dev, IBLOCK(ip->inum, sb));
    dip = (struct dinode*)bp->data + ip->inum%IPB;
    ip->type = dip->type;
    ip->major = dip->major;
    ip->minor = dip->minor;
    ip->nlink = dip->nlink;
    ip->size = dip->size;
    memmove(ip->addrs, dip->addrs, sizeof(ip->addrs));
    brelse(bp);
    ip->valid = 1;
    if(ip->type == 0)
      panic("ilock: no type"); }}Copy the code

Distribution inodeinodeinode when not dinodedinodedinode read in data from disk, only will inodeinodeinode validvalidvalid 0 means data is invalid, The official reading of dinodinodinode data takes place while this is locked.

Changes to inodeinodeinode in the cache need to be synchronized to dinodedinodedinode on disk:

void iupdate(struct inode *ip)
{
  struct buf *bp;
  struct dinode *dip;

  bp = bread(ip->dev, IBLOCK(ip->inum, sb));       // Read node I on disk
  dip = (struct dinode*)bp->data + ip->inum%IPB;
  dip->type = ip->type;                            //update
  dip->major = ip->major;
  dip->minor = ip->minor;
  dip->nlink = ip->nlink;
  dip->size = ip->size;
  memmove(dip->addrs, ip->addrs, sizeof(ip->addrs));
  log_write(bp);
  brelse(bp);
}
Copy the code

When you run out of inodeinodeinode, you need to “drop” it:

void iput(struct inode *ip)
{
  acquiresleep(&ip->lock);      / / lock
  if(ip->valid && ip->nlink == 0) {// Get the number of references to this node
    acquire(&icache.lock);
    int r = ip->ref;
    release(&icache.lock);

    if(r == 1) {// inode has no links and no other references: truncate and free.
      itrunc(ip);
      ip->type = 0;
      iupdate(ip);
      ip->valid = 0;
    }
  }
  releasesleep(&ip->lock);

  acquire(&icache.lock);
  ip->ref--;
  release(&icache.lock);
}
Copy the code

This function subtracts the number of references to the inodeinodeinode by 1. If it was the last reference to the inodeinodeinode and the link count is 0, Call itrunc()itrunc()itrunc() to free all blocks that the inodeinodeinode points to, and delete the file that the inodeinodeinode points to.

IP →type= 0IP \rightarrow type= 0IP →type=0 Indicates that the inodeinodeinode is released and recycled to icacheicacheicache. After updating inodeinodeinode information to dinodedinodedinode on disk, set ValidValidValid to 0. Table data is no longer valid.

The index

The index part of the inodeinodeinode is used to point to the data block

static uint bmap(struct inode *ip, uint bn)  // allocate a block to the bn index of node I
{
  uint addr, *a;
  struct buf *bp;
  
  //bn is a direct index
  if(bn < NDIRECT){    
    // If the block to which the bn index points has not been allocated, then allocate, otherwise return the block number
    if((addr = ip->addrs[bn]) == 0)     
      ip->addrs[bn] = addr = balloc(ip->dev);
    return addr;
  }

  bn -= NDIRECT;     //bn is an indirect index

  if(bn < NINDIRECT){
    // Load indirect block, allocating if necessary.
    if((addr = ip->addrs[NDIRECT]) == 0)          // If the block where the indirect index resides has not been allocated, allocate
      ip->addrs[NDIRECT] = addr = balloc(ip->dev);
    bp = bread(ip->dev, addr);           // Read the first level index block
    a = (uint*)bp->data;
    if((addr = a[bn]) == 0) {// If the block to which the index points has not been allocated, allocate
      a[bn] = addr = balloc(ip->dev);
      log_write(bp);
    }
    brelse(bp);
    return addr;         // returns the block number to which index bn points
  }

  panic("bmap: out of range");
}
Copy the code

Bmap (struct inode∗ IP,uintbn)bmap(struct inode * IP,uint BN)bmap(struct inode∗ IP,uintbn) The block number is returned after allocation.

static void itrunc(struct inode *ip)
{
  int i, j;
  struct buf *bp;
  uint *a;

  for(i = 0; i < NDIRECT; i++){         // Release the data block pointed to by the direct index
    if(ip->addrs[i]){
      bfree(ip->dev, ip->addrs[i]);   
      ip->addrs[i] = 0; }}if(ip->addrs[NDIRECT]){
    bp = bread(ip->dev, ip->addrs[NDIRECT]);     // Read the first level index block
    a = (uint*)bp->data;     
    for(j = 0; j < NINDIRECT; j++){     // Release the block pointed to by the primary index
      if(a[j])
        bfree(ip->dev, a[j]);              
    }
    brelse(bp);
    bfree(ip->dev, ip->addrs[NDIRECT]);    // Release the index block
    ip->addrs[NDIRECT] = 0;
  }

  ip->size = 0;
  iupdate(ip);
}
Copy the code

Truncate inodetruncate\ inodetruncate inodetruncate inode, truncate inode, truncate inode, truncate inode, truncate inode, truncate inode, truncate inode, truncate inode, truncate inode, truncate inode It’s like deleting the file. This works by checking whether the index is valid one by one, calling bfree()bfree()bfree() and setting the index to invalid. It basically does the opposite of bmap(), bmap(), bmap().

Read and write data

int readi(struct inode *ip, char *dst, uint off, uint n) // Read data from inode
{
  uint tot, m;
  struct buf *bp;

  if(ip->type == T_DEV){  // If the inode points to a device file
    if(ip->major < 0|| ip->major >= NDEV || ! devsw[ip->major].read)return - 1;
    return devsw[ip->major].read(ip, dst, n);  // Use device-specific read methods
  }

  if(off > ip->size || off + n < off)  // If the start reading position is past the end of the file, if the number of bytes read is negative
    return - 1;
  if(off + n > ip->size)   // If n bytes from offset exceed the end of the file
    n = ip->size - off;    // Only so many bytes can be read

  for(tot=0; tot<n; tot+=m, off+=m, dst+=m){  //tol: total number of bytes read so far, n: number of bytes to read,off: read from here, DST: destination
    bp = bread(ip->dev, bmap(ip, off/BSIZE)); // Read off into the cache block
    m = min(n - tot, BSIZE - off%BSIZE);  // Read up to m bytes at a time
    memmove(dst, bp->data + off%BSIZE, m);  // Assign data to DST
    brelse(bp);  // Release the cache block
  }
  return n;
}

int writei(struct inode *ip, char *src, uint off, uint n)
{
  uint tot, m;
  struct buf *bp;

  if(ip->type == T_DEV){
    if(ip->major < 0|| ip->major >= NDEV || ! devsw[ip->major].write)return - 1;
    return devsw[ip->major].write(ip, src, n);
  }

  if(off > ip->size || off + n < off)
    return - 1;
  if(off + n > MAXFILE*BSIZE)
    return - 1;

  for(tot=0; tot<n; tot+=m, off+=m, src+=m){
    bp = bread(ip->dev, bmap(ip, off/BSIZE));
    m = min(n - tot, BSIZE - off%BSIZE);
    memmove(bp->data + off%BSIZE, src, m);
    log_write(bp);
    brelse(bp);
  }

Copy the code

Readi (struct inode∗ IP,char∗ DST,uint off,uint n)readi(struct inode * IP,char * DST,uint off, Uint \ n) The readi(struct inode∗ IP,char∗ DST,uint off,uint n) function is used to read data, starting offoffoff from the file pointed to by IP, and reading NNN bytes into DSTDSTDST.

If inodeinodeinode refers to a device file type, then the data will be read using a special method. For example, the console has its own special method of reading data. These device read methods are collected in the devSwDevSwDevsw array. We’ll talk more about this in a later article, but only about reading and writing files on disk.

For example, the number of bytes read should not exceed the end of the file. NNN should not be negative. If off+n> IP →sizeoff +n> IP \rightarrow sizeoff+n> IP →size exceeds the end of the file, Then IP →size− offIP \rightarrow size – offip→size−off is the maximum number of bytes that can be read.

File data might be more than, off/BSIZEoff/BSIZEoff/BSIZE said offoffoff blocks, the blocks is relative to the file at the beginning of block number, called bmap bmap () () bmap offoffoff’s absolute block number (). Bread ()bread()bread() bread()bread()

The data is ready in memory to be assigned to move the data. However, the maximum number of bytes that can be read at a time is n− Totn-Totn −tot, which is the number of bytes that must be read. The maximum number of bytes that can be read at a time is BSIZE− OFF %BSIZEBSIZE-off\%BSIZEBSIZE− OFF %BSIZE. You can’t go beyond these two, so you take the smaller of the two.

Writei (struct inode∗ IP,char∗ SRC,uint off,uint n)writei(struct inode * IP,char * SRC,uint off, Uint \ n) Writei (struct inode∗ IP,char∗ SRC,uint off,uint n), write NNN bytes from data in the data source SRCSRCSRC into files pointed to by iPIpip, Start at offoffoff.

It’s basically the opposite of readi()readi()readi(), except that the inodeinodeinode information needs to be updated at the end, because like writing data to a file, the file gets bigger. Inodeinodeinode is a representation of the file and corresponds to the file. The inodeinodeinode size attribute needs to be updated, and then iupdate()iupdate()iupdate() synchronizes the inodeinodeinode to dinodedinoode on disk.

directory

A directory is also a file, but its data is a little special. Its data is a directory entry, and its main properties are file names.
i n o d e inode
Serial number.

An inodeinodeinode represents a file. A file corresponds to an inodeinodeinode, but the inodeinodeinode attribute does not contain a file name. The file name attribute in the directory entry indicates that the main function of the directory entry is to associate inodeinodeinode with the file name.

Structure definition

struct dirent {            // Directory entry structure
  ushort inum;
  char name[DIRSIZ];
};

#define DIRSIZ 14
Copy the code

The directory entry xv6xv6xv6 consists of only two items, the file name and the inodeinodeinode number.

Find a directory entry

struct inode* dirlookup(struct inode *dp, char *name, uint *poff)
{
  uint off, inum;
  struct dirent de;

  if(dp->type ! = T_DIR)// If the file is not a directory file
    panic("dirlookup not DIR");

  for(off = 0; off < dp->size; off += sizeof(de)){
    if(readi(dp, (char*)&de, off, sizeof(de)) ! =sizeof(de))   // Read the directory file pointed to by dp, one directory entry at a time
      panic("dirlookup read");
    if(de.inum == 0)  // If the root directory is??????????????
      continue;
    if(namecmp(name, de.name) == 0) {// Find the file by name
      // entry matches path element
      if(poff)              // Records the offset of the directory entry in the directory
        *poff = off;
      inum = de.inum;       //name Indicates the inode number of the file
      return iget(dp->dev, inum);    // take the inode}}return 0;
}
Copy the code

This function is used to find the namenamename directory entry in the directory file pointed to by inodedPinode dpinodedp. The offset of this directory entry is recorded in poffPoffpoff. Finally return the inodeinodeinode of the file named namenamename.

Therefore, searching for files according to the file name refers to the process of searching for directory items in the directory file. The specific search method is to compare whether the name of the directory item and the file name to be searched are the same. If they are the same, they will be found; otherwise, there is no file to be searched in the directory.

Adding a directory entry

int dirlink(struct inode *dp, char *name, uint inum)
{
  int off;
  struct dirent de;
  struct inode *ip;

  // Check that name is not present.
  if((ip = dirlookup(dp, name, 0)) != 0){   
    iput(ip);       //dirlookup calls iget to increase the number of references by 1, so call iput to decrease the number of references by 1
    return - 1;      //name The directory entry already exists. -1 is returned
  }

  // Look for an empty dirent.
  for(off = 0; off < dp->size; off += sizeof(de)){
    if(readi(dp, (char*)&de, off, sizeof(de)) ! =sizeof(de))
      panic("dirlink read");
    if(de.inum == 0)   // Find a free directory entry
      break;
  }

  strncpy(de.name, name, DIRSIZ);     // Set the file name for the directory entry
  de.inum = inum;                     // Set the id of node I of the directory entry
  if(writei(dp, (char*)&de, off, sizeof(de)) ! =sizeof(de))   // Write the directory entry to the directory file pointed to by dp
    panic("dirlink");

  return 0;
}
Copy the code

This function is used to add a directory entry to the directory file pointed to by inodedPinode dpinodedp, usually when a new file is created and information about the new file needs to be added to the directory.

First check whether the directory entry exists. If not, find a free directory entry and write the inodeinodeinode and filename of the new file.

The path

Paths, such as/A /b/c/a/b/c/a/b/ C /a/b/ C, are actually composed of file names. The file may be a directory file or a common file. Usually the last item is the ordinary file name, and the middle item is the directory file name.

Also like a/a/b/c/a/b/c/a/b/c this path by said absolute path beginning with a ‘/’, a/b/ca/b/ca/b/c said that don’t begin with ‘/’ relative path. These two paths should be very familiar, no more explanation, or you can refer to the article a file system, which detailed the theoretical knowledge of the file system.

Either way, a path resolution function is required to extract the file names from each of them:

static char* skipelem(char *path, char *name)
{
  char *s;
  int len;

  while(*path == '/')   / / skip '/'
    path++;
  if(*path == 0)    / / path is empty
    return 0;
  s = path;
  while(*path ! ='/'&& *path ! =0)   // Move path further back, stripping out the first directory name
    path++;
  len = path - s;        // Record the length of the directory name
  if(len >= DIRSIZ)
    memmove(name, s, DIRSIZ);    // Copy the directory name to name
  else {
    memmove(name, s, len);
    name[len] = 0;
  }
  while(*path == '/')   // continue to skip '/'
    path++;
  return path;    // Return the remaining path
}
Copy the code

Skipelem () Skipelem () calls resolve the file name one header at a time in namenamename, returning the rest of the path.

To illustrate, use the example in the source code comment:

skipelem("a/bb/c", name) = "bb/c", setting name = "a"
skipelem("///a//bb", name) = "bb", setting name = "a"
skipelem("a", name) = "", setting name = "a"
skipelem("", name) = skipelem("/ / / /", name) = 0
Copy the code
static struct inode*
namex(char *path, int nameiparent, char *name)
{
  struct inode *ip, *next;

  if(*path == '/')   // Absolute path
    ip = iget(ROOTDEV, ROOTINO);    // Read the root directory
  else               // Relative path
    ip = idup(myproc()->cwd);       // Read the current working directory

  while((path = skipelem(path, name)) ! =0){
    ilock(ip);
    if(ip->type ! = T_DIR){ iunlockput(ip);return 0;
    }
    if(nameiparent && *path == '\ 0') {// If the parent is returned and the rest of the path is empty, the current node is I
      // Stop one level early.
      iunlock(ip);
      return ip;
    }
    if((next = dirlookup(ip, name, 0)) = =0) {// Query the next directory
      iunlockput(ip);
      return 0;
    }
    iunlockput(ip);
    ip = next;    // The current directory points to the next one, and then the while loop until the end of the parsing
  }
  if(nameiparent){
    iput(ip);
    return 0;
  }
  return ip;
}
Copy the code

This function according to the path to return to inodeinodeinode, such as path/a/b/c/a/b/c/a/b/c, if nameiparentnameiparentnameiparent effectively, it returns the file BBB inodeinodeinode, If nameiparentnameiparentnameiparent invalid, return the file inodeinodeinode CCC.

This function is mainly the use of path parsing function skipelem()skipelem()skipelem() and search directory item function dirlookup()dirlookup()dirlookup() function, according to the path to find the file steps are roughly as follows:

  • Gets the inodeinodeinode of the current directory
  • Obtain directory files according to inodeinodeinode
  • Find files/directories under the directory file by filename
  • Repeat the above process until the file is found

Namex ()namex()namex() function is the implementation of the above process, as for the specific implementation of this function, it is not detailed, you can take an example according to the code simulation process to understand. In the process of simulation, several conditions are mainly paid attention to:

  • (path = skipelem(path, name)) ! = 0when
    p a t h path
    Zero is returned if it is an empty string, so to speakskipelem("", name) = 0. Path points to an empty string, not that path itself is empty
  • if(nameiparent && *path == '\0').
    p a t h path
    When it’s an empty string, that’s when it’s “”*path = '\0'.

In addition, if it is a relative path, the current directory must be obtained from the PWDPWD attribute in the PCBPCBPCB process, which will be described later.

This paper mainly introduces the xv6xv6xv6 file system inodeinodeinode, directory, path three levels of design and implementation, should have a deeper understanding of these three concepts. Inodeinodeinode is the representative of a file, corresponding to a file. A directory is also a file, but its data is the information of other files, namely a directory entry. A directory entry is a combination of the names of other files and the corresponding inodeinodeinode number. And the path? It’s a string of file names combined together. You can use the path resolution function to parse out file names one by one and then look for files from top to bottom based on the directory entries in each directory layer.

Well this article is here, there is any mistake also please criticize Pointers, also welcome to discuss with me exchange study progress together.