By QingStor Huang Meng

Why do we need the Network File Protocol?

Storage file is one of the most common in people daily work life demand, as the number of files and take up the rise of storage space, and within a certain range to share access to the file needs to produce, we naturally need to store files from a single computer equipment, as a separate service resources (or physical hardware) to provide storage capabilities, Provide larger capacity while sharing access for multiple terminals over the network.

The Storage device in question is commonly referred to as NAS (Network Attached Storage).

The shared access to the NAS by terminals over the network requires standard protocols. NFS is one of the most important and widely used protocols (other popular network file protocols include SMB[1], which will be covered in a future article). Today we’re going to talk about the NFS protocol.

An ever-evolving NFS

1. Birth of simple and easy to use stateless protocols

NFS first appeared on the scene in 1985. NFS Version2 is released as a component of SunOS 2.0. So it might have been better called the “Sun Network File system” back then. Also, you read that correctly, the first release was version V2 (v1 was never released, which is why we never talk about NFS V1).

However, NFS was not the earliest network file system. There were some earlier network file systems. For example, Remote File System (RFS)[2] in UNIX SVR3 System has introduced the concept of Remote Procedure Call (RPC) and become a model for NFS V2. But RFS has its own obvious problems, such as opening files for each client to record the state (stateful protocol), so it is difficult to cope with a server down or restart. NFS V2 is designed as a completely stateless protocol to address the shortcomings of RFS at the design level.

In 1995, NFS V3 was released. At this time, the development of NFS protocol was no longer completely dependent on Sun, but a number of companies jointly led the completion. NFS V3 contains a number of optimizations, but most can be considered performance-level optimizations. In general, NFS V3 still follows a stateless protocol design.

The design of stateless protocol naturally reduces the difficulty of dealing with server downtime and restart. But it’s not easy to break free of stateful information. For example, as a network file protocol, you need to support “file lock” operations, but the lock information is naturally a “state information.” So in the environment where NFS V2 / V3 runs, NFS “outsources” this burden to the Network Lock Manager (NLM). When an NFS Client receives a file lock request, an NLM RPC call is generated instead of an NFS RPC call. But then NLM becomes a “stateful” protocol. So it needs to deal with fault recovery after server crashes, client crashes, network partitions. NFS V2 / V3 does not work well with NLM in general (for example, NLM itself will mark and identify which process is applying for and holding each lock, but NFS server processing read and write requests cannot distinguish from which remote process), which makes perfect lock logic difficult to implement.

Even leaving aside the issue of file locking, the stateless protocol design brings new problems of its own. As a stateless service, the NFS service cannot record the status of files opened by NFS clients. Therefore, there is no simple and direct way to determine whether the file content has been modified by other clients, that is, whether the cache is still valid. In NFS V2 / V3, NFS clients usually save the file modification time and file size in the cache details and periodically verify the cache validity at an interval. The NFS client obtains the current file properties and compares them with the modification time and file size in the cache. If they are still consistent, the cache is still valid because the file has not been modified. If there is no match, the file is considered to have changed and CAHCE is no longer valid. Such an approach is obviously inefficient. Unfortunately, since many file systems store timestamps inaccurately, NFS clients cannot detect successive changes in coarsest units of time (seconds), such as changes that occur within a second after cache validation (overwrite), The change time and file size do not indicate that the cache needs to be updated. In this case, only the cache is ejected by the LRU or the file is modified again. Otherwise, NFS clients cannot detect the latest file changes.

2. Stateless to stateless, evolving into a mature stand-alone network file system

2002 NFS V4.0 was released. At this time, the DEVELOPMENT of NFS protocol was completely led by IETF, and the biggest change was that the design of NFS protocol changed from stateless protocol to stateful protocol.

From a stateless protocol again turned into a state, not the pursuit of a kind of old fasion, but because of the current already have better engineering capabilities, to design and develop enough to cope with a stateful protocol mechanism design under the complex problem (of course, this is due to the motivation behind the apparently perennial bear the suffering of the shortcomings in the design of stateless given).

The stateful design of NFS V4.0 is mainly reflected in the following aspects:

A. The file lock function is added to the protocol and the status information such as the lock information is maintained without the assistance of the NLM.

B. NFSv4 supports delegating for cache consistency. Because multiple clients can mount the same file system, NFSv4 can rely on Delegation for file synchronization in order to keep files synchronized. When client A opens A file, the NFS server assigns A delegation to client A. As long as the client A holds A delegation, it can be considered as consistent with the server, and can safely do caching and other processing on the CLIENT side of NFS protocol. If another client B accesses the same file, the server suspends (that is, temporarily blocks) client B’s access request and sends A RECALL request to client A. When client A receives the RECALL request, the local cache is flushed to the server and delegation is returned to the server, at which point the server continues processing client B’s request.

Of course, the delegation mechanism can only be understood as a more aggressive way of handling reads and writes when cache consistency is considered, so it should be understood as a performance optimization rather than a complete solution to the cache consistency problem. This is because when the NFS server detects that multiple clients compete for the same file, it reclaims the previously issued authorization, and then reverts to the mechanism similar to v2/ V3 to determine cache validity. In addition to the above two points, v4.0 has the following improvements compared to the previous version: A. NFSv4 has added security design and started to support RPC SEC-GSS[3] authentication. B. NFSv4 provides only two NULL requests and a COMPOUND request. All operations are integrated into a COMPOUND request. Clients can encapsulate multiple operations into a COMPOUND request based on actual requests. C. NfSV4 modified the representation of file properties to significantly improve compatibility with Windows systems. At the same time, Microsoft began to remodel the SMB protocol into the Common Internet FileSystem (CIFS). This is not a coincidence. With these improvements, NFS has evolved into a mature and efficient stand-alone network file system, but the software world has slowly demanded more scalability and enterprise-level features for file systems that NFS V4.0 has yet to answer.

3. With stronger expansibility, the prototype of enterprise-class clustered file system has emerged

In 2010 and 2016, NFS V4 evolutions v4.1 and V4.2 were released.

In 2010, NFS V4.1 took an important step in the direction of a clustered file system by introducing the concept of Parallel NFS/pNFS: In other words, the metadata and data are separated at the protocol level, and the roles of metadata nodes and data nodes are created. The data access has a certain scalability. Parallel data access is also designed to take overall throughput to new heights, similar to the thinking of many modern distributed file systems.

However, it should be noted that in this design, the scalability of metadata processing is still not addressed. In addition, as a stateful protocol, in the active/standby architecture to ensure high availability, the failover process is difficult to be smooth enough because the standby node does not have the status information maintained by the active node.

In addition, the NFS protocol has begun to add more enterprise-level features at the data center level:

NFSv4.1 began to support Remote Direct Memory Access (RDMA) [4], and NFS V4.2 began to support sparse files and server-side Copy.

This helps the NFS protocol support more serious data center/enterprise applications.

4. Continued evolution and engineering exploration of NFS: kernel-mode vs. user-mode

With the continuous development of NFS protocol, in addition to the increasing complexity, the cluster scalability/availability is constantly explored, but this undoubtedly brings new challenges to the engineering implementation level. In the Linux world, the most commonly used NFSD service is in kernel mode, but as the complexity of the mechanism and architecture increases, implementing an NFS service in user mode seems to be a more engineering solution. As the rival of NFS in the Linux world, Samba[5] service is based on the user mode to build a relatively complete cluster scheme.

In response to this problem, the open source community has also given some exploration, among which the most influential and the most widely used project is nfs-ganesha[6], which is currently maintained by RedHat and implements the complete NFS service in user mode.

As shown in the NFS-Ganesha architecture diagram, the project itself focuses on protocol processing logic (supporting all NFS protocol versions), while designing an independent SAL (State abstraction Layer). And the file-system abstraction Layer (FSAL). The former helps extend some cluster logic when dealing with the “state” of stateful protocols, while the latter facilitates access to include local VFS file systems and a variety of open source/commercial distributed storage. Such a design is conducive to creating better cluster logic with the help of external middleware and services, providing space for more open designs, and ensuring community vitality and expanding application scope by connecting more back-end storage systems.

The NFS protocol will continue to play an important role for a long time to come, and NFS engineering practices will continue to be explored in the user mode if they are to evolve to meet ever-expanding and complex needs.

conclusion

NFS protocol is not only the driving force behind file sharing, but also supports various core businesses in such industries as supercomputing and broadcasting. As a multi-year old network file protocol, NFS has its historical limitations, and even each iteration has its own heavy historical baggage tied to it, but it can still be studied as a classic example of a file system. It can be said that the process of understanding NFS protocol step by step is also a process of reunderstanding the modern distributed file system.

The resources

1 . Why NFS Sucks (Olaf KirchSUSE/Novell, Inc)

www.kernel.org/doc/ols/200…

2.Review of “Why NFS Sucks” Paper from the 2006 Linux Symposium

Nfsworld.blogspot.com/2006/10/rev…

3.NFS and file locking

Docstore. Mike. Ua/orelly/netw…

4.. Comparisons between NFS versions (ycnian)

Blog.csdn.net/ycnian/arti…

5.. NFS Ganesha Architecture github.com/nfs-ganesha…

Reference links

[1] the SMB en.wikipedia.org/wiki/Server…

[2] RFS (Remote File System) en.wikipedia.org/wiki/Remote…

[3] RPC SEC – GSS en.wikipedia.org/wiki/RPC_SE…

[4] RDMA en.wikipedia.org/wiki/Remote…

[5] Samba en.wikipedia.org/wiki/Samba_…

[6] NFS – ganesha github.com/nfs-ganesha…

This article is published by OpenWrite!