Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: The Sun Network Filesystem: Design, Implementation and Experience ppt
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
The Sun Network Filesystem: Design, Implementation and Experience

[attachment=33171]

Introduction

The Sun Network Filesystem (NFS) provides transparent, remote access to filesystems. Unlike many
other remote filesystem implementations under UNIX, NFS is designed to be easily portable to other
operating systems and machine architectures. It uses an External Data Representation (XDR)
specification to describe protocols in a machine and system independent way. NFS is implemented on top
of a Remote Procedure Call package (RPC) to help simplify protocol definition, implementation, and
maintenance.
In order to build NFS into the UNIX kernel in a way that is transparent to applications, we decided to add
a new interface to the kernel which separates generic filesystem operations from specific filesystem
implementations. The “filesystem interface” consists of two parts: the Virtual File System (VFS)
interface defines the operations that can be done on a filesystem, while the virtual node (vnode) interface
defines the operations that can be done on a file within that filesystem. This new interface allows us to
implement and install new filesystems in much the same way as new device drivers are added to the
kernel.
In this paper we discuss the design and implementation of the filesystem interface in the UNIX kernel and
the NFS virtual filesystem. We compare NFS to other remote filesystem implementations, and describe
some interesting NFS ports that have been done, including the IBM PC implementation under MS/DOS
and the VMS server implementation. We also describe the user-level NFS server implementation which
allows simple server ports without modification to the underlying operating system. We conclude with
some ideas for future enhancements.
In this paper we use the term server to refer to a machine that provides resources to the network; a client
is a machine that accesses resources over the network; a user is a person “logged in” at a client; an
application is a program that executes on a client; and a workstation is a client machine that typically
supports one user at a time.

NFS Protocol

The NFS protocol uses the Sun Remote Procedure Call (RPC) mechanism 1. For the same reasons that
procedure calls simplify programs, RPC helps simplify the definition, organization, and implementation
of remote services. The NFS protocol is defined in terms of a set of procedures, their arguments and
results, and their effects. Remote procedure calls are synchronous, that is, the client application blocks
until the server has completed the call and returned the results. This makes RPC very easy to use and
understand because it behaves like a local procedure call.
NFS uses a stateless protocol. The parameters to each procedure call contain all of the information
necessary to complete the call, and the server does not keep track of any past requests. This makes crash
recovery very easy; when a server crashes, the client resends NFS requests until a response is received,
and the server does no crash recovery at all. When a client crashes, no recovery is necessary for either the
client or the server.
If state is maintained on the server, on the other hand, recovery is much harder. Both client and server need
to reliably detect crashes. The server needs to detect client crashes so that it can discard any state it is
holding for the client, and the client must detect server crashes so that it can rebuild the server’s state.
A stateless protocol avoids complex crash recovery. If a client just resends requests until a response is
received, data will never be lost due to a server crash. In fact, the client cannot tell the difference between
a server that has crashed and recovered, and a server that is slow.
Sun’s RPC package is designed to be transport independent. New transport protocols, such as ISO and
XNS, can be “plugged in” to the RPC implementation without affecting the higher level protocol code
(see appendix 3). NFS currently uses the DARPA User Datagram Protocol (UDP) and Internet Protocol
(IP) for its transport level. Since UDP is an unreliable datagram protocol, packets can get lost, but because
the NFS protocol is stateless and NFS requests are idempotent, the client can recover by retrying the call
until the packet gets through.
The most common NFS procedure parameter is a structure called a file handle (fhandle or fh) which is
provided by the server and used by the client to reference a file. The fhandle is opaque, that is, the client
never looks at the contents of the fhandle, but uses it when operations are done on that file.

The Filesystem Interface

The VFS interface is implemented using a structure that contains the operations that can be done on a
filesystem. Likewise, the vnode interface is a structure that contains the operations that can be done on
a node (file or directory) within a filesystem. There is one VFS structure per mounted filesystem in the
kernel and one vnode structure for each active node. Using this abstract data type implementation allows
the kernel to treat all filesystems and nodes in the same way without knowing which underlying filesystem
implementation it is using.
Each vnode contains a pointer to its parent VFS and a pointer to a mounted-on VFS. This means that any
node in a filesystem tree can be a mount point for another filesystem. A root operation is provided in the
VFS to return the root vnode of a mounted filesystem. This is used by the pathname traversal routines in
the kernel to bridge mount points. The root operation is used instead of keeping a pointer so that the root
vnode for each mounted filesystem can be released. The VFS of a mounted filesystem also contains a
pointer back to the vnode on which it is mounted so that pathnames that include “..” can also be traversed
across mount points.
In addition to the VFS and vnode operations, each filesystem type must provide mount and mount_root
operations to mount normal and root filesystems. The operations defined for the filesystem interface are
given below. In the arguments and results, vp is a pointer to a vnode, dvp is a pointer to a directory vnode
and devvp is a pointer to a device vnode.

Concurrent Access and File Locking

NFS does not support remote file locking. We purposely did not include this as part of the protocol
because we could not find a set of file locking facilities that everyone agrees is correct. Instead we have a
separate, RPC based file locking facility. Because file locking is an inherently stateful service, the lock
service depends on yet another RPC based service called the status monitor 6. The status monitor keeps
track of the state of the machines on a network so that the lock server can free the locked resources of a
crashed machine. The status monitor is important to stateful services because it provides a common view
of the state of the network.
Related to the problem of file locking is concurrent access to remote files by multiple clients. In the local
filesystem, file modifications are locked at the inode level. This prevents two processes writing to the
same file from intermixing data on a single write. Since the NFS server maintains no locks between
requests, and a write may span several RPC requests.

UNIX Open File Semantics

We tried very hard to make the NFS client obey UNIX filesystem semantics without modifying the server
or the protocol. In some cases this was hard to do. For example, UNIX allows removal of open files. A
process can open a file, then remove the directory entry for the file so that it has no name anywhere in the
filesystem, and still read and write the file. This is a disgusting bit of UNIX trivia and at first we were just
not going to support it, but it turns out that all of the programs that we didn’t want to have to fix (csh,
sendmail, etc.) use this for temporary files.
What we did to make open file removal work on remote files was check in the client VFS remove
operation if the file is open, and if so rename it instead of removing it. This makes it (sort of) invisible
to the client and still allows reading and writing. The client kernel then removes the new name when the
vnode becomes inactive. We call this the 3/4 solution because if the client crashes between the rename
and remove a garbage file is left on the server. An entry to cron can be added to clean up on the server,
but, in practice, this has never been necessary.
Another problem associated with remote, open files is that access permission on the file can change while
the file is open. In the local case the access permission is only checked when the file is opened, but in the
remote case permission is checked on every NFS call. This means that if a client program opens a file,
then changes the permission bits so that it no longer has read permission, a subsequent read request will
fail. To get around this problem we save the client credentials in the file table at open time, and use them
in later file access requests.

Conclusions

We think that the NFS protocols, along with RPC and XDR, provide the most flexible method of remote
file access available today. To encourage others to use NFS, Sun has made public all of the protocols
associated with NFS. In addition, we have published the source code for the user level implementation of
the RPC and XDR libraries.