05-02-2013, 02:18 PM
Distributed File Systems(DFS)
Distributed File.ppt (Size: 528.5 KB / Downloads: 41)
Learning objectives
Understand the requirements that affect the design of distributed services
NFS: understand how a relatively simple, widely-used service is designed
Obtain a knowledge of file systems, both local and networked
Caching as an essential design technique
Remote interfaces are not the same as APIs
Security requires special consideration
Recent advances: appreciate the ongoing research that often leads to major advances
Storage systems and their properties
In first generation of distributed systems (1974-95), file systems (e.g. NFS) were the only networked storage systems.
With the advent of distributed object systems (CORBA, Java) and the web, the picture has become more complex.
What is a file system?
Persistent stored data sets
Hierarchic name space visible to all processes
API with the following characteristics:
access and update operations on persistently stored data sets
Sequential access model (with additional random facilities)
Sharing of data between users, with access control
Concurrent access:
certainly for read-only access
what about updates?
Other features:
mountable file stores
more? ...
File Service Architecture
An architecture that offers a clear separation of the main concerns in providing access to files is obtained by structuring the file service as three components:
A flat file service
A directory service
A client module.
The relevant modules and their relationship is (shown next).
The Client module implements exported interfaces by flat file and directory services on server side.
Case Study: Sun NFS
An industry standard for file sharing on local networks since the 1980s
An open standard with clear and simple interfaces
Closely follows the abstract file service model defined above
Supports many of the design requirements already mentioned:
transparency
heterogeneity
efficiency
fault tolerance
Limited achievement of:
concurrency
replication
consistency
security
Kerberized NFS
Kerberos protocol is too costly to apply on each file access request
Kerberos is used in the mount service:
to authenticate the user's identity
User's UserID and GroupID are stored at the server with the client's IP address
For each file request:
The UserID and GroupID sent must match those stored at the server
IP addresses must also match
This approach has some problems
can't accommodate multiple users sharing the same client computer
all remote filestores must be mounted each time a user logs in
NFS performance
Early measurements (1987) established that:
write() operations are responsible for only 5% of server calls in typical UNIX environments
hence write-through at server is acceptable
lookup() accounts for 50% of operations -due to step-by-step pathname resolution necessitated by the naming and mounting semantics.
More recent measurements (1993) show high performance:
1 x 450 MHz Pentium III: > 5000 server ops/sec, < 4 millisec. average latency
24 x 450 MHz IBM RS64: > 29,000 server ops/sec, < 4 millisec. average latency
see www.spec.org for more recent measurements
Provides a good solution for many environments including:
large networks of UNIX and PC clients
multiple web server installations sharing a single file store