written 7.0 years ago by |
Distributed file system allows users of distributed computers to share data and storage resources.
The goal is to present transparency to the user and the system
Features of good distributed file system
Transparency:
Network Transparency: It means that client uses the same operations to access the local as well as remote files also known as access transparency.
Location Transparency: A consistent name space exists for local and remote files. The name of a file does not reveal its location
Replication transparency: To support scalability and fault tolerance files may be replicated across multiple servers
Migration transparency: It means that files should be able to move from one machine to another and DFS should support fine grained distribution of data required
Concurrency transparency: All clients have the same view of the file system
Failure transparency: DFS should be fault tolerant that is client program should operate correctly after server failures,communication faults, storage device crashes.
User mobility:
• user should be able to access the file from any machine.
• DFS should facilitate this by bringing the users environment (eg home dir) to whichever machine the user logs from
Performance measurement: The time to service a file access request to a DFS should be comparable to that of conventional file system
Hetrogeneity: File services should be provided across different operating system and hardware platform
Scalability: DFS design should be scalable to withstand high service load, accomodate, growth, and enable simple integration of added resources
File Sharing Semantics
A client uses open() and close() operations to establish the file session which in effect uses read() and write() operation for accessing the file
Read and write operations are file access semantics
In file sharing environment such as DFS, concurrent access to the file also requires to be designed for which file sharing semantics are used
Unix sequential semantics
Writes to an open file are visible immediately to all clients who have this file open
By Implementing sequential semantics where a file has a single physical image and all accesses are being served sequentially
This semantic requires to guarantee that at any given time only one client is allowed to write the cached copy anywhere
This requirement Itself requires the semantics to have a distributed conflict resolution scheme
Once the cache is modified, the result of the update requires to be immediately be propogated to all cached copies
Session Semantics
Writes to an open file are visible immediately to local clients but not to the remote client who have the same file open simultaneously in session semantics
The changes are propogated at the close of a file and are visible to the remote client in the next session of open file
Session semantics are suitable for caching entire file
Read and write within a session (open and close) can be easily be handled with stored cached copy
Immutable shared file semantics
An immutable file is a file which cannot be modified once it has been created
It can be open or read
Modification would require that a new copy be created
This file has the properties that its name may not be reused and its content may not be altered
Sharing is allowed only in read mode thus implementing this semantics is very easy
Transaction like semantics
An access to a file or a group of files is allowed by a process by performing a “begin transaction” operation to signal that all successive operations will be executed automatically
When the transaction finishes an end transaction primitive is executed
In case of concurrent access serialization is maintained
This is implemented using locks .
For designing a DFS would compromise between the performance parameters such as increasing client performance, reducing network traffic and minimizing the load of the server
Thus, a semantic should be choosen in keeping a mind the usage pattern of the file system