written 6.8 years ago by | modified 2.8 years ago by |
Subject: Parallel and Distributed System
Topic: Distributed File System
Difficulty: Medium
written 6.8 years ago by | modified 2.8 years ago by |
Subject: Parallel and Distributed System
Topic: Distributed File System
Difficulty: Medium
written 6.8 years ago by | modified 6.8 years ago by |
The Andrew File System (AFS) is a DFS that came out of the Andrew research project at Carnegie Mellon University (CMU). Its goal was to develop a DFS that would scale to all computers on the university’s campus. It was further developed into a commercial product and an opensource branch was later released under the name “OpenAFS”.
AFS offers the same API as Unix, implements Unix semantics for processes on the same machine, but implements write-on-close semantics globally. All data in AFS is mounted in the /afs directory and organised in cells (e.g. /afs/cs.cmu.edu).
Cells are administrative units that manage users and servers. Files and directories are stored on a collection of trusted servers called Vice. Client processes accessing AFS are redirected by the file system layer to a local user-level process called Venus (the AFS daemon), which then connects to the servers.
The servers serve whole files, which are cached as a whole on the clients’ local disks. For cached files a callback is installed on the corresponding server. After a process finishes modifying a file by closing it, the changes are written back to the server. The server then uses the callbacks to invalidate the file in other clients’ caches.
As a result, clients do not have to validate cached files on access (except after a reboot) and hence there is only very little cache validation traffic. Data is stored on flexible volumes, which can be resized and moved between the servers of a cell. Volumes can be marked as read only, e.g. for software installations.
AFS does not trust Unix user IDs and instead uses its own IDs which are managed at a cell level. Users have to authenticate with Kerberos by using the klog command. On successful authentication, a token will be installed in the client’s cache managers.
When a process tries to access a file, the cache manager checks if there is a valid token and enforces the access rights. Tokens have a timp stamp and expire, so users have to renew their token from time to time. Authorisation is implemented by directory-based ACLs, which allow finer grained access rights than Unix.