CITI Experience with Directory Delegations
From Linux NFS
NB: this is a rough work-in-progress and will be fleshed-out over the next few days; thank you.
Directory Delegations background
NFSv4.1 introduces read-only directory delegations, a protocol addition intended to enable clients to perform more-aggressive caching of directory contents and metadata. The following subsections quote section 11 of the NFSv4.1 minor version draft:
NFSv4 client caching behavior
"Directory caching for the NFS version 4 protocol is similar to previous versions. Clients typically cache directory information for a duration determined by the client. At the end of a predefined timeout, the client will query the server to see if the directory has been updated. By caching attributes, clients reduce the number of GETATTR calls made to the server to validate attributes. Furthermore, frequently accessed files and directories, such as the current working directory, have their attributes cached on the client so that some NFS operations can be performed without having to make an RPC call. By caching name and inode information about most recently looked up entries in DNLC (Directory Name Lookup Cache), clients do not need to send LOOKUP calls to the server every time these files are accessed."
NFSv4.1 delegations extensions
"[The NFSv4] caching approach works reasonably well at reducing network traffic in many environments. However, it does not address environments where there are numerous queries for files that do not exist. In these cases of "misses", the client must make RPC calls to the server in order to provide reasonable application semantics and promptly detect the creation of new directory entries. Examples of high miss activity are compilation in software development environments. The current behavior of NFS limits its potential scalability and wide-area sharing effectiveness in these types of environments."
Furthermore, analysis of NFSv3 (whose client cache semantics NFSv4 mirrors) network traces by Brian Wickman at the University of Michigan (FIXME: need link to a copy of his prelim) show that a very significant amount of NFS traffic are the periodic GETATTRs the clients send when a timeout triggers a cache revalidation.