CITI ASC status
From Linux NFS
(→Milestones) |
(→Activities) |
||
Line 47: | Line 47: | ||
===Activities=== | ===Activities=== | ||
- | We are | + | We are expanding our simple single whole file layout implementation to include multiple small byte range layouts which requires a new layout cache implementation on the client and a new layout management implementation on the server. |
+ | |||
+ | In cooperation with EMC, we continue to develop a block layout driver module for the generic pNFS client. | ||
+ | |||
+ | We continue to measure I/O performance. | ||
+ | |||
+ | We joined the (http://www.ultralight.org) Ultralight project and are testing pNFS I/O using 10G pNFS clients against 1G pNFS clusters. | ||
+ | The Linux pNFS client is included in the Ultralight kernel which is distributed to ultralight sites providing opportunities for large distance WAN testing. | ||
+ | |||
+ | DESY (?) | ||
- | |||
==Task 2. Migration of client from one mount/metadata server to another to be demonstrated. This demonstration may be replicated at LANL depending on success of this work. == | ==Task 2. Migration of client from one mount/metadata server to another to be demonstrated. This demonstration may be replicated at LANL depending on success of this work. == | ||
When a file system moves, the former server notifies clients with NFS4ERR_MOVED. Clients then reclaim state held on the former server by engaging in reboot recovery with the new server. For cluster file systems, server-to-server state transfer lets clients avoid the reclaim. | When a file system moves, the former server notifies clients with NFS4ERR_MOVED. Clients then reclaim state held on the former server by engaging in reboot recovery with the new server. For cluster file systems, server-to-server state transfer lets clients avoid the reclaim. |
Revision as of 18:00, 12 October 2006
I started with the May 2006 report, which we can bring up to date for the October 2006 report.
University of Michigan/CITI NFSv4 ASC alliance
Status of May 2006
Task 1. Demonstration of pNFS with multiple back end methods (PVFS and File) including layout recall — LANL will replicate this demonstration at LANL working with CITI remotely.
Development.
We've updated the pNFS client and server to the 2.6.17 kernel level, and will rebase again for 2.6.19. We've updated the pNFS codebase to the draft-ietf-nfsv4-minorversion1-05. Through testing we've identified and fixed multiple bugs.
We rewrote the Linux pNFS client to use it's own set of rpc operations to cleanly separate the common NFS v2/3/4/4.1 code from the pNFS specific code. We now have four client layout modules under development. The file layout driver is being jointly developed by CITI, Network Appliance, and IBM Almaden. The CITI pVFS2 layout driver from Dean Hildebrand. The object layout driver from Panasas. The block layout driver is being developed at CITI under contract from EMC. We've expanded the layout operation interface and the layout policy interface between the layout driver and generic pNFS client to accommodate the requirements of the multiple layout drivers. We are designing and coding a pNFS client layout cache to replace the current simple single layout per inode implementation.
We've improved the Linux pNFS server to underlying file system interface which is now used by the Panasas object layout server as well as the IBM GPFS server. We are currently coding the server pNFS layout management service and file system interfaces to bookeep layouts in order to expand the current simple single layout recall implementation.
We've continued to developed the pVFS2 layout and pVFS2 pNFS server. (XXX Dean)
We developed prototype implementations of pNFS operations: o OP_GETDEVICELIST, o OP_GETDEVICEINFO, o OP_LAYOUTGET, o OP_LAYOUTCOMMIT, o OP_LAYOUTRETURN and o OP_CB_LAYOUTRECALL
We continue testing the prototype’s ability to send direct I/O data to data servers.
Milestones
At the September NFSv4 bakeathon hosted by CITI, we continued to test the ability of CITI's Linux pNFS client to operate with multiple layouts, and CITI's Linux pNFS server to export pNFS capable underlying file systems. We demonstrated the Linux pNFS client support for multiple layouts by copying files between multiple pNFS back ends.
The following pNFS implementations were tested. File Layout
Linux and Solaris client Network Appliance, Linux IBM GPFS, DESY dCache, Solaris server
Object layout
Linux client Linux Panasas server
Block layout
Linux client EMC server
Activities
We are expanding our simple single whole file layout implementation to include multiple small byte range layouts which requires a new layout cache implementation on the client and a new layout management implementation on the server.
In cooperation with EMC, we continue to develop a block layout driver module for the generic pNFS client.
We continue to measure I/O performance.
We joined the (http://www.ultralight.org) Ultralight project and are testing pNFS I/O using 10G pNFS clients against 1G pNFS clusters. The Linux pNFS client is included in the Ultralight kernel which is distributed to ultralight sites providing opportunities for large distance WAN testing.
DESY (?)
Task 2. Migration of client from one mount/metadata server to another to be demonstrated. This demonstration may be replicated at LANL depending on success of this work.
When a file system moves, the former server notifies clients with NFS4ERR_MOVED. Clients then reclaim state held on the former server by engaging in reboot recovery with the new server. For cluster file systems, server-to-server state transfer lets clients avoid the reclaim.
We redesigned state bookkeeping to ensure that state created on NFSv4 servers exporting the same cluster file system will not collide. We are rewriting the interface that clients use when saving NFSv4 server state in stable storage to also support the server-server state transfer.
It remains to inform clients that state established with the former server remains valid on the new server. IETF is considering solutions, e.g., augmented FS_LOCATIONS information or a new error code NFS4ERR_MOVED_DATA_AND_STATE.
Task 3. Analysis of caching and lock coherency, demonstration of caching and lock performance with scaling, under various levels of conflict, using byte range locks (looking at lock splitting issues etc.).
We have set up test machines and begun planning for tests. We have some immediate concerns over the memory footprint imposed by server lock structures.
Task 4. Analysis of directory delegations – how well does it work and when, when does it totally not work.
Development
We have implemented directory delegations in the Linux client and server. Our server implementation of directory delegations follows the file delegations architecture. We extended the lease API in the Linux VFS to support read-only leases on directories and NFS-specific lease-breaking semantics.
We implemented a /proc interface for enabling or disabling directory delegation at run time. At startup, the client queries the server for directory delegation support.
Directory delegations promise to extend the usefulness of negative dentry caching on the client. Negative caching is unsafe without cache invalidation (positive caching can be treated as a hint). To give an example, opening a file that does not exist produces an OPEN RPC that fails. Open-to-close semantics and the lack of consistent negative caching requires that subsequent opens of the same non-existent file yield repeated OPEN RPC calls being sent to the server. This example is played out frequently when searching for an executable in PATH or a shared library in LD_LIBRARY_PATH.
Directory delegation enables negative caching by assuring that no entries have been added or modified in a cached directory. This should markedly decrease unnecessary repeated checks for non-existent files. We are testing this use case.
The server has hooks for a policy layer to control the granting of directory delegations. (No policy is implemented yet.) When and whether to acquire delegations is also a client concern.
Testing
We are testing delegation grant and recall in a test rig with one or two clients. Testing consists mostly of comparing NFS operation-counts when directory delegations is enabled or disabled.
Tests range from simple UNIX utilities — ls, find, touch — to hosting a CVS repository or compiling with shared libraries and header files on NFS servers. Tests will become more specific.
We have extended PyNFS to support directory delegations. So far, the support is basic and the tests are trivial. Tests will become more specific.
We are designing mechanisms that allow simulation experiments to compare delegation policies on NFSv4 network traces.
Task 5. How do you specify/measure NFS Server load.
We have no progress to report on this task.