Cluster client migration prototype

From Linux NFS

(Difference between revisions)
Jump to: navigation, search
Line 12: Line 12:
-
A [http://www.citi.umich.edu/u/richterd/migration-moved-and-good-open-reclaim-3--apikia-rhcl1-rhcl2.pcap network trace] of the client <tt>141.211.133.'''86'''</tt> migrating from server <tt>141.211.133.'''212'''</tt> to <tt>141.211.133.'''213'''</tt> is available from CITI's website.  Packets 104/106 show a file initially being opened; then the migration was triggered; then, packets 128/130 show the client trying to make a symlink and getting a "moved" error; packets 140/142 show the client making contact with the target server; packets 156/158 show the client reclaiming state for the file it had open; and finally, packets 239/241 show subsequent "normal" operation as another file is read after the artificial grace period expired.
+
A [http://www.citi.umich.edu/u/richterd/migration-moved-and-good-open-reclaim-3--apikia-rhcl1-rhcl2.pcap network trace] of the client <tt>141.211.133.'''86'''</tt> migrating from server <tt>141.211.133.'''212'''</tt> to <tt>141.211.133.'''213'''</tt> is available from CITI's website.  After filtering for only NFS traffic, packets 104/106 show a file initially being opened; then the migration was triggered; then, packets 128/130 show the client trying to make a symlink and getting a "moved" error; packets 140/142 show the client making contact with the target server; packets 156/158 show the client reclaiming state for the file it had open; and finally, packets 239/241 show subsequent "normal" operation as another file is read after the artificial grace period expired.

Revision as of 22:07, 9 January 2008

Client migration prototype

this is still a work-in-progress [2008-01-09]

As part of CITI's work with IBM, we looked at some of the issues involved with NFSv4 client migration and developed an initial prototype. Our setup involved a cluster of equivalent NFS servers attached to a GFS2 disk array, with each server exporting the same directory from the GFS2 filesystem. The intent was to provide an interface by which an administrator could selectively migrate NFSv4 clients from one server to another (e.g., to take a server down for maintenance).


The prototype is a proof-of-concept: the "right way" to migrate a client would be to transfer all of the client-related state from one server to another and then have the client reorient to the new server and continue without interruption; instead, this prototype leverages parts of the existing reboot-recovery process. To briefly explain reboot-recovery, when a Linux NFSv4 server starts, it enters a ~90sec phase called a grace period; during this time, eligible clients may contact the server and reclaim state for open files and locks they were holding prior to a server crash/reboot. In order to allow clients to reclaim state without conflicts, new opens, etc, are disallowed during the grace period.

During a migration, the cluster is put into an artificial grace period and the target-server is notified that a new client is eligible to perform reclaims. When the client contacts the source-server, it receives an error message saying that the file system has moved and sees that it should migrate to the target-server. The client establishes a connection to the target-server and reclaims its state almost identically to how it would after a server reboot. Shortly thereafter, the grace period expires, the client is purged from the source-server, and then it's business as usual.

The prototype is limited in many ways: for ease of integration, only the creation of a symlink completes a migration event on the client; there is no security associated with the triggering of a migration; the GFS2 code in the kernel version used in the prototype is very fragile; the list goes on. Nevertheless, we have migrated clients at CITI that are able to -- to the extent that the maturity of that earlier GFS2 code permits -- continue functioning normally after a migration.


A network trace of the client 141.211.133.86 migrating from server 141.211.133.212 to 141.211.133.213 is available from CITI's website. After filtering for only NFS traffic, packets 104/106 show a file initially being opened; then the migration was triggered; then, packets 128/130 show the client trying to make a symlink and getting a "moved" error; packets 140/142 show the client making contact with the target server; packets 156/158 show the client reclaiming state for the file it had open; and finally, packets 239/241 show subsequent "normal" operation as another file is read after the artificial grace period expired.

Personal tools