Client sessions Implementation Issues

From Linux NFS

(Difference between revisions)
Jump to: navigation, search
(Initial description of open issues in the NFSv4.1 client sessions implementation)
(Add Known Bugs)
Line 113: Line 113:
*** Known problem where max sizes does not allow for compound operation header
*** Known problem where max sizes does not allow for compound operation header
*** Audit client to ensure proper GETFH usage after FH modifying ops (Section 2.10.6.4)  (B)
*** Audit client to ensure proper GETFH usage after FH modifying ops (Section 2.10.6.4)  (B)
 +
 +
== Known Bugs ==
 +
Should file bugzillas for these and refer the BZ# here instead
 +
#When trying to umount a server that is down and the process is interrupted with a CNTL-C leads to an oops
 +
#*This happens in NFSv4.1 due to destroy_session / sequence sync issues.
 +
#** Alexandros sent a patch that Trond wants to be revisited
 +
# If rsize, wsize are specified at mount time:
 +
#* the request_sz, response_sz take the wsize, rsize values incorrectly
 +
#* options don't propagate to CREATE_SESSION with DSs (pNFS)
 +
#* Alexandros will be submitting a patch soon
 +
# If backchannel connection timeouts, upon re-establishment CB_SEQUENCE is out of order
 +
# Mount two different file systems from the same servers
 +
#* Does not reuse the nfs_client structure therefore failing to reuse the existing session
 +
#* Alexandros: (related) mount nfs41_server:/exp1 /mnt, ..., mount nfs41_server:/exp2 /mnt2. New session overrides old
 +
# The spec says that the program version number for the backchannel must be set to 4 (!= v4.0 that the spec doesn't say anything AFICS). Currenly we set it to 1. What about NFSv4.0 ?
 +
# Sequence Flooding: the NFSv4.1 client keeps queuing SEQUENCE operations when the server is down (and eventually we run out of slots). Same for NFSv4.0 and RENEWs (minus the slot issue)
 +
#*Alexandros working on a patch

Revision as of 02:54, 31 October 2009

The client forechannel and backchannel functionality has been integrated into Linux-2.6.31. The server forechannel functionality has been integrated into Linux-2.6.30. The server backchannel functionality has been integrated into Linux-2.6.32. This document highlights functionality that is not yet fully implemented, not fully tested, or which status needs to be checked. It also provides a list of known issues/ bugs. This list of issues and pending functionality should be addressed before the NFS v4.1 client can be changed from Developer to Experimental. This change will allow distros to more comfortably include the functionality in their releases.

Contents

Legend

Note: The labeling still needs to be reviewed by the v4.1 Linux community.

  • An (A) indicates the issue needs to be addressed prior to status change
  • A (B) indicates the issue can be deferred after status change

It's only a first stab at this time. In the near future we'll probably break the list into two sections: Items needed before the change of status and items needed after change of status.

NFSv4.1 Sessions

Backchannel

  • Duplicate Reply Cache (B)
    • Not yet implemented
    • The backchannel currently only implements idempotent operations.
  • Alternate connection for the backchannel (B)
    • Not yet implemented
    • The backchannel can only currently be bound to the existing forechannel connection.
    • BIND_CONN_TO_SESSION (Separate Connection) (B)
      • Not yet implemented.
      • The workaround is for the client to destroy and create a new session to reestablish the backchannel.
    • BACKCHANNEL_CTL (B)
      • Not yet implemented
      • Provide alternate Backchannel program number
      • Provide Kerberos (not yet supported) Principals for Backchannel
  • Sequence Flag Processing
    • The client does not yet implement the check on the following callback path related flags
    • SEQ4_STATUS_CB_PATH_DOWN (A)
    • SEQ4_STATUS_CB_PATH_DOWN_SESSION (A)
    • SEQ4_STATUS_BACKCHANNEL_FAULT (A)
    • Section 2.10.12.2.4 recommends (A)
      • Provide a new connection and bind it to the session when the server indicates the backchannel is down

[ AB: I have implemented a first version of this, queued up for submission upstream]

  • Inspect "Referring triples" to detect race with forechannel
    • Section 2.10.6.3
    • Not yet implemented
  • Kerberos (B)
    • Not yet implemented
    • Need to ensure krb5 forechannel with AUTH_SYS backchannel works (A)

Slot Management/ Negotiation

None of the following items have yet been implemented

  • Client CB_RECALL_SLOT (Handles server reducing slots) (B?)
  • Server CB_RECALL_SLOT (Reduce slots) (B)
  • Client needs to provide indication of "highest_slotid" and comply with "target" and "enforced highest_slotid" in SEQUENCE OP (B?)
  • Define policy to size slot table (startup, congestion, etc) (B)
  • Statistics to monitor (B)
  • Destroy Session when not in use (B)
  • Verify we ask for LEASE TIMEOUT after every clientid exchange (A)

Connection Management

  • Rebind session to a new connection (after loss of connection)
    • BIND_CONN_TO_SESSION (B)
    • Not yet impelemented - we currently destroy the session and create a new one

Session Reestablishment

  • Need a thorough review of session and state recovery (A)
  • Need to verify that open state, locks, and delegations survive session reestablishment (A)

SessionID Trunking

Increases the I/O pipe and the number of slots

  • Bind a new connection to an existing session (B)
    • BIND_CONN_TO_SESSION (B)
      • Not yet implemented
    • Issue SEQUENCE with existing sessionID?
    • IIRC, the spec states that a SEQUENCE op on a new connection causes the connection to be bound to the specified session
      • Not yet implemented

ClientID Trunking

  • Not yet implemented

State Management

  • State revocation handling
    • Sequence status bits processing (A)
      • Not yet implemented
      • SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING (B)
      • SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED (B)
      • SEQ4_STATUS_EXPIRED_{ALL/SOME}_STATE_REVOKED (A)
        • Propagate error to app
      • SEQ4_STATUS_ADMIN_STATE_REVOKED (A)
      • SEQ4_STATUS_RECALLABLE_STATE_REVOKED (A)
      • SEQ4_STATUS_LEASE_MOVE (B?)
      • SEQ4_STATS_RESTART_RECLAIM_NEEDED (A)
    • TEST_STATEID (B?)
      • Use to determine status of stateids
      • Not yet implemented
    • FREE_STATEID (B?)
      • Use to tell server to free stateids after revocation
  • Verify we use the correct stateid ordering (Section 8.2.4) (A)
  • Ensure Close with most recent stateid (not v4.1 specific) (A)
  • Backchannel must check for zero seqid in stateid callbacks (Section 8.2.2) (B?)
  • Verify locks and delegations survive session reestablishment (A)

State Reclaim

  • Wait for outstanding RPCs (Section 8.4.2.1) (B)
  • LOCK with RECLAIM (A?)
  • OPEN with CLAIM_PREVIOUS (B?)
  • RECLAIM_COMPLETE (B?)
  • Lock recovery when eir_server_owner is different (Section 8.4.2.1) (B?)
    • Verify client attempts lock recovery when eir_server_scope is same

State Protection

  • SSV Support (for trunking and reconnection) (B)
    • SET_SSV
    • GET_SSV
  • Mach creds (B)

Error Handling Review

  • Thorough error handling inspection and testing (A)


COMPOUND and CB_COMPOUND

  • Correct use of max sizes (A)
    • Client should take care to use correct request and response max sizes
      • Known problem where max sizes does not allow for compound operation header
      • Audit client to ensure proper GETFH usage after FH modifying ops (Section 2.10.6.4) (B)

Known Bugs

Should file bugzillas for these and refer the BZ# here instead

  1. When trying to umount a server that is down and the process is interrupted with a CNTL-C leads to an oops
    • This happens in NFSv4.1 due to destroy_session / sequence sync issues.
      • Alexandros sent a patch that Trond wants to be revisited
  2. If rsize, wsize are specified at mount time:
    • the request_sz, response_sz take the wsize, rsize values incorrectly
    • options don't propagate to CREATE_SESSION with DSs (pNFS)
    • Alexandros will be submitting a patch soon
  3. If backchannel connection timeouts, upon re-establishment CB_SEQUENCE is out of order
  4. Mount two different file systems from the same servers
    • Does not reuse the nfs_client structure therefore failing to reuse the existing session
    • Alexandros: (related) mount nfs41_server:/exp1 /mnt, ..., mount nfs41_server:/exp2 /mnt2. New session overrides old
  5. The spec says that the program version number for the backchannel must be set to 4 (!= v4.0 that the spec doesn't say anything AFICS). Currenly we set it to 1. What about NFSv4.0 ?
  6. Sequence Flooding: the NFSv4.1 client keeps queuing SEQUENCE operations when the server is down (and eventually we run out of slots). Same for NFSv4.0 and RENEWs (minus the slot issue)
    • Alexandros working on a patch
Personal tools