Some NFS file transfers fail and hang automounting

From Linux NFS

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
 +
== About ==
 +
* [[https://bugzilla.kernel.org/show_bug.cgi?id=16213 Bug 126213]]
 +
* Reported by: Philippe Dax (June 15, 2010)
 +
* Fixed by: Trond Myklebust (June 16, 2010)
 +
* Kernel version: 2.6.33.5-112.fc13.x86_64
 +
== Symptoms ==
== Symptoms ==
* Given a file "foo" of 50Mb on a remote machine "remote".  
* Given a file "foo" of 50Mb on a remote machine "remote".  
Line 13: Line 19:
* This incident occurs with: <pre>sunrpc.tcp_slot_table_entries = 16</pre>
* This incident occurs with: <pre>sunrpc.tcp_slot_table_entries = 16</pre>
* This incident does NOT occur with: <pre>sunrpc.tcp_slot_table_entries = 32</pre>
* This incident does NOT occur with: <pre>sunrpc.tcp_slot_table_entries = 32</pre>
 +
 +
== Resolution ==
 +
This problem was fixed by:
 +
<pre>
 +
commit b76ce56192bcf618013fb9aecd83488cffd645cc
 +
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
 +
Date:  Wed Jun 16 13:57:32 2010 -0400
 +
 +
    SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()
 +
   
 +
    If the attempt to read the calldir fails, then instead of storing the read
 +
    bytes, we currently discard them. This leads to a garbage final result when
 +
    upon re-entry to the same routine, we read the remaining bytes.
 +
   
 +
    Fixes the regression in bugzilla number 16213. Please see
 +
        https://bugzilla.kernel.org/show_bug.cgi?id=16213
 +
   
 +
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
 +
    Cc: stable@kernel.org
 +
</pre>

Revision as of 15:02, 22 October 2010

About

  • [Bug 126213]
  • Reported by: Philippe Dax (June 15, 2010)
  • Fixed by: Trond Myklebust (June 16, 2010)
  • Kernel version: 2.6.33.5-112.fc13.x86_64

Symptoms

  • Given a file "foo" of 50Mb on a remote machine "remote".
    • This command will never finish
      <localmachine $> cp /remote_mount_point/foo bar 
    • bar will have a size less than foo.
    • automounting of the local machine is hung.
  • The following message will show up in /var/log/messages
kernel: Callback slot table overflowed
  • The problem doesn't occur if foo has a size less than 10Mb
  • The final size of bar appears to be random
  • This incident occurs with:
    sunrpc.tcp_slot_table_entries = 16
  • This incident does NOT occur with:
    sunrpc.tcp_slot_table_entries = 32

Resolution

This problem was fixed by:

commit b76ce56192bcf618013fb9aecd83488cffd645cc
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date:   Wed Jun 16 13:57:32 2010 -0400

    SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()
    
    If the attempt to read the calldir fails, then instead of storing the read
    bytes, we currently discard them. This leads to a garbage final result when
    upon re-entry to the same routine, we read the remaining bytes.
    
    Fixes the regression in bugzilla number 16213. Please see
        https://bugzilla.kernel.org/show_bug.cgi?id=16213
    
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
    Cc: stable@kernel.org
Personal tools