General troubleshooting recommendations

From Linux NFS

(Difference between revisions)
Jump to: navigation, search
(Getting detailed debug output of the client/server interactions)
(Tracing kernel calls)
Line 35: Line 35:
# Don't forget about tcpslice and tethereal's command line parsers if you have a really big trace and you need to split it into manageable chunks.
# Don't forget about tcpslice and tethereal's command line parsers if you have a really big trace and you need to split it into manageable chunks.
-
== Tracing kernel calls ==
+
== Kernel Stack Traceback ==
-
If the problem might be in the kernel, it can be useful to log a trace of the kernel calls. Here is how to turn this on:
+
If you have hung processes, capture a stack traceback to show where the processes are waiting in the kernel.
First, look in /etc/sysctl.conf to see if kernel.sysrq is set to 1.  If not, then run this command:
First, look in /etc/sysctl.conf to see if kernel.sysrq is set to 1.  If not, then run this command:
Line 43: Line 43:
  echo 1 > /proc/sys/kernel/sysrq
  echo 1 > /proc/sys/kernel/sysrq
-
Next, turn on system trace via this command:
+
Next, trigger a stack traceback via this command:
  echo t > /proc/sysrq-trigger
  echo t > /proc/sysrq-trigger
-
 
+
Look on your console or in /var/log/messages for the output.
=="Reboot" the NFSv4 server without shutting down the machine==
=="Reboot" the NFSv4 server without shutting down the machine==

Revision as of 14:43, 29 June 2005

Depending on your configuration, there's a number of ways that NFS can fail to work. Sometimes it can be difficult to determine exactly why it is not working. This page describes some general techniques for diagnosing the issue.

If you cannot resolve your problem and plan to report it to the developer, see Reporting bugs.

Contents

General NFSv4 Issues

Check server's exports

An easy first thing to doublecheck is that your server is exporting what you think it is. On the server, run the command:

exportfs -v

If you need to make modifications, edit /etc/exports and re-export using the command

exportfs -r

Remember that pseudo-filesystems in NFSv4 work very differently than NFSv3. Review the Using NFSv4 directions if you have questions.

Check server mount functionality

Try mounting the nfs4 export on the server itself by mounting localhost:/. This will isolate whether the problem is with the server configuration.

Getting detailed debug output of the client/server interactions

If you suspect the problem may involve some sort of miscommunication between the client and server, it can be useful for debugging purposes to dump the communication stream:

Start `tcpdump -s 9000 -w /tmp/dump.out port 2049` on the client, then conduct the client/server interaction. Review the /tmp/dump.out file (or include it with your bug report).

Useful tips:

  1. If you build your own kernels, enable CONFIG_PACKET_MMAP (Under Device Drivers --> Networking Support --> Network Options) to help tcpdump to keep up with traffic.
  2. Use a tmpfs file system for the tcpdump output file. tcpdump will keep up more easily, especially with gigabit speed transfer rates.
  3. Capture a trace on both ends if you suspect a network problem. Comparing the traces will show what each side of the communication is seeing.
  4. Leave off the "port 2049" to capture DNS, NIS, LDAP, or Kerberos traffic, if you suspect one of these auxiliary protocols is causing misbehavior.
  5. Don't forget about tcpslice and tethereal's command line parsers if you have a really big trace and you need to split it into manageable chunks.

Kernel Stack Traceback

If you have hung processes, capture a stack traceback to show where the processes are waiting in the kernel.

First, look in /etc/sysctl.conf to see if kernel.sysrq is set to 1. If not, then run this command:

echo 1 > /proc/sys/kernel/sysrq

Next, trigger a stack traceback via this command:

echo t > /proc/sysrq-trigger

Look on your console or in /var/log/messages for the output.

"Reboot" the NFSv4 server without shutting down the machine

Just shut down rpc.nfsd and start it again.

Comparing results when mounting via NFSv3 and NFSv4

Find a file that is differing between v3 and v4, and look at the output from the `stat` utility.

Or use `ls -lid --type-style=full-iso` and `ls -lid --time=ctime --time-style=full-iso` if you don't have stat.

Kerberos issues

Check hostnames

Kerberos requires the hostname/domainname used in the keytab is correct. Run `hostname` and look in /etc/hosts to doublecheck that it is set properly. Compare with what you've listed in your keytab file.

Check keytabs

Run the following command to check your keytab:

klist -k
Personal tools