NewMountDesignSpec
From Linux NFS
Chucklever (Talk | contribs) m (→Return Codes and Error Reporting) |
Chucklever (Talk | contribs) (A couple of bugs Neil found) |
||
(16 intermediate revisions not shown) | |||
Line 12: | Line 12: | ||
## NFSv4 mounts will ignore legacy options in order to make fallback work | ## NFSv4 mounts will ignore legacy options in order to make fallback work | ||
# Support transport protocol fallback - If TCP is not supported, fall back to UDP | # Support transport protocol fallback - If TCP is not supported, fall back to UDP | ||
+ | # Improve the ability to interrupt an ongoing mount request | ||
# Provide reasonable default behavior in the presence of network firewalls and misconfigured servers | # Provide reasonable default behavior in the presence of network firewalls and misconfigured servers | ||
# Facilitate new features - IPv6, RDMA, FS cache should be easy to introduce | # Facilitate new features - IPv6, RDMA, FS cache should be easy to introduce | ||
Line 39: | Line 40: | ||
If kernel processing fails in any way (bad mount options, unable to connect to the server's rpcbind, mountd, or NFS server port, or the server doesn't support the requested protocol or transport) the mount(2) system call will return with an appropriate error code. The mount.nfs command is then responsible for arbitrating a series of zero or more retry requests, depending on the exact error code that was returned. | If kernel processing fails in any way (bad mount options, unable to connect to the server's rpcbind, mountd, or NFS server port, or the server doesn't support the requested protocol or transport) the mount(2) system call will return with an appropriate error code. The mount.nfs command is then responsible for arbitrating a series of zero or more retry requests, depending on the exact error code that was returned. | ||
- | For example, if the return code indicates that the server's rpcbind does not have NFSv3 registered, the return code will report | + | For example, if the return code indicates that the server's rpcbind does not have NFSv3 registered, the return code will report EOPNOTSUPP. The mount.nfs command could simply replace "vers=3" with "vers=2" in the mount option string, and send a new mount request to the kernel. On the other hand, if the kernel finds an invalid mount option in the mount option string, it will report EINVAL. The mount.nfs command will not retry the request in this case. |
+ | |||
+ | ==== NFSv4 mounts ==== | ||
+ | |||
+ | NFSv4 mounts do not require an interaction with the server's rpcbind or mountd services, but do require an extra "clientaddr=" in order to advertise the client's callback address to the server. This address is usually discovered by mount.nfs4 by noting the network path between client and server, in order to account for multi-homed clients that may want all traffic to and from the server to travel via a particular NIC. Unlike the "addr=" option, the "clientaddr=" can also be specified by admins who need to advertise a specific address when mounting an NFSv4 server through a NAT router. | ||
+ | |||
+ | ==== Umount processing ==== | ||
+ | |||
+ | Unmounting an NFS share does not change with the new text-based NFS mount interface. The mount.nfs and mount.nfs4 commands must take care to record the specific mount option string that was finally successful in /etc/mtab so that umount can discover and use the same "addr=" and network transport protocol during unmount processing. | ||
== Return Codes and Error Reporting == | == Return Codes and Error Reporting == | ||
Line 178: | Line 187: | ||
<pre> | <pre> | ||
- | + | EINVAL The mount option string was not able to be parsed, or an unre- | |
cognized option was specified, or a keyword option was specified | cognized option was specified, or a keyword option was specified | ||
with a value that is out of range. | with a value that is out of range. | ||
+ | |||
+ | EIO An unknown error occurred while attempting the mount request. | ||
</pre> | </pre> | ||
- | + | These are permanent mount errors. The calling program should not retry this request with the same options. | |
<pre> | <pre> | ||
Line 193: | Line 204: | ||
</pre> | </pre> | ||
- | These are temporary errors. The calling program may choose to retry this request using the same options, or fail immediately. | + | These are temporary errors. The calling program may choose to retry this request using the same mount options, or fail immediately. |
<pre> | <pre> | ||
- | |||
- | + | EOPNOTSUPP | |
The server reports that the program, version, or transport pro- | The server reports that the program, version, or transport pro- | ||
tocol is not currently available. | tocol is not currently available. | ||
+ | |||
+ | EPROTONOSUPPORT | ||
+ | The client is missing support for the requested NFS version, or | ||
+ | the server doesn't support the requested NFS version. | ||
ECONNREFUSED | ECONNREFUSED | ||
Line 207: | Line 221: | ||
</pre> | </pre> | ||
- | These are temporary errors. The calling program can attempt to recover by adjusting the options and retrying the request. | + | These are temporary errors. The calling program can attempt to recover by adjusting the mount options and retrying the request. |
== Test Planning == | == Test Planning == | ||
Line 259: | Line 273: | ||
## Test behavior when server host is up but not exporting NFS | ## Test behavior when server host is up but not exporting NFS | ||
## Test behavior when server host is up, but export is not available | ## Test behavior when server host is up, but export is not available | ||
- | # Test behavior of "retry=" option | + | # Test behavior of "retry=" option |
# Test behavior of falling back to NFSv2 | # Test behavior of falling back to NFSv2 | ||
+ | ## Works iff no vers= is specified | ||
# Test behavior of falling back to UDP | # Test behavior of falling back to UDP | ||
+ | ## Works iff no proto= is specified | ||
# Test behavior of mounting when mounted-on dir isn't yet present | # Test behavior of mounting when mounted-on dir isn't yet present | ||
+ | # Run under valgrind to watch for memory leaks | ||
+ | ## Check long-running mount requests - especially "bg" mounts against missing servers | ||
+ | # Test /etc/mtab corner cases | ||
+ | ## Test behavior when /etc/mtab doesn't exist | ||
+ | ## Test behavior when /etc/mtab isn't writable | ||
+ | ## Test behavior when /etc/mtab is a link to /proc/mounts | ||
+ | ## Test behavior when mounted-on dir or share name contains non-alphanum characters (blanks, slashes, ampersands, and so on) | ||
+ | |||
+ | These are static code checks. They are probably reasonable check-in tests, able to be run easily via a make target. | ||
+ | |||
+ | # gcc -Wall | ||
+ | # sparse | ||
+ | # valgrind | ||
+ | # gcov | ||
== Discussion of Individual NFS Mount Options == | == Discussion of Individual NFS Mount Options == | ||
Line 529: | Line 559: | ||
;Implementation | ;Implementation | ||
- | : | + | :The mount.nfs command must convert the value of this option into a resolved IP address, and pass that to the kernel. The current text-based implementation treats "n" not as a name, but as an address. |
;Testing status | ;Testing status | ||
Line 707: | Line 737: | ||
* Not tested with text-based mount.nfs | * Not tested with text-based mount.nfs | ||
- | ==== clientaddr=''n'' ==== | + | ==== clientaddr=''n.n.n.n'' ==== |
;Description | ;Description | ||
- | : | + | :Specifies a single IPv4 address in dotted-quad form that the NFS client advertises to allow servers to perform NFSv4 callback requests against files on this mount point. If the server is not able to establish callback connections to clients, performance may degrade, or accesses to files may temporarily hang. |
+ | :If this option is not specified, the '''mount'''(8) command attempts to discover an appropriate callback address automatically. The automatic discovery process is not perfect, however. In the presence of multiple client network interfaces, special routing policies, or atypical network topologies, the exact address to use for callbacks may be nontrivial to determine. | ||
;Implementation priority | ;Implementation priority | ||
Line 720: | Line 751: | ||
;Test plan | ;Test plan | ||
# Specify no mount options, and check that the kernel is getting a valid clientaddr= option from the mount.nfs command (using rpcdebug). | # Specify no mount options, and check that the kernel is getting a valid clientaddr= option from the mount.nfs command (using rpcdebug). | ||
- | # Specify clientaddr=garbage, and check that the client's kernel and user-space mount.nfs command properly reject it. | + | # Check that the clientaddr= option in /etc/mtab properly reflects the clientaddr that is in use by the kernel client. |
+ | # Specify clientaddr=garbage string, and check that the client's kernel and user-space mount.nfs command properly reject it. | ||
# Specify a clientaddr= a good address, and check that the client's kernel gets the same address. | # Specify a clientaddr= a good address, and check that the client's kernel gets the same address. | ||
+ | # Specify a working address for another client for clientaddr=, and check how mounting, normal NFS operation, and umounting behave. | ||
+ | # Specify a nonworking address for clientaddr=, and check how mounting, normal NFS operation, and unmounting behave. | ||
;Testing status | ;Testing status | ||
- | * Not tested with the legacy mount. | + | * Not tested with the legacy mount.nfs4 command |
- | * Partially tested with the text-based mount. | + | * Partially tested with the text-based mount.nfs4 command |
+ | * Partially tested with umount.nfs4 -- reports "Server failed to unmount" and hangs if Solaris server can't reach callback address | ||
==== intr ==== | ==== intr ==== | ||
;Description | ;Description | ||
- | :If an NFS file operation has a major timeout and it is hard mounted, then allow signals to | + | :If an NFS file operation has a major timeout and it is hard mounted, then allow signals to interrupt the file operation and cause it to return EINTR to the calling program. The default is to not allow file operations to be interrupted. |
;Implementation | ;Implementation |
Latest revision as of 17:03, 2 November 2007
Introduction
This wiki page is a working design specification for the new text-based NFS mount API. Here we discuss use cases, requirement statements, error reporting, and design specifications, in addition to minute behavioral details of mounting NFS shares. The purpose of this discussion is to understand how to implement the new interface, and to construct a unit test plan for both the legacy user-space mount command and the new in-kernel mount client.
Requirements
There are several broad requirements for the new text-based NFS mount API.
- Scalability - Allow for thousands of NFS mount points, and a large number of simultaneous mount operations
- No user-space dependency on a versioned binary blob for passing NFS mount options to the kernel
- Support version fallback - If NFS version 4 is not supported, fall back to version 3; if version 3 is not supported, fall back to version 2
- NFSv4 mounts will ignore legacy options in order to make fallback work
- Support transport protocol fallback - If TCP is not supported, fall back to UDP
- Improve the ability to interrupt an ongoing mount request
- Provide reasonable default behavior in the presence of network firewalls and misconfigured servers
- Facilitate new features - IPv6, RDMA, FS cache should be easy to introduce
- Better error reporting - Report and log useful, relevant, clear error messages when a failure has occurred; prepare for i18n
- Update and clarify NFS mount documentation
Use Cases
To mount a remote share using NFS version 2, use the nfs file system type and specify the nfsvers=2 mount option. To mount using NFS version 3, use the nfs file system type and specify the nfsvers=3 mount option. To mount using NFS version 4, use the nfs4 file system type (the nfsvers mount option is not supported for the nfs4 file system type).
Here is an example from an /etc/fstab file for an NFS version 3 mount over TCP.
server:/export/share /mnt nfs nfsvers=3,proto=tcp
Here is an example for an NFS version 4 mount over TCP using Kerberos 5 mutual authentication.
server:/export/share /mnt nfs4 sec=krb5
Design Specification For Text-Based NFS mount processing
Basic Architecture
Text-based NFS mount functionality will be split between user-space and the kernel. As much policy as possible will be moved up into user space. "Policy" includes decisions about when and how hard to retry mount requests, including when to fork into the background during a "bg" mount. These decisions should be made by mount.nfs, and not in the kernel, to allow the most administrative flexibility, and the most simplicity in the kernel implementation.
During a basic "fg" mount, the mount.nfs command will add an "addr=" option to the mount option string, and call the mount(2) system call. The kernel will make the mountd request and set up the NFS mount, then attach the share to mount.nfs command's name space as usual.
If kernel processing fails in any way (bad mount options, unable to connect to the server's rpcbind, mountd, or NFS server port, or the server doesn't support the requested protocol or transport) the mount(2) system call will return with an appropriate error code. The mount.nfs command is then responsible for arbitrating a series of zero or more retry requests, depending on the exact error code that was returned.
For example, if the return code indicates that the server's rpcbind does not have NFSv3 registered, the return code will report EOPNOTSUPP. The mount.nfs command could simply replace "vers=3" with "vers=2" in the mount option string, and send a new mount request to the kernel. On the other hand, if the kernel finds an invalid mount option in the mount option string, it will report EINVAL. The mount.nfs command will not retry the request in this case.
NFSv4 mounts
NFSv4 mounts do not require an interaction with the server's rpcbind or mountd services, but do require an extra "clientaddr=" in order to advertise the client's callback address to the server. This address is usually discovered by mount.nfs4 by noting the network path between client and server, in order to account for multi-homed clients that may want all traffic to and from the server to travel via a particular NIC. Unlike the "addr=" option, the "clientaddr=" can also be specified by admins who need to advertise a specific address when mounting an NFSv4 server through a NAT router.
Umount processing
Unmounting an NFS share does not change with the new text-based NFS mount interface. The mount.nfs and mount.nfs4 commands must take care to record the specific mount option string that was finally successful in /etc/mtab so that umount can discover and use the same "addr=" and network transport protocol during unmount processing.
Return Codes and Error Reporting
Currently mount.nfs's error messages are very problematic.
- Some error messages are incorrect.
- Some error messages are repeated.
- Some errors are never reported.
- Some error messages are too specific to be useful to an average administration. For example, reporting an "RPC program/version mismatch occurred" is not helpful if the real problem is that "proto=udp" is not supported.
- Some error messages are too general to be useful. For example, reporting "mount.nfs: not a directory" is obviously an errno string, but more specific information would provide a course of corrective action.
Perhaps a clear error message can be reported to the command line, and a lot of detail should be reported in the system log? That's easy enough with in-kernel mount option parsing!
mount(2) API return codes
The mount.nfs program needs to distinguish between temporary problems and permanent errors in order to determine whether it's worth retrying a mount request in the background.
For text-based NFS mounts, the version/protocol fallback mechanism should occur in user space -- certainly fallback policy is easier to set and implement in user space, but the kernel must provide specific information about how a mount request failed so that user space can make an appropriate choice about the next step to try.
The current mount(2) API is described in a man page. The man page describes a set of generic error return codes, which we excerpt here. It also suggests that we can add specific error codes for NFS mounts.
RETURN VALUE On success, zero is returned. On error, -1 is returned, and errno is set appropriately. ERRORS The error values given below result from filesystem type independent errors. Each filesystem type may have its own special errors and its own special behavior. See the kernel source code for details. EACCES A component of a path was not searchable. (See also path_resolu- tion(2).) Or, mounting a read-only filesystem was attempted without giving the MS_RDONLY flag. Or, the block device source is located on a filesystem mounted with the MS_NODEV option. EAGAIN A call to umount2() specifying MNT_EXPIRE successfully marked an unbusy file system as expired. EBUSY source is already mounted. Or, it cannot be remounted read-only, because it still holds files open for writing. Or, it cannot be mounted on target because target is still busy (it is the work- ing directory of some task, the mount point of another device, has open files, etc.). Or, it could not be unmounted because it is busy. EFAULT One of the pointer arguments points outside the user address space. EINVAL source had an invalid superblock. Or, a remount (MS_REMOUNT) was attempted, but source was not already mounted on target. Or, a move (MS_MOVE) was attempted, but source was not a mount point, or was ’/’. Or, an unmount was attempted, but target was not a mount point. Or, umount2() was called with MNT_EXPIRE and either MNT_DETACH or MNT_FORCE. ELOOP Too many link encountered during pathname resolution. Or, a move was attempted, while target is a descendant of source. EMFILE (In case no block device is required:) Table of dummy devices is full. ENAMETOOLONG A pathname was longer than MAXPATHLEN. ENODEV filesystemtype not configured in the kernel. ENOENT A pathname was empty or had a nonexistent component. ENOMEM The kernel could not allocate a free page to copy filenames or data into. ENOTBLK source is not a block device (and a device was required). ENOTDIR The second argument, or a prefix of the first argument, is not a directory. ENXIO The major number of the block device source is out of range. EPERM The caller does not have the required privileges.
In the following table, we discuss how each of these error values is used.
EACCES | A component of a path was not searchable. (See also path_resolution(2).) Or, mounting a read-only filesystem was attempted without giving the MS_RDONLY flag. Or, the block device source is located on a filesystem mounted with the MS_NODEV option. |
EAGAIN | A call to umount2() specifying MNT_EXPIRE successfully marked an unbusy file system as expired. |
EBUSY | source is already mounted. Or, it cannot be remounted read-only, because it still holds files open for writing. Or, it cannot be mounted on target because target is still busy (it is the working directory of some task, the mount point of another device, has open files, etc.). Or, it could not be unmounted because it is busy. |
EFAULT | One of the pointer arguments points outside the user address space. |
EINVAL | source had an invalid superblock. Or, a remount (MS_REMOUNT) was attempted, but source was not already mounted on target. Or, a move (MS_MOVE) was attempted, but source was not a mount point, or was ’/’. Or, an unmount was attempted, but target was not a mount point. Or, umount2() was called with MNT_EXPIRE and either MNT_DETACH or MNT_FORCE.
Note that NFS uses this error return code to signal bad mount options: The mount option string was not able to be parsed, or an unrecognized option was specified, or a keyword option was specified with a value that is out of range. This appears to be a precedent set by OCFS2 and CIFS. |
ELOOP | Too many link encountered during pathname resolution. Or, a move was attempted, while target is a descendant of source. |
EMFILE | (In case no block device is required:) Table of dummy devices is full. |
ENAMETOOLONG | A pathname was longer than MAXPATHLEN. |
ENODEV | filesystemtype not configured in the kernel. |
ENOENT | A pathname was empty or had a nonexistent component. |
ENOMEM | The kernel could not allocate a free page to copy filenames or data into. |
ENOTBLK | source is not a block device (and a device was required). |
ENOTDIR | The second argument, or a prefix of the first argument, is not a directory. |
ENXIO | The major number of the block device source is out of range. |
EPERM | The caller does not have the required privileges. |
Here are some additional return codes I recommend for NFS mounts, just as a start. These should allow a calling program to report a reasonably specific error message, and decide whether and how to retry the request.
EINVAL The mount option string was not able to be parsed, or an unre- cognized option was specified, or a keyword option was specified with a value that is out of range. EIO An unknown error occurred while attempting the mount request.
These are permanent mount errors. The calling program should not retry this request with the same options.
ESTALE The server denied access to the requested share. ETIMEDOUT The kernel's mount attempt timed out after n seconds (I think n is 15).
These are temporary errors. The calling program may choose to retry this request using the same mount options, or fail immediately.
EOPNOTSUPP The server reports that the program, version, or transport pro- tocol is not currently available. EPROTONOSUPPORT The client is missing support for the requested NFS version, or the server doesn't support the requested NFS version. ECONNREFUSED The kernel's mount connection attempt was refused by the server at the network transport layer.
These are temporary errors. The calling program can attempt to recover by adjusting the mount options and retrying the request.
Test Planning
Each section below will provide an abbreviated description of a unit test plan for that mount option. Our goal is to construct an automated test harness that can run all of these unit tests at once, acting either as a check-in test or as a final release test. We'd like something similar to the t/ directory in the git-core distribution.
Mount system call testing
We can begin with some simple tests to make sure the mount system call API, as implemented by the NFS client, is working. The obvious stuff:
- Testing first parameter sanity checking:
- Called with first parameter set to NULL
- Called with no ":" in the first parameter string
- Called with first parameter set to a very long string
- Called with first parameter pointing to unallocated storage
- Testing second parameter sanity checking:
- Called with second parameter set to NULL
- Called with second parameter set to a very long string
- Called with second parameter pointing to a path with too many symlinks
- Called with second parameter pointing to unallocated storage
- Testing option string sanity checking:
- Called with option string set to NULL
- Called with option string set to a very long string
- Called with option string pointing to unallocated storage
- Testing security checking
- Called by root
- Called by a normal user
In-kernel mount client testing
Another set of tests should drive the in-kernel mount client implementation. The in-kernel mount client is used to contact the server's mountd during NFSv2 and NFSv3 mounting. It is not used during NFSv4 mounts.
- Test behavior when server doesn't support all NFS versions
- Configure server to support only NFSv2, and check the return code from mount(2) for v3 mount
- Configure server to support only NFSv3, and check the return code from mount(2) for v2 mount
- Test behavior when server doesn support all socket transports
- Configure server to support only UDP, and check the return code from mount(2) for TCP mount
- Configure server to support only TCP, and check the return code from mount(2) for UDP mount
- Test behavior in the presence of firewalls
- Block the rpcbind port on the server, and check the return code from mount(2)
- Block UDP traffic to the server's NFS port, and check the return code from mount(2)
- Block TCP traffic to the server's NFS port, and check the return code from mount(2)
User-space mount.nfs and umount.nfs testing
These tests probe the behavior of mount.nfs in the presence of various mount options that require user-level processing.
- Test behavior of "bg" option
- Test behavior of "fg" option
- Test behavior in the presence of slow servers
- Test behavior when server host is up but not exporting NFS
- Test behavior when server host is up, but export is not available
- Test behavior of "retry=" option
- Test behavior of falling back to NFSv2
- Works iff no vers= is specified
- Test behavior of falling back to UDP
- Works iff no proto= is specified
- Test behavior of mounting when mounted-on dir isn't yet present
- Run under valgrind to watch for memory leaks
- Check long-running mount requests - especially "bg" mounts against missing servers
- Test /etc/mtab corner cases
- Test behavior when /etc/mtab doesn't exist
- Test behavior when /etc/mtab isn't writable
- Test behavior when /etc/mtab is a link to /proc/mounts
- Test behavior when mounted-on dir or share name contains non-alphanum characters (blanks, slashes, ampersands, and so on)
These are static code checks. They are probably reasonable check-in tests, able to be run easily via a make target.
- gcc -Wall
- sparse
- valgrind
- gcov
Discussion of Individual NFS Mount Options
There are four classes of mount options for nfs and nfs4 file systems. Fix this: All four classes of options are specified as normal NFS mount options because there is only one way to specify mount options in the /etc/fstab file.
- There are generic mount options available to all Linux file systems, such as "ro" or "sync". See mount(8) for a description of generic mount options available for all file systems.
- Some mount options can determine how the mount command behaves, such as "mountport" or "retry". These options have no affect after the mount operation has completed, but might be used to mount an NFS share through a network firewall.
- Some mount options determine how the NFS client behaves during normal operation, such as "rsize" and "wsize". These may be used to tune performance, or change the client's caching or file locking behavior.
- Mount options such as timeout= or retrans= can control aspects of Remote Procedure Call behavior. NFS clients send requests to NFS servers via Remote Procedure Calls, or RPCs. RPCs handle per-request authentication, adjust request parameters for different byte endianness on client and server, and retransmit requests that may have been lost by the network or server.
Note that some options take the form of keyword=value while some options are boolean, taking either the form of keyword or nokeyword. All options which do not use the keyword=value form use the boolean form, except for hard | soft, udp | tcp, and fg | bg.
To Do
- Format this section
- Add status information about each option
- Tested (legacy / text-based)
- Works, does not work as documented (legacy / text-based)
- Implementation/fix priority
- Details about how it works and/or how it should work
Valid options for either the nfs or nfs4 file system type
soft | hard
- Description
- Determines the recovery behavior of the RPC client after an RPC request times out. If neither option is specified, or if the \fIhard\fR option is specified, the RPC is retried indefinitely. If the \fIsoft\fR option is specified, then the RPC client fails the RPC request after a major timeout occurs, and causes the NFS client to return an error to the calling application.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
timeo=n
- Description
- The value, in tenths of a second, before timing out an RPC request. The default value is 600 (60 seconds) for NFS over TCP. On a UDP transport, the Linux RPC client uses an adaptive algorithm to estimate the time out value for frequently used request types such as READ and WRITE, and uses the timeo= setting for infrequently used requests such as FSINFO. The timeo= value defaults to 7 tenths of a second for NFS over UDP. After each timeout, the RPC client may retransmit the timed out request, or it may take some other action depending on the settings of the hard or retrans= options.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
retrans=n
- Description
- The number of RPC timeouts that must occur before a major timeout occurs. The default is 3 timeouts. If the file system is mounted with the hard option, the RPC client will generate a "server not responding" message after a major timeout, then continue to retransmit the
request. If the file system is mounted with the soft option, the RPC client will abandon the request after a major timeout, and cause NFS to return an error to the application.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
rsize=n
- Description
- The maximum number of bytes in each network READ request that the NFS client can use when reading data from a file on an NFS server; the actual data payload size of each NFS READ request is equal to or smaller than the rsize value. The rsize value is a positive integral multiple of 1024, and the largest value supported by the Linux NFS client is 1,048,576 bytes. Specified values outside of this range are rounded down to the closest multiple of 1024, and specified values smaller than 1024 are replaced with a default of 4096. If an rsize value is not specified, or if a value is specified but is larger than the maximums either the client or server support, the client and server negotiate the largest rsize value that both will support. The rsize option as specified on the mount(8) command line appears in the /etc/mtab file, but the effective rsize value negotiated by the client and server is reported in the /proc/mounts file.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
wsize=n
- Description
- The maximum number of bytes per network WRITE request that the NFS client can use when writing data to a file on an NFS server. See the description of the \fIrsize\fP option for more details.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
acregmin=n
- Description
- The minimum time in seconds that the NFS client caches attributes of a regular file before it requests fresh attribute information from a server. The default is 3 seconds.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
acregmax=n
- Description
- The maximum time in seconds that the NFS client caches attributes of a regular file before it requests fresh attribute information from a server. The default is 60 seconds.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
acdirmin=n
- Description
- The minimum time in seconds that the NFS client caches attributes of a directory before it requests fresh attribute information from a server. The default is 30 seconds.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
acdirmax=n
- Description
- The maximum time in seconds that the NFS client caches attributes of a directory before it requests fresh attribute information from a server. The default is 60 seconds.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
actimeo=n
- Description
- Using actimeo sets all of acregmin, acregmax, acdirmin, and acdirmax to the same value. There is no default value.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
bg | fg
- Description
- This mount option determines how the mount(8) command behaves if an attempt to mount a remote share fails. The fg option causes mount(8) to exit with an error status if any part of the mount request times out or fails outright. This is called a "foreground" mount, and is the default behavior if neither fg nor bg is specified. If the bg option is specified, a timeout or failure causes the mount(8) command to fork a child which continues to attempt to mount the remote share. The parent immediately returns with a zero exit code. This is known as a "background" mount. If the local mount point directory is missing, the mount(8) command treats that as if the mount request timed out. This permits nested NFS mounts.
- Implementation priority
- Questionable. There is some debate about whether users are still using this option, or are using autofs instead.
- Implementation
- The mount.nfs command must distinguish between permanent mount errors (such as a bad mount option) which prevent the mount request as specified from ever being valid, and temporary errors (such as an unreachable server) which might allow the mount request as specified from completing at some future point. See the discussion of mount(2) return codes for more detail.
- Test plan (fg - v2/v3)
- Remove the local mount point, then attempt an NFS mount with the "fg" option set. The mount should fail with (what error code and what error message?).
- Shut down the NFS server (service nfs stop), then attempt an NFS mount with the "bg" option set. The mount should fail with (what error code and what error message?).
- Block the NFS server ports on the server with iptables, then attempt an NFS mount with the "bg" option set. The mount should fail with (what error code and what error message?).
- Block the mountd server ports on the server with iptables, then attempt an NFS mount with the "bg" option set. The mount should fail with (what error code and what error message?).
- Block the rpcbind server ports on the server with iptables, then attempt an NFS mount with the "bg" option set. The mount should fail with (what error code and what error message?).
- Test plan (bg - v2/v3)
- Remove the local mount point, then attempt an NFS mount with the "bg" option set. The mount should succeed once the mount point has been recreated.
- Shut down the NFS server (service nfs stop), then attempt an NFS mount with the "bg" option set. The mount should succeed once the NFS server has been restarted.
- Block the NFS server ports on the server with iptables, then attempt an NFS mount with the "bg" option set. The mount should succeed once the ports are unblocked.
- Block the mountd server ports on the server with iptables, then attempt an NFS mount with the "bg" option set. The mount should succeed once the ports are unblocked.
- Block the rpcbind server ports on the server with iptables, then attempt an NFS mount with the "bg" option set. The mount should succeed once the ports are unblocked.
- Testing status
- Tested with legacy mount.nfs; works for v2/v3, not for v4
- Tested with text-based mount.nfs; does not work for any version
retry=n
- Description
- The number of minutes to retry an NFS mount operation in the foreground or background before giving up. The default value for foreground mounts is 2 minutes. The default value for background mounts is 10000 minutes, which is roughly one week.
- Implementation
- The ten thousand minute default might be too long. Perhaps foreground mounts should also use a much shorter default.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
sec=mode
- Description
- The RPCGSS security flavor to use for accessing files on this mount point. If the sec= option is not specified, or if sec=sys is specified, the RPC client uses the AUTH_SYS security flavor for all RPC operations on this mount point. Valid security flavors are none, sys, krb5, krb5i, krb5p, lkey, lkeyi, lkeyp, spkm, spkmi, and spkmp. See the SECURITY CONSIDERATIONS section for details.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
- Description
- Determines how the client's data cache is shared between mount points that mount the same remote share. If the option is not specified, or the \fIsharecache\fR option is specified, then all mounts of the same remote share on a client use the same data cache. If the \fInosharecache\fR option is specified, then files under that mount point are cached separately from files under other mount points that may be accessing the same remote share. As of kernel 2.6.18, this is legacy caching behavior, and is considered a data risk since two cached copies of the same file on the same client can become out of sync following an update of one of the copies.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
Valid options for the nfs file system type
proto=netid
- Description
- The transport protocol used by the RPC client to transmit requests to the NFS server for this mount point. The value of netid can be either udp or tcp. Each transport protocol uses different default retrans and timeo settings; see the description of these two mount options for details.
- NB: This mount option controls both how the mount(8) command communicates with the portmapper and the MNT and NFS server, and what transport protocol the in-kernel NFS client uses to transmit requests to the NFS server. Specifying proto=tcp forces all traffic from the mount command and the NFS client to use TCP. Specifying proto=udp forces all traffic types to use UDP. If the proto= mount option is not specified, the mount(8) command chooses the best transport for each type of request (GETPORT, MNT, and NFS), and by default the in\-kernel NFS client uses the TCP protocol. If the server doesn't support one or the other protocol, the mount(8) command attempts to discover which protocol is supported and use that.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
port=n
- Description
- The numeric value of the port used by the remote NFS service. If the port= option is not specified, or if the specified port value is 0, then the NFS client uses the NFS service port provided by the remote portmapper service. If any other value is specified, then the NFS client uses that value as the destination port when connecting to the remote NFS service. If the remote host's NFS service is not registered with its portmapper, or if the NFS service is not available on the specified port, the mount fails.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
namlen=n
- Description
- When an NFS server does not support version two of the RPC mount protocol, this option can be used to specify the maximum length of a filename that is supported on the remote filesystem. This is used to support the POSIX pathconf functions. The default is 255 characters.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
mountport=n
- Description
- The numeric value of the mountd port.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
mounthost=name
- Description
- The name of the host running mountd.
- Implementation
- The mount.nfs command must convert the value of this option into a resolved IP address, and pass that to the kernel. The current text-based implementation treats "n" not as a name, but as an address.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
mountprog=n
- Description
- Use an alternate RPC program number to contact the mount daemon on the remote host. This option is useful for hosts that can run multiple NFS servers. The default value is 100005 which is the standard RPC mount daemon program number.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
mountvers=n
- Description
- Use an alternate RPC version number to contact the mount daemon on the remote host. This option is useful for hosts that can run multiple NFS servers. The default value depends on which kernel you are using.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
nfsprog=n
- Description
- Use an alternate RPC program number to contact the NFS daemon on the remote host. This option is useful for hosts that can run multiple NFS servers. The default value is 100003 which is the standard RPC NFS daemon program number.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
nfsvers=n
- Description
- Use an alternate RPC version number to contact the NFS daemon on the remote host. This option is useful for hosts that can run multiple NFS servers. The default value depends on which kernel you are using.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
vers=n
- Description
- vers is an alternative to nfsvers and is compatible with many other operating systems.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
nolock
- Description
- Disable NFS locking. Do not start lockd. This is appropriate for mounting the root filesystem or /usr or /var. These filesystems are typically either read-only or not shared, and in those cases, remote locking is not needed. This also needs to be used with some old NFS servers that don't support locking.
- Note that applications can still get locks on files, but the locks only provide exclusion locally. Other clients mounting the same filesystem will not be able to detect the locks.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
intr
- Description
- If an NFS file operation has a major timeout and it is hard mounted, then allow signals to interupt the file operation and cause it to return EINTR to the calling program. The default is to not allow file operations to be interrupted.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
posix
- Description
- Mount the NFS filesystem using POSIX semantics. This allows an NFS filesystem to properly support the POSIX pathconf command by querying the mount server for the maximum length of a filename. To do this, the remote host must support version two of the RPC mount protocol. Many NFS servers support only version one.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
nocto
- Description
- Suppress the retrieval of new attributes when creating a file.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
noac
- Description
- Disable all forms of attribute caching entirely. This extracts a significant performance penalty but it allows two different NFS clients to get reasonable results when both clients are actively writing to a common export on the server.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
noacl
- Description
- Disables Access Control List (ACL) processing.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
nordirplus
- Description
- Disables NFSv3 READDIRPLUS RPCs. Use this option when mounting servers that don't support or have broken READDIRPLUS implementations.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
Valid options for the nfs4 file system type
proto=netid
- Description
- The transport protocol used by the RPC client to transmit requests to the NFS server. The value of netid can be either udp or tcp. All NFS version 4 servers are required to support TCP, so the default transport protocol for NFS version 4 is TCP.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
port=n
- Description
- The numeric value of the port used by the remote NFS service. If the port= option is not specified, the NFS client uses the standard NFS port number of 2049 without checking the remote portmapper service. If the specified port value is 0, then the NFS client uses the NFS service port provided by the remote portmapper service. If any other value is specified, then the NFS client uses that value as the destination port when connecting to the remote NFS service. If the remote host's NFS service is not registered with its portmapper, or if the NFS service is not available on the specified port, the mount fails.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
clientaddr=n.n.n.n
- Description
- Specifies a single IPv4 address in dotted-quad form that the NFS client advertises to allow servers to perform NFSv4 callback requests against files on this mount point. If the server is not able to establish callback connections to clients, performance may degrade, or accesses to files may temporarily hang.
- If this option is not specified, the mount(8) command attempts to discover an appropriate callback address automatically. The automatic discovery process is not perfect, however. In the presence of multiple client network interfaces, special routing policies, or atypical network topologies, the exact address to use for callbacks may be nontrivial to determine.
- Implementation priority
- High
- Implementation
- The client address option must discover the local address the server will use to contact the client. On multi-homed hosts, the client's local address depends on which NIC is used to route requests to the server. The address is set automatically by the user-space mount command if the admin doesn't provide one.
- Test plan
- Specify no mount options, and check that the kernel is getting a valid clientaddr= option from the mount.nfs command (using rpcdebug).
- Check that the clientaddr= option in /etc/mtab properly reflects the clientaddr that is in use by the kernel client.
- Specify clientaddr=garbage string, and check that the client's kernel and user-space mount.nfs command properly reject it.
- Specify a clientaddr= a good address, and check that the client's kernel gets the same address.
- Specify a working address for another client for clientaddr=, and check how mounting, normal NFS operation, and umounting behave.
- Specify a nonworking address for clientaddr=, and check how mounting, normal NFS operation, and unmounting behave.
- Testing status
- Not tested with the legacy mount.nfs4 command
- Partially tested with the text-based mount.nfs4 command
- Partially tested with umount.nfs4 -- reports "Server failed to unmount" and hangs if Solaris server can't reach callback address
intr
- Description
- If an NFS file operation has a major timeout and it is hard mounted, then allow signals to interrupt the file operation and cause it to return EINTR to the calling program. The default is to not allow file operations to be interrupted.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
nocto
- Description
- Suppress the retrieval of new attributes when creating a file.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
noac
- Description
- Disable attribute caching, and force synchronous writes. This extracts a server performance penalty but it allows two different NFS clients to get reasonable good results when both clients are actively writing to common filesystem on the server.
- Implementation
- No notes.
- Testing status
- Not tested with legacy mount.nfs
- Not tested with text-based mount.nfs
Security Considerations
NFS provides access control for data, but depends on its RPC implementation to provide authentication of NFS requests. Traditional NFS access control mimics the standard mode bit access control provided in local file systems. Traditional RPC authentication uses a number to represent each user (usually the user's own uid), a number to represent the user's group (the user's gid), and a set of up to 16 auxiliary group numbers to represent other groups of which the user may be a member. File data and user ID values appear in the clear on the network.
Moreover, NFS versions 2 and 3 use separate protocols for mounting, for locking and unlocking files, and for reporting system status of clients and servers. These auxiliary protocols use no authentication.
In addition to combining all the auxiliary protocols into a single protocol, NFS version 4 introduces more advanced forms of access control, authentication, and in-transit data protection. Linux also implements the proprietary NFSv3 access control list implementation built into Solaris, but never standardized, and allows the use of advanced authentication modes for NFS version 2 and version 3 mounts.
The NFS version 4 specification mandates NFSv4 ACLs, RPCGSS authentication, and RPCGSS security flavors that provide per-RPC integrity checking and encryption, and it applies to all NFS version 4 operations including mounting, file locking, and so on. Note that Linux does not yet implement security mode negotiation between NFS version 4 clients and servers.
A mount option enables the RPCGSS security mode that is in effect on a given NFS mount point. Using the sec=krb5 mount option provides a cryptographic proof of a user's identity in each RPC request that passes between client and server. This makes a very strong guarantee about who is accessing what data on the server.
Two other flavors of Kerberos security are supported as well. krb5i provides a cryptographically strong guarantee that the data in each RPC request has not been tampered with. And krbp encrypts every RPC request so the data is not exposed at all during transit on networks between NFS client and server. There can be some performance impact when using integrity checking or encryption, however.
Support for other forms of cryptographic security are also available, including lipkey and SPKM3.
Citations
fstab(5), mount(8), umount(8), mount.nfs(5), umount.nfs(5), exports(5), nfsd(8), rpc.idmapd(8), rpc.gssd(8), rpc.svcgssd(8), kerberos(1)
- RFC 768 for the UDP specification.
- RFC 793 for the TCP specification.
- RFC 1094 for the NFS version 2 specification.
- RFC 1813 for the NFS version 3 specification.
- RFC 1832 for the XDR specification.
- RFC 1833 for the RPC bind specification.
- RFC 2203 for the RPCSEC GSS API protocol specification.
- RFC 3530 for the NFS version 4 specification.