Server-side silly rename

From Linux NFS

Jump to: navigation, search

The NFSv3 protocol has no way to say "I'm unlinking this file, but please keep it around because I have an application that's still using it". So if the client wants to provide unix-like semantics, it has to resort to this hack (called "silly rename") on unlink of an open file. See also [1], or the earliest description I'm aware of, in "Design and Implementation of the Sun Network Filesystem" (1985):

We tried very hard to make the NFS client obey UNIX filesystem semantics without modifying the server or the protocol. In some cases this was hard to do. For example, UNIX allows removal of open files. A process can open a file, then remove the directory entry for the file so that it has no name anywhere in the filesystem, and still read and write the file. This is a disgusting bit of UNIX trivia and at first we were just not going to support it, but it turns out that all of the programs that we didn't want to have to fix (csh, sendmail, etc.) use this for temporary files.

What we did to make open file removal work on remote files was check in the client VFS remove operation if the file is open, and if so rename it instead of removing it. This makes it (sort of) invisible to the client and still allows reading and writing. The client kernel then removes the new name when the vnode becomes inactive. We call this the 3/4 solution because if the client crashes between the rename and remove a garbage file is left on the server. An entry to cron can be added to clean up on the server.

Silly rename is indeed an imperfect solution. Another case when users sometimes notice the ".nfsXXXX" files is when they try to remove a directory that contains them. Also, it doesn't help if a file is unlinked by a different client than the one that holds it open.

NFSv4 actually does have open and close calls, and our server won't free a file until last close--unless the server reboots, at which point the file will disappear even if an application on the client is still using it. NFS is supposed to keep working normally across server reboots, so the client still does silly rename even in the v4 case.

We could move the responsibility for silly rename to the server--the server could keep a hardlink to the file after unlink, and that would preserve the file after reboot as well. (And it could use a separate directory for the purpose, and avoid the rmdir). We even added a bit to the NFSv4 protocol so that the server can tell the client it does this, allowing the client to skip sillyrename (see references to OPEN4_RESULT_PRESERVE_UNLINKED in [2].)

My rough plan for knfsd is to create a "hidden" directory in the root of the exported filesystem and create a link there for each open file on open, deleted on close. On restart, the server would wait till the end of its grace period for clients to reclaim any files they had open before, and then delete any leftover links.

We may want to think about how exactly to hide that directory. Maybe we could get some kind of help from the filesystem.

Filesystems already have to deal with the case where the system crashes while there are unlinked open files. I believe they keep a list of such files so they can free them in fsck or next mount. Another possibility I considered was hooking into that process somehow--perhaps the server could be given an interface allowing it to discover those orphaned files. But that would require nfsd to be involved in the mount process (currently we mount first, then export). And we'd have to figure out how to perform clean shutdowns without losing those files. And we'd have to worry about losing them any time an administrator fsck'd or mounted without running nfsd. So I think that would be too complicated and fragile.

Personal tools