PNFS File-based Distribution Options
From Linux NFS
(Difference between revisions)
Dhildebz (Talk | contribs)
(New page: <pre> Distribute NFSv4 state via File System or NFSD ============================================== The discussion of whether to distribute nfsv4 stateids via the NFSD or through the file...)
(New page: <pre> Distribute NFSv4 state via File System or NFSD ============================================== The discussion of whether to distribute nfsv4 stateids via the NFSD or through the file...)
Latest revision as of 21:29, 10 March 2009
Distribute NFSv4 state via File System or NFSD ============================================== The discussion of whether to distribute nfsv4 stateids via the NFSD or through the file system is an ongoing debate. I would say it started when I implemented the first pNFS prototype in 2004. I'm sure this conversation will continue on forever until someone actually spends the time to implement a nfsd-to-nfsd stateid mechanism and compares its performance and functionality with doing the distribution via the file system. Until then, this page tries to capture the issues. I think the ideal compromise would have NFSD offer a default server-to-server protocol, that does the most basic of things. File systems that need more control can implement their own mechanism and hook into it via a standard interface (via export ops or something) Outside file system (via NFSD or something) =========================================== Pro: Simplifies work of FS Con: a) NFSD MUST be exactly in sync with the FS on the active data servers, and which client has accessed data on which data server. (note the number of export ops it would take maintain this coherence) NFSD must know which devices to distribute/revoke state from. This means that it must be in sync with regards to the active data servers. Ideally, it would also track which data servers have which stateids so it can avoid calling data servers that don't have any information. In the laissez-faire case, it would even know which clients accessed which data servers, so the revokes can be even more specific. In short, it seems to me that the only place that really understands the whole picture is the FS. b) FS already has a protocol to communicate between servers, why duplicate it? c) Scalability is an issue. Many FSs use sophisticated communication algorithms to spread information among many many servers (tree broadcast, etc) Inside file system =========== Pro: a) Leaves the problem of performance and scalability to the file system (which has already solved the problem) b) The FS has all the required information to optimize distribution and revocation of all state information. No sync with NFSD layer required. Con: Each FS must implement extra export ops