Peer-to-peer NFS
From Linux NFS
(Difference between revisions)
(8 intermediate revisions not shown) | |||
Line 1: | Line 1: | ||
+ | [[P2P Design Specification]] | ||
+ | == Test Plan == | ||
+ | * Basic tests: | ||
+ | ** pnfsd exports filesystem | ||
+ | ** p2pds reads file from pnfsd | ||
+ | ** p2pclient opens file on pnfsd, is referred to p2pds for reading | ||
+ | * Scalability tests | ||
+ | ** Recreate a "boot storm" environment | ||
+ | ** pnfsd exports filesystem | ||
+ | ** N DSs read files from pnfsd | ||
+ | ** M * N DSs open files on pnfsd and are referred to DSs for reading | ||
+ | ** Compare times to read files on each client to times required to read all files without p2p | ||
+ | |||
== Implementation == | == Implementation == | ||
=== NFSD === | === NFSD === | ||
Line 45: | Line 58: | ||
* [most of PROXY_REVOKE] | * [most of PROXY_REVOKE] | ||
* Delete the extra delegation state stored during PROXY_OPEN. This will force a new PROXY_OPEN call (and possibly start some kind of recovery) the next time the file is accessed. | * Delete the extra delegation state stored during PROXY_OPEN. This will force a new PROXY_OPEN call (and possibly start some kind of recovery) the next time the file is accessed. | ||
+ | |||
+ | === VFS === | ||
+ | ==== struct super_block -> struct vfsmount ==== | ||
+ | * [NFSD: Implement PROXY_OPEN] | ||
+ | * Look through sb->s_mounts to find a mount instance, inspired by d_find_any_alias(). All vfsmounts in the super_block point to the same filesystem, so we don't care which one is returned as long as NFSD gets something to pass to dentry_open(). | ||
+ | * This is the biggest hack I used. | ||
+ | |||
+ | == Prerequisite patches == | ||
+ | * These are patches that introduce features and code that I directly modify, they might have their own prerequisites that I haven't tracked down yet. | ||
+ | * Some patches may be bugfixes or add code near lines that I change and may not actually be required. | ||
+ | === Generic PNFSD === | ||
+ | ==== pnfsd: get device list/info ==== | ||
+ | <nowiki>commit 50af02e4b053ac4fed827ba983539a28db91046c | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Tue Aug 11 17:07:04 2009 -0400 | ||
+ | |||
+ | pnfsd: get device list/info | ||
+ | |||
+ | Implement the generic handling of GETDEVICELIST and GETDEVICEINFO. | ||
+ | |||
+ | After verifying that the requested layout type is supported, | ||
+ | getdevlist uses the get_device_iter pnfs export method | ||
+ | to encode the list of deviceids and get the cookie, verifier, | ||
+ | and eof flag to be used be the client to iterate through | ||
+ | the whole device list. | ||
+ | |||
+ | Getdevinfo uses the get_device_info pnfs export method | ||
+ | to encode the device info for the given deviceid. | ||
+ | |||
+ | The filesystem can choose to return valid cookie and cookieverf | ||
+ | on eof, pointing at the end of the device list so that subsequent | ||
+ | calls to GETDEVIE LIST will return an empty list. | ||
+ | |||
+ | Note that with the file layout, lots of devices are sent under a | ||
+ | single device id, so the client will need to send a relatively | ||
+ | large value of maxcount. | ||
+ | |||
+ | If maxcount is 0 then just update notifications. | ||
+ | The nfsv4.1 spec forbids returning ETOOSMALL in this case. | ||
+ | It is up to the implementor of the get_device_info method | ||
+ | to verify the deviceid in this case and return no | ||
+ | info for it. | ||
+ | |||
+ | If no notifications are given represent gdir_notification as an empty | ||
+ | bitmap array rather than one consisting of a single zeroed entry. | ||
+ | Thanks to Dean Hildebrand for suggesting this optimization | ||
+ | and to Peter Staubach for convincing that it's worth it. | ||
+ | |||
+ | Nfsd should return sbid while getting device list so that it can operate it properly later in nfsd4_getdevinfo. | ||
+ | |||
+ | [extracted from pnfsd: Initial pNFS server implementation.] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: update pNFS server ops to draft 13] | ||
+ | Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> | ||
+ | [pnfsd: Fix server getdevicelist update to draft 13] | ||
+ | Signed-off-by: Andy Adamson<andros@umich.edu> | ||
+ | [pnfsd: update pNFS server ops to draft 13] | ||
+ | Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> | ||
+ | [pnfsd: Fix server GETDEVICELIST to comply with NFSv4.1 Draft 13] | ||
+ | Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> | ||
+ | [pnfsd: Streamline error code checking for non-pnfs filesystems] | ||
+ | Signed-off-by: Dean Hildebrand <seattleplus@gmail.com> | ||
+ | [pnfsd: Simplify device export ops.] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfs: fix compile problems if CONFIG_PNFS turned off - exportfs.h] | ||
+ | Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> | ||
+ | [pnfsd: Implement getdevlist maxcount checking.] | ||
+ | [pnfsd: use nfs error codes] | ||
+ | [pnfsd: Use 128 bit deviceid on server] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: fix warning in nfsd4_encode_devlist_iterator()] | ||
+ | Signed-off-by: Mike Sager <sager@netapp.com> | ||
+ | [pnfsd: Update getdeviceinfo for draft-19] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: encode empty getdeviceinfo notify bitmap rather than zeroed] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: do not depend on the current file handle in getdeviceinfo] | ||
+ | [pnfsd: update export hold count] | ||
+ | Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> | ||
+ | [pnfsd: Update getdevlist for draft 19] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: fix GETDEVICELIST encoding] | ||
+ | Signed-off-by: Mike Sager <sager@netapp.com> | ||
+ | [pnfsd: use nfsd4_compoundres pointer in pnfs_xdr_info] | ||
+ | [pnfsd: fix NFS4ERR_TOOSMALL for getdeviceinfo] | ||
+ | [pnfsd: enable multipage getdeviceinfo da_addr_body] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: move vfs api structures to nfsd4_pnfs.h] | ||
+ | [pnfsd: convert generic code to use new pnfs api] | ||
+ | [pnfsd: define pnfs_export_operations] | ||
+ | [pnfsd: obliterate old vfs api] | ||
+ | [pnfsd: fixup ENCODE_HEAD for getdevicelist/info] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: get device list/info all layout types] | ||
+ | [pnfsd: check ex_pnfs in nfsd4_verify_layout] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [removed nfsd4_pnfs_fl_getdev{info,iter} stubs] | ||
+ | [pnfsd: filelayout: convert to using exp_xdr] | ||
+ | [pnfsd: get rid of getdevinfo notify_types] | ||
+ | [pnfsd: copy getdevinfo deviceid in one piece] | ||
+ | [pnfsd: rename deviceid_t struct pnfs_deviceid] | ||
+ | [pnfsd: fix cosmetic checkpatch warnings] | ||
+ | [pnfsd: handle s_pnfs_op==NULL] | ||
+ | [pnfsd: move getdevinfo xdr structure to private header] | ||
+ | [pnfsd: clean up getdeviceinfo export op API] | ||
+ | [pnfsd: getdeviceinfo deviceid needs to be const.] | ||
+ | [pnfsd: allow returning empty device list.] | ||
+ | [pnfsd: return NFS4ERR_INVAL when maxdevices is zero.] | ||
+ | [pnfsd: move getdevlist xdr structure to private header] | ||
+ | [pnfsd: dev_iter: clean up export API] | ||
+ | [pnfsd: rename device fsid member to sbid] | ||
+ | [pnfsd: use devid.sbid for locating super block in getdevinfo] | ||
+ | [pnfsd: fixup nfsd4_encode_getdev{list,info} to use __be32 nfserr] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Use list_move instead list_del and list_add] | ||
+ | Signed-off-by: Bian Naimeng <biannm@cn.fujitsu.com> | ||
+ | [pnfsd: using sbid instead of fsid while returning device list to client] | ||
+ | Signed-off-by: Zhengju Sha <shazhengju@nrchpc.ac.cn> | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: layout get ==== | ||
+ | <nowiki>commit 15a582b4dee94ae23b39f672a0590cad54d42fc1 | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Tue Aug 11 17:07:06 2009 -0400 | ||
+ | |||
+ | pnfsd: layout get | ||
+ | |||
+ | Currently, always return a single record in the log_layout array. | ||
+ | |||
+ | If an invalid iomode, or an iomode of LAYOUTIOMODE4_ANY is specified, the | ||
+ | metadata server MUST return NFS4ERR_BADIOMODE. | ||
+ | |||
+ | [extracted from pnfsd: Initial pNFS server implementation.] | ||
+ | [pnfsd: nfsd layout cache: layout return changes] | ||
+ | [pnfsd: add debug printouts in return_layout path] | ||
+ | [pnfsd: refactor return_layout] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Streamline error code checking for non-pnfs filesystems] | ||
+ | [pnfsd: Use nfsd4_layout_seg instead of wrapper struct.] | ||
+ | [pnfsd: Move nfsd4_layout_seg to exportfs.h] | ||
+ | [pnfsd: Fix file layout layoutget export op for d13] | ||
+ | [pnfsd: Simplify layout get export interface.] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: improve nfs4_pnfs_get_layout dprintks] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: initialize layoutget return_on_close] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: update server layout xdr for draft 19.] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: use stateid_t for layout stateid xdr data structs] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Update getdeviceinfo for draft-19] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: xdr encode layoutget response logr_layout array count as per draft-19] | ||
+ | [pnfsd: use stateid xdr {en,de}code functions for layoutget] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: use nfsd4_compoundres pointer in pnfs_xdr_info] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: move vfs api structures to nfsd4_pnfs.h] | ||
+ | [pnfsd: convert generic code to use new pnfs api] | ||
+ | [pnfsd: define pnfs_export_operations] | ||
+ | [pnfsd: obliterate old vfs api] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [Split this patch into filelayout only (this patch) and all layout types] | ||
+ | (patch pnfsd: layout get all layout types). | ||
+ | Remove use of pnfs_export_operations. | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: fixup ENCODE_HEAD for layoutget] | ||
+ | [pnfsd: rewind xdr response pointer on nfsd4_encode_layoutget error] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [Move pnfsd code from nfs4state.c to nfs4pnfsd.c] | ||
+ | [Move common state code from linux/nfsd/state.h to fs/nfsd/internal.h] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: Release lock during layout export ops.] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [cosmetic changes from pnfsd: Helper functions for layout stateid processing.] | ||
+ | [pnfsd: layout get all layout types] | ||
+ | [pnfsd: check ex_pnfs in nfsd4_verify_layout] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [removed the nfsd4_pnfs_fl_layoutget stub] | ||
+ | [pnfsd: get rid of layout encoding function vector] | ||
+ | [pnfsd: filelayout: convert to using exp_xdr] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Move pnfsd code out of nfs4state.c/h] | ||
+ | Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | [fixed !CONFIG_PNFSD and clean up for pnfsd-files] | ||
+ | [gfs2: set pnfs_dlm_export_ops only for CONFIG_PNFSD] | ||
+ | [moved pnfsd defs back into state.h] | ||
+ | [pnfsd: rename deviceid_t struct pnfs_deviceid] | ||
+ | [pnfsd: fix cosmetic checkpatch warnings] | ||
+ | [pnfsd: handle s_pnfs_op==NULL] | ||
+ | [pnfsd: move layoutget xdr structure to xdr4.h] | ||
+ | [pnfsd: clean up layoutget export API] | ||
+ | [pnfsd: moved find_alloc_file to nfs4state.c] | ||
+ | [moved struct nfs4_fsid to public include/linux/nfs4.h] | ||
+ | [pnfsd: rename device fsid member to sbid] | ||
+ | [pnfsd: use sbid hash table to map super_blocks to devid major identifiers] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: fix file system API layout_get error codes] | ||
+ | [pnfsd: fix NFS4ERR_BADIOMODE in layoutget] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: require filesystem layout_get method return a u32 rather than int] | ||
+ | [pnfsd: allow filesystem to return canonical nfs4 errors for layoutget] | ||
+ | [pnfsd: do not allow filesystem to return encoded nfs errors on layout_get] | ||
+ | [pnfsd: fixup nfs4_pnfs_get_layout to use __be32 nfserr] | ||
+ | [pnfsd: allow filesystem to return NFS4ERR_WRONG_TYPE for layout_get] | ||
+ | [pnfsd: fix error handling in layout_get] | ||
+ | [pnfsd: fix uninitialized usage of nfserr in nfs4_pnfs_get_layout] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: process the layout stateid ==== | ||
+ | <nowiki>commit f2e3eade61963b7794c82939a86434b7be09cbb0 | ||
+ | Author: Andy Adamson <andros@umich.edu> | ||
+ | Date: Tue Aug 11 17:07:11 2009 -0400 | ||
+ | |||
+ | pnfsd: process the layout stateid | ||
+ | |||
+ | Common function for LAYOUTGET and LAYOUTRETURN layout stateid processing. | ||
+ | |||
+ | The 'first open, delegation, or lock stateid' presented by the client is | ||
+ | looked up for verification. | ||
+ | |||
+ | Both initial and non-initial parallel LAYOUTGET operations and parallel | ||
+ | LAYOUTRETURN operations are supported. | ||
+ | |||
+ | Note: layout stateid seqid checking is more lax than that specified in | ||
+ | draft-ietf-nfsv4-minorversion1-22 for Connectathon. | ||
+ | |||
+ | Take a reference count whenever the pointer to the layout state | ||
+ | is kept, in particular when the layout structure is listed on the | ||
+ | state's ls_layouts. On dequeue_layout the layout state if being put | ||
+ | and its reference count will drop to zero if the list empties | ||
+ | unless someone's holding a reference transiently within the scope | ||
+ | of teh calling function, in which case the layout state is dereferenced | ||
+ | before the function exits. | ||
+ | |||
+ | Note: the layout stateid must be updated by layout get only | ||
+ | on success upon changing the actual state, otherwise, | ||
+ | a parallel layout_recall will send the wrong stateid. | ||
+ | |||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: nfs4_process_layout_stateid print result stateid conditionally] | ||
+ | [pnfsd: use STATEID_FMT and STATEID_VAL for printing stateids] | ||
+ | [pnfsd: debug print layout stateid before putting the layout_state] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: fix layout state reference count] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [used nfs4_check_stateid in nfs4_process_layout_stateid] | ||
+ | [Moved pnfsd code from nfs4state.c to nfs4pnfsd.c] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: use a spinlock for layout state] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Move pnfsd code out of nfs4state.c/h] | ||
+ | Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | [moved defs back into state.h] | ||
+ | [verify_stateid's return status is __be32] | ||
+ | [update layout stateid properly] | ||
+ | [convert to using 3.2 layout state infrastructure] | ||
+ | [squashed Helper functions for layout stateid processing] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: LAYOUTGET layout stateid processing ==== | ||
+ | <nowiki>commit 16b73ee1ba17b7b8dcb5226f7d0056f419884a0b | ||
+ | Author: Andy Adamson <andros@netapp.com> | ||
+ | Date: Tue Aug 11 17:07:12 2009 -0400 | ||
+ | |||
+ | pnfsd: LAYOUTGET layout stateid processing | ||
+ | |||
+ | [Moved pnfsd code from nfs4state.c to nfs4pnfsd.c] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: clean up layoutget export API] | ||
+ | [pnfsd: fix uninitialized usage of nfserr in nfs4_pnfs_get_layout] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: fix layout state ref counting] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@tonian.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: layout recall callback ==== | ||
+ | <nowiki>commit fcf9d16843b042034cc7e440e68c95958a41856e | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Tue Aug 11 17:07:39 2009 -0400 | ||
+ | |||
+ | pnfsd: layout recall callback | ||
+ | |||
+ | when cb_layoutrecall is replied with NFS4ERR_NOMATCHING_LAYOUTS | ||
+ | simulate a return_layout call into the file system and release | ||
+ | all layouts in the range as if a respective LAYOUTRETURN was received. | ||
+ | |||
+ | On a recall, a client may return NFS4ERR_DELAY to indicate | ||
+ | that it is busy with the layout and wants to be poled. | ||
+ | |||
+ | TODO: If the client is stuck he would probably be cleaned | ||
+ | at expire client. But it is possible that the client | ||
+ | is active/renewing but would not acknowledge the | ||
+ | recall. We should take a time stamp on first recall | ||
+ | and expire the client if a lease time has passed. | ||
+ | |||
+ | cb_layout_recall() return codes: | ||
+ | ->cb_layout_recall (nfsd_layout_recall_cb) will always try | ||
+ | to make forward progress. Below is the meaning of the possible | ||
+ | return codes: | ||
+ | |||
+ | -ENOENT: | ||
+ | There are no layouts that match the requested recall. Operation | ||
+ | can proceed. | ||
+ | |||
+ | 0: | ||
+ | All needed recalls were sent and code should wait for | ||
+ | the given cookie to be returned at layout_return. | ||
+ | |||
+ | -EAGAIN: | ||
+ | There were errors sending all of the recalls, but some | ||
+ | forward progress was made. code should wait for the given | ||
+ | cookie to be returned at layout_return, but then call | ||
+ | cb_layout_recall() again, until -ENOENT is returned. | ||
+ | (Note it is always safe to call cb_layout_recall() multiple | ||
+ | times until -ENOENT is returned) | ||
+ | |||
+ | ANY ERROR: (Currently only -ENOMEM) | ||
+ | Any other error means it was not possible to make any | ||
+ | forward progress, the operation should not be attempted | ||
+ | and an error should be returned to the application. (Or | ||
+ | whatever is appropriate in that situation). | ||
+ | |||
+ | Note: do not expire client on cb_layout error. | ||
+ | This may lead to deadlocks on the nfsd state lock. | ||
+ | Instead, wait for the laundromat to expire the client. | ||
+ | |||
+ | TODO: Regardless, even if the callback succeeded, we need to track | ||
+ | a list of recalled layout and maintain it as they returned by the client | ||
+ | Then, scrub it in the laundromat and expire the client if it hasn't | ||
+ | returned all recalled layouts in a timely manner (e.g. after 2 | ||
+ | lease periods after its last return with the respective stateid) | ||
+ | |||
+ | [extracted from pnfsd: Initial pNFS server implementation.] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: nfsd layout cache: layoutrecall changes] | ||
+ | [pnfsd: nfsd layout cache: cb_layoutrecall: minorversion1 xdr infrastructure] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | Signed-off-by: Andy Adamson <andros@umich.edu> | ||
+ | Signed-off-by: Mike Sager <sager@netapp.com> | ||
+ | [pnfsd: fix compile error with gcc 3.4.4] | ||
+ | [pnfsd: cleanup encode_cb_layout dprintks] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Integrated remaining callback patches] | ||
+ | [extracted from: Integrated remaining NFSv4.1 callback channel and cb sequence patches.] | ||
+ | [pnfsd: Spaces to tabs fixes] | ||
+ | Signed-off-by: Mike Sager <sager@netapp.com> | ||
+ | [pnfsd: simulate layoutreturn on nomatching_layouts error] | ||
+ | [pnfsd: get a reference on nfs4_file when cloning nfs4_layoutrecall] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Tested-by: Marc Eshel <eshel@almaden.ibm.com> | ||
+ | [pnfsd: Use nfsd4_layout_seg instead of wrapper struct.] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: update cb_layoutrecall xdr to draft-19] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: remove unused symbol exports] | ||
+ | Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> | ||
+ | [pnfsd: dprint recalled layout stateid] | ||
+ | [pnfsd: handle RETURN_{FSID,ALL} with no nfs4_file] | ||
+ | [pnfsd: properly xdr-encode cb_layoutrecall stateid] | ||
+ | [pnfsd: set up layout recall stateid for RECALL_FILE] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: remove unused clr_status and cbd_status variables] | ||
+ | [pnfsd: Remove layoutrecall dead code.] | ||
+ | [pnfsd: Fixup layoutrecall server handling.] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: refactor create_layout_recall_list] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [part of pnfsd: Do not hold state lock while recalling layouts.] | ||
+ | Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> | ||
+ | [pnfsd: expire_layout when client is expired] | ||
+ | Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> | ||
+ | [pnfsd: use STATEID_FMT and STATEID_VAL for printing stateids] | ||
+ | [pnfsd: convert generic code to use new pnfs api] | ||
+ | [pnfsd: define pnfsd_cb_operations] | ||
+ | [pnfsd: get rid of generic use of old cb ops in export_operations] | ||
+ | [pnfsd: obliterate old vfs api] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Backchannel: Get minorversion directly from arguments] | ||
+ | [pnfsd: Backchannel: Update pnfs callbacks to use new slot locking mechanism] | ||
+ | Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> | ||
+ | [pnfsd: delete unused status in spawn_layout_recall] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | pnfsd: update layout stateid on cb layout recall] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: refactor nfsd4_cb_prepare for pnfs callbacks] | ||
+ | [pnfsd: remove redundant BUG_ON in nfsd_layout_recall_cb] | ||
+ | [pnfsd: fix nfs4_file reference leak in nfsd_layout_recall_cb] | ||
+ | [pnfsd: fix indentation in nfsd_layout_recall_cb's declaration] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [Moved pnfsd code from nfs4state.c to nfs4pnfsd.c] | ||
+ | Signed-off-by: Andy Adamson <andros@netapp.com> | ||
+ | [pnfsd: use a spinlock for layout state] | ||
+ | [pnfsd: fix compiler warnings when CONFIG_PNFSD is not defined] | ||
+ | [pnfsd: expire_client code cleanup] | ||
+ | [pnfsd: bugfix and handle -EIO in nfsd4_cb_layout_done()] | ||
+ | [pnfsd: cb_recall cookie (layout return hint 03)] | ||
+ | Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | [pnfsd: callbacks must kfree args on rpc_call_async error] | ||
+ | Reported-by: J. Bruce Fields <bfields@fieldses.org> | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: fix compile problem] | ||
+ | Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> | ||
+ | [for 2.6.32: pnfsd: use callback_cred] | ||
+ | [moved recall_return_*_match function definitions here] | ||
+ | [pnfsd: nomatching_layout unlocked (layout_return hint 02)] | ||
+ | [pnfsd: Bug in last pnfs_expire_client changes] | ||
+ | [pnfsd: pnfs_expire client Second bug fix] | ||
+ | Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | [fixup "SQUASHME: pnfsd: pnfs_expire client Second bug fix"] | ||
+ | [pnfsd: properly initialize lsid in lo_recall_per_client] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Move pnfsd code out of nfs4state.c/h] | ||
+ | Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | [pnfsd: use a filter function to iterate through all confirmed clients] | ||
+ | [pnfsd: fixup nfs4_pnfs_get_layout to use __be32 nfserr] | ||
+ | [pnfsd: cb_{set,client} moved in 2.6.35] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Support for cb_layout returning NFS4ERR_DELAY] | ||
+ | Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | [pnfsd: move kfree of the cb_layout rpc_args to release time] | ||
+ | [pnfsd: update layout stateid properly] | ||
+ | [pnfsd: reset status on NFS4ERR_NOMATCHING_LAYOUT] | ||
+ | [rewrite for 2.6.38] | ||
+ | [always dprintk cb_done status] | ||
+ | [rebase onto nfsd-2.6/for-3.2] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com> | ||
+ | [pnfsd: Compile warning in pnfs-all-3.1-2011-10-31-1] | ||
+ | Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | Signed-off-by: Benny Halevy <bhalevy@tonian.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: decode getdeviceinfo notify types. ==== | ||
+ | <nowiki>commit c4c21cb5a26ce8beac71a05e0ba34c1607aafe72 | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Thu Oct 15 15:15:12 2009 +0200 | ||
+ | |||
+ | pnfsd: decode getdeviceinfo notify types. | ||
+ | |||
+ | Reintroduced. Removed for pnfsd-files patchset. | ||
+ | |||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: set_device_notify ==== | ||
+ | <nowiki>commit 81fd9c9102cf0d1a41c0a52be8a84bc19ec8e446 | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Thu Oct 15 15:27:48 2009 +0200 | ||
+ | |||
+ | pnfsd: set_device_notify | ||
+ | |||
+ | Set the device notify_types at GETDEVICEINFO time. | ||
+ | The call may be used to initially set or to update | ||
+ | the requested notify types for the given deviceid. | ||
+ | The implementor of the method must update the notify_types | ||
+ | member with the supported notification types. | ||
+ | If none are supported, this method can be left unimplemented. | ||
+ | |||
+ | [pnfsd: nfsd4_pnfs_deviceid] | ||
+ | [pnfsd: fix error handling for s_pnfs_op->get_device_info] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: handle LAYOUTGET with maxcount >= 2^31 ==== | ||
+ | <nowiki>commit e0824c9416352328cd79ea9aa1e12c7444fa3baa | ||
+ | Author: Benny Halevy <bhalevy@tonian.com> | ||
+ | Date: Thu Feb 23 15:50:41 2012 -0800 | ||
+ | |||
+ | pnfsd: handle LAYOUTGET with maxcount >= 2^31 | ||
+ | |||
+ | Signed-off-by: Benny Halevy <bhalevy@tonian.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: verify minlength and range as per RFC5661 ==== | ||
+ | <nowiki>commit 5007875a386990bbaff090a1bc50f533253b5c8c | ||
+ | Author: Benny Halevy <bhalevy@tonian.com> | ||
+ | Date: Wed Mar 14 22:17:30 2012 +0200 | ||
+ | |||
+ | pnfsd: verify minlength and range as per RFC5661 | ||
+ | |||
+ | Implement verification logic for loga_minlength and loga_length | ||
+ | specified in RFC5661, Section 18.43.3. | ||
+ | |||
+ | Signed-off-by: Benny Halevy <bhalevy@tonian.com></nowiki> | ||
+ | |||
+ | ==== pnfsd: make return_on_close state layout-global for the file/client ==== | ||
+ | <nowiki>commit a199108eec467bce691a46f85ed908c906c048ed | ||
+ | Author: Benny Halevy <bhalevy@tonian.com> | ||
+ | Date: Mon May 28 11:18:35 2012 +0300 | ||
+ | |||
+ | pnfsd: make return_on_close state layout-global for the file/client | ||
+ | |||
+ | As per RFC5661 errata #3226 | ||
+ | http://www.ietf.org/mail-archive/web/nfsv4/current/msg10965.html | ||
+ | Once the server returns the return_on_close flag set, all the layout | ||
+ | for that client will be implicitly returned on last close. | ||
+ | |||
+ | Reported-by: Boaz Harrosh <bharrosh@panasas.com> | ||
+ | Signed-off-by: Benny Halevy <bhalevy@tonian.com></nowiki> | ||
+ | |||
+ | === pnfsd-lexp patches === | ||
+ | ==== pnfsd-lexp: get_device_info ==== | ||
+ | <nowiki>commit 16947a6a1e15c33c37f3e98273617be22ab80b1f | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Mon Jun 16 18:57:15 2008 +0300 | ||
+ | |||
+ | pnfsd-lexp: get_device_info | ||
+ | |||
+ | [directly call filelayout_encode_{layout,devinfo}] | ||
+ | [get rid of getdevinfo notify_types] | ||
+ | [move to new getdevinfo API] | ||
+ | [fixup LAYOUT_NFSV4_1_FILES] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd-lexp: layout_get ==== | ||
+ | <nowiki>commit 3f05b23b3f81f65076b8adda3b8e38f6e54ecc1a | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Mon Jun 16 19:01:15 2008 +0300 | ||
+ | |||
+ | pnfsd-lexp: layout_get | ||
+ | |||
+ | [directly call filelayout_encode_{layout,devinfo}] | ||
+ | [move to new layoutget API] | ||
+ | [rename device fsid member to sbid] | ||
+ | [fixup layout_get return type to u32] | ||
+ | [change layout_get return type to enum nfsstat4] | ||
+ | [fixup LAYOUT_NFSV4_1_FILES] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd-lexp: simulate layout segments ==== | ||
+ | <nowiki>commit d60b53d55629ad30275e3a9ec2188659703eba38 | ||
+ | Author: Benny Halevy <bhalevy@panasas.com> | ||
+ | Date: Wed Apr 20 18:47:07 2011 +0300 | ||
+ | |||
+ | pnfsd-lexp: simulate layout segments | ||
+ | |||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== pnfsd-lexp: Export correct ipv6 address for local exports ==== | ||
+ | <nowiki>commit d518f7cfed490367bc6833f614e7fc9ee480333a | ||
+ | Author: Michael Groshans <groshans@umich.edu> | ||
+ | Date: Tue May 31 14:20:29 2011 -0400 | ||
+ | |||
+ | pnfsd-lexp: Export correct ipv6 address for local exports | ||
+ | |||
+ | A sockaddr is not big enough to hold most ipv6 addresses. Changed to use | ||
+ | sockaddr_storage to hold the ipv6 address used for local exports. | ||
+ | |||
+ | Signed-off-by: Michael Groshans <groshans@umich.edu> | ||
+ | Signed-off-by: Benny Halevy <bhalevy@panasas.com></nowiki> | ||
+ | |||
+ | ==== SQUASHME: pnfsd-lexp: return only nfs4errs from pnfsd_lexp_layout_get ==== | ||
+ | <nowiki>commit 3fb7ad9fe7b10bca98539e95c31085edfe87cb2f | ||
+ | Author: Benny Halevy <bhalevy@tonian.com> | ||
+ | Date: Mon Jun 13 14:25:01 2011 -0400 | ||
+ | |||
+ | SQUASHME: pnfsd-lexp: return only nfs4errs from pnfsd_lexp_layout_get | ||
+ | |||
+ | Signed-off-by: Benny Halevy <bhalevy@tonian.com></nowiki> | ||
+ | |||
+ | ==== pnfsd-lexp: return_on_close config option ==== | ||
+ | <nowiki>commit f7f207ea96cb07be4b9b1244db4107ae8dc2f4d1 | ||
+ | Author: Benny Halevy <bhalevy@tonian.com> | ||
+ | Date: Mon May 28 14:03:58 2012 +0300 | ||
+ | |||
+ | pnfsd-lexp: return_on_close config option | ||
+ | |||
+ | [fix default for PNFSD_LEXP_RETURN_ON_CLOSE] | ||
+ | Signed-off-by: Benny Halevy <bhalevy@tonian.com></nowiki> |
Latest revision as of 18:30, 3 December 2012
Test Plan
- Basic tests:
- pnfsd exports filesystem
- p2pds reads file from pnfsd
- p2pclient opens file on pnfsd, is referred to p2pds for reading
- Scalability tests
- Recreate a "boot storm" environment
- pnfsd exports filesystem
- N DSs read files from pnfsd
- M * N DSs open files on pnfsd and are referred to DSs for reading
- Compare times to read files on each client to times required to read all files without p2p
Implementation
NFSD
Register DS
- [NFSD: Implement REGISTER_DS]
- Creates a new pnfs_p2p_client structure to store arguments from REGISTER_DS (netid, ip address, mds identifier, stateid) and attaches this structure to the struct nfs4_client sending the request. Assumes REGISTER_DS_ALL.
Unregister DS
- [NFSD: Implement UNREGISTER_DS]
- Frees the pnfs_p2p_client structure created by REGISTER_DS.
Layout get
- [NFSD: Find p2p device addresses for layoutget]
- Use most of pnfsd_lexp_layout_get(), adding a new function call when the device ID is set. I also add on to filelayout_encode_layout() to add the mds identifier to the front of the returned filehandle. pnfs_p2p_find_deviceid() will search the files delegation list and return the first client found that isn't the client placing the LAYOUTGET call. I send the DSs clientid as the device id for GETDEVINFO.
Get device info
- [NFSD: Find device address of a p2p DS]
- Use most of pnfsd_lexp_get_device_info(), but have the p2p code fill out the daddr return value in the p2p case. I can easily translate between devid and clientid since they're the same value, and this allows me to look up the netid and ip address from the client structure.
Put filehandle
- [NFSD: Make putfh work with p2p filehandles]
- Assume that any filehandle of length 36 is a p2p filehandle and split out the mds identifier from the rest of the handle. Don't do any of the state checking normally associated with putfh to avoid a crash since the NFS client doesn't set up an exports structure.
Read
- [NFSD: Make putfh work with p2p filehandles]
- If the filehandle is a p2p handle, then call the NFS client's nfs_proxy_open() function to map the p2pfh into a fh recognized by the DS.
Proxy open
- [NFSD: Implement PROXY_OPEN]
- Generate a new stateid representing the proxy open using the new struct pnfs_p2p_po_stid to store relevant information. Verify the filehandle exists using fh_verify(), since the DS skipped this check during the PUTFH call.
Expire clients
- [NFSD: Clean up p2p clients when they expire]
- Free up a client's p2p state when it expires.
Proxy revoke
- [most of PROXY_REVOKE]
- I send PROXY_REVOKE to clients that have PROXY_OPEN-ed files when a p2pclient (not p2pds) is expired. I haven't figured out the right way to free up the p2p stateid during nfsd4_cb_proxy_revoke_release() yet, my attempts all seem to cause an oops.
NFS
Register DS
- [NFS: Place a call to REGISTER_DS]
- Generate a new mds identifier using the cl_cb_ident and a static int that is incremented every REGISTER_DS call. Always sends REGISTER_DS_ALL as the sharing type. Called from nfs4_remote_mount() if nfs_fs_mount_common() returns success.
Unregister DS
- [NFS: Place a call to UNREGISTER_DS]
- Called from nfs4_destroy_server() when we are using v4.1
Proxy open
- [NFS: Place a call to PROXY_OPEN]
- I use the mds identifier to find the struct nfs_server that opened the file for caching and use it to place a call to PROXY_OPEN on the mounted server. This gives me a translated fh that I use to first find an associated inode (using nfs_delegation_find_inode(), and then a dentry using d_find_any_alias()). The dentry is returned to the nfsd code.
Proxy open
- [NFS: Store list of proxy opened files in the delegation]
- Check if we already have the file proxy-opened by storing some extra state in the delegation structure. If we already have a mapping to the real filehandle, then use that instead of placing a new PROXY_OPEN call.
Proxy revoke
- [most of PROXY_REVOKE]
- Delete the extra delegation state stored during PROXY_OPEN. This will force a new PROXY_OPEN call (and possibly start some kind of recovery) the next time the file is accessed.
VFS
struct super_block -> struct vfsmount
- [NFSD: Implement PROXY_OPEN]
- Look through sb->s_mounts to find a mount instance, inspired by d_find_any_alias(). All vfsmounts in the super_block point to the same filesystem, so we don't care which one is returned as long as NFSD gets something to pass to dentry_open().
- This is the biggest hack I used.
Prerequisite patches
- These are patches that introduce features and code that I directly modify, they might have their own prerequisites that I haven't tracked down yet.
- Some patches may be bugfixes or add code near lines that I change and may not actually be required.
Generic PNFSD
pnfsd: get device list/info
commit 50af02e4b053ac4fed827ba983539a28db91046c Author: Benny Halevy <bhalevy@panasas.com> Date: Tue Aug 11 17:07:04 2009 -0400 pnfsd: get device list/info Implement the generic handling of GETDEVICELIST and GETDEVICEINFO. After verifying that the requested layout type is supported, getdevlist uses the get_device_iter pnfs export method to encode the list of deviceids and get the cookie, verifier, and eof flag to be used be the client to iterate through the whole device list. Getdevinfo uses the get_device_info pnfs export method to encode the device info for the given deviceid. The filesystem can choose to return valid cookie and cookieverf on eof, pointing at the end of the device list so that subsequent calls to GETDEVIE LIST will return an empty list. Note that with the file layout, lots of devices are sent under a single device id, so the client will need to send a relatively large value of maxcount. If maxcount is 0 then just update notifications. The nfsv4.1 spec forbids returning ETOOSMALL in this case. It is up to the implementor of the get_device_info method to verify the deviceid in this case and return no info for it. If no notifications are given represent gdir_notification as an empty bitmap array rather than one consisting of a single zeroed entry. Thanks to Dean Hildebrand for suggesting this optimization and to Peter Staubach for convincing that it's worth it. Nfsd should return sbid while getting device list so that it can operate it properly later in nfsd4_getdevinfo. [extracted from pnfsd: Initial pNFS server implementation.] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: update pNFS server ops to draft 13] Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> [pnfsd: Fix server getdevicelist update to draft 13] Signed-off-by: Andy Adamson<andros@umich.edu> [pnfsd: update pNFS server ops to draft 13] Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> [pnfsd: Fix server GETDEVICELIST to comply with NFSv4.1 Draft 13] Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> [pnfsd: Streamline error code checking for non-pnfs filesystems] Signed-off-by: Dean Hildebrand <seattleplus@gmail.com> [pnfsd: Simplify device export ops.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfs: fix compile problems if CONFIG_PNFS turned off - exportfs.h] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [pnfsd: Implement getdevlist maxcount checking.] [pnfsd: use nfs error codes] [pnfsd: Use 128 bit deviceid on server] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: fix warning in nfsd4_encode_devlist_iterator()] Signed-off-by: Mike Sager <sager@netapp.com> [pnfsd: Update getdeviceinfo for draft-19] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: encode empty getdeviceinfo notify bitmap rather than zeroed] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: do not depend on the current file handle in getdeviceinfo] [pnfsd: update export hold count] Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> [pnfsd: Update getdevlist for draft 19] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: fix GETDEVICELIST encoding] Signed-off-by: Mike Sager <sager@netapp.com> [pnfsd: use nfsd4_compoundres pointer in pnfs_xdr_info] [pnfsd: fix NFS4ERR_TOOSMALL for getdeviceinfo] [pnfsd: enable multipage getdeviceinfo da_addr_body] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: move vfs api structures to nfsd4_pnfs.h] [pnfsd: convert generic code to use new pnfs api] [pnfsd: define pnfs_export_operations] [pnfsd: obliterate old vfs api] [pnfsd: fixup ENCODE_HEAD for getdevicelist/info] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: get device list/info all layout types] [pnfsd: check ex_pnfs in nfsd4_verify_layout] Signed-off-by: Andy Adamson <andros@netapp.com> [removed nfsd4_pnfs_fl_getdev{info,iter} stubs] [pnfsd: filelayout: convert to using exp_xdr] [pnfsd: get rid of getdevinfo notify_types] [pnfsd: copy getdevinfo deviceid in one piece] [pnfsd: rename deviceid_t struct pnfs_deviceid] [pnfsd: fix cosmetic checkpatch warnings] [pnfsd: handle s_pnfs_op==NULL] [pnfsd: move getdevinfo xdr structure to private header] [pnfsd: clean up getdeviceinfo export op API] [pnfsd: getdeviceinfo deviceid needs to be const.] [pnfsd: allow returning empty device list.] [pnfsd: return NFS4ERR_INVAL when maxdevices is zero.] [pnfsd: move getdevlist xdr structure to private header] [pnfsd: dev_iter: clean up export API] [pnfsd: rename device fsid member to sbid] [pnfsd: use devid.sbid for locating super block in getdevinfo] [pnfsd: fixup nfsd4_encode_getdev{list,info} to use __be32 nfserr] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Use list_move instead list_del and list_add] Signed-off-by: Bian Naimeng <biannm@cn.fujitsu.com> [pnfsd: using sbid instead of fsid while returning device list to client] Signed-off-by: Zhengju Sha <shazhengju@nrchpc.ac.cn> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd: layout get
commit 15a582b4dee94ae23b39f672a0590cad54d42fc1 Author: Benny Halevy <bhalevy@panasas.com> Date: Tue Aug 11 17:07:06 2009 -0400 pnfsd: layout get Currently, always return a single record in the log_layout array. If an invalid iomode, or an iomode of LAYOUTIOMODE4_ANY is specified, the metadata server MUST return NFS4ERR_BADIOMODE. [extracted from pnfsd: Initial pNFS server implementation.] [pnfsd: nfsd layout cache: layout return changes] [pnfsd: add debug printouts in return_layout path] [pnfsd: refactor return_layout] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Streamline error code checking for non-pnfs filesystems] [pnfsd: Use nfsd4_layout_seg instead of wrapper struct.] [pnfsd: Move nfsd4_layout_seg to exportfs.h] [pnfsd: Fix file layout layoutget export op for d13] [pnfsd: Simplify layout get export interface.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: improve nfs4_pnfs_get_layout dprintks] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: initialize layoutget return_on_close] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: update server layout xdr for draft 19.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: use stateid_t for layout stateid xdr data structs] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Update getdeviceinfo for draft-19] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: xdr encode layoutget response logr_layout array count as per draft-19] [pnfsd: use stateid xdr {en,de}code functions for layoutget] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: use nfsd4_compoundres pointer in pnfs_xdr_info] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: move vfs api structures to nfsd4_pnfs.h] [pnfsd: convert generic code to use new pnfs api] [pnfsd: define pnfs_export_operations] [pnfsd: obliterate old vfs api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Split this patch into filelayout only (this patch) and all layout types] (patch pnfsd: layout get all layout types). Remove use of pnfs_export_operations. Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: fixup ENCODE_HEAD for layoutget] [pnfsd: rewind xdr response pointer on nfsd4_encode_layoutget error] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Move pnfsd code from nfs4state.c to nfs4pnfsd.c] [Move common state code from linux/nfsd/state.h to fs/nfsd/internal.h] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: Release lock during layout export ops.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [cosmetic changes from pnfsd: Helper functions for layout stateid processing.] [pnfsd: layout get all layout types] [pnfsd: check ex_pnfs in nfsd4_verify_layout] Signed-off-by: Andy Adamson <andros@netapp.com> [removed the nfsd4_pnfs_fl_layoutget stub] [pnfsd: get rid of layout encoding function vector] [pnfsd: filelayout: convert to using exp_xdr] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Move pnfsd code out of nfs4state.c/h] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [fixed !CONFIG_PNFSD and clean up for pnfsd-files] [gfs2: set pnfs_dlm_export_ops only for CONFIG_PNFSD] [moved pnfsd defs back into state.h] [pnfsd: rename deviceid_t struct pnfs_deviceid] [pnfsd: fix cosmetic checkpatch warnings] [pnfsd: handle s_pnfs_op==NULL] [pnfsd: move layoutget xdr structure to xdr4.h] [pnfsd: clean up layoutget export API] [pnfsd: moved find_alloc_file to nfs4state.c] [moved struct nfs4_fsid to public include/linux/nfs4.h] [pnfsd: rename device fsid member to sbid] [pnfsd: use sbid hash table to map super_blocks to devid major identifiers] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: fix file system API layout_get error codes] [pnfsd: fix NFS4ERR_BADIOMODE in layoutget] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: require filesystem layout_get method return a u32 rather than int] [pnfsd: allow filesystem to return canonical nfs4 errors for layoutget] [pnfsd: do not allow filesystem to return encoded nfs errors on layout_get] [pnfsd: fixup nfs4_pnfs_get_layout to use __be32 nfserr] [pnfsd: allow filesystem to return NFS4ERR_WRONG_TYPE for layout_get] [pnfsd: fix error handling in layout_get] [pnfsd: fix uninitialized usage of nfserr in nfs4_pnfs_get_layout] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd: process the layout stateid
commit f2e3eade61963b7794c82939a86434b7be09cbb0 Author: Andy Adamson <andros@umich.edu> Date: Tue Aug 11 17:07:11 2009 -0400 pnfsd: process the layout stateid Common function for LAYOUTGET and LAYOUTRETURN layout stateid processing. The 'first open, delegation, or lock stateid' presented by the client is looked up for verification. Both initial and non-initial parallel LAYOUTGET operations and parallel LAYOUTRETURN operations are supported. Note: layout stateid seqid checking is more lax than that specified in draft-ietf-nfsv4-minorversion1-22 for Connectathon. Take a reference count whenever the pointer to the layout state is kept, in particular when the layout structure is listed on the state's ls_layouts. On dequeue_layout the layout state if being put and its reference count will drop to zero if the list empties unless someone's holding a reference transiently within the scope of teh calling function, in which case the layout state is dereferenced before the function exits. Note: the layout stateid must be updated by layout get only on success upon changing the actual state, otherwise, a parallel layout_recall will send the wrong stateid. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: nfs4_process_layout_stateid print result stateid conditionally] [pnfsd: use STATEID_FMT and STATEID_VAL for printing stateids] [pnfsd: debug print layout stateid before putting the layout_state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: fix layout state reference count] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [used nfs4_check_stateid in nfs4_process_layout_stateid] [Moved pnfsd code from nfs4state.c to nfs4pnfsd.c] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: use a spinlock for layout state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Move pnfsd code out of nfs4state.c/h] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [moved defs back into state.h] [verify_stateid's return status is __be32] [update layout stateid properly] [convert to using 3.2 layout state infrastructure] [squashed Helper functions for layout stateid processing] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd: LAYOUTGET layout stateid processing
commit 16b73ee1ba17b7b8dcb5226f7d0056f419884a0b Author: Andy Adamson <andros@netapp.com> Date: Tue Aug 11 17:07:12 2009 -0400 pnfsd: LAYOUTGET layout stateid processing [Moved pnfsd code from nfs4state.c to nfs4pnfsd.c] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: clean up layoutget export API] [pnfsd: fix uninitialized usage of nfserr in nfs4_pnfs_get_layout] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: fix layout state ref counting] Signed-off-by: Benny Halevy <bhalevy@tonian.com>
pnfsd: layout recall callback
commit fcf9d16843b042034cc7e440e68c95958a41856e Author: Benny Halevy <bhalevy@panasas.com> Date: Tue Aug 11 17:07:39 2009 -0400 pnfsd: layout recall callback when cb_layoutrecall is replied with NFS4ERR_NOMATCHING_LAYOUTS simulate a return_layout call into the file system and release all layouts in the range as if a respective LAYOUTRETURN was received. On a recall, a client may return NFS4ERR_DELAY to indicate that it is busy with the layout and wants to be poled. TODO: If the client is stuck he would probably be cleaned at expire client. But it is possible that the client is active/renewing but would not acknowledge the recall. We should take a time stamp on first recall and expire the client if a lease time has passed. cb_layout_recall() return codes: ->cb_layout_recall (nfsd_layout_recall_cb) will always try to make forward progress. Below is the meaning of the possible return codes: -ENOENT: There are no layouts that match the requested recall. Operation can proceed. 0: All needed recalls were sent and code should wait for the given cookie to be returned at layout_return. -EAGAIN: There were errors sending all of the recalls, but some forward progress was made. code should wait for the given cookie to be returned at layout_return, but then call cb_layout_recall() again, until -ENOENT is returned. (Note it is always safe to call cb_layout_recall() multiple times until -ENOENT is returned) ANY ERROR: (Currently only -ENOMEM) Any other error means it was not possible to make any forward progress, the operation should not be attempted and an error should be returned to the application. (Or whatever is appropriate in that situation). Note: do not expire client on cb_layout error. This may lead to deadlocks on the nfsd state lock. Instead, wait for the laundromat to expire the client. TODO: Regardless, even if the callback succeeded, we need to track a list of recalled layout and maintain it as they returned by the client Then, scrub it in the laundromat and expire the client if it hasn't returned all recalled layouts in a timely manner (e.g. after 2 lease periods after its last return with the respective stateid) [extracted from pnfsd: Initial pNFS server implementation.] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: nfsd layout cache: layoutrecall changes] [pnfsd: nfsd layout cache: cb_layoutrecall: minorversion1 xdr infrastructure] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@umich.edu> Signed-off-by: Mike Sager <sager@netapp.com> [pnfsd: fix compile error with gcc 3.4.4] [pnfsd: cleanup encode_cb_layout dprintks] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Integrated remaining callback patches] [extracted from: Integrated remaining NFSv4.1 callback channel and cb sequence patches.] [pnfsd: Spaces to tabs fixes] Signed-off-by: Mike Sager <sager@netapp.com> [pnfsd: simulate layoutreturn on nomatching_layouts error] [pnfsd: get a reference on nfs4_file when cloning nfs4_layoutrecall] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Tested-by: Marc Eshel <eshel@almaden.ibm.com> [pnfsd: Use nfsd4_layout_seg instead of wrapper struct.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: update cb_layoutrecall xdr to draft-19] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: remove unused symbol exports] Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> [pnfsd: dprint recalled layout stateid] [pnfsd: handle RETURN_{FSID,ALL} with no nfs4_file] [pnfsd: properly xdr-encode cb_layoutrecall stateid] [pnfsd: set up layout recall stateid for RECALL_FILE] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: remove unused clr_status and cbd_status variables] [pnfsd: Remove layoutrecall dead code.] [pnfsd: Fixup layoutrecall server handling.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: refactor create_layout_recall_list] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [part of pnfsd: Do not hold state lock while recalling layouts.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [pnfsd: expire_layout when client is expired] Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> [pnfsd: use STATEID_FMT and STATEID_VAL for printing stateids] [pnfsd: convert generic code to use new pnfs api] [pnfsd: define pnfsd_cb_operations] [pnfsd: get rid of generic use of old cb ops in export_operations] [pnfsd: obliterate old vfs api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Backchannel: Get minorversion directly from arguments] [pnfsd: Backchannel: Update pnfs callbacks to use new slot locking mechanism] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [pnfsd: delete unused status in spawn_layout_recall] Signed-off-by: Benny Halevy <bhalevy@panasas.com> pnfsd: update layout stateid on cb layout recall] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: refactor nfsd4_cb_prepare for pnfs callbacks] [pnfsd: remove redundant BUG_ON in nfsd_layout_recall_cb] [pnfsd: fix nfs4_file reference leak in nfsd_layout_recall_cb] [pnfsd: fix indentation in nfsd_layout_recall_cb's declaration] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Moved pnfsd code from nfs4state.c to nfs4pnfsd.c] Signed-off-by: Andy Adamson <andros@netapp.com> [pnfsd: use a spinlock for layout state] [pnfsd: fix compiler warnings when CONFIG_PNFSD is not defined] [pnfsd: expire_client code cleanup] [pnfsd: bugfix and handle -EIO in nfsd4_cb_layout_done()] [pnfsd: cb_recall cookie (layout return hint 03)] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [pnfsd: callbacks must kfree args on rpc_call_async error] Reported-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: fix compile problem] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [for 2.6.32: pnfsd: use callback_cred] [moved recall_return_*_match function definitions here] [pnfsd: nomatching_layout unlocked (layout_return hint 02)] [pnfsd: Bug in last pnfs_expire_client changes] [pnfsd: pnfs_expire client Second bug fix] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [fixup "SQUASHME: pnfsd: pnfs_expire client Second bug fix"] [pnfsd: properly initialize lsid in lo_recall_per_client] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Move pnfsd code out of nfs4state.c/h] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [pnfsd: use a filter function to iterate through all confirmed clients] [pnfsd: fixup nfs4_pnfs_get_layout to use __be32 nfserr] [pnfsd: cb_{set,client} moved in 2.6.35] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Support for cb_layout returning NFS4ERR_DELAY] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [pnfsd: move kfree of the cb_layout rpc_args to release time] [pnfsd: update layout stateid properly] [pnfsd: reset status on NFS4ERR_NOMATCHING_LAYOUT] [rewrite for 2.6.38] [always dprintk cb_done status] [rebase onto nfsd-2.6/for-3.2] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsd: Compile warning in pnfs-all-3.1-2011-10-31-1] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
pnfsd: decode getdeviceinfo notify types.
commit c4c21cb5a26ce8beac71a05e0ba34c1607aafe72 Author: Benny Halevy <bhalevy@panasas.com> Date: Thu Oct 15 15:15:12 2009 +0200 pnfsd: decode getdeviceinfo notify types. Reintroduced. Removed for pnfsd-files patchset. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd: set_device_notify
commit 81fd9c9102cf0d1a41c0a52be8a84bc19ec8e446 Author: Benny Halevy <bhalevy@panasas.com> Date: Thu Oct 15 15:27:48 2009 +0200 pnfsd: set_device_notify Set the device notify_types at GETDEVICEINFO time. The call may be used to initially set or to update the requested notify types for the given deviceid. The implementor of the method must update the notify_types member with the supported notification types. If none are supported, this method can be left unimplemented. [pnfsd: nfsd4_pnfs_deviceid] [pnfsd: fix error handling for s_pnfs_op->get_device_info] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd: handle LAYOUTGET with maxcount >= 2^31
commit e0824c9416352328cd79ea9aa1e12c7444fa3baa Author: Benny Halevy <bhalevy@tonian.com> Date: Thu Feb 23 15:50:41 2012 -0800 pnfsd: handle LAYOUTGET with maxcount >= 2^31 Signed-off-by: Benny Halevy <bhalevy@tonian.com>
pnfsd: verify minlength and range as per RFC5661
commit 5007875a386990bbaff090a1bc50f533253b5c8c Author: Benny Halevy <bhalevy@tonian.com> Date: Wed Mar 14 22:17:30 2012 +0200 pnfsd: verify minlength and range as per RFC5661 Implement verification logic for loga_minlength and loga_length specified in RFC5661, Section 18.43.3. Signed-off-by: Benny Halevy <bhalevy@tonian.com>
pnfsd: make return_on_close state layout-global for the file/client
commit a199108eec467bce691a46f85ed908c906c048ed Author: Benny Halevy <bhalevy@tonian.com> Date: Mon May 28 11:18:35 2012 +0300 pnfsd: make return_on_close state layout-global for the file/client As per RFC5661 errata #3226 http://www.ietf.org/mail-archive/web/nfsv4/current/msg10965.html Once the server returns the return_on_close flag set, all the layout for that client will be implicitly returned on last close. Reported-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
pnfsd-lexp patches
pnfsd-lexp: get_device_info
commit 16947a6a1e15c33c37f3e98273617be22ab80b1f Author: Benny Halevy <bhalevy@panasas.com> Date: Mon Jun 16 18:57:15 2008 +0300 pnfsd-lexp: get_device_info [directly call filelayout_encode_{layout,devinfo}] [get rid of getdevinfo notify_types] [move to new getdevinfo API] [fixup LAYOUT_NFSV4_1_FILES] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd-lexp: layout_get
commit 3f05b23b3f81f65076b8adda3b8e38f6e54ecc1a Author: Benny Halevy <bhalevy@panasas.com> Date: Mon Jun 16 19:01:15 2008 +0300 pnfsd-lexp: layout_get [directly call filelayout_encode_{layout,devinfo}] [move to new layoutget API] [rename device fsid member to sbid] [fixup layout_get return type to u32] [change layout_get return type to enum nfsstat4] [fixup LAYOUT_NFSV4_1_FILES] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd-lexp: simulate layout segments
commit d60b53d55629ad30275e3a9ec2188659703eba38 Author: Benny Halevy <bhalevy@panasas.com> Date: Wed Apr 20 18:47:07 2011 +0300 pnfsd-lexp: simulate layout segments Signed-off-by: Benny Halevy <bhalevy@panasas.com>
pnfsd-lexp: Export correct ipv6 address for local exports
commit d518f7cfed490367bc6833f614e7fc9ee480333a Author: Michael Groshans <groshans@umich.edu> Date: Tue May 31 14:20:29 2011 -0400 pnfsd-lexp: Export correct ipv6 address for local exports A sockaddr is not big enough to hold most ipv6 addresses. Changed to use sockaddr_storage to hold the ipv6 address used for local exports. Signed-off-by: Michael Groshans <groshans@umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
SQUASHME: pnfsd-lexp: return only nfs4errs from pnfsd_lexp_layout_get
commit 3fb7ad9fe7b10bca98539e95c31085edfe87cb2f Author: Benny Halevy <bhalevy@tonian.com> Date: Mon Jun 13 14:25:01 2011 -0400 SQUASHME: pnfsd-lexp: return only nfs4errs from pnfsd_lexp_layout_get Signed-off-by: Benny Halevy <bhalevy@tonian.com>
pnfsd-lexp: return_on_close config option
commit f7f207ea96cb07be4b9b1244db4107ae8dc2f4d1 Author: Benny Halevy <bhalevy@tonian.com> Date: Mon May 28 14:03:58 2012 +0300 pnfsd-lexp: return_on_close config option [fix default for PNFSD_LEXP_RETURN_ON_CLOSE] Signed-off-by: Benny Halevy <bhalevy@tonian.com>