Fallocate

From Linux NFS

(Difference between revisions)
Jump to: navigation, search
Amschuma (Talk | contribs)
(Created page with "Whenever a client wishes to zero the blocks backing a particular region in the file, it calls the WRITE_PLUS operation with the current filehandle set to the filehandle of the fi...")
Newer edit →

Revision as of 16:43, 28 October 2013

Whenever a client wishes to zero the blocks backing a particular region in the file, it calls the WRITE_PLUS operation with the current filehandle set to the filehandle of the file in question, and the equivalent of start offset and length in bytes of the region set in wpa_hole.di_offset and wpa_hole.di_length respectively. If the wpa_hole.di_allocated is set to TRUE, then the blocks will be zeroed and if it is set to FALSE, then they will be deallocated. All further reads to this region MUST return zeros until overwritten. The filehandle specified must be that of a regular file.


Contents

Data type reference

typedef uint32_t count4
typedef uint64_t length4
typedef uint64_t offset4

enum stable_how4 {
        UNSTABLE4       = 0,
        DATA_SYNC4      = 1,
        FILE_SYNC4      = 2
};

enum data_content4 {
        NFS4_CONTENT_DATA = 0,
        NFS4_CONTENT_APP_DATA_HOLE = 1,
        NFS4_CONTENT_HOLE = 2,
};

struct data_info4 {
        offset4         di_offset;
        length4         di_length;
        bool            di_allocated;
};


Argument

union write_plus_arg4 switch (data_content4 wpa_content) {
case NFS4_CONTENT_HOLE:
        data_info4      wpa_hole;
default:
        void;
};

struct WRITE_PLUS4args {
        /* CURRENT_FH: file */
        stateid4        wp_stateid;
        stable_how4     wp_stable;
        write_plus_arg4 wp_data<>;
};


Result

struct write_response4 {
        stateid4        wr_callback_id<1>;
        count4          wr_count;
        stable_how4     wr_committed;
        verifier4       wr_writeverf;
};

union WRITE_PLUS4res switch (nfsstat4 wp_status) {
case NFS4_OK:
        write_response4         wp_resok4;
default:
        void;
};


Sync Client

  • Fill in the fallocate field in the struct file_operations for NFS v4
    • Return -EOPNOTSUPP for NFS < 4.2
  • Punch holes one at a time through the system call
    • The spec allows for many, but the system call only allows for one / call.
    • Simpler design, whoo!
  • Create a nfs42_proc_fallocate()
    • Check the flags provided by the VFS
    • If zero, set di_allocated to true
    • If FALLOC_FL_PUNCH_HOLE is set, set di_allocated to false
  • Would setting wp_stable to FILE_SYNC4 disable the callback from the server?
  • Send the compound:
        SEQUENCE
        PUTFH
        WRITE_PLUS


Sync Server

  • Define operation to use the same server flags as WRITE
  • Call do_fallocate() once the request arrives
    • Need to return an error if the underlying filesystem doesn't support hole punching (check for -EOPNOTSUPP)
  • Check the di_allocated flag
    • If true, call do_fallocate() with mode = 0
    • If false, call do_fallocate() with mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE
  • Decode list, process everything in order.
  • Return NFS4ERR_UNION_NOTSUPP for data_content4 != NFS4_CONTENT_HOLE


Async Client

  • Similar to COPY
    • Check for an offload stateid, add to offload list
  • Use wait_for_completion() before returning


Async Server

  • Similar to COPY
  • Run the entire operation later through a work queue
  • Send CB_OFFLOAD to the client when complete
  • Don't submit this patch since OFFLOAD_STATUS and OFFLOAD_ABORT aren't implemented (yet?).


Testing

  • XFS test #255 tests generic fallocate hole punching
  • XFS test #285 should run test 9 and 10 once fallocate support is added
  • XFS test #316 tests fallocate hole punching w/o unwritten extents
Personal tools