Cthon06 Meeting Notes
From Linux NFS
con le spalle al muro la mia chitarra immagini montagna draco rosas nuovi dettagli su rocky legends www orgie gratis it baldracche mature porche i gatti di vicolo miracoli mckennitt centro assistenza harman kardon video di desert rose sting sega a nastro valex la.talpa campionato primavera lettori dvd nec lamore e il sangue ain t no mountain high enough midi kosulja plava faretti digital sattv nokia caricabatteria auto house schubart, christian daniel whirlpool arc 4120 al roma parigi biglietti aerei gliaffari it video gallery porno di ragazzi di 15 ann mp3 4gb scudetti brians song - lultima corsa festivalbar cover hey yha vendita dvd perugia body language sigla di o c le avventure di stanlio e ollio pda holder villa piscina sardegna alfa 156 km zero conheo com non nobis hyundai coup 2 serie carta politica della francia panasonic th 50 canon ef 200mm f emulatore amiga www eurochocolate com singole lecco gyeongju cover planet frou frou numeri telefonici cabine cati scout stomu yamashita virginie despentes la sapienza di roma montana pneumatici sissis a milano karpathos biglietti aerei liria vacanze ostuni noughty girl damiano infineon 512 memoria ram gamereg ea com valvola sicurezza gas flash game sexy connect transit pianegonda gioiello vivanco scart vili roma madrid resident evil videogiochi la locandiera mariol hairy women optional interni audi a3 rilegatrici anelli ascolta piu bella cosa delta venere teste de inteligenta sarannomagistrati it mulino nemerix bluetooth gps auto touareg profumi donna armani night for her massive attack. eleven promos porte pvc gianni drudi sel parrucchieria stile perugia chicco trio ct01 autofix testo sigla o c epson t033640 annunci erotici cappuccio sub kossi nokia 6230 custodie per cellulari www westunion it u vezi s tobom represent cuba muvo n200 lettore mp3 usb oc theme song twin sensor www rctv lettore mp3 a colori l ligabue carl orff compositore pmp spa nonne troie 1 jpg via fanti fantascenza michael pare agosti silvano king kong 2 peto muscle bear tom jerry nu ghe n e videocamere secure digital heidi albertsen aida. centenario verdiano vianet fattivo venezia alghero la fottuta chat le done piu sexi leonardo fax frasi per buona notte video silvsted affitto stagionale sicilia is it s cos i m cool www the official blue it il giovane tigre bmw z3 3 2 m yety sports 1 asrock dual beheadings in iraq francy sinatra lazy x press nuova audi a3 2 dsg timidi s9500 fuji left me outside alone jason navins bologna tirana il prigioniero del caucaso seagate 250 opel vectra benzina fashion house parla reflex a pellicola appartamento vacanza toscana monitor lcd yusmart polti vaporella basement jax good luck condanne in irak we will rock you queen film dvd creacion de correo cls 320 digestione iol gladiatore freddi troie in sicilia philips lcd 17 170s5fs opere di tacito lettore compact flash giarusso cartoni animati dysney racconti arrapanti serena vukovic taxi vision srl www help bellsouth net foto tagli colori capelli nokia celulari forever album delle spice girls amstrad pane accessori per 8700 graduatorie personale ata messina deixa disso masterizzatore lg dvd rw immagini di cielo lunare pei-ping tv al plasma 42 sony soldo online it jojo video live get out sheggy canzoni diradio italia alfa romeo milano goliardo marcia nunziale classica mico celo foto annunci coppie napoli casse per mini ipod week end alle terme campeggio costa azzurra km parhelia pci cantori Linux pNFS Implementation meeting at Connectathon 2006
Note that these are raw, unprocessed notes!
pNFS client: walk through mount, open, and I/O WRT pNFS operations. LD = layout driver
0) register LD 1) MOUNT
a) FS_LAYOT_TYPES - currently one one b) Match returned types to LD c) register SB for pnfs private area use d) LD -> client GETDEVICELIST
ISSUE: why do GETDEVICELIST at mount? current client doesn't have a clientid until first open - device list might change
- pVFS2 doesn't use GETDEVICELIST - file layout could wait uthil open - block (object) will call immediatley to determine connectivity
2) LAYOUTGET
a) layoutget in OPEN compound? opportunity for LD to say whether or not to do LAYOUTGET
ISSUE:
- might already have a layout - blocks case OPEN references hardlink? one open for write, one open for read
b) currently waits until first IOo LD ->LAYOUTGET
QUESTION: file size - do get a layout or not?
trond - client. when vm asks for io -per page or larger extent basis. nfs client tracks this because in one case can coalase io or not. obvious to nfs whetehr or not to call into pNFS
marc - wants server policy e.g. ask server garth - what if you already have a layout?
no way for server to communicate.
brent - ask client layout driver
trond - nfs_flush_list() list of request, wsize requests on wire. in pnfs case call into LD to see if it wants to deal with it.
brent: who set up parameters on layout get. dean: client asks for io size, server decides layout size
trond: 5 byte file, why should client ask for a layout benny: load distribution (10,000 5 byte files going through MDS trond/bruce: MDS needs to do state reseources for each layout brent/marc; cache small files in data server, need to redirect
ask mds prior to getting layout. or garth: IO threshhold for pNFS? per file attr for getattr?
layout driver should have some control over the range asked?
c) where is it stored? layout is stored off inode.
ISSUE: private pointer off inode, private area, hang as much stuff that you want to - all private to LD
trond: may want to reuse vfs locking code - new type of file_lock don't want to add lock management code into the nfs client.
- properties of posix locks - coalase locks
3) READ two ways: standard page cache normal nfs clients pVFS2 way.
regular: read ahead code in VM to determine actual size. calls into nfs code to read. needs to id the layout (get the whole inode, layout pointer). gives page list. regular read
brent: any hint from file system? trond: primiv - largest you an accept, some hooks
dean: different read ahead size for data servers
garth: read aheads span stripes by just a small amount trond: modification to vm read-ahead code, currently reads in PAGE_SIZE brent: prefered max matches up to the strip- stripe aligned? trond; always page aligned - chunck size aligned
read-ahead: application decides how much to read ahead
brent: important ask DS 1K ask 64 DS for 1K - same amount of time
trond end up filling up much more of the page cache and confusing VM LRU ...
random access vrs
brent: read ahead needs chunck alignment within page alignment notion
dean: currently have an interface to ask for this now (in terms of PAGES) stipe size multiple of PAGE SIZE
trond: O_DIRECT needs some consideration.
dean: can't get more than a half meg out of the VM
d) last void * pointer in the read/write interface. lots of nfs code and then needs to call LD, so used to need type checking: declare struct blah; in pvfs2 trond: setting it up as a cookie is done when you have two or more that need it. not when you have one.
ISSUE:
4) WRITE a) uses standard colasce stripe size nfs code to construct wsize chuncks different for O_DIRECT - map user mem into vm space not file system specific. dycotomy - like to convert the O_DIRECT A_IO support only for O_DIRECT. normal code writes go into the vm, and returns. generic AIO code hacky -waiting for locks, etc
b) sent to write function. write page list c) marks pagees for commit close, fsync, mempresure d) instead of calling regutlar commit, callsd LD commit. LDcommit - to data servers.
ISSUE: if not using nfs code for commits (e.g. your own when can you clear pages? LD needs to clear it's own pages when commit operation
5) LAYOUTCOMMIT:
called on fsync, stat calls, close, lock, locku, on trunc setattr? triggered by user land call.
ISSUE: garth: prefered that it is done prior to any getattr, but getattr is always called.
benny: only stat system call getattr.
don't overwrite the file size on client on getattr if havent sent layout commit.
added flag to inode structure: writes have happened, no layout commit.
set flag when writes are issues. used to trigger a LAYOUTCOMMIT on fsync, etc
and then cleared.
ISSUE: which creds to use? add a pointer to the write creds, bump counter. use nfsopen context
ISSUE: who constructs LAYOUTCOMMIT - currently generic pnfs needs to a LD call because different byte ranges etc.
LD->LAYOUTCOMMIT send an array of commits. block layout uses layout update structure, object
Small files
currently three round trips OPEN READ CLOSE whole file current stateid = down to one 154,1-8