Matrix robustness section

From Linux NFS

(Difference between revisions)
Jump to: navigation, search
(Basic stability assessments)
(Basic stability assessments)
Line 3: Line 3:
==Basic stability assessments==
==Basic stability assessments==
-
*Run iozone for 2 weeks on basic client/server operations, using:
 
-
**Both data and metadata options
 
-
**Cached and direct I/O
 
-
**Various mount options
 
-
*Run automounter use case for 2 weeks on amd, autofs, and autong, using:
 
-
**Large number of maps
 
-
**Randomly mount and run workloads on an automounted partition
 
-
**use a variety of workloads, such as randomly chosen fs tests e.g. Crashme [http://people.delphiforums.com/gjc/crashme.html more]
 
-
*Run NFS server for 2 wks with random configuration changes, using: Open OSDL
+
<table nosave="" border="1" width="85%">
-
**Interrupt server in various ways (reboot, power cycle, lan fail)
+
<tr nosave="" align="center" valign="CENTER">
-
**Change/reexport export rules at random
+
<td nosave="" align="center" valign="CENTER"></td>
-
**Trigger a client workload at arbitrary times
+
-
**Analyze client recovery behaviors
+
-
*Run connectathon locking tests against NFS server for 2 weeks, using: New
+
<td>test</td>
-
**Multiple client machines
+
<td>tool test</td>
-
**Reboot at random
+
<td>status</td>
-
**Analyze client cache coherency behaviors
+
<td>Owner</td>
-
**Analyze locking behaviors
+
<td>notes</td>
 +
</tr>
 +
<tr>
 +
<td>III.A.1</td>
-
*Run fsstress 2 weeks on basic client/server operations, using:   '''[[Robustness_testing#Main_results|Done]]''' (1 week)
+
<td>
-
**Long list random operations (1000 operations)
+
Run iozone for 2 weeks on basic client/server operations, using:
-
**hight number of process (100)
+
*Both data and metadata options
 +
*Cached and direct I/O
 +
*Various mount options </td><td>IOzone</td><td>'''done'''</td><td>BULL</td><td>Now testing with fsstress and FFSB</td>
 +
</tr>
 +
<tr>
 +
<td>III.A.2</td>
 +
<td>
 +
Run automounter use case for 2 weeks on amd, autofs, and autong, using:
 +
*Large number of maps
 +
*Randomly mount and run workloads on an automounted partition
 +
*use a variety of workloads, such as randomly chosen fs tests
 +
</td><td>e.g. Crashme [http://people.delphiforums.com/gjc/crashme.html more]</td><td>New</td><td>none</td><td>none</td>
 +
</tr>
-
*Run FFSB 1 day on basic client/server operations in stress configuration, using: '''[[Robustness_testing#Main_results|Done]]''' (1 day)
+
<tr>
-
**1 200 000 files
+
<td>III.A.3</td>
-
**100 directories
+
<td>
 +
Run NFS server for 2 wks with random configuration changes, using:
 +
*Interrupt server in various ways (reboot, power cycle, lan fail)
 +
*Change/reexport export rules at random
 +
*Trigger a client workload at arbitrary times
 +
*Analyze client recovery behaviors
 +
</td><td></td><td>'''OPEN'''</td><td>OSDL</td><td></td>
 +
</tr>
 +
 
 +
<tr>
 +
<td>III.A.4</td>
 +
<td>
 +
Run connectathon locking tests against NFS server for 2 weeks, using:
 +
*Multiple client machines
 +
*Reboot at random
 +
*Analyze client cache coherency behaviors
 +
*Analyze locking behaviors
 +
</td><td></td><td>'''NEW'''</td><td></td><td></td>
 +
</tr>
 +
 
 +
<tr>
 +
<td>III.A.5</td>
 +
<td>
 +
Run fsstress 2 weeks on basic client/server operations, using:  
 +
*Long list random operations (1000 operations)
 +
*hight number of process (100)
 +
</td><td>fsstress</td><td>'''[[Robustness_testing#Main_results|Done]]'''</td><td>BULL</td><td>1 week</td>
 +
</tr>
 +
 
 +
<tr>
 +
<td>III.A.6</td>
 +
<td>
 +
Run FFSB 1 day on basic client/server operations in stress configuration, using:
 +
*1 200 000 files
 +
*100 directories
 +
</td><td>ffsb</td><td>'''[[Robustness_testing#Main_results|Done]]'''</td><td>BULL</td><td>1 day</td>
 +
</tr>
 +
</table>
== Resource limit testing ==
== Resource limit testing ==

Revision as of 09:48, 19 May 2005

Section III

Contents

ROBUSTNESS TESTING

Basic stability assessments

test tool test status Owner notes
III.A.1

Run iozone for 2 weeks on basic client/server operations, using:

  • Both data and metadata options
  • Cached and direct I/O
  • Various mount options
IOzonedoneBULLNow testing with fsstress and FFSB
III.A.2

Run automounter use case for 2 weeks on amd, autofs, and autong, using:

  • Large number of maps
  • Randomly mount and run workloads on an automounted partition
  • use a variety of workloads, such as randomly chosen fs tests
e.g. Crashme moreNewnonenone
III.A.3

Run NFS server for 2 wks with random configuration changes, using:

  • Interrupt server in various ways (reboot, power cycle, lan fail)
  • Change/reexport export rules at random
  • Trigger a client workload at arbitrary times
  • Analyze client recovery behaviors
OPENOSDL
III.A.4

Run connectathon locking tests against NFS server for 2 weeks, using:

  • Multiple client machines
  • Reboot at random
  • Analyze client cache coherency behaviors
  • Analyze locking behaviors
NEW
III.A.5

Run fsstress 2 weeks on basic client/server operations, using:

  • Long list random operations (1000 operations)
  • hight number of process (100)
fsstressDoneBULL1 week
III.A.6

Run FFSB 1 day on basic client/server operations in stress configuration, using:

  • 1 200 000 files
  • 100 directories
ffsbDoneBULL1 day

Resource limit testing

  • Test stability of client in out of pid situation
  • Test stability of client in out of disk space on server situation Done
  • Test stability of client in out of inode situation
  • Test stability of client in out of swap space situation
  • Test stability of server in out of pid situation
  • Test stability of server in out of memory situation
  • Test stability of server in out of disk space situation Done
  • Test stability of server in out of inode situation
  • Test stability of server in out of swap space situation

Stress load testing

  • Run LTP NFS fstress in a std config on each release fsx, fsstress (1hour), ffsb (1hour) In Progress
  • Analyze load balancing, failure modes, etc. under different stress loads New
  • Destructive testing by measuring point of failure for various loads New

Scalability (robustness)

  • Find maximum number of connections to Linux IA-32 server Fsstress, fsx New
  • Find maximum number of files for Linux IA-32 exported file system Fsstress, fsx
  • Find maximum file size on Linux IA-32 Fsstress, fsx New
  • Find maximum number of mounted file systems on client Fsstress, fsx New
  • Test robustness on NUMA when scaling CPU, mem, NIC, or disk count New
  • Test robustness on SMP when scaling CPU, mem, NIC, or disk count New
  • Test correctness of NFS client when backed by a large (>100GB) cachefs New
  • Find maximum number exported file systems on server New
  • Find maximum size of exported file systems on server New

Recovery from problems while under light/normal/heavy loads

  • Test short & long term local network failure (unplugged cable, ifdown eth0, etc.) Open OSDL
  • Test short & long duration remote network partition Open OSDL
  • Test behavior during crash/reboot of server with clients holding various states Open OSDL more
  • Test multiple clients using, locking, etc. same files New
  • Test behavior of server with failed storage device New
  • Test behavior during crash of client with open delegations and locks New
  • Test recovery from denied permission New
  • Test recovery from JUKEBOX/DELAY New
  • Test recovery from ESTALE New
Personal tools