Revision as of 22:37, 19 May 2005

Section III

ROBUSTNESS TESTING

Basic stability assessments

ID	test	tool test	status	owner	notes
III.A.1	Run iozone for 2 weeks on basic client/server operations, using: Both data and metadata options Cached and direct I/O Various mount options	IOzone	done	BULL	Now testing with fsstress and FFSB
III.A.2	Run automounter use case for 2 weeks on amd, autofs, and autong, using: Large number of maps Randomly mount and run workloads on an automounted partition use a variety of workloads, such as randomly chosen fs tests	e.g. Crashme more	New
III.A.3	Run NFS server for 2 wks with random configuration changes, using: Interrupt server in various ways (reboot, power cycle, lan fail) Change/reexport export rules at random Trigger a client workload at arbitrary times Analyze client recovery behaviors		OPEN	OSDL
III.A.4	Run connectathon locking tests against NFS server for 2 weeks, using: Multiple client machines Reboot at random Analyze client cache coherency behaviors Analyze locking behaviors		NEW
III.A.5	Run fsstress 2 weeks on basic client/server operations, using: Long list random operations (1000 operations) hight number of process (100)	fsstress	DONE	BULL	1 week
III.A.6	Run FFSB 1 day on basic client/server operations in stress configuration, using: 1 200 000 files 100 directories	ffsb	DONE	BULL	1 day

Resource limit testing

ID	test	tool test	status	owner	notes
III.B.1	Test stability of client in out of pid situation
III.B.2	Test stability of client in out of memory situation	valgrind	New
III.B.3	Test stability of client in out of disk space on server situation	dd,fsstress	Done	Bull	Simple error message no space left on device
III.B.4	Test stability of client in out of inode situation
III.B.5	Test stability of client in out of swap space situation
III.B.6	Test stability of server in out of pid situation
III.B.7	Test stability of server in out of memory situation	valgrind	New
III.B.8	Test stability of server in out of disk space	dd,fsstress	Done	Bull	Simple error message no space left on device
III.B.9	Test stability of server in out of inode situation
III.B.10	Test stability of server in out of swap space situation

Stress load testing

ID	test	tool test	status	owner	notes
III.C.1	Run stress tools in a std config on each release	fsx,fsstress,ffsb	In progress	BULL	fsstress and ffsb are ran 1hour
III.C.2	Analyze load balancing, failure modes, etc. under different stress loads		New
III.C.3	Destructive testing by measuring point of failure for various loads		New

Scalability (robustness)

ID	test	tool test	status	owner
III.D.1	Find maximum number of connections to Linux IA-32 server	Fsstress, fsx	New	Bull (partial)
III.D.2	Find maximum number of files for Linux IA-32 exported file system	ffsb	New
III.D.3	Find maximum file size on Linux IA-32		New
III.D.4	Find maximum number of mounted file systems on client	Fsstress, fsx	New	Bull
III.D.5	Test robustness on NUMA when scaling CPU, mem, NIC, or disk count	Fsstress, fsx	New
III.D.6	Test robustness on SMP when scaling CPU, mem, NIC, or disk count	Fsstress, fsx	New	Bull (partial)
III.D.7	Test correctness of NFS client when backed by a large (>100GB) cachefs		New
III.D.8	Find maximum number exported file systems on server		New
III.D.9	Find maximum size of exported file systems on server		New

Recovery from problems while under light/normal/heavy loads

ID	test	status	owner	notes
III.E.1	Test short & long term local network failure (unplugged cable, ifdown eth0, etc.)	Open	OSDL
III.E.2	Test short & long duration remote network partition	Open	OSDL
III.E.3	Test behavior during crash/reboot of server with clients holding various states	Open	OSDL	more
III.E.4	Test multiple clients using, locking, etc. same files	New
III.E.5	Test behavior of server with failed storage device	New
III.E.6	Test behavior during crash of client with open delegations and locks	New
III.E.7	Test recovery from denied permission	New
III.E.8	Test recovery from JUKEBOX/DELAY	New
III.E.9	Test recovery from ESTALE	New

Race conditions

ID	test	tool test	status	owner	notes
III.F.1	Test for race conditions and locking bugs on PPC64		New	(Polyserve?)	Olaf Kirch says PPC64 is good at exposing problems because of its weak CPU cache coherency semantics
III.F.	Test for race conditions on new architectures		New	(Polyserve?)	Faster CPU, memory, and buses can expose race conditions

Automounter robustness

For more info about Automounter, see notes in nfsv4 list archive for 2/16/05

ID	test	status	notes
III.G.1	Test interuptible automounting in the following cases indirect mount direct mount browsed mount multimount offset	New
III.G.2	Test concurrent access tests for races in automounter Have multiple threads working in parallel	New
III.G.3	Test replicated file system selection	New
III.G.4	Test remounting after expire corner cases: Something (a process) sitting in the scaffolding Common case for /net	New	Needs to be supported at nfs level

@@ Line 4: / Line 4: @@
 ==Basic stability assessments==
-{|border="1" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
+{|border="1" width="100%" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
 !style="background: #ececec;"|'''ID
 !style="background: #ececec;"|'''test'''
@@ Line 26: / Line 26: @@
 *Large number of maps
 *Randomly mount and run workloads on an automounted partition
 *use a variety of workloads, such as randomly chosen fs tests
 |e.g. Crashme [http://people.delphiforums.com/gjc/crashme.html more]
 |'''New'''
@@ Line 34: / Line 34: @@
 |III.A.3
 |Run NFS server for 2 wks with random configuration changes, using:
 *Interrupt server in various ways (reboot, power cycle, lan fail)
 *Change/reexport export rules at random
 *Trigger a client workload at arbitrary times
 *Analyze client recovery behaviors
 |
@@ Line 45: / Line 45: @@
 |III.A.4
 |Run connectathon locking tests against NFS server for 2 weeks, using:
 *Multiple client machines
 *Reboot at random
 *Analyze client cache coherency behaviors
 *Analyze locking behaviors
 |
@@ Line 74: / Line 74: @@
 == Resource limit testing ==
-{|border="1" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
+{|border="1" width="100%" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
 !style="background: #ececec;"|'''ID
 !style="background: #ececec;"|'''test'''
@@ Line 155: / Line 155: @@
 ==Stress load testing==
-{|border="1" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
+{|border="1" width="100%" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
 !style="background: #ececec;"|'''ID
 !style="background: #ececec;"|'''test'''
@@ Line 261: / Line 261: @@
-{|border="1" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
+{|border="1" width="100%" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
 !style="background: #ececec;"|'''ID
 !style="background: #ececec;"|'''test'''
@@ Line 305: / Line 305: @@
 |-
 |III.E.6
 |Test behavior during crash of client with open delegations and locks
 |
 |'''New'''
@@ Line 335: / Line 335: @@
 ==Race conditions==
-{|border="1" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
+{|border="1" width="100%" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
 !style="background: #ececec;"|'''ID
 !style="background: #ececec;"|'''test'''
@@ Line 362: / Line 362: @@
 For more info about Automounter, see notes in nfsv4 list archive for 2/16/05
-{|border="1" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
+{|border="1" width="100%" cellpadding="1" cellspacing="0" style="font-size: 85%; border: gray solid 1px; border-collapse: collapse; text-align: center; width: 100%
 !style="background: #ececec;"|'''ID
 !style="background: #ececec;"|'''test'''
@@ Line 373: / Line 373: @@
 |Test interuptible automounting in the following cases
 *indirect mount
 *direct mount
 *browsed mount
 *multimount offset
 |
@@ Line 397: / Line 397: @@
 |-
 |III.G.4
 |Test remounting after expire corner cases:
 *Something (a process) sitting in the scaffolding
 *Common case for /net
 |

Matrix robustness section

From Linux NFS

Revision as of 22:37, 19 May 2005

Contents

ROBUSTNESS TESTING

Basic stability assessments

Resource limit testing

Stress load testing

Scalability (robustness)

Recovery from problems while under light/normal/heavy loads

Race conditions

Automounter robustness

Views

Personal tools

Navigation

Search

Toolbox