Crashes after stacktraces about flush and kswapd

From Linux NFS

Revision as of 19:08, 22 October 2010 by Amschuma (Talk | contribs)
Jump to: navigation, search

Contents

About

  • Kernel version: 2.6.32, 2.6.33
  • bug 15552
  • bug 15578
  • Reported by: a.radke@arcor.de (March 17, 2010), lkolbe@techfak.uni-bielefeld.de (March 19, 2010)
  • Fixed by: Trond Myklemust (March 19, 2010)

Symptoms

  • Something similar to the following appears in the logs:
Mar 11 06:45:14 river kernel: [40200.628071] kswapd0       D 0000000000000002     0    47      2 0x00000000
Mar 11 06:45:14 river kernel: [40200.628076]  ffff88022f073880 0000000000000046 0000000000000000 ffffffff810114ce
Mar 11 06:45:14 river kernel: [40200.628080]  ffffffffb468199e 000000000000f8a0 ffff88022c8e3fd8 00000000000155c0
Mar 11 06:45:14 river kernel: [40200.628084]  00000000000155c0 ffff88022f135bd0 ffff88022f135ec8 0000000100000000
Mar 11 06:45:14 river kernel: [40200.628088] Call Trace:
Mar 11 06:45:14 river kernel: [40200.628112]  [<ffffffff810114ce>] ? common_interrupt+0xe/0x13
Mar 11 06:45:14 river kernel: [40200.628116]  [<ffffffff81098e5e>] ? delayacct_end+0x74/0x7f
Mar 11 06:45:14 river kernel: [40200.628129]  [<ffffffffa038dc38>] ? nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
Mar 11 06:45:14 river kernel: [40200.628134]  [<ffffffff812ee03d>] ? io_schedule+0x73/0xb7
Mar 11 06:45:14 river kernel: [40200.628140]  [<ffffffffa038dc41>] ? nfs_wait_bit_uninterruptible+0x9/0xd [nfs]
Mar 11 06:45:14 river kernel: [40200.628143]  [<ffffffff812ee53d>] ? __wait_on_bit+0x41/0x70
Mar 11 06:45:14 river kernel: [40200.628148]  [<ffffffff8118a10f>] ? __lookup_tag+0xad/0x11b
Mar 11 06:45:14 river kernel: [40200.628154]  [<ffffffffa038dc38>] ? nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
Mar 11 06:45:14 river kernel: [40200.628157]  [<ffffffff812ee5d7>] ? out_of_line_wait_on_bit+0x6b/0x77
Mar 11 06:45:14 river kernel: [40200.628161]  [<ffffffff81064a64>] ? wake_bit_function+0x0/0x23
Mar 11 06:45:14 river kernel: [40200.628168]  [<ffffffffa0391bc3>] ? nfs_sync_mapping_wait+0xfa/0x227 [nfs]
Mar 11 06:45:14 river kernel: [40200.628175]  [<ffffffffa0391d84>] ? nfs_wb_page+0x94/0xc3 [nfs]
Mar 11 06:45:14 river kernel: [40200.628179]  [<ffffffff810b4968>] ? __remove_from_page_cache+0x33/0xb6
Mar 11 06:45:14 river kernel: [40200.628185]  [<ffffffffa0384e54>] ? nfs_release_page+0x3a/0x57 [nfs]
Mar 11 06:45:14 river kernel: [40200.628189]  [<ffffffff810bd244>] ? shrink_page_list+0x481/0x617
Mar 11 06:45:14 river kernel: [40200.628192]  [<ffffffff8101166e>] ? apic_timer_interrupt+0xe/0x20
Mar 11 06:45:14 river kernel: [40200.628195]  [<ffffffff810bc317>] ? isolate_pages_global+0x1a0/0x20f
Mar 11 06:45:14 river kernel: [40200.628198]  [<ffffffff810bdaf1>] ? shrink_list+0x44a/0x725
Mar 11 06:45:14 river kernel: [40200.628206]  [<ffffffffa0276828>] ? jbd2_journal_release_jbd_inode+0x55/0x10e [jbd2]
Mar 11 06:45:14 river kernel: [40200.628211]  [<ffffffff810e2fd7>] ? add_partial+0x11/0x58
Mar 11 06:45:14 river kernel: [40200.628214]  [<ffffffff810be04c>] ? shrink_zone+0x280/0x342
Mar 11 06:45:14 river kernel: [40200.628216]  [<ffffffff810be24f>] ? shrink_slab+0x141/0x153
Mar 11 06:45:14 river kernel: [40200.628219]  [<ffffffff810bea71>] ? kswapd+0x4b9/0x683
Mar 11 06:45:14 river kernel: [40200.628222]  [<ffffffff810bc177>] ? isolate_pages_global+0x0/0x20f
Mar 11 06:45:14 river kernel: [40200.628224]  [<ffffffff81064a36>] ? autoremove_wake_function+0x0/0x2e
Mar 11 06:45:14 river kernel: [40200.628227]  [<ffffffff810be5b8>] ? kswapd+0x0/0x683
Mar 11 06:45:14 river kernel: [40200.628229]  [<ffffffff81064769>] ? kthread+0x79/0x81
Mar 11 06:45:14 river kernel: [40200.628232]  [<ffffffff81011baa>] ? child_rip+0xa/0x20
Mar 11 06:45:14 river kernel: [40200.628234]  [<ffffffff810646f0>] ? kthread+0x0/0x81
Mar 11 06:45:14 river kernel: [40200.628237]  [<ffffffff81011ba0>] ? child_rip+0x0/0x20
Mar 11 06:45:14 river kernel: [40200.628409] flush-0:24    D 0000000000000002     0  4682      2 0x00000000
Mar 11 06:45:14 river kernel: [40200.628412]  ffff88022f0754c0 0000000000000046 0000000000000000 ffff88020bfbf1ac
Mar 11 06:45:14 river kernel: [40200.628415]  0000000000000000 000000000000f8a0 ffff88020bfbffd8 00000000000155c0
Mar 11 06:45:14 river kernel: [40200.628418]  00000000000155c0 ffff880203973170 ffff880203973468 0000000200000000
Mar 11 06:45:14 river kernel: [40200.628421] Call Trace:
Mar 11 06:45:14 river kernel: [40200.628423]  [<ffffffff81098e5e>] ? delayacct_end+0x74/0x7f
Mar 11 06:45:14 river kernel: [40200.628430]  [<ffffffffa038dc38>] ? nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
Mar 11 06:45:14 river kernel: [40200.628433]  [<ffffffff812ee03d>] ? io_schedule+0x73/0xb7
Mar 11 06:45:14 river kernel: [40200.628439]  [<ffffffffa038dc41>] ? nfs_wait_bit_uninterruptible+0x9/0xd [nfs]
Mar 11 06:45:14 river kernel: [40200.628442]  [<ffffffff812ee53d>] ? __wait_on_bit+0x41/0x70
Mar 11 06:45:14 river kernel: [40200.628445]  [<ffffffff8118a10f>] ? __lookup_tag+0xad/0x11b
Mar 11 06:45:14 river kernel: [40200.628451]  [<ffffffffa038dc38>] ? nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
Mar 11 06:45:14 river kernel: [40200.628454]  [<ffffffff812ee5d7>] ? out_of_line_wait_on_bit+0x6b/0x77
Mar 11 06:45:14 river kernel: [40200.628456]  [<ffffffff81064a64>] ? wake_bit_function+0x0/0x23
Mar 11 06:45:14 river kernel: [40200.628464]  [<ffffffffa0391bc3>] ? nfs_sync_mapping_wait+0xfa/0x227 [nfs]
Mar 11 06:45:14 river kernel: [40200.628471]  [<ffffffffa0391d84>] ? nfs_wb_page+0x94/0xc3 [nfs]
Mar 11 06:45:14 river kernel: [40200.628473]  [<ffffffff810b4968>] ? __remove_from_page_cache+0x33/0xb6
Mar 11 06:45:14 river kernel: [40200.628479]  [<ffffffffa0384e54>] ? nfs_release_page+0x3a/0x57 [nfs]
Mar 11 06:45:14 river kernel: [40200.628482]  [<ffffffff810bd244>] ? shrink_page_list+0x481/0x617
Mar 11 06:45:14 river kernel: [40200.628486]  [<ffffffff812564c7>] ? sch_direct_xmit+0x7f/0x14c
Mar 11 06:45:14 river kernel: [40200.628489]  [<ffffffff810bc317>] ? isolate_pages_global+0x1a0/0x20f
Mar 11 06:45:14 river kernel: [40200.628492]  [<ffffffff810bdaf1>] ? shrink_list+0x44a/0x725
Mar 11 06:45:14 river kernel: [40200.628495]  [<ffffffff810b9dfc>] ? determine_dirtyable_memory+0xd/0x1d
Mar 11 06:45:14 river kernel: [40200.628498]  [<ffffffff810b9e74>] ? get_dirty_limits+0x1d/0x259
Mar 11 06:45:14 river kernel: [40200.628500]  [<ffffffff810be04c>] ? shrink_zone+0x280/0x342
Mar 11 06:45:14 river kernel: [40200.628504]  [<ffffffff810c63ac>] ? zone_statistics+0x3c/0x5d
Mar 11 06:45:14 river kernel: [40200.628507]  [<ffffffff810bf110>] ? try_to_free_pages+0x232/0x38e
Mar 11 06:45:14 river kernel: [40200.628510]  [<ffffffff810bc177>] ? isolate_pages_global+0x0/0x20f
Mar 11 06:45:14 river kernel: [40200.628512]  [<ffffffff810b92c5>] ? __alloc_pages_nodemask+0x3bb/0x5ce
Mar 11 06:45:14 river kernel: [40200.628516]  [<ffffffff810e5190>] ? new_slab+0x42/0x1ca
Mar 11 06:45:14 river kernel: [40200.628519]  [<ffffffff810e5508>] ? __slab_alloc+0x1f0/0x39b
Mar 11 06:45:14 river kernel: [40200.628526]  [<ffffffffa03924c9>] ? nfs_writedata_alloc+0x74/0x98 [nfs]
Mar 11 06:45:14 river kernel: [40200.628530]  [<ffffffff810e67eb>] ? __kmalloc+0xf1/0x141
Mar 11 06:45:14 river kernel: [40200.628532]  [<ffffffff8118a10f>] ? __lookup_tag+0xad/0x11b
Mar 11 06:45:14 river kernel: [40200.628539]  [<ffffffffa03924c9>] ? nfs_writedata_alloc+0x74/0x98 [nfs]
Mar 11 06:45:14 river kernel: [40200.628546]  [<ffffffffa03924c9>] ? nfs_writedata_alloc+0x74/0x98 [nfs]
Mar 11 06:45:14 river kernel: [40200.628553]  [<ffffffffa0392501>] ? nfs_flush_one+0x14/0xce [nfs]
Mar 11 06:45:14 river kernel: [40200.628559]  [<ffffffffa038daf0>] ? nfs_pageio_doio+0x2a/0x51 [nfs]
Mar 11 06:45:14 river kernel: [40200.628566]  [<ffffffffa038dbdc>] ? nfs_pageio_add_request+0xc5/0xd5 [nfs]
Mar 11 06:45:14 river kernel: [40200.628573]  [<ffffffffa0390dfd>] ? nfs_do_writepage+0x100/0x122 [nfs]
Mar 11 06:45:14 river kernel: [40200.628580]  [<ffffffffa0391316>] ? nfs_writepages_callback+0xf/0x21 [nfs]
Mar 11 06:45:14 river kernel: [40200.628583]  [<ffffffff810b9b69>] ? write_cache_pages+0x20b/0x327
Mar 11 06:45:14 river kernel: [40200.628590]  [<ffffffffa0391307>] ? nfs_writepages_callback+0x0/0x21 [nfs]
Mar 11 06:45:14 river kernel: [40200.628597]  [<ffffffffa03912c6>] ? nfs_writepages+0xef/0x130 [nfs]
Mar 11 06:45:14 river kernel: [40200.628604]  [<ffffffffa03924ed>] ? nfs_flush_one+0x0/0xce [nfs]
Mar 11 06:45:14 river kernel: [40200.628607]  [<ffffffff81064957>] ? bit_waitqueue+0x10/0xa0
Mar 11 06:45:14 river kernel: [40200.628611]  [<ffffffff8110637e>] ? writeback_single_inode+0xe7/0x2da
Mar 11 06:45:14 river kernel: [40200.628614]  [<ffffffff81107057>] ? writeback_inodes_wb+0x423/0x4fe
Mar 11 06:45:14 river kernel: [40200.628617]  [<ffffffff8110725e>] ? wb_writeback+0x12c/0x1ab
Mar 11 06:45:14 river kernel: [40200.628620]  [<ffffffff811073f2>] ? wb_do_writeback+0x73/0x15b
Mar 11 06:45:14 river kernel: [40200.628622]  [<ffffffff8110750b>] ? bdi_writeback_task+0x31/0x9d
Mar 11 06:45:14 river kernel: [40200.628626]  [<ffffffff810c7c1e>] ? bdi_start_fn+0x0/0xca
Mar 11 06:45:14 river kernel: [40200.628628]  [<ffffffff810c7c8e>] ? bdi_start_fn+0x70/0xca
Mar 11 06:45:14 river kernel: [40200.628631]  [<ffffffff810c7c1e>] ? bdi_start_fn+0x0/0xca
Mar 11 06:45:14 river kernel: [40200.628633]  [<ffffffff81064769>] ? kthread+0x79/0x81
Mar 11 06:45:14 river kernel: [40200.628636]  [<ffffffff81011baa>] ? child_rip+0xa/0x20
Mar 11 06:45:14 river kernel: [40200.628638]  [<ffffffff810646f0>] ? kthread+0x0/0x81
Mar 11 06:45:14 river kernel: [40200.628640]  [<ffffffff81011ba0>] ? child_rip+0x0/0x20

Cause

  • A deadlock caused by freeing a page with __GFP_FS set

Resolution

commit d812e575822a2b7ab1a7cadae2571505ec6ec2bd
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date:   Fri Mar 19 13:55:17 2010 -0400

    NFS: Prevent another deadlock in nfs_release_page()
    
    We should not attempt to free the page if __GFP_FS is not set. Otherwise we
    can deadlock as per
    
      http://bugzilla.kernel.org/show_bug.cgi?id=15578
    
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
    Cc: stable@kernel.org

Personal tools