linux/fs/nilfs2
Ryusuke Konishi 283ee1482f nilfs2: fix deadlock of segment constructor during recovery
According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
during recovery at mount time.  The code path that caused the deadlock was
as follows:

  nilfs_fill_super()
    load_nilfs()
      nilfs_salvage_orphan_logs()
        * Do roll-forwarding, attach segment constructor for recovery,
          and kick it.

        nilfs_segctor_thread()
          nilfs_segctor_thread_construct()
           * A lock is held with nilfs_transaction_lock()
             nilfs_segctor_do_construct()
               nilfs_segctor_drop_written_files()
                 iput()
                   iput_final()
                     write_inode_now()
                       writeback_single_inode()
                         __writeback_single_inode()
                           do_writepages()
                             nilfs_writepage()
                               nilfs_construct_dsync_segment()
                                 nilfs_transaction_lock() --> deadlock

This can happen if commit 7ef3ff2fea ("nilfs2: fix deadlock of segment
constructor over I_SYNC flag") is applied and roll-forward recovery was
performed at mount time.  The roll-forward recovery can happen if datasync
write is done and the file system crashes immediately after that.  For
instance, we can reproduce the issue with the following steps:

 < nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
 # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
 # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
 count=1 && reboot -nfh
 < the system will immediately reboot >
 # mount -t nilfs2 /dev/sdb1 /nilfs

The deadlock occurs because iput() can run segment constructor through
writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags.  The
above commit changed segment constructor so that it calls iput()
asynchronously for inodes with i_nlink == 0, but that change was
imperfect.

This fixes the another deadlock by deferring iput() in segment constructor
even for the case that mount is not finished, that is, for the case that
MS_ACTIVE flag is not set.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reported-by: Yuxuan Shui <yshuiv7@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-12 18:46:08 -07:00
..
alloc.c nilfs2: implement calculation of free inodes count 2013-07-03 16:08:01 -07:00
alloc.h nilfs2: implement calculation of free inodes count 2013-07-03 16:08:01 -07:00
bmap.c nilfs2: get rid of NILFS_I_NILFS 2011-05-10 22:21:56 +09:00
bmap.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
btnode.c nilfs2: use mark_buffer_dirty to mark btnode or meta data dirty 2011-05-10 22:21:57 +09:00
btnode.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
btree.c nilfs2: fix potential memory overrun on inode 2015-02-28 09:57:51 -08:00
btree.h
cpfile.c nilfs2: verify metadata sizes read from disk 2014-04-03 16:21:26 -07:00
cpfile.h nilfs2: use iget for all metadata files 2010-10-23 09:24:38 +09:00
dat.c nilfs2: verify metadata sizes read from disk 2014-04-03 16:21:26 -07:00
dat.h nilfs2: use iget for all metadata files 2010-10-23 09:24:38 +09:00
dir.c [readdir] convert nilfs2 2013-06-29 12:56:36 +04:00
direct.c nilfs2: record used amount of each checkpoint in checkpoint list 2011-03-08 14:58:31 +09:00
direct.h
export.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
file.c mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub 2015-02-10 14:30:30 -08:00
gcinode.c fs: remove mapping->backing_dev_info 2015-01-20 14:03:05 -07:00
ifile.c ] nilfs2: use atomic64_t type for inodes_count and blocks_count fields in nilfs_root struct 2013-07-03 16:08:01 -07:00
ifile.h nilfs2: implement calculation of free inodes count 2013-07-03 16:08:01 -07:00
inode.c nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races 2014-12-10 17:41:16 -08:00
ioctl.c nilfs2: add missing blkdev_issue_flush() to nilfs_sync_fs() 2014-10-14 02:18:20 +02:00
Kconfig fs/nilfs2: remove depends on CONFIG_EXPERIMENTAL 2013-01-11 11:39:04 -08:00
Makefile nilfs2: integrate sysfs support into driver 2014-08-08 15:57:21 -07:00
mdt.c fs: remove mapping->backing_dev_info 2015-01-20 14:03:05 -07:00
mdt.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
namei.c nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races 2014-12-10 17:41:16 -08:00
nilfs.h nilfs2: fix deadlock of segment constructor over I_SYNC flag 2015-02-05 13:35:29 -08:00
page.c fs: remove mapping->backing_dev_info 2015-01-20 14:03:05 -07:00
page.h fs: remove mapping->backing_dev_info 2015-01-20 14:03:05 -07:00
recovery.c nilfs2: drop vmtruncate 2012-12-20 18:40:54 -05:00
segbuf.c block: Abstract out bvec iterator 2013-11-23 22:33:47 -08:00
segbuf.h
segment.c nilfs2: fix deadlock of segment constructor during recovery 2015-03-12 18:46:08 -07:00
segment.h nilfs2: fix deadlock of segment constructor over I_SYNC flag 2015-02-05 13:35:29 -08:00
sufile.c nilfs2: verify metadata sizes read from disk 2014-04-03 16:21:26 -07:00
sufile.h nilfs2: add nilfs_sufile_trim_fs to trim clean segs 2014-04-03 16:21:25 -07:00
super.c fs: remove mapping->backing_dev_info 2015-01-20 14:03:05 -07:00
sysfs.c nilfs2: integrate sysfs support into driver 2014-08-08 15:57:21 -07:00
sysfs.h nilfs2: add /sys/fs/nilfs2/<device>/mounted_snapshots/<snapshot> group 2014-08-08 15:57:21 -07:00
the_nilfs.c nilfs2: deletion of an unnecessary check before the function call "iput" 2014-12-10 17:41:16 -08:00
the_nilfs.h nilfs2: add missing blkdev_issue_flush() to nilfs_sync_fs() 2014-10-14 02:18:20 +02:00