linux/fs/ext4
Dmitry Monakhov bb55748805 ext4: clean up online defrag bugs in move_extent_per_page()
Non-full list of bugs:
1) uninitialized extent optimization does not hold page's lock,
   and simply replace brunches after that writeback code goes
   crazy because block mapping changed under it's feets
   kernel BUG at fs/ext4/inode.c:1434!  ( 288'th xfstress)

2) uninitialized extent may became initialized right after we
   drop i_data_sem, so extent state must be rechecked

3) Locked pages goes uptodate via following sequence:
   ->readpage(page); lock_page(page); use_that_page(page)
   But after readpage() one may invalidate it because it is
   uptodate and unlocked (reclaimer does that)
   As result kernel bug at include/linux/buffer_head.c:133!

4) We call write_begin() with already opened stansaction which
   result in following deadlock:
->move_extent_per_page()
  ->ext4_journal_start()-> hold journal transaction
  ->write_begin()
    ->ext4_da_write_begin()
      ->ext4_nonda_switch()
        ->writeback_inodes_sb_if_idle()  --> will wait for journal_stop()

5) try_to_release_page() may fail and it does fail if one of page's bh was
   pinned by journal

6) If we about to change page's mapping we MUST hold it's lock during entire
   remapping procedure, this is true for both pages(original and donor one)

Fixes:

- Avoid (1) and (2) simply by temproraly drop uninitialized extent handling
  optimization, this will be reimplemented later.

- Fix (3) by manually forcing page to uptodate state w/o dropping it's lock

- Fix (4) by rearranging existing locking:
  from: journal_start(); ->write_begin
  to: write_begin(); journal_extend()
- Fix (5) simply by checking retvalue
- Fix (6) by locking both (original and donor one) pages during extent swap
  with help of mext_page_double_lock()

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-26 12:52:07 -04:00
..
acl.c switch posix_acl_equiv_mode() to umode_t * 2011-08-01 02:10:06 -04:00
acl.h fs: take the ACL checks to common code 2011-07-25 14:30:23 -04:00
balloc.c ext4: don't call ext4_error while block group is locked 2012-08-17 09:06:06 -04:00
bitmap.c ext4: don't call ext4_error while block group is locked 2012-08-17 09:06:06 -04:00
block_validity.c ext2/3/4: delete unneeded includes of module.h 2012-01-09 13:52:10 +01:00
dir.c ext4: use core vfs llseek code for dir seeks 2012-07-23 00:00:28 +04:00
ext4_extents.h ext4: verify and calculate checksums for extent tree blocks 2012-04-29 18:37:10 -04:00
ext4_jbd2.c ext4: remove unnecessary argument from __ext4_handle_dirty_metadata() 2012-07-22 20:37:31 -04:00
ext4_jbd2.h ext4: remove unnecessary argument from __ext4_handle_dirty_metadata() 2012-07-22 20:37:31 -04:00
ext4.h ext4: grow the s_group_info array as needed 2012-09-05 01:31:50 -04:00
extents.c ext4: speed up truncate/unlink by not using bforget() unless needed 2012-09-19 14:14:53 -04:00
file.c The usual collection of bug fixes and optimizations. Perhaps of 2012-07-27 20:52:25 -07:00
fsync.c ext4: check return value of blkdev_issue_flush() 2012-08-17 09:58:17 -04:00
hash.c ext4: return 32/64-bit dir name hash according to usage type 2012-03-18 22:44:40 -04:00
ialloc.c ext4: check free inode count before allocating an inode 2012-09-23 23:16:03 -04:00
indirect.c ext2/3/4: delete unneeded includes of module.h 2012-01-09 13:52:10 +01:00
inode.c ext4: fix potential deadlock in ext4_nonda_switch() 2012-09-19 22:42:36 -04:00
ioctl.c ext4: add online resizing support for meta_bg and 64-bit file systems 2012-09-05 01:33:50 -04:00
Kconfig ext4: load the crc32c driver if necessary 2012-04-29 18:27:10 -04:00
Makefile ext4: move ext4_ind_* functions from inode.c to indirect.c 2011-06-27 19:40:50 -04:00
mballoc.c ext4: check free block counters in ext4_mb_find_by_goal 2012-09-23 23:10:51 -04:00
mballoc.h ext4: remove unused macro MB_DEFAULT_MAX_GROUPS_TO_SCAN 2012-08-17 10:00:17 -04:00
migrate.c userns: Convert ext4 to user kuid/kgid where appropriate 2012-05-15 14:59:27 -07:00
mmp.c ext4: Convert to new freezing mechanism 2012-07-31 09:45:48 +04:00
move_extent.c ext4: clean up online defrag bugs in move_extent_per_page() 2012-09-26 12:52:07 -04:00
namei.c ext4: make orphan functions be no-op in no-journal mode 2012-09-18 13:38:59 -04:00
page-io.c Revert "ext4: don't release page refs in ext4_end_bio()" 2012-03-29 17:00:56 -07:00
resize.c ext4: don't call update_backups() multiple times for the same bg 2012-09-26 00:08:57 -04:00
super.c ext4: fix crash when accessing /proc/mounts concurrently 2012-09-23 22:49:12 -04:00
symlink.c ext4: symlink must be handled via filesystem specific operation 2010-05-16 02:00:00 -04:00
truncate.h ext4: move common truncate functions to header file 2011-06-27 19:16:04 -04:00
xattr_security.c Merge branch 'for_linus' into for_linus_merged 2012-01-10 11:54:07 -05:00
xattr_trusted.c ext2/3/4: delete unneeded includes of module.h 2012-01-09 13:52:10 +01:00
xattr_user.c ext2/3/4: delete unneeded includes of module.h 2012-01-09 13:52:10 +01:00
xattr.c ext4: use s_csum_seed instead of i_csum_seed for xattr block 2012-07-09 16:29:27 -04:00
xattr.h ext4: change on-disk layout to support extended metadata checksumming 2012-04-29 18:23:10 -04:00