linux/fs/ext4
Aditya Kali 5356f2615c ext4: attempt to fix race in bigalloc code path
Currently, there exists a race between delayed allocated writes and
the writeback when bigalloc feature is in use. The race was because we
wanted to determine what blocks in a cluster are under delayed
allocation and we were using buffer_delayed(bh) check for it. But, the
writeback codepath clears this bit without any synchronization which
resulted in a race and an ext4 warning similar to:

EXT4-fs (ram1): ext4_da_update_reserve_space: ino 13, used 1 with only 0
		reserved data blocks

The race existed in two places.
(1) between ext4_find_delalloc_range() and ext4_map_blocks() when called from
    writeback code path.
(2) between ext4_find_delalloc_range() and ext4_da_get_block_prep() (where
    buffer_delayed(bh) is set.

To fix (1), this patch introduces a new buffer_head state bit -
BH_Da_Mapped.  This bit is set under the protection of
EXT4_I(inode)->i_data_sem when we have actually mapped the delayed
allocated blocks during the writeout time. We can now reliably check
for this bit inside ext4_find_delalloc_range() to determine whether
the reservation for the blocks have already been claimed or not.

To fix (2), it was necessary to set buffer_delay(bh) under the
protection of i_data_sem.  So, I extracted the very beginning of
ext4_map_blocks into a new function - ext4_da_map_blocks() - and
performed the required setting of bh_delay bit and the quota
reservation under the protection of i_data_sem.  These two fixes makes
the checking of buffer_delay(bh) and buffer_da_mapped(bh) consistent,
thus removing the race.

Tested: I was able to reproduce the problem by running 'dd' and
'fsync' in parallel. Also, xfstests sometimes used to reproduce this
race. After the fix both my test and xfstests were successful and no
race (warning message) was observed.

Google-Bug-Id: 4997027

Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-09-09 19:20:51 -04:00
..
acl.c switch posix_acl_equiv_mode() to umode_t * 2011-08-01 02:10:06 -04:00
acl.h fs: take the ACL checks to common code 2011-07-25 14:30:23 -04:00
balloc.c ext4: rename ext4_has_free_blocks() to ext4_has_free_clusters() 2011-09-09 19:16:51 -04:00
bitmap.c ext4: Change unsigned long to unsigned int 2008-11-05 00:14:04 -05:00
block_validity.c ext4: move ext4_ind_* functions from inode.c to indirect.c 2011-06-27 19:40:50 -04:00
dir.c ext4: Use ext4_error_file() to print the pathname to the corrupted inode 2011-01-10 12:10:55 -05:00
ext4_extents.h ext4: Fix bigalloc quota accounting and i_blocks value 2011-09-09 19:04:51 -04:00
ext4_jbd2.c jbd2: add debugging information to jbd2_journal_dirty_metadata() 2011-09-04 10:18:14 -04:00
ext4_jbd2.h ext4: Fix ext4_should_writeback_data() for no-journal mode 2011-08-13 11:25:18 -04:00
ext4.h ext4: attempt to fix race in bigalloc code path 2011-09-09 19:20:51 -04:00
extents.c ext4: attempt to fix race in bigalloc code path 2011-09-09 19:20:51 -04:00
file.c fs: take the ACL checks to common code 2011-07-25 14:30:23 -04:00
fsync.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2011-08-01 13:56:03 -10:00
hash.c ext4: Add support for non-native signed/unsigned htree hash algorithms 2008-10-28 13:21:44 -04:00
ialloc.c ext4: rename ext4_free_blocks_after_init() to ext4_free_clusters_after_init() 2011-09-09 19:12:51 -04:00
indirect.c ext4: enforce bigalloc restrictions (e.g., no online resizing, etc.) 2011-09-09 18:36:51 -04:00
inode.c ext4: attempt to fix race in bigalloc code path 2011-09-09 19:20:51 -04:00
ioctl.c ext4: enforce bigalloc restrictions (e.g., no online resizing, etc.) 2011-09-09 18:36:51 -04:00
Kconfig ext4: Don't ask about supporting ext2/3 in ext4 if ext4 is not configured 2009-12-21 10:54:09 -05:00
Makefile ext4: move ext4_ind_* functions from inode.c to indirect.c 2011-06-27 19:40:50 -04:00
mballoc.c ext4: rename ext4_claim_free_blocks() to ext4_claim_free_clusters() 2011-09-09 19:14:51 -04:00
mballoc.h ext4: teach ext4_free_blocks() about bigalloc and clusters 2011-09-09 18:50:51 -04:00
migrate.c ext4: add some tracepoints in ext4/extents.c 2011-09-09 19:18:51 -04:00
mmp.c ext4: add support for multiple mount protection 2011-05-24 18:31:25 -04:00
move_extent.c ext4: add some tracepoints in ext4/extents.c 2011-09-09 19:18:51 -04:00
namei.c ext4: call ext4_handle_dirty_metadata with correct inode in ext4_dx_add_entry 2011-08-31 12:02:51 -04:00
page-io.c ext4: remove i_mutex lock in ext4_evict_inode to fix lockdep complaining 2011-08-31 11:50:51 -04:00
resize.c ext4: Rename ext4_free_blks_{count,set}() to refer to clusters 2011-09-09 19:08:51 -04:00
super.c ext4: rename ext4_count_free_blocks() to ext4_count_free_clusters() 2011-09-09 19:10:51 -04:00
symlink.c ext4: symlink must be handled via filesystem specific operation 2010-05-16 02:00:00 -04:00
truncate.h ext4: move common truncate functions to header file 2011-06-27 19:16:04 -04:00
xattr_security.c fs/vfs/security: pass last path component to LSM on inode creation 2011-02-01 11:12:29 -05:00
xattr_trusted.c ext4: constify xattr_handler 2010-05-21 18:31:19 -04:00
xattr_user.c ext4: constify xattr_handler 2010-05-21 18:31:19 -04:00
xattr.c ext4: add flag to ext4_has_free_blocks 2011-05-25 07:41:26 -04:00
xattr.h fs/vfs/security: pass last path component to LSM on inode creation 2011-02-01 11:12:29 -05:00