linux/fs
Filipe Manana 410f954cb1 Btrfs: fix assertion failure during fsync and use of stale transaction
Sometimes when fsync'ing a file we need to log that other inodes exist and
when we need to do that we acquire a reference on the inodes and then drop
that reference using iput() after logging them.

That generally is not a problem except if we end up doing the final iput()
(dropping the last reference) on the inode and that inode has a link count
of 0, which can happen in a very short time window if the logging path
gets a reference on the inode while it's being unlinked.

In that case we end up getting the eviction callback, btrfs_evict_inode(),
invoked through the iput() call chain which needs to drop all of the
inode's items from its subvolume btree, and in order to do that, it needs
to join a transaction at the helper function evict_refill_and_join().
However because the task previously started a transaction at the fsync
handler, btrfs_sync_file(), it has current->journal_info already pointing
to a transaction handle and therefore evict_refill_and_join() will get
that transaction handle from btrfs_join_transaction(). From this point on,
two different problems can happen:

1) evict_refill_and_join() will often change the transaction handle's
   block reserve (->block_rsv) and set its ->bytes_reserved field to a
   value greater than 0. If evict_refill_and_join() never commits the
   transaction, the eviction handler ends up decreasing the reference
   count (->use_count) of the transaction handle through the call to
   btrfs_end_transaction(), and after that point we have a transaction
   handle with a NULL ->block_rsv (which is the value prior to the
   transaction join from evict_refill_and_join()) and a ->bytes_reserved
   value greater than 0. If after the eviction/iput completes the inode
   logging path hits an error or it decides that it must fallback to a
   transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
   non-zero value from btrfs_log_dentry_safe(), and because of that
   non-zero value it tries to commit the transaction using a handle with
   a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
   the transaction commit hit an assertion failure at
   btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
   the ->block_rsv is NULL. The produced stack trace for that is like the
   following:

   [192922.917158] assertion failed: !trans->bytes_reserved, file: fs/btrfs/transaction.c, line: 816
   [192922.917553] ------------[ cut here ]------------
   [192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
   [192922.918310] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
   [192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: G        W         5.1.4-btrfs-next-47 #1
   [192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
   [192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
   (...)
   [192922.920925] RSP: 0018:ffffaebdc8a27da8 EFLAGS: 00010286
   [192922.921315] RAX: 0000000000000051 RBX: ffff95c9c16a41c0 RCX: 0000000000000000
   [192922.921692] RDX: 0000000000000000 RSI: ffff95cab6b16838 RDI: ffff95cab6b16838
   [192922.922066] RBP: ffff95c9c16a41c0 R08: 0000000000000000 R09: 0000000000000000
   [192922.922442] R10: ffffaebdc8a27e70 R11: 0000000000000000 R12: ffff95ca731a0980
   [192922.922820] R13: 0000000000000000 R14: ffff95ca84c73338 R15: ffff95ca731a0ea8
   [192922.923200] FS:  00007f337eda4e80(0000) GS:ffff95cab6b00000(0000) knlGS:0000000000000000
   [192922.923579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   [192922.923948] CR2: 00007f337edad000 CR3: 00000001e00f6002 CR4: 00000000003606e0
   [192922.924329] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   [192922.924711] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   [192922.925105] Call Trace:
   [192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
   [192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
   [192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
   [192922.926731]  do_fsync+0x38/0x60
   [192922.927138]  __x64_sys_fdatasync+0x13/0x20
   [192922.927543]  do_syscall_64+0x60/0x1c0
   [192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
   (...)
   [192922.934077] ---[ end trace f00808b12068168f ]---

2) If evict_refill_and_join() decides to commit the transaction, it will
   be able to do it, since the nested transaction join only increments the
   transaction handle's ->use_count reference counter and it does not
   prevent the transaction from getting committed. This means that after
   eviction completes, the fsync logging path will be using a transaction
   handle that refers to an already committed transaction. What happens
   when using such a stale transaction can be unpredictable, we are at
   least having a use-after-free on the transaction handle itself, since
   the transaction commit will call kmem_cache_free() against the handle
   regardless of its ->use_count value, or we can end up silently losing
   all the updates to the log tree after that iput() in the logging path,
   or using a transaction handle that in the meanwhile was allocated to
   another task for a new transaction, etc, pretty much unpredictable
   what can happen.

In order to fix both of them, instead of using iput() during logging, use
btrfs_add_delayed_iput(), so that the logging path of fsync never drops
the last reference on an inode, that step is offloaded to a safe context
(usually the cleaner kthread).

The assertion failure issue was sporadically triggered by the test case
generic/475 from fstests, which loads the dm error target while fsstress
is running, which lead to fsync failing while logging inodes with -EIO
errors and then trying later to commit the transaction, triggering the
assertion failure.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-09-12 13:37:19 +02:00
..
9p treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 188 2019-05-30 11:29:21 -07:00
adfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
affs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
afs AFS fixes 2019-06-28 08:34:12 +08:00
autofs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 83 2019-05-24 17:37:52 +02:00
befs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
bfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
btrfs Btrfs: fix assertion failure during fsync and use of stale transaction 2019-09-12 13:37:19 +02:00
cachefiles treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
ceph ceph: fix ceph_mdsc_build_path to not stop on first component 2019-06-27 18:27:36 +02:00
cifs SPDX update for 5.2-rc6 2019-06-21 09:58:42 -07:00
coda treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
configfs SPDX update for 5.2-rc3, round 1 2019-05-31 08:34:32 -07:00
cramfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
crypto treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
debugfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
devpts treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 83 2019-05-24 17:37:52 +02:00
dlm treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398 2019-06-05 17:37:12 +02:00
ecryptfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
efivarfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
efs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
exportfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
ext2 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
ext4 Bug fixes (including a regression fix) for ext4. 2019-05-25 15:03:12 -07:00
f2fs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
fat treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282 2019-06-05 17:36:37 +02:00
freevxfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
fscache treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
fuse Revert "fuse: require /dev/fuse reads to have enough buffer capacity" 2019-06-11 13:35:22 +02:00
gfs2 Fix rounding error in gfs2_iomap_page_prepare 2019-06-14 17:27:12 -10:00
hfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
hfsplus treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
hostfs This pull request contains the following changes for UML: 2019-05-12 17:52:13 -04:00
hpfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
hugetlbfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
isofs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 142 2019-05-30 11:25:17 -07:00
jbd2 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
jffs2 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
jfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
kernfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
lockd Revert "lockd: Show pid of lockd for remote locks" 2019-05-31 09:43:26 -04:00
minix treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nfs NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O 2019-06-28 11:48:52 -04:00
nfs_common treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nfsd treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 1 2019-05-21 11:28:39 +02:00
nilfs2 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nls treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
notify fanotify: update connector fsid cache on add mark 2019-06-19 15:53:58 +02:00
ntfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 97 2019-05-24 17:37:53 +02:00
ocfs2 fs/ocfs2: fix race in ocfs2_dentry_attach_lock() 2019-06-13 17:34:56 -10:00
omfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
openpromfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
orangefs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
overlayfs SPDX update for 5.2-rc6 2019-06-21 09:58:42 -07:00
proc fs/proc/array.c: allow reporting eip/esp for all coredumping threads 2019-06-29 16:43:44 +08:00
pstore SPDX update for 5.2-rc4 2019-06-08 12:52:42 -07:00
qnx4 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
qnx6 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
quota quota: fix a problem about transfer quota 2019-06-19 15:25:41 +02:00
ramfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
reiserfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
romfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
squashfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 2019-06-19 17:09:53 +02:00
sysfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
sysv treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
tracefs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
ubifs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 336 2019-06-05 17:37:07 +02:00
udf treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
ufs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
unicode treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 294 2019-06-05 17:36:38 +02:00
xfs block: return from __bio_try_merge_page if merging occured in the same page 2019-06-17 09:33:02 -06:00
aio.c signal: remove the wrong signal_pending() check in restore_user_sigmask() 2019-06-29 16:43:45 +08:00
anon_inodes.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
attr.c
bad_inode.c
binfmt_aout.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_elf_fdpic.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
binfmt_elf.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_em86.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_flat.c fs/binfmt_flat.c: make load_flat_shared_library() work 2019-06-29 16:43:45 +08:00
binfmt_misc.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_script.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
block_dev.c block: Don't revalidate bdev of hidden gendisk 2019-05-27 07:35:02 -06:00
buffer.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
char_dev.c
compat_binfmt_elf.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 2019-05-30 11:29:21 -07:00
compat_ioctl.c
compat.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
coredump.c
d_path.c
dax.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
dcache.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
dcookies.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
direct-io.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
drop_caches.c
eventfd.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
eventpoll.c signal: remove the wrong signal_pending() check in restore_user_sigmask() 2019-06-29 16:43:45 +08:00
exec.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fcntl.c
fhandle.c
file_table.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
file.c
filesystems.c
fs_context.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
fs_parser.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
fs_pin.c
fs_struct.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fs_types.c
fs-writeback.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fsopen.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
inode.c mm: fix page cache convergence regression 2019-05-31 13:52:41 -04:00
internal.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
io_uring.c Merge branch 'akpm' (patches from Andrew) 2019-06-29 17:11:01 +08:00
ioctl.c
iomap.c block: return from __bio_try_merge_page if merging occured in the same page 2019-06-17 09:33:02 -06:00
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.binfmt treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
libfs.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
locks.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
Makefile Add as a feature case-insensitive directories (the casefold feature) 2019-05-07 21:12:44 -07:00
mbcache.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mount.h
mpage.c block: remove the i argument to bio_for_each_segment_all 2019-04-30 09:26:13 -06:00
namei.c Clean up fscrypt's dcache revalidation support, and other 2019-05-07 21:28:04 -07:00
namespace.c fs/namespace: fix unprivileged mount propagation 2019-06-17 17:36:09 -04:00
no-block.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nsfs.c
open.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
pipe.c
pnode.c fs/namespace: fix unprivileged mount propagation 2019-06-17 17:36:09 -04:00
pnode.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
posix_acl.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
proc_namespace.c
read_write.c vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files 2019-05-06 17:46:52 +03:00
readdir.c
select.c signal: remove the wrong signal_pending() check in restore_user_sigmask() 2019-06-29 16:43:45 +08:00
seq_file.c
signalfd.c
splice.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
stack.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
stat.c
statfs.c
super.c [fix] get rid of checking for absent device name in vfs_get_tree() 2019-04-28 21:34:21 -04:00
sync.c fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback 2019-05-14 09:47:50 -07:00
timerfd.c
userfaultfd.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 2019-06-19 17:09:53 +02:00
utimes.c
xattr.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00