linux/fs
Qu Wenruo 6282675e67 btrfs: relocation: fix reloc_root lifespan and access
[BUG]
There are several different KASAN reports for balance + snapshot
workloads.  Involved call paths include:

   should_ignore_root+0x54/0xb0 [btrfs]
   build_backref_tree+0x11af/0x2280 [btrfs]
   relocate_tree_blocks+0x391/0xb80 [btrfs]
   relocate_block_group+0x3e5/0xa00 [btrfs]
   btrfs_relocate_block_group+0x240/0x4d0 [btrfs]
   btrfs_relocate_chunk+0x53/0xf0 [btrfs]
   btrfs_balance+0xc91/0x1840 [btrfs]
   btrfs_ioctl_balance+0x416/0x4e0 [btrfs]
   btrfs_ioctl+0x8af/0x3e60 [btrfs]
   do_vfs_ioctl+0x831/0xb10

   create_reloc_root+0x9f/0x460 [btrfs]
   btrfs_reloc_post_snapshot+0xff/0x6c0 [btrfs]
   create_pending_snapshot+0xa9b/0x15f0 [btrfs]
   create_pending_snapshots+0x111/0x140 [btrfs]
   btrfs_commit_transaction+0x7a6/0x1360 [btrfs]
   btrfs_mksubvol+0x915/0x960 [btrfs]
   btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs]
   btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs]
   btrfs_ioctl+0x241b/0x3e60 [btrfs]
   do_vfs_ioctl+0x831/0xb10

   btrfs_reloc_pre_snapshot+0x85/0xc0 [btrfs]
   create_pending_snapshot+0x209/0x15f0 [btrfs]
   create_pending_snapshots+0x111/0x140 [btrfs]
   btrfs_commit_transaction+0x7a6/0x1360 [btrfs]
   btrfs_mksubvol+0x915/0x960 [btrfs]
   btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs]
   btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs]
   btrfs_ioctl+0x241b/0x3e60 [btrfs]
   do_vfs_ioctl+0x831/0xb10

[CAUSE]
All these call sites are only relying on root->reloc_root, which can
undergo btrfs_drop_snapshot(), and since we don't have real refcount
based protection to reloc roots, we can reach already dropped reloc
root, triggering KASAN.

[FIX]
To avoid such access to unstable root->reloc_root, we should check
BTRFS_ROOT_DEAD_RELOC_TREE bit first.

This patch introduces wrappers that provide the correct way to check the
bit with memory barriers protection.

Most callers don't distinguish merged reloc tree and no reloc tree.  The
only exception is should_ignore_root(), as merged reloc tree can be
ignored, while no reloc tree shouldn't.

[CRITICAL SECTION ANALYSIS]
Although test_bit()/set_bit()/clear_bit() doesn't imply a barrier, the
DEAD_RELOC_TREE bit has extra help from transaction as a higher level
barrier, the lifespan of root::reloc_root and DEAD_RELOC_TREE bit are:

	NULL: reloc_root is NULL	PTR: reloc_root is not NULL
	0: DEAD_RELOC_ROOT bit not set	DEAD: DEAD_RELOC_ROOT bit set

	(NULL, 0)    Initial state		 __
	  |					 /\ Section A
        btrfs_init_reloc_root()			 \/
	  |				 	 __
	(PTR, 0)     reloc_root initialized      /\
          |					 |
	btrfs_update_reloc_root()		 |  Section B
          |					 |
	(PTR, DEAD)  reloc_root has been merged  \/
          |					 __
	=== btrfs_commit_transaction() ====================
	  |					 /\
	clean_dirty_subvols()			 |
	  |					 |  Section C
	(NULL, DEAD) reloc_root cleanup starts   \/
          |					 __
	btrfs_drop_snapshot()			 /\
	  |					 |  Section D
	(NULL, 0)    Back to initial state	 \/

Every have_reloc_root() or test_bit(DEAD_RELOC_ROOT) caller holds
transaction handle, so none of such caller can cross transaction boundary.

In Section A, every caller just found no DEAD bit, and grab reloc_root.

In the cross section A-B, caller may get no DEAD bit, but since reloc_root
is still completely valid thus accessing reloc_root is completely safe.

No test_bit() caller can cross the boundary of Section B and Section C.

In Section C, every caller found the DEAD bit, so no one will access
reloc_root.

In the cross section C-D, either caller gets the DEAD bit set, avoiding
access reloc_root no matter if it's safe or not.  Or caller get the DEAD
bit cleared, then access reloc_root, which is already NULL, nothing will
be wrong.

The memory write barriers are between the reloc_root updates and bit
set/clear, the pairing read side is before test_bit.

Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Fixes: d2311e6985 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ barriers ]
Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-13 23:10:56 +01:00
..
9p 9p pull request for inclusion in 5.4 2019-09-27 15:10:34 -07:00
adfs Merge branch 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 11:33:22 -07:00
affs fs: affs: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
afs afs: Fix race in commit bulk status fetch 2019-11-15 10:28:02 -08:00
autofs autofs: fix a leak in autofs_expire_indirect() 2019-10-25 00:03:11 -04:00
befs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
bfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
btrfs btrfs: relocation: fix reloc_root lifespan and access 2020-01-13 23:10:56 +01:00
cachefiles treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
ceph ceph: increment/decrement dio counter on async requests 2019-11-14 18:44:51 +01:00
cifs SMB3: Fix persistent handles reconnect 2019-11-06 21:32:18 -06:00
coda y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
configfs configfs: calculate the depth of parent item 2019-11-06 18:36:01 +01:00
cramfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
crypto fscrypt: require that key be added when setting a v2 encryption policy 2019-08-12 19:18:50 -07:00
debugfs Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-09-28 08:14:15 -07:00
devpts devpts_pty_kill(): don't bother with d_delete() 2019-09-03 09:30:56 -04:00
dlm dlm for 5.3 2019-07-12 17:37:53 -07:00
ecryptfs ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either 2019-11-10 11:57:45 -05:00
efivarfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
efs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
erofs erofs: fix mis-inplace determination related with noio chain 2019-10-01 04:54:45 +08:00
exportfs exportfs_decode_fh(): negative pinned may become positive without the parent locked 2019-11-10 11:56:05 -05:00
ext2 \n 2019-09-21 13:53:34 -07:00
ext4 Merge branch 'entropy' 2019-09-29 19:25:39 -07:00
f2fs f2fs-for-5.4-rc1 2019-09-21 14:26:33 -07:00
fat fat: delete an unnecessary check before brelse() 2019-09-25 17:51:40 -07:00
freevxfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
fscache Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
fuse fuse: redundant get_fuse_inode() calls in fuse_writepages_fill() 2019-10-23 14:26:37 +02:00
gfs2 gfs2: Fix initialisation of args for remount 2019-10-30 12:16:53 +01:00
hfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
hfsplus fs/hfsplus/xattr.c: replace strncpy with memcpy 2019-07-16 19:23:23 -07:00
hostfs This pull request contains the following changes for UML: 2019-05-12 17:52:13 -04:00
hpfs fs: hpfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
hugetlbfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
iomap iomap: move the iomap_dio_rw ->end_io callback into a structure 2019-09-19 15:32:45 -07:00
isofs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
jbd2 jbd2: remove jbd2_journal_inode_add_[write|wait] 2019-09-24 15:54:07 -07:00
jffs2 Merge branch 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-26 11:33:30 -07:00
jfs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
kernfs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
lockd lockd: Make two symbols static 2019-07-03 17:52:09 -04:00
minix fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
nfs NFS: Fix an RCU lock leak in nfs4_refresh_delegation_stateid() 2019-11-01 11:03:56 -04:00
nfs_common treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nfsd Highlights: 2019-09-27 17:00:27 -07:00
nilfs2 vfs: create a generic checking and prep function for FS_IOC_SETFLAGS 2019-07-01 08:25:34 -07:00
nls treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
notify Highlights: 2019-09-27 17:00:27 -07:00
ntfs ntfs: remove (un)?likely() from IS_ERR() conditions 2019-09-26 10:10:44 -07:00
ocfs2 ocfs2: protect extent tree in ocfs2_prepare_inode_for_write() 2019-11-06 08:47:08 -08:00
omfs fs: omfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
openpromfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
orangefs Orangefs: a fix and a cleanup 2019-09-19 10:21:35 -07:00
overlayfs ovl: filter of trusted xattr results in audit 2019-09-11 16:11:45 +02:00
proc proc/meminfo: fix output alignment 2019-10-19 06:32:32 -04:00
pstore pstore: fs superblock limits 2019-08-30 08:11:25 -07:00
qnx4 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
qnx6 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
quota quota: fix condition for resetting time limit in do_set_dqblk() 2019-07-31 12:04:42 +02:00
ramfs vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API 2019-09-12 21:05:34 -04:00
reiserfs fs/reiserfs/do_balan.c: remove set but not used variable 2019-09-25 17:51:40 -07:00
romfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
squashfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
sysfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
sysv fs: sysv: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
tracefs tracing: Do not create tracefs files if tracefs lockdown is in effect 2019-10-12 20:49:07 -04:00
ubifs This pull request contains the following changes for UBI, UBIFS and JFFS2: 2019-09-21 11:10:16 -07:00
udf fs-udf: Delete an unnecessary check before brelse() 2019-09-04 18:19:43 +02:00
ufs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
unicode unicode: make array 'token' static const, makes object smaller 2019-09-17 11:48:24 -04:00
verity fs-verity: support builtin file signatures 2019-08-12 19:33:50 -07:00
xfs xfs: change the seconds fields in xfs_bulkstat to signed 2019-10-15 08:46:07 -07:00
aio.c aio: Fix io_pgetevents() struct __compat_aio_sigset layout 2019-10-21 19:12:19 -04:00
anon_inodes.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
attr.c timestamp_truncate: Replace users of timespec64_trunc 2019-08-30 07:27:17 -07:00
bad_inode.c
binfmt_aout.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_elf_fdpic.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
binfmt_elf.c elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings 2019-10-06 13:53:27 -07:00
binfmt_em86.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_flat.c fs/binfmt_flat.c: remove set but not used variable 'inode' 2019-07-16 19:23:22 -07:00
binfmt_misc.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
binfmt_script.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
block_dev.c Changes for 5.4: 2019-09-18 17:35:20 -07:00
buffer.c for-linus-20190715 2019-07-15 21:20:52 -07:00
char_dev.c chardev: set variable ret to -EBUSY before checking minor range overlap 2019-05-24 20:50:36 +02:00
compat_binfmt_elf.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 2019-05-30 11:29:21 -07:00
compat_ioctl.c compat_ioctl: pppoe: fix PPPOEIOCSFWD handling 2019-07-30 14:42:13 -07:00
compat.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
coredump.c coredump: split pipe command whitespace before expanding template 2019-08-03 07:02:01 -07:00
d_path.c [PATCH] fix d_absolute_path() interplay with fsmount() 2019-08-30 19:31:09 -04:00
dax.c fs/dax: Fix pmd vs pte conflict detection 2019-10-22 22:53:02 -07:00
dcache.c Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-20 09:15:51 -07:00
dcookies.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
direct-io.c fs/direct-io.c: fix kernel-doc warning 2019-10-14 15:04:01 -07:00
drop_caches.c
eventfd.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
eventpoll.c PM / wakeup: Show wakeup sources stats in sysfs 2019-08-21 00:20:40 +02:00
exec.c sched/membarrier: Fix p->mm->membarrier_state racy load 2019-09-25 17:42:30 +02:00
fcntl.c fs: mark expected switch fall-throughs 2019-04-08 18:21:02 -05:00
fhandle.c fs/handle.c - fix up kerneldoc 2019-08-07 21:51:47 -04:00
file_table.c vfs: Export flush_delayed_fput for use by knfsd. 2019-08-19 11:00:39 -04:00
file.c
filesystems.c
fs_context.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
fs_parser.c vfs: Make fs_parse() handle fs_param_is_fd-type params better 2019-09-12 21:06:14 -04:00
fs_pin.c switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
fs_struct.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fs_types.c
fs-writeback.c cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead 2019-11-08 13:37:24 -07:00
fsopen.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
inode.c mm,thp: avoid writes to file with THP in pagecache 2019-09-24 15:54:11 -07:00
internal.h Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-20 09:15:51 -07:00
io_uring.c io_uring: ensure registered buffer import returns the IO length 2019-11-13 16:15:14 -07:00
ioctl.c
Kconfig fs-verity for 5.4 2019-09-18 16:59:14 -07:00
Kconfig.binfmt binfmt_flat: make support for old format binaries optional 2019-06-24 09:16:47 +10:00
libfs.c fs/libfs.c: fix kernel-doc warning 2019-10-14 15:04:01 -07:00
locks.c Highlights: 2019-09-27 17:00:27 -07:00
Makefile fs-verity for 5.4 2019-09-18 16:59:14 -07:00
mbcache.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mount.h switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
mpage.c blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner() 2019-07-10 09:00:57 -06:00
namei.c fs/namei.c: keep track of nd->root refcount status 2019-09-03 09:30:45 -04:00
namespace.c fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry() 2019-10-16 23:15:09 -04:00
no-block.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nsfs.c vfs: Convert nsfs to use the new mount API 2019-05-25 18:00:06 -04:00
open.c fs: remove unlikely() from WARN_ON() condition 2019-09-26 10:10:30 -07:00
pipe.c vfs: Convert pipe to use the new mount API 2019-05-25 18:00:07 -04:00
pnode.c fs/namespace: fix unprivileged mount propagation 2019-06-17 17:36:09 -04:00
pnode.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
posix_acl.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
proc_namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
read_write.c vfs: fix page locking deadlocks when deduping files 2019-08-16 18:43:24 -07:00
readdir.c filldir[64]: remove WARN_ON_ONCE() for bad directory entries 2019-10-18 18:41:16 -04:00
select.c fs/select.c: use struct_size() in kmalloc() 2019-07-16 19:23:25 -07:00
seq_file.c seq_file: fix problem when seeking mid-record 2019-08-13 16:06:52 -07:00
signalfd.c fs: mark expected switch fall-throughs 2019-04-08 18:21:02 -05:00
splice.c uio: make import_iovec()/compat_import_iovec() return bytes on success 2019-05-31 15:30:03 -06:00
stack.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
stat.c
statfs.c vfs: Fix EOVERFLOW testing in put_compat_statfs64 2019-10-03 14:21:35 -07:00
super.c Merge branch 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-10-10 08:16:44 -07:00
sync.c fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback 2019-05-14 09:47:50 -07:00
timerfd.c timerfd: Prepare for PREEMPT_RT 2019-08-01 20:51:23 +02:00
userfaultfd.c userfaultfd: untag user pointers 2019-09-25 17:51:41 -07:00
utimes.c utimes: Clamp the timestamps before update 2019-08-30 07:27:17 -07:00
xattr.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00