linux/fs
Josef Bacik 49d11bead7 btrfs: add a helper to read the tree_root commit root for backref lookup
I got the following lockdep splat with tree locks converted to rwsem
patches on btrfs/104:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.9.0+  Not tainted
  ------------------------------------------------------
  btrfs-cleaner/903 is trying to acquire lock:
  ffff8e7fab6ffe30 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x32/0x170

  but task is already holding lock:
  ffff8e7fab628a88 (&fs_info->commit_root_sem){++++}-{3:3}, at: btrfs_find_all_roots+0x41/0x80

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  ->  (&fs_info->commit_root_sem){++++}-{3:3}:
	 down_read+0x40/0x130
	 caching_thread+0x53/0x5a0
	 btrfs_work_helper+0xfa/0x520
	 process_one_work+0x238/0x540
	 worker_thread+0x55/0x3c0
	 kthread+0x13a/0x150
	 ret_from_fork+0x1f/0x30

  ->  (&caching_ctl->mutex){+.+.}-{3:3}:
	 __mutex_lock+0x7e/0x7b0
	 btrfs_cache_block_group+0x1e0/0x510
	 find_free_extent+0xb6e/0x12f0
	 btrfs_reserve_extent+0xb3/0x1b0
	 btrfs_alloc_tree_block+0xb1/0x330
	 alloc_tree_block_no_bg_flush+0x4f/0x60
	 __btrfs_cow_block+0x11d/0x580
	 btrfs_cow_block+0x10c/0x220
	 commit_cowonly_roots+0x47/0x2e0
	 btrfs_commit_transaction+0x595/0xbd0
	 sync_filesystem+0x74/0x90
	 generic_shutdown_super+0x22/0x100
	 kill_anon_super+0x14/0x30
	 btrfs_kill_super+0x12/0x20
	 deactivate_locked_super+0x36/0xa0
	 cleanup_mnt+0x12d/0x190
	 task_work_run+0x5c/0xa0
	 exit_to_user_mode_prepare+0x1df/0x200
	 syscall_exit_to_user_mode+0x54/0x280
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  ->  (&space_info->groups_sem){++++}-{3:3}:
	 down_read+0x40/0x130
	 find_free_extent+0x2ed/0x12f0
	 btrfs_reserve_extent+0xb3/0x1b0
	 btrfs_alloc_tree_block+0xb1/0x330
	 alloc_tree_block_no_bg_flush+0x4f/0x60
	 __btrfs_cow_block+0x11d/0x580
	 btrfs_cow_block+0x10c/0x220
	 commit_cowonly_roots+0x47/0x2e0
	 btrfs_commit_transaction+0x595/0xbd0
	 sync_filesystem+0x74/0x90
	 generic_shutdown_super+0x22/0x100
	 kill_anon_super+0x14/0x30
	 btrfs_kill_super+0x12/0x20
	 deactivate_locked_super+0x36/0xa0
	 cleanup_mnt+0x12d/0x190
	 task_work_run+0x5c/0xa0
	 exit_to_user_mode_prepare+0x1df/0x200
	 syscall_exit_to_user_mode+0x54/0x280
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  ->  (btrfs-root-00){++++}-{3:3}:
	 __lock_acquire+0x1167/0x2150
	 lock_acquire+0xb9/0x3d0
	 down_read_nested+0x43/0x130
	 __btrfs_tree_read_lock+0x32/0x170
	 __btrfs_read_lock_root_node+0x3a/0x50
	 btrfs_search_slot+0x614/0x9d0
	 btrfs_find_root+0x35/0x1b0
	 btrfs_read_tree_root+0x61/0x120
	 btrfs_get_root_ref+0x14b/0x600
	 find_parent_nodes+0x3e6/0x1b30
	 btrfs_find_all_roots_safe+0xb4/0x130
	 btrfs_find_all_roots+0x60/0x80
	 btrfs_qgroup_trace_extent_post+0x27/0x40
	 btrfs_add_delayed_data_ref+0x3fd/0x460
	 btrfs_free_extent+0x42/0x100
	 __btrfs_mod_ref+0x1d7/0x2f0
	 walk_up_proc+0x11c/0x400
	 walk_up_tree+0xf0/0x180
	 btrfs_drop_snapshot+0x1c7/0x780
	 btrfs_clean_one_deleted_snapshot+0xfb/0x110
	 cleaner_kthread+0xd4/0x140
	 kthread+0x13a/0x150
	 ret_from_fork+0x1f/0x30

  other info that might help us debug this:

  Chain exists of:
    btrfs-root-00 --> &caching_ctl->mutex --> &fs_info->commit_root_sem

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(&fs_info->commit_root_sem);
				 lock(&caching_ctl->mutex);
				 lock(&fs_info->commit_root_sem);
    lock(btrfs-root-00);

   *** DEADLOCK ***

  3 locks held by btrfs-cleaner/903:
   : ffff8e7fab628838 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: cleaner_kthread+0x6e/0x140
   : ffff8e7faadac640 (sb_internal){.+.+}-{0:0}, at: start_transaction+0x40b/0x5c0
   : ffff8e7fab628a88 (&fs_info->commit_root_sem){++++}-{3:3}, at: btrfs_find_all_roots+0x41/0x80

  stack backtrace:
  CPU: 0 PID: 903 Comm: btrfs-cleaner Not tainted 5.9.0+ 
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
  Call Trace:
   dump_stack+0x8b/0xb0
   check_noncircular+0xcf/0xf0
   __lock_acquire+0x1167/0x2150
   ? __bfs+0x42/0x210
   lock_acquire+0xb9/0x3d0
   ? __btrfs_tree_read_lock+0x32/0x170
   down_read_nested+0x43/0x130
   ? __btrfs_tree_read_lock+0x32/0x170
   __btrfs_tree_read_lock+0x32/0x170
   __btrfs_read_lock_root_node+0x3a/0x50
   btrfs_search_slot+0x614/0x9d0
   ? find_held_lock+0x2b/0x80
   btrfs_find_root+0x35/0x1b0
   ? do_raw_spin_unlock+0x4b/0xa0
   btrfs_read_tree_root+0x61/0x120
   btrfs_get_root_ref+0x14b/0x600
   find_parent_nodes+0x3e6/0x1b30
   btrfs_find_all_roots_safe+0xb4/0x130
   btrfs_find_all_roots+0x60/0x80
   btrfs_qgroup_trace_extent_post+0x27/0x40
   btrfs_add_delayed_data_ref+0x3fd/0x460
   btrfs_free_extent+0x42/0x100
   __btrfs_mod_ref+0x1d7/0x2f0
   walk_up_proc+0x11c/0x400
   walk_up_tree+0xf0/0x180
   btrfs_drop_snapshot+0x1c7/0x780
   ? btrfs_clean_one_deleted_snapshot+0x73/0x110
   btrfs_clean_one_deleted_snapshot+0xfb/0x110
   cleaner_kthread+0xd4/0x140
   ? btrfs_alloc_root+0x50/0x50
   kthread+0x13a/0x150
   ? kthread_create_worker_on_cpu+0x40/0x40
   ret_from_fork+0x1f/0x30
  BTRFS info (device sdb): disk space caching is enabled
  BTRFS info (device sdb): has skinny extents

This happens because qgroups does a backref lookup when we create a
delayed ref.  From here it may have to look up a root from an indirect
ref, which does a normal lookup on the tree_root, which takes the read
lock on the tree_root nodes.

To fix this we need to add a variant for looking up roots that searches
the commit root of the tree_root.  Then when we do the backref search
using the commit root we are sure to not take any locks on the tree_root
nodes.  This gets rid of the lockdep splat when running btrfs/104.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-26 15:04:57 +01:00
..
9p treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
adfs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
affs affs: fix basic permission bits to actually work 2020-08-31 12:20:31 +02:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-03 18:50:48 -07:00
autofs autofs: use __kernel_write() for the autofs pipe writing 2020-09-29 17:18:34 -07:00
befs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
bfs
btrfs btrfs: add a helper to read the tree_root commit root for backref lookup 2020-10-26 15:04:57 +01:00
cachefiles cachefiles: switch to kernel_write 2020-07-08 08:27:56 +02:00
ceph We have an inode number handling change, prompted by s390x which is 2020-08-28 10:33:04 -07:00
cifs cifs: fix DFS mount with cifsacl/modefromsid 2020-09-06 23:59:53 -05:00
coda
configfs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
cramfs
crypto mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
debugfs debugfs: Fix module state check condition 2020-09-04 18:12:52 +02:00
devpts
dlm treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ecryptfs mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
efivarfs efi/efivars: Expose RT service availability via efivars abstraction 2020-07-09 10:14:29 +03:00
efs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
erofs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
exfat exfat: retain 'VolumeFlags' properly 2020-08-12 08:31:13 +09:00
exportfs
ext2 ext2: don't update mtime on COW faults 2020-09-05 10:00:05 -07:00
ext4 \n 2020-08-28 10:57:14 -07:00
f2fs f2fs: Return EOF on unaligned end of file DIO read 2020-09-08 20:31:33 -07:00
fat fat: fix fat_ra_init() for data clusters == 0 2020-08-12 10:58:01 -07:00
freevxfs
fscache Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
fuse fuse: fix the ->direct_IO() treatment of iov_iter 2020-09-17 17:26:56 -04:00
gfs2 Fix memory leak on filesystem withdraw 2020-08-28 10:41:00 -07:00
hfs block: move block-related definitions out of fs.h 2020-06-24 09:16:02 -06:00
hfsplus treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
hostfs
hpfs hpfs: fix warning due to superfluous semicolon 2020-06-06 10:08:17 -07:00
hugetlbfs hugetlbfs: prevent filesystem stacking of hugetlbfs 2020-08-12 10:57:56 -07:00
iomap treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
isofs Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
jbd2 jbd2: clean up checksum verification in do_one_pass() 2020-08-19 12:04:35 -04:00
jffs2 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
jfs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
kernfs fsnotify: pass dir and inode arguments to fsnotify() 2020-07-27 23:15:48 +02:00
lockd
minix fs/minix: remove expected error message in block_to_path() 2020-08-12 10:58:00 -07:00
nfs pNFS/flexfiles: Be consistent about mirror index types 2020-09-18 09:25:33 -04:00
nfs_common treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nfsd Fixes: 2020-08-25 18:01:36 -07:00
nilfs2 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nls treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
notify treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ntfs ntfs: fix ntfs_test_inode and ntfs_init_locked_inode function type 2020-08-07 11:33:21 -07:00
ocfs2 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
omfs treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
openpromfs
orangefs orangefs: remove unnecessary assignment to variable ret 2020-08-04 15:01:58 -04:00
overlayfs Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
proc mm, oom: make the calculation of oom badness more accurate 2020-08-12 10:57:56 -07:00
pstore treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
qnx4
qnx6 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
quota treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ramfs
reiserfs \n 2020-08-06 19:28:26 -07:00
romfs romfs: fix uninitialized memory leak in romfs_dev_read() 2020-08-21 09:52:53 -07:00
squashfs squashfs: avoid bio_alloc() failure with 1Mbyte blocks 2020-08-21 09:52:53 -07:00
sysfs RDMA 5.8 merge window pull request 2020-06-05 14:05:57 -07:00
sysv
tracefs
ubifs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
udf treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ufs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
unicode
vboxsf Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-09-22 15:08:41 -07:00
verity fs-verity: use smp_load_acquire() for ->i_verity_info 2020-07-21 16:02:41 -07:00
xfs Fixes (2) for 5.9: 2020-09-05 10:04:53 -07:00
zonefs zonefs: add zone-capacity support 2020-08-11 17:42:24 +09:00
aio.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
anon_inodes.c
attr.c
bad_inode.c fs: move the fiemap definitions out of fs.h 2020-06-03 23:16:55 -04:00
binfmt_aout.c
binfmt_elf_fdpic.c Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
binfmt_elf.c kill elf_fpxregs_t 2020-07-27 14:29:23 -04:00
binfmt_em86.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-08-24 08:49:13 +10:00
binfmt_misc.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_script.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
block_dev.c for-5.9/io_uring-20200802 2020-08-03 13:01:22 -07:00
buffer.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
char_dev.c
compat_binfmt_elf.c Split the old READ_IMPLIES_EXEC workaround from executable PT_GNU_STACK 2020-06-05 13:45:21 -07:00
compat.c
coredump.c coredump: add %f for executable filename 2020-08-12 10:58:01 -07:00
d_path.c
dax.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dcache.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
dcookies.c
direct-io.c fs: remove no longer used dio_end_io() 2020-10-07 12:17:59 +02:00
drop_caches.c
eventfd.c
eventpoll.c ep_create_wakeup_source(): dentry name can change under you... 2020-09-24 19:41:58 -04:00
exec.c mm/gup: remove task_struct pointer for all gup code 2020-08-12 10:58:04 -07:00
fcntl.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fhandle.c
file_table.c Revert "fs: Do not check if there is a fsnotify watcher on pseudo inodes" 2020-06-29 09:40:55 -07:00
file.c Merge branch 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 09:40:34 -07:00
filesystems.c
fs_context.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fs_parser.c
fs_pin.c
fs_struct.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
fs_types.c
fs-writeback.c fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype 2020-09-19 13:13:39 -07:00
fsopen.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
init.c init: add an init_dup helper 2020-08-04 21:02:38 -04:00
inode.c AFS Changes 2020-06-05 16:26:36 -07:00
internal.h Merge branch 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 09:40:34 -07:00
io_uring.c io_uring-5.9-2020-10-02 2020-10-02 14:38:10 -07:00
io-wq.c io-wq: fix hang after cancelling pending hashed work 2020-08-23 11:38:50 -06:00
io-wq.h io_uring/io-wq: move RLIMIT_FSIZE to io-wq 2020-07-24 13:00:44 -06:00
ioctl.c fs: remove ksys_ioctl 2020-07-31 08:16:01 +02:00
Kconfig tmpfs: support 64-bit inums per-sb 2020-08-07 11:33:24 -07:00
Kconfig.binfmt treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
libfs.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
locks.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
Makefile init: add an init_mount helper 2020-07-31 08:17:51 +02:00
mbcache.c
mount.h
mpage.c fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
namei.c exec: restore EACCES of S_ISDIR execve() 2020-08-14 19:56:56 -07:00
namespace.c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 21:03:25 -07:00
no-block.c
nsfs.c
open.c exec: move S_ISREG() check earlier 2020-08-12 10:58:01 -07:00
pipe.c pipe: remove pipe_wait() and fix wakeup race with splice 2020-10-01 19:14:36 -07:00
pnode.c
pnode.h
posix_acl.c vfs: clean up posix_acl_permission() logic aroudn MAY_NOT_BLOCK 2020-06-08 11:04:19 -07:00
proc_namespace.c Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-06-04 13:54:34 -07:00
read_write.c autofs: use __kernel_write() for the autofs pipe writing 2020-09-29 17:18:34 -07:00
readdir.c fs: remove ksys_getdents64 2020-07-31 08:16:00 +02:00
select.c pselect6() and friends: take handling the combined 6th/7th args into helper 2020-05-29 19:10:42 -04:00
seq_file.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
signalfd.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
splice.c pipe: remove pipe_wait() and fix wakeup race with splice 2020-10-01 19:14:36 -07:00
stack.c
stat.c New code for 5.8: 2020-06-02 19:45:12 -07:00
statfs.c
super.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:09:11 -07:00
sync.c overlayfs update for 5.8 2020-06-09 15:40:50 -07:00
timerfd.c
userfaultfd.c A set of locking fixes and updates: 2020-08-10 19:07:44 -07:00
utimes.c fs: expose utimes_common 2020-07-31 08:16:01 +02:00
xattr.c xattr: add a function to check if a namespace is supported 2020-07-13 17:27:03 -04:00