linux/fs
Qu Wenruo 6b7faadd98 btrfs: Ensure we trim ranges across block group boundary
[BUG]
When deleting large files (which cross block group boundary) with
discard mount option, we find some btrfs_discard_extent() calls only
trimmed part of its space, not the whole range:

  btrfs_discard_extent: type=0x1 start=19626196992 len=2144530432 trimmed=1073741824 ratio=50%

type:		bbio->map_type, in above case, it's SINGLE DATA.
start:		Logical address of this trim
len:		Logical length of this trim
trimmed:	Physically trimmed bytes
ratio:		trimmed / len

Thus leaving some unused space not discarded.

[CAUSE]
When discard mount option is specified, after a transaction is fully
committed (super block written to disk), we begin to cleanup pinned
extents in the following call chain:

btrfs_commit_transaction()
|- btrfs_finish_extent_commit()
   |- find_first_extent_bit(unpin, 0, &start, &end, EXTENT_DIRTY);
   |- btrfs_discard_extent()

However, pinned extents are recorded in an extent_io_tree, which can
merge adjacent extent states.

When a large file gets deleted and it has adjacent file extents across
block group boundary, we will get a large merged range like this:

      |<---    BG1    --->|<---      BG2     --->|
      |//////|<--   Range to discard   --->|/////|

To discard that range, we have the following calls:

  btrfs_discard_extent()
  |- btrfs_map_block()
  |  Returned bbio will end at BG1's end. As btrfs_map_block()
  |  never returns result across block group boundary.
  |- btrfs_issuse_discard()
     Issue discard for each stripe.

So we will only discard the range in BG1, not the remaining part in BG2.

Furthermore, this bug is not that reliably observed, for above case, if
there is no other extent in BG2, BG2 will be empty and btrfs will trim
all space of BG2, covering up the bug.

[FIX]
- Allow __btrfs_map_block_for_discard() to modify @length parameter
  btrfs_map_block() uses its @length paramter to notify the caller how
  many bytes are mapped in current call.
  With __btrfs_map_block_for_discard() also modifing the @length,
  btrfs_discard_extent() now understands when to do extra trim.

- Call btrfs_map_block() in a loop until we hit the range end Since we
  now know how many bytes are mapped each time, we can iterate through
  each block group boundary and issue correct trim for each range.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:48 +01:00
..
9p 9p pull request for inclusion in 5.4 2019-09-27 15:10:34 -07:00
adfs Merge branch 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 11:33:22 -07:00
affs fs: affs: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
afs afs: Fix race in commit bulk status fetch 2019-11-15 10:28:02 -08:00
autofs autofs: fix a leak in autofs_expire_indirect() 2019-10-25 00:03:11 -04:00
befs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
bfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
btrfs btrfs: Ensure we trim ranges across block group boundary 2019-11-18 17:51:48 +01:00
cachefiles treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
ceph ceph: increment/decrement dio counter on async requests 2019-11-14 18:44:51 +01:00
cifs SMB3: Fix persistent handles reconnect 2019-11-06 21:32:18 -06:00
coda y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
configfs configfs: calculate the depth of parent item 2019-11-06 18:36:01 +01:00
cramfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
crypto fscrypt: require that key be added when setting a v2 encryption policy 2019-08-12 19:18:50 -07:00
debugfs Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-09-28 08:14:15 -07:00
devpts devpts_pty_kill(): don't bother with d_delete() 2019-09-03 09:30:56 -04:00
dlm dlm for 5.3 2019-07-12 17:37:53 -07:00
ecryptfs ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either 2019-11-10 11:57:45 -05:00
efivarfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
efs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
erofs erofs: fix mis-inplace determination related with noio chain 2019-10-01 04:54:45 +08:00
exportfs exportfs_decode_fh(): negative pinned may become positive without the parent locked 2019-11-10 11:56:05 -05:00
ext2 \n 2019-09-21 13:53:34 -07:00
ext4 Merge branch 'entropy' 2019-09-29 19:25:39 -07:00
f2fs f2fs-for-5.4-rc1 2019-09-21 14:26:33 -07:00
fat fat: delete an unnecessary check before brelse() 2019-09-25 17:51:40 -07:00
freevxfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
fscache Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
fuse fuse: redundant get_fuse_inode() calls in fuse_writepages_fill() 2019-10-23 14:26:37 +02:00
gfs2 gfs2: Fix initialisation of args for remount 2019-10-30 12:16:53 +01:00
hfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
hfsplus fs/hfsplus/xattr.c: replace strncpy with memcpy 2019-07-16 19:23:23 -07:00
hostfs This pull request contains the following changes for UML: 2019-05-12 17:52:13 -04:00
hpfs fs: hpfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
hugetlbfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
iomap iomap: move the iomap_dio_rw ->end_io callback into a structure 2019-09-19 15:32:45 -07:00
isofs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
jbd2 jbd2: remove jbd2_journal_inode_add_[write|wait] 2019-09-24 15:54:07 -07:00
jffs2 Merge branch 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-26 11:33:30 -07:00
jfs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
kernfs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
lockd lockd: Make two symbols static 2019-07-03 17:52:09 -04:00
minix fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
nfs NFS: Fix an RCU lock leak in nfs4_refresh_delegation_stateid() 2019-11-01 11:03:56 -04:00
nfs_common treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nfsd Highlights: 2019-09-27 17:00:27 -07:00
nilfs2 vfs: create a generic checking and prep function for FS_IOC_SETFLAGS 2019-07-01 08:25:34 -07:00
nls treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
notify Highlights: 2019-09-27 17:00:27 -07:00
ntfs ntfs: remove (un)?likely() from IS_ERR() conditions 2019-09-26 10:10:44 -07:00
ocfs2 ocfs2: protect extent tree in ocfs2_prepare_inode_for_write() 2019-11-06 08:47:08 -08:00
omfs fs: omfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
openpromfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
orangefs Orangefs: a fix and a cleanup 2019-09-19 10:21:35 -07:00
overlayfs ovl: filter of trusted xattr results in audit 2019-09-11 16:11:45 +02:00
proc proc/meminfo: fix output alignment 2019-10-19 06:32:32 -04:00
pstore pstore: fs superblock limits 2019-08-30 08:11:25 -07:00
qnx4 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
qnx6 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
quota quota: fix condition for resetting time limit in do_set_dqblk() 2019-07-31 12:04:42 +02:00
ramfs vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API 2019-09-12 21:05:34 -04:00
reiserfs fs/reiserfs/do_balan.c: remove set but not used variable 2019-09-25 17:51:40 -07:00
romfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
squashfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
sysfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
sysv fs: sysv: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
tracefs tracing: Do not create tracefs files if tracefs lockdown is in effect 2019-10-12 20:49:07 -04:00
ubifs This pull request contains the following changes for UBI, UBIFS and JFFS2: 2019-09-21 11:10:16 -07:00
udf fs-udf: Delete an unnecessary check before brelse() 2019-09-04 18:19:43 +02:00
ufs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
unicode unicode: make array 'token' static const, makes object smaller 2019-09-17 11:48:24 -04:00
verity fs-verity: support builtin file signatures 2019-08-12 19:33:50 -07:00
xfs xfs: change the seconds fields in xfs_bulkstat to signed 2019-10-15 08:46:07 -07:00
aio.c aio: Fix io_pgetevents() struct __compat_aio_sigset layout 2019-10-21 19:12:19 -04:00
anon_inodes.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
attr.c timestamp_truncate: Replace users of timespec64_trunc 2019-08-30 07:27:17 -07:00
bad_inode.c
binfmt_aout.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_elf_fdpic.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
binfmt_elf.c elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings 2019-10-06 13:53:27 -07:00
binfmt_em86.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_flat.c fs/binfmt_flat.c: remove set but not used variable 'inode' 2019-07-16 19:23:22 -07:00
binfmt_misc.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
binfmt_script.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
block_dev.c Changes for 5.4: 2019-09-18 17:35:20 -07:00
buffer.c for-linus-20190715 2019-07-15 21:20:52 -07:00
char_dev.c chardev: set variable ret to -EBUSY before checking minor range overlap 2019-05-24 20:50:36 +02:00
compat_binfmt_elf.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 2019-05-30 11:29:21 -07:00
compat_ioctl.c compat_ioctl: pppoe: fix PPPOEIOCSFWD handling 2019-07-30 14:42:13 -07:00
compat.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
coredump.c coredump: split pipe command whitespace before expanding template 2019-08-03 07:02:01 -07:00
d_path.c [PATCH] fix d_absolute_path() interplay with fsmount() 2019-08-30 19:31:09 -04:00
dax.c fs/dax: Fix pmd vs pte conflict detection 2019-10-22 22:53:02 -07:00
dcache.c Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-20 09:15:51 -07:00
dcookies.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
direct-io.c fs/direct-io.c: fix kernel-doc warning 2019-10-14 15:04:01 -07:00
drop_caches.c
eventfd.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
eventpoll.c PM / wakeup: Show wakeup sources stats in sysfs 2019-08-21 00:20:40 +02:00
exec.c sched/membarrier: Fix p->mm->membarrier_state racy load 2019-09-25 17:42:30 +02:00
fcntl.c fs: mark expected switch fall-throughs 2019-04-08 18:21:02 -05:00
fhandle.c fs/handle.c - fix up kerneldoc 2019-08-07 21:51:47 -04:00
file_table.c vfs: Export flush_delayed_fput for use by knfsd. 2019-08-19 11:00:39 -04:00
file.c
filesystems.c
fs_context.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
fs_parser.c vfs: Make fs_parse() handle fs_param_is_fd-type params better 2019-09-12 21:06:14 -04:00
fs_pin.c switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
fs_struct.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fs_types.c
fs-writeback.c cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead 2019-11-08 13:37:24 -07:00
fsopen.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
inode.c mm,thp: avoid writes to file with THP in pagecache 2019-09-24 15:54:11 -07:00
internal.h Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-20 09:15:51 -07:00
io_uring.c io_uring: ensure registered buffer import returns the IO length 2019-11-13 16:15:14 -07:00
ioctl.c
Kconfig fs-verity for 5.4 2019-09-18 16:59:14 -07:00
Kconfig.binfmt binfmt_flat: make support for old format binaries optional 2019-06-24 09:16:47 +10:00
libfs.c fs/libfs.c: fix kernel-doc warning 2019-10-14 15:04:01 -07:00
locks.c Highlights: 2019-09-27 17:00:27 -07:00
Makefile fs-verity for 5.4 2019-09-18 16:59:14 -07:00
mbcache.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mount.h switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
mpage.c blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner() 2019-07-10 09:00:57 -06:00
namei.c fs/namei.c: keep track of nd->root refcount status 2019-09-03 09:30:45 -04:00
namespace.c fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry() 2019-10-16 23:15:09 -04:00
no-block.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nsfs.c vfs: Convert nsfs to use the new mount API 2019-05-25 18:00:06 -04:00
open.c fs: remove unlikely() from WARN_ON() condition 2019-09-26 10:10:30 -07:00
pipe.c vfs: Convert pipe to use the new mount API 2019-05-25 18:00:07 -04:00
pnode.c fs/namespace: fix unprivileged mount propagation 2019-06-17 17:36:09 -04:00
pnode.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
posix_acl.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
proc_namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
read_write.c vfs: fix page locking deadlocks when deduping files 2019-08-16 18:43:24 -07:00
readdir.c filldir[64]: remove WARN_ON_ONCE() for bad directory entries 2019-10-18 18:41:16 -04:00
select.c fs/select.c: use struct_size() in kmalloc() 2019-07-16 19:23:25 -07:00
seq_file.c seq_file: fix problem when seeking mid-record 2019-08-13 16:06:52 -07:00
signalfd.c fs: mark expected switch fall-throughs 2019-04-08 18:21:02 -05:00
splice.c uio: make import_iovec()/compat_import_iovec() return bytes on success 2019-05-31 15:30:03 -06:00
stack.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
stat.c
statfs.c vfs: Fix EOVERFLOW testing in put_compat_statfs64 2019-10-03 14:21:35 -07:00
super.c Merge branch 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-10-10 08:16:44 -07:00
sync.c fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback 2019-05-14 09:47:50 -07:00
timerfd.c timerfd: Prepare for PREEMPT_RT 2019-08-01 20:51:23 +02:00
userfaultfd.c userfaultfd: untag user pointers 2019-09-25 17:51:41 -07:00
utimes.c utimes: Clamp the timestamps before update 2019-08-30 07:27:17 -07:00
xattr.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00