linux/fs
Omar Sandoval d7eae3403f Btrfs: rework delayed ref total_bytes_pinned accounting
The total_bytes_pinned counter is completely broken when accounting
delayed refs:

- If two drops for the same extent are merged, we will decrement
  total_bytes_pinned twice but only increment it once.
- If an add is merged into a drop or vice versa, we will decrement the
  total_bytes_pinned counter but never increment it.
- If multiple references to an extent are dropped, we will account it
  multiple times, potentially vastly over-estimating the number of bytes
  that will be freed by a commit and doing unnecessary work when we're
  close to ENOSPC.

The last issue is relatively minor, but the first two make the
total_bytes_pinned counter leak or underflow very often. These
accounting issues were introduced in b150a4f10d ("Btrfs: use a percpu
to keep track of possibly pinned bytes"), but they were papered over by
zeroing out the counter on every commit until d288db5dc0 ("Btrfs: fix
race of using total_bytes_pinned").

We need to make sure that an extent is accounted as pinned exactly once
if and only if we will drop references to it when when the transaction
is committed. Ideally we would only add to total_bytes_pinned when the
*last* reference is dropped, but this information isn't readily
available for data extents. Again, this over-estimation can lead to
extra commits when we're close to ENOSPC, but it's not as bad as before.

The fix implemented here is to increment total_bytes_pinned when the
total refmod count for an extent goes negative and decrement it if the
refmod count goes back to non-negative or after we've run all of the
delayed refs for that extent.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-29 20:17:01 +02:00
..
9p 9p: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
adfs
affs fs/affs: add rename exchange 2017-05-05 15:24:52 -04:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
autofs4 Fix dead URLs to ftp.kernel.org 2017-03-28 16:16:52 +02:00
befs befs: make export work with cold dcache 2017-05-05 11:35:35 +01:00
bfs Tigran has moved 2017-05-12 15:57:15 -07:00
btrfs Btrfs: rework delayed ref total_bytes_pinned accounting 2017-06-29 20:17:01 +02:00
cachefiles sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
ceph ceph: unify inode i_ctime update 2017-06-14 19:37:23 +02:00
cifs [CIFS] Minor cleanup of xattr query function 2017-05-12 20:59:10 -05:00
coda coda: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
configfs configfs: Introduce config_item_get_unless_zero() 2017-06-12 13:20:20 +02:00
cramfs
crypto fscrypt: introduce helper function for filename matching 2017-05-04 11:44:37 -04:00
debugfs fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
devpts
dlm net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
ecryptfs ecryptfs: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
efivarfs
efs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
exofs fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
exportfs Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
ext2 dax, xfs, ext4: compile out iomap-dax paths in the FS_DAX=n case 2017-05-13 17:52:16 -07:00
ext4 ext4: fix fdatasync(2) after extent manipulation operations 2017-05-29 13:24:55 -04:00
f2fs crypto: Work around deallocated stack frame reference gcc bug on sparc. 2017-06-08 17:36:03 +08:00
fat fat: fix using uninitialized fields of fat_inode/fsinfo_inode 2017-03-09 17:01:10 -08:00
freevxfs
fscache KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload() 2017-03-02 10:09:00 +11:00
fuse Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2017-05-20 16:12:30 -07:00
gfs2 gfs2: Make flush bios explicitely sync 2017-05-24 13:35:20 +02:00
hfs fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
hfsplus fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
hostfs vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
hpfs sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
hugetlbfs mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
isofs sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
jbd2 jbd2: preserve original nofs flag during journal restart 2017-05-21 22:32:23 -04:00
jffs2 jffs2: fix spelling mistake: "requestied" -> "requested" 2017-04-19 11:35:55 -07:00
jfs jfs: Remove jfs_get_inode_flags() 2017-04-19 14:21:23 +02:00
kernfs kernfs: Check KERNFS_HAS_RELEASE before calling kernfs_release_file() 2017-03-17 10:25:59 +09:00
lockd Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
minix statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
ncpfs ncpfs: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
nfs NFS client bugfixes for Linux 4.12 2017-06-04 11:56:53 -07:00
nfs_common netns: make struct pernet_operations::id unsigned int 2016-11-18 10:59:15 -05:00
nfsd nfsd4: fix null dereference on replay 2017-05-23 14:20:58 -04:00
nilfs2 fs: Remove SB_I_DYNBDI flag 2017-04-20 12:09:55 -06:00
nls
notify fanotify: don't expose EOPENSTALE to userspace 2017-04-25 15:48:06 +02:00
ntfs ntfs: Use ERR_CAST() to avoid cross-structure cast 2017-05-28 10:11:48 -07:00
ocfs2 ocfs2: Use ERR_CAST() to avoid cross-structure cast 2017-05-28 10:11:49 -07:00
omfs sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
openpromfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
orangefs Some cleanups: 2017-05-05 13:36:10 -07:00
overlayfs ovl: filter trusted xattr for non-admin 2017-05-29 15:15:27 +02:00
proc mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
pstore Annotation of module parameters that specify device settings 2017-05-10 19:13:03 -07:00
qnx4
qnx6
quota ext4: fix quota charging for shared xattr blocks 2017-05-24 18:24:07 -04:00
ramfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
reiserfs reiserfs: Make flush bios explicitely sync 2017-05-24 13:35:20 +02:00
romfs romfs: use different way to generate fsid for BLOCK or MTD 2017-01-24 16:26:14 -08:00
squashfs fs/pstore: fs/squashfs: change usage of LZ4 to work with new LZ4 version 2017-02-24 17:46:57 -08:00
sysfs sysfs: be careful of error returns from ops->show() 2017-04-08 17:33:32 +02:00
sysv statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
tracefs fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
ubifs This pull request contains updates for both UBI and UBIFS: 2017-05-13 10:23:12 -07:00
udf udf: use kmap_atomic for memcpy copying 2017-04-24 16:28:02 +02:00
ufs Merge branch 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-06-17 17:30:07 +09:00
xfs Changes since last update: 2017-06-17 17:34:41 +09:00
aio.c Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
anon_inodes.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
attr.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
bad_inode.c statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
binfmt_aout.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
binfmt_elf_fdpic.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
binfmt_elf.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
binfmt_em86.c
binfmt_flat.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
binfmt_misc.c fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
binfmt_script.c
block_dev.c Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 2017-05-12 15:43:10 -07:00
buffer.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
char_dev.c chardev: add helper function to register char devs with a struct device 2017-03-21 06:44:32 +01:00
compat_binfmt_elf.c fs/binfmt: Convert obsolete cputime type to nsecs 2017-02-01 09:13:51 +01:00
compat_ioctl.c fs: compat: Remove warning from COMPATIBLE_IOCTL 2017-04-29 17:47:19 -04:00
compat.c fs/compat.c: trim unused includes 2017-04-17 12:52:27 -04:00
coredump.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
dax.c dax: fix race between colliding PMD & PTE entries 2017-06-02 15:07:37 -07:00
dcache.c Hang/soft lockup in d_invalidate with simultaneous calls 2017-06-15 06:52:09 -04:00
dcookies.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
direct-io.c fs: add i_blocksize() 2017-02-27 18:43:46 -08:00
drop_caches.c
eventfd.c sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
eventpoll.c epoll: Add busy poll support to epoll with socket fds. 2017-03-24 20:49:31 -07:00
exec.c x86/arch_prctl: Add ARCH_[GET|SET]_CPUID 2017-03-20 16:10:34 +01:00
fcntl.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
fhandle.c fhandle: move compat syscalls from compat.c 2017-04-17 12:52:26 -04:00
file_table.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
file.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
filesystems.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
fs_pin.c
fs_struct.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
fs-writeback.c writeback: fix memory leak in wb_queue_work() 2017-03-13 08:27:34 -06:00
inode.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
internal.h Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
ioctl.c sched/headers: Prepare for the reduction of <linux/sched.h>'s signal API dependency 2017-03-02 08:42:37 +01:00
iomap.c fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
Kconfig block, dax: move "select DAX" from BLOCK to FS_DAX 2017-05-08 10:55:27 -07:00
Kconfig.binfmt
libfs.c fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
locks.c locks: Set FL_CLOSE when removing flock locks on close() 2017-04-21 10:45:01 -04:00
Makefile logfs: remove from tree 2016-12-14 23:48:11 -05:00
mbcache.c mbcache: document that "find" functions only return reusable entries 2016-12-03 15:55:01 -05:00
mount.h fsnotify: Free fsnotify_mark_connector when there is no mark attached 2017-04-10 17:37:35 +02:00
mpage.c fs: add i_blocksize() 2017-02-27 18:43:46 -08:00
namei.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
namespace.c fs: don't forget to put old mntns in mntns_install 2017-06-15 06:53:05 -04:00
no-block.c
nsfs.c ns: allow ns_entries to have custom symlink content 2017-05-08 17:15:12 -07:00
open.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
pipe.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
pnode.c mnt: Tuck mounts under others instead of creating shadow/side mounts. 2017-02-04 00:01:06 +13:00
pnode.h mnt: Tuck mounts under others instead of creating shadow/side mounts. 2017-02-04 00:01:06 +13:00
posix_acl.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
proc_namespace.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
read_write.c fs: pass on flags in compat_writev 2017-06-16 18:40:51 +09:00
readdir.c readdir: move compat syscalls from compat.c 2017-04-17 12:52:24 -04:00
select.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
seq_file.c mm: introduce kv[mz]alloc helpers 2017-05-08 17:15:12 -07:00
signalfd.c mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
splice.c Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:38:06 -07:00
stack.c
stat.c ufs: restore maintaining ->i_blocks 2017-06-09 16:28:01 -04:00
statfs.c statfs: move compat syscalls from compat.c 2017-04-17 12:52:23 -04:00
super.c bdi: Drop 'parent' argument from bdi_register[_va]() 2017-04-20 12:09:55 -06:00
sync.c vfs: use helper for calling f_op->fsync() 2017-02-20 16:51:23 +01:00
timerfd.c timerfd: Only check CAP_WAKE_ALARM when it is needed 2017-03-01 12:53:44 +01:00
userfaultfd.c userfaultfd: shmem: handle coredumping in handle_userfault() 2017-06-17 06:37:05 +09:00
utimes.c utimes: move compat syscalls from compat.c 2017-04-17 12:52:23 -04:00
xattr.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00