linux/fs/btrfs
Omar Sandoval d7eae3403f Btrfs: rework delayed ref total_bytes_pinned accounting
The total_bytes_pinned counter is completely broken when accounting
delayed refs:

- If two drops for the same extent are merged, we will decrement
  total_bytes_pinned twice but only increment it once.
- If an add is merged into a drop or vice versa, we will decrement the
  total_bytes_pinned counter but never increment it.
- If multiple references to an extent are dropped, we will account it
  multiple times, potentially vastly over-estimating the number of bytes
  that will be freed by a commit and doing unnecessary work when we're
  close to ENOSPC.

The last issue is relatively minor, but the first two make the
total_bytes_pinned counter leak or underflow very often. These
accounting issues were introduced in b150a4f10d ("Btrfs: use a percpu
to keep track of possibly pinned bytes"), but they were papered over by
zeroing out the counter on every commit until d288db5dc0 ("Btrfs: fix
race of using total_bytes_pinned").

We need to make sure that an extent is accounted as pinned exactly once
if and only if we will drop references to it when when the transaction
is committed. Ideally we would only add to total_bytes_pinned when the
*last* reference is dropped, but this information isn't readily
available for data extents. Again, this over-estimation can lead to
extra commits when we're close to ENOSPC, but it's not as bad as before.

The fix implemented here is to increment total_bytes_pinned when the
total refmod count for an extent goes negative and decrement it if the
refmod count goes back to non-negative or after we've run all of the
delayed refs for that extent.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-29 20:17:01 +02:00
..
tests Btrfs: replace tree->mapping with tree->private_data 2017-06-19 18:25:58 +02:00
acl.c posix_acl: Clear SGID bit when setting file permissions 2016-09-22 10:55:32 +02:00
async-thread.c btrfs: fix crash when tracepoint arguments are freed by wq callbacks 2017-01-09 11:24:50 +01:00
async-thread.h btrfs: limit async_work allocation and worker func duration 2016-12-13 11:01:30 -08:00
backref.c btrfs: use GFP_KERNEL in init_ipath 2017-06-19 18:26:02 +02:00
backref.h
btrfs_inode.h Btrfs: fix reported number of inode blocks 2017-04-26 16:27:26 +01:00
check-integrity.c btrfs: sink gfp parameter to btrfs_io_bio_alloc 2017-06-19 18:26:04 +02:00
check-integrity.h btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
compression.c btrfs: pass bytes to btrfs_bio_alloc 2017-06-19 18:26:03 +02:00
compression.h btrfs: reduce arguments for decompress_bio ops 2017-06-19 18:26:00 +02:00
ctree.c btrfs: adjust includes after vmalloc removal 2017-06-19 18:26:02 +02:00
ctree.h btrfs: Check name_len with boundary in verify dir_item 2017-06-21 19:16:04 +02:00
dedupe.h btrfs: expand cow_file_range() to support in-band dedup and subpage-blocksize 2016-07-26 13:52:25 +02:00
delayed-inode.c btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
delayed-inode.h btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
delayed-ref.c Btrfs: return old and new total ref mods when adding delayed refs 2017-06-29 20:17:01 +02:00
delayed-ref.h Btrfs: return old and new total ref mods when adding delayed refs 2017-06-29 20:17:01 +02:00
dev-replace.c Btrfs: switch to div64_u64 if with a u64 divisor 2017-04-18 14:07:26 +02:00
dev-replace.h btrfs: constify device path passed to relevant helpers 2017-02-28 14:26:07 +01:00
dir-item.c btrfs: fix validation of XATTR_ITEM dir items 2017-06-29 20:06:11 +02:00
disk-io.c btrfs: move dev stats accounting out of wait_dev_flush 2017-06-21 19:03:39 +02:00
disk-io.h btrfs: btrfs_wait_tree_block_writeback can be void return 2017-06-19 18:26:01 +02:00
export.c btrfs: Check name_len before reading btrfs_get_name 2017-06-21 19:16:04 +02:00
export.h
extent_io.c btrfs: sink gfp parameter to btrfs_io_bio_alloc 2017-06-19 18:26:04 +02:00
extent_io.h btrfs: sink gfp parameter to btrfs_io_bio_alloc 2017-06-19 18:26:04 +02:00
extent_map.c btrfs: convert extent_map.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
extent_map.h btrfs: convert extent_map.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
extent-tree.c Btrfs: rework delayed ref total_bytes_pinned accounting 2017-06-29 20:17:01 +02:00
file-item.c Btrfs: change how we iterate bios in endio 2017-06-19 18:25:59 +02:00
file.c Btrfs: fix invalid extent maps due to hole punching 2017-06-21 16:52:45 +02:00
free-space-cache.c btrfs: use clear_page where appropriate 2017-04-18 14:07:26 +02:00
free-space-cache.h btrfs: free-space-cache, clean up unnecessary root arguments 2017-02-17 12:03:56 +01:00
free-space-tree.c Btrfs: use memalloc_nofs and kvzalloc() for free space tree bitmaps 2017-06-19 18:26:01 +02:00
free-space-tree.h Btrfs: implement the free space B-tree 2015-12-17 12:16:47 -08:00
hash.c crypto: Work around deallocated stack frame reference gcc bug on sparc. 2017-06-08 17:36:03 +08:00
hash.h btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
inode-item.c btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
inode-map.c btrfs: all btrfs_delalloc_release_metadata take btrfs_inode 2017-02-28 11:30:07 +01:00
inode-map.h Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-01-15 19:25:02 +01:00
inode.c btrfs: Check name_len with boundary in verify dir_item 2017-06-21 19:16:04 +02:00
ioctl.c btrfs: use GFP_KERNEL in init_ipath 2017-06-19 18:26:02 +02:00
Kconfig
locking.c btrfs: cleanup, remove stray return statements 2016-01-07 14:30:52 +01:00
locking.h
lzo.c btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace 2017-06-19 18:26:02 +02:00
Makefile Btrfs: add free space tree sanity tests 2015-12-17 12:16:47 -08:00
math.h
ordered-data.c btrfs: convert btrfs_ordered_extent.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
ordered-data.h btrfs: convert btrfs_ordered_extent.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
orphan.c
print-tree.c Btrfs: let btrfs_print_leaf print more about block group 2017-06-19 18:26:00 +02:00
print-tree.h btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
props.c btrfs: Verify dir_item in iterate_object_props 2017-06-21 19:16:04 +02:00
props.h
qgroup.c btrfs: add cond_resched to btrfs_qgroup_trace_leaf_items 2017-06-21 15:48:01 +02:00
qgroup.h btrfs: qgroup: Re-arrange tracepoint timing to co-operate with reserved space tracepoint 2017-04-18 14:07:26 +02:00
raid56.c btrfs: sink gfp parameter to btrfs_io_bio_alloc 2017-06-19 18:26:04 +02:00
raid56.h btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
rcu-string.h
reada.c btrfs: remove unused member err from reada_extent 2017-06-19 18:25:59 +02:00
relocation.c Btrfs: replace tree->mapping with tree->private_data 2017-06-19 18:25:58 +02:00
root-tree.c btrfs: Check name_len before in btrfs_del_root_ref 2017-06-21 19:16:04 +02:00
scrub.c btrfs: sink gfp parameter to btrfs_io_bio_alloc 2017-06-19 18:26:04 +02:00
send.c btrfs: Check name_len before read in iterate_dir_item 2017-06-21 19:16:04 +02:00
send.h Btrfs: use linux/sizes.h to represent constants 2016-01-07 14:38:02 +01:00
struct-funcs.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
super.c btrfs: obsolete and remove mount option alloc_start 2017-06-20 14:22:48 +02:00
sysfs.c btrfs: Add quota_override knob into sysfs 2017-06-19 18:25:58 +02:00
sysfs.h btrfs: sysfs: introduce helper for syncing bits with sysfs files 2016-01-21 18:50:40 +01:00
transaction.c btrfs: move fs_info::fs_frozen to the flags 2017-06-20 14:22:42 +02:00
transaction.h btrfs: remove unused qgroup members from btrfs_trans_handle 2017-04-18 14:07:25 +02:00
tree-defrag.c Btrfs: fix locking bugs when defragging leaves 2015-12-18 02:51:32 +00:00
tree-log.c btrfs: Check name_len in btrfs_check_ref_name_override 2017-06-21 19:16:04 +02:00
tree-log.h btrfs: Make btrfs_del_inode_ref take btrfs_inode 2017-02-14 15:50:54 +01:00
ulist.c btrfs: ulist: rename ulist_fini to ulist_release 2017-02-17 12:03:50 +01:00
ulist.h btrfs: ulist: rename ulist_fini to ulist_release 2017-02-17 12:03:50 +01:00
uuid-tree.c btrfs: return the actual error value from from btrfs_uuid_tree_iterate 2016-12-19 18:08:15 +01:00
volumes.c btrfs: preallocate device flush bio 2017-06-21 19:03:38 +02:00
volumes.h btrfs: preallocate device flush bio 2017-06-21 19:03:38 +02:00
xattr.c btrfs: Check name_len with boundary in verify dir_item 2017-06-21 19:16:04 +02:00
xattr.h btrfs: Switch to generic xattr handlers 2016-05-17 19:17:09 -04:00
zlib.c btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace 2017-06-19 18:26:02 +02:00