linux/fs
Christoph Hellwig 43ff2122e6 xfs: on-stack delayed write buffer lists
Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
and write back the buffers per-process instead of by waking up xfsbufd.

This is now easily doable given that we have very few places left that write
delwri buffers:

 - log recovery:
	Only done at mount time, and already forcing out the buffers
	synchronously using xfs_flush_buftarg

 - quotacheck:
	Same story.

 - dquot reclaim:
	Writes out dirty dquots on the LRU under memory pressure.  We might
	want to look into doing more of this via xfsaild, but it's already
	more optimal than the synchronous inode reclaim that writes each
	buffer synchronously.

 - xfsaild:
	This is the main beneficiary of the change.  By keeping a local list
	of buffers to write we reduce latency of writing out buffers, and
	more importably we can remove all the delwri list promotions which
	were hitting the buffer cache hard under sustained metadata loads.

The implementation is very straight forward - xfs_buf_delwri_queue now gets
a new list_head pointer that it adds the delwri buffers to, and all callers
need to eventually submit the list using xfs_buf_delwi_submit or
xfs_buf_delwi_submit_nowait.  Buffers that already are on a delwri list are
skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
list.  The biggest change to pass down the buffer list was done to the AIL
pushing. Now that we operate on buffers the trylock, push and pushbuf log
item methods are merged into a single push routine, which tries to lock the
item, and if possible add the buffer that needs writeback to the buffer list.
This leads to much simpler code than the previous split but requires the
individual IOP_PUSH instances to unlock and reacquire the AIL around calls
to blocking routines.

Given that xfsailds now also handle writing out buffers, the conditions for
log forcing and the sleep times needed some small changes.  The most
important one is that we consider an AIL busy as long we still have buffers
to push, and the other one is that we do increment the pushed LSN for
buffers that are under flushing at this moment, but still count them towards
the stuck items for restart purposes.  Without this we could hammer on stuck
items without ever forcing the log and not make progress under heavy random
delete workloads on fast flash storage devices.

[ Dave Chinner:
	- rebase on previous patches.
	- improved comments for XBF_DELWRI_Q handling
	- fix XBF_ASYNC handling in queue submission (test 106 failure)
	- rename delwri submit function buffer list parameters for clarity
	- xfs_efd_item_push() should return XFS_ITEM_PINNED ]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14 16:20:31 -05:00
..
9p 9p changes for the 3.4 merge window 2012-03-28 09:58:38 -07:00
adfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
affs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
afs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
autofs4 Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
befs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
bfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2012-03-30 12:44:29 -07:00
cachefiles switch touch_atime to struct path 2012-03-20 21:29:41 -04:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2012-03-28 10:01:29 -07:00
cifs Fix UNC parsing on mount 2012-04-03 20:46:09 -05:00
coda Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
configfs make configfs_pin_fs() return root dentry on success 2012-03-20 21:29:48 -04:00
cramfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
debugfs simple_open: automatically convert to simple_open() 2012-04-05 15:25:50 -07:00
devpts Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
dlm simple_open: automatically convert to simple_open() 2012-04-05 15:25:50 -07:00
ecryptfs ecryptfs: make register_filesystem() the last potential failure exit 2012-03-20 21:29:49 -04:00
efs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
exofs Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2012-03-28 20:04:27 -07:00
exportfs
ext2 migrate ext2_fs.h guts to fs/ext2/ext2.h 2012-03-31 16:03:16 -04:00
ext3 ext3: move headers to fs/ext3/ 2012-03-31 16:03:16 -04:00
ext4 Revert "ext4: don't release page refs in ext4_end_bio()" 2012-03-29 17:00:56 -07:00
fat fat: fix bug in enforcing Long File Name length 2012-03-23 16:58:40 -07:00
freevxfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
fscache
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
gfs2 get rid of pointless includes of ext2_fs.h 2012-03-31 16:03:15 -04:00
hfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
hfsplus hfsplus: add an ioctl to bless files 2012-03-20 21:29:53 -04:00
hostfs Merge branch 'for-linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2012-03-27 18:29:53 -07:00
hpfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
hppfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
hugetlbfs hugetlbfs: remove unregister_filesystem() when initializing module 2012-04-05 15:25:50 -07:00
isofs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
jbd Power management updates for 3.4 2012-03-21 10:15:51 -07:00
jbd2 Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
jffs2 MTD merge for 3.4 2012-03-30 17:31:56 -07:00
jfs jfs: mising cleanup on register_filesystem() failure 2012-03-20 21:29:48 -04:00
lockd Merge nfs containerization work from Trond's tree 2012-03-26 11:48:54 -04:00
logfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
minix Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
ncpfs Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
nfs NFS client bugfixes for Linux 3.4 2012-03-28 19:02:35 -07:00
nfs_common
nfsd Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linux 2012-03-29 14:53:25 -07:00
nilfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
nls
notify fs/notify/notification.c: make subsys_initcall function static 2012-03-23 16:58:31 -07:00
ntfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
ocfs2 get rid of pointless includes of ext2_fs.h 2012-03-31 16:03:15 -04:00
omfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
openpromfs switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
proc Merge branch 'akpm' (Andrew's patch-bomb) 2012-04-05 15:30:34 -07:00
pstore Merge branch 'akpm' (Andrew's patch-bomb) 2012-04-05 15:30:34 -07:00
qnx4 qnx4: new helper - try_extent() 2012-03-20 21:29:52 -04:00
qnx6 fs: initial qnx6fs addition 2012-03-20 21:29:38 -04:00
quota Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-03-28 10:00:14 -07:00
ramfs tidy up after d_make_root() conversion 2012-03-20 21:29:37 -04:00
reiserfs Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
romfs MTD merge for 3.4 2012-03-30 17:31:56 -07:00
squashfs Add an extra mount time sanity check, plus some code cleanups and bug fixes. 2012-03-28 18:05:54 -07:00
sysfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
sysv switch open-coded instances of d_make_root() to new helper 2012-03-20 21:29:35 -04:00
ubifs - Improve error messages 2012-03-23 09:27:40 -07:00
udf Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-03-28 10:00:14 -07:00
ufs Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
xfs xfs: on-stack delayed write buffer lists 2012-05-14 16:20:31 -05:00
aio.c aio: take final put_ioctx() into callers of io_destroy() 2012-03-31 16:03:15 -04:00
anon_inodes.c anon_inodes: move allocation of anon_inode into ->mount() 2012-03-20 21:29:45 -04:00
attr.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
bad_inode.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
binfmt_aout.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
binfmt_elf_fdpic.c Add #includes needed to permit the removal of asm/system.h 2012-03-28 18:30:03 +01:00
binfmt_elf.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
binfmt_em86.c __register_binfmt() made void 2012-03-20 21:29:46 -04:00
binfmt_flat.c Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
binfmt_misc.c magic.h: move some FS magic numbers into magic.h 2012-03-23 16:58:31 -07:00
binfmt_script.c __register_binfmt() made void 2012-03-20 21:29:46 -04:00
binfmt_som.c take removal of PF_FORKNOEXEC to flush_old_exec() 2012-03-20 21:29:51 -04:00
bio-integrity.c fs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:21 +08:00
bio.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
block_dev.c magic.h: move some FS magic numbers into magic.h 2012-03-23 16:58:31 -07:00
buffer.c fs: only send IPI to invalidate LRU BH when needed 2012-03-28 17:14:35 -07:00
char_dev.c char_dev.c: fix up some whitespace errors 2011-12-13 11:18:17 -08:00
compat_binfmt_elf.c
compat_ioctl.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
compat.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
dcache.c vfs: fix d_ancestor() case in d_materialize_unique 2012-03-28 09:54:34 -07:00
dcookies.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
direct-io.c Restore direct_io / truncate locking API 2012-02-23 15:56:21 -08:00
drop_caches.c
eventfd.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
eventpoll.c Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
exec.c Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-04-04 10:04:42 -07:00
fcntl.c Wrap accesses to the fd_sets in struct fdtable 2012-02-19 10:30:52 -08:00
fhandle.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
fifo.c
file_table.c vfs: drop_file_write_access() made static 2012-03-20 21:29:32 -04:00
file.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
filesystems.c vfs: convert fs_supers to hlist 2012-01-03 22:52:39 -05:00
fs_struct.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
fs-writeback.c trivial writeback fixes 2012-03-28 10:07:27 -07:00
generic_acl.c
inode.c trim includes in inode.c 2012-03-20 21:29:51 -04:00
internal.h vfs: protect remounting superblock read-only 2012-01-06 23:20:12 -05:00
ioctl.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
ioprio.c block: strip out locking optimization in put_io_context() 2012-02-07 07:51:30 +01:00
Kconfig Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
Kconfig.binfmt fs: binfmt_elf: create Kconfig variable for PIE randomization 2012-01-10 16:30:51 -08:00
libfs.c libfs: add simple_open() 2012-04-05 15:25:50 -07:00
locks.c CIFS: Fix VFS lock usage for oplocked files 2012-04-01 13:54:27 -05:00
Makefile fs: initial qnx6fs addition 2012-03-20 21:29:38 -04:00
mbcache.c
mount.h vfs: keep list of mounts for each superblock 2012-01-06 23:20:12 -05:00
mpage.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
namei.c Make the "word-at-a-time" helper functions more commonly usable 2012-04-06 13:54:56 -07:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
no-block.c
open.c Wrap accesses to the fd_sets in struct fdtable 2012-02-19 10:30:52 -08:00
pipe.c magic.h: move some FS magic numbers into magic.h 2012-03-23 16:58:31 -07:00
pnode.c vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
pnode.h vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
posix_acl.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
proc_namespace.c vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
read_write.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
read_write.h
readdir.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
select.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
seq_file.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
signalfd.c epoll: ep_unregister_pollwait() can use the freed pwq->whead 2012-02-24 11:42:50 -08:00
splice.c tcp: tcp_sendpages() should call tcp_push() once 2012-04-05 19:04:27 -04:00
stack.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
stat.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
statfs.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
super.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
sync.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
timerfd.c
utimes.c
xattr_acl.c fs: reduce the use of module.h wherever possible 2012-02-28 19:31:58 -05:00
xattr.c fs/xattr.c:setxattr(): improve handling of allocation failures 2012-04-05 15:25:50 -07:00