linux/fs/nfs
Jens Axboe fe0f07d08e direct-io: only inc/dec inode->i_dio_count for file systems
do_blockdev_direct_IO() increments and decrements the inode
->i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.

For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:

clat percentiles (usec):
 |  1.00th=[   33],  5.00th=[   34], 10.00th=[   34], 20.00th=[   34],
 | 30.00th=[   34], 40.00th=[   34], 50.00th=[   35], 60.00th=[   35],
 | 70.00th=[   35], 80.00th=[   35], 90.00th=[   37], 95.00th=[   80],
 | 99.00th=[   98], 99.50th=[  151], 99.90th=[  155], 99.95th=[  155],
 | 99.99th=[  165]

After:

clat percentiles (usec):
 |  1.00th=[   95],  5.00th=[  108], 10.00th=[  129], 20.00th=[  149],
 | 30.00th=[  155], 40.00th=[  161], 50.00th=[  167], 60.00th=[  171],
 | 70.00th=[  177], 80.00th=[  185], 90.00th=[  201], 95.00th=[  270],
 | 99.00th=[  390], 99.50th=[  398], 99.90th=[  418], 99.95th=[  422],
 | 99.99th=[  438]

In other setups, Robert Elliott reported seeing good performance
improvements:

https://lkml.org/lkml/2015/4/3/557

The more applications accessing the device, the worse it gets.

Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Elliott, Robert (Server Storage) <elliott@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24 15:45:28 -04:00
..
blocklayout pnfs: release lseg in pnfs_generic_pg_cleanup 2015-02-03 11:06:44 -08:00
filelayout pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit 2015-02-18 07:20:35 -08:00
flexfilelayout pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit 2015-02-18 07:20:35 -08:00
objlayout nfs: add nfs_pgio_current_mirror helper 2015-02-03 11:06:48 -08:00
cache_lib.c NFS: simplify and clean cache library 2013-02-15 10:43:36 -05:00
cache_lib.h NFS: simplify and clean cache library 2013-02-15 10:43:36 -05:00
callback_proc.c NFSv4.1: Don't set up a backchannel if the server didn't agree to do so 2015-02-18 12:30:47 -08:00
callback_xdr.c NFSv4.1: Convert open-coded array allocation calls to kmalloc_array() 2015-02-11 19:02:52 -05:00
callback.c nfs: don't call blocking operations while !TASK_RUNNING 2015-01-30 20:39:50 -05:00
callback.h NFS: Add in v4.2 callback operation 2013-06-08 16:20:18 -04:00
client.c NFSv4: Fix a race in NFSv4.1 server trunking discovery 2015-03-03 20:42:23 -05:00
delegation.c NFSv4: Ensure we skip delegations that are already being returned 2015-03-02 18:09:15 -05:00
delegation.h NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return 2014-11-12 17:19:04 -05:00
dir.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
direct.c direct-io: only inc/dec inode->i_dio_count for file systems 2015-04-24 15:45:28 -04:00
dns_resolve.c NFS: Enabling v4.2 should not recompile nfsd and lockd 2013-11-19 16:20:40 -05:00
dns_resolve.h
file.c nfs: generic_write_checks() shouldn't be done on swapout... 2015-04-15 15:04:27 -04:00
fscache-index.c NFS: Fabricate fscache server index key correctly 2014-09-25 21:25:18 -04:00
fscache.c nfs: define nfs_inc_fscache_stats and using it as possible 2014-11-24 20:08:47 -05:00
fscache.h NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() 2013-09-27 18:40:25 +01:00
getroot.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
idmap.c pnfs/flexfiles: Add the FlexFile Layout Driver 2015-02-03 11:06:52 -08:00
inode.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
internal.h NFS: Add attribute update barriers to NFS writebacks 2015-03-01 23:23:06 -05:00
iostat.h nfs: define nfs_inc_fscache_stats and using it as possible 2014-11-24 20:08:47 -05:00
Kconfig pnfs/flexfiles: Add the FlexFile Layout Driver 2015-02-03 11:06:52 -08:00
Makefile pnfs/flexfiles: Add the FlexFile Layout Driver 2015-02-03 11:06:52 -08:00
mount_clnt.c nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it 2013-06-28 15:51:51 -04:00
namespace.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
netns.h pnfs/blocklayout: serialize GETDEVICEINFO calls 2014-11-12 14:22:52 -05:00
nfs2super.c NFS: Convert v2 into a module 2012-07-30 19:06:41 -04:00
nfs2xdr.c nfs: save server READ/WRITE/COMMIT status 2015-02-03 11:06:40 -08:00
nfs3_fs.h nfsv3: introduce nfs3_set_ds_client 2015-02-03 11:06:34 -08:00
nfs3acl.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs3client.c nfs: set hostname when creating nfsv3 ds connection 2015-02-03 11:06:38 -08:00
nfs3proc.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs3super.c nfsv3: introduce nfs3_set_ds_client 2015-02-03 11:06:34 -08:00
nfs3xdr.c NFSv3: Use the readdir fileid as the mounted-on-fileid 2015-03-01 23:23:07 -05:00
nfs4_fs.h Merge branch 'flexfiles' 2015-02-03 16:01:27 -05:00
nfs4client.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4file.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4getroot.c NFSv4: Fix security auto-negotiation 2013-09-07 16:18:30 -04:00
nfs4namespace.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4proc.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4renewd.c NFSv4.1: Fix an NFSv4.1 state renewal regression 2014-09-30 17:18:42 -04:00
nfs4session.c NFSv4.1: Don't set up a backchannel if the server didn't agree to do so 2015-02-18 12:30:47 -08:00
nfs4session.h NFSv4.1: Clear the old state by our client id before establishing a new lease 2015-03-03 21:52:30 -05:00
nfs4state.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4super.c Merge branch 'for-3.20/bdi' of git://git.kernel.dk/linux-block 2015-02-12 13:50:21 -08:00
nfs4sysctl.c nfs: convert use of typedef ctl_table to struct ctl_table 2014-06-06 16:08:16 -07:00
nfs4trace.c NFSv4.1: Add tracepoints for debugging slot table operations 2013-08-22 08:58:27 -04:00
nfs4trace.h VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4xdr.c NFSv4.1: Clean up bind_conn_to_session 2015-02-18 13:11:09 -08:00
nfs42.h nfs: Add DEALLOCATE support 2014-11-25 16:38:32 -05:00
nfs42proc.c nfs: Add DEALLOCATE support 2014-11-25 16:38:32 -05:00
nfs42xdr.c nfs: Add DEALLOCATE support 2014-11-25 16:38:32 -05:00
nfs.h NFS: Convert v4 into a module 2012-07-30 19:06:52 -04:00
nfsroot.c NFS: a couple off by ones 2015-01-30 20:43:30 -05:00
nfstrace.c NFS: Add event tracing for generic NFS lookups 2013-08-22 08:58:18 -04:00
nfstrace.h NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping 2014-01-27 15:35:56 -05:00
pagelist.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
pnfs_dev.c pnfs: remove GETDEVICELIST implementation 2014-09-12 13:20:54 -04:00
pnfs_nfs.c pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit 2015-02-18 07:20:35 -08:00
pnfs.c pnfs: delete an unintended goto 2015-02-10 08:41:23 -05:00
pnfs.h VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
proc.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
read.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
super.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
symlink.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
sysctl.c nfs: convert use of typedef ctl_table to struct ctl_table 2014-06-06 16:08:16 -07:00
unlink.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
write.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00