linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-13 15:41:39 +00:00

History

Waiman Long 232d2d60aa dcache: Translating dentry into pathname without taking rename_lock When running the AIM7's short workload, Linus' lockref patch eliminated most of the spinlock contention. However, there were still some left: 8.46% reaim [kernel.kallsyms] [k] _raw_spin_lock \|--42.21%-- d_path \| proc_pid_readlink \| SyS_readlinkat \| SyS_readlink \| system_call \| __GI___readlink \| \|--40.97%-- sys_getcwd \| system_call \| __getcwd The big one here is the rename_lock (seqlock) contention in d_path() and the getcwd system call. This patch will eliminate the need to take the rename_lock while translating dentries into the full pathnames. The need to take the rename_lock is to make sure that no rename operation can be ongoing while the translation is in progress. However, only one thread can take the rename_lock thus blocking all the other threads that need it even though the translation process won't make any change to the dentries. This patch will replace the writer's write_seqlock/write_sequnlock sequence of the rename_lock of the callers of the prepend_path() and __dentry_path() functions with the reader's read_seqbegin/read_seqretry sequence within these 2 functions. As a result, the code will have to retry if one or more rename operations had been performed. In addition, RCU read lock will be taken during the translation process to make sure that no dentries will go away. To prevent live-lock from happening, the code will switch back to take the rename_lock if read_seqretry() fails for three times. To further reduce spinlock contention, this patch does not take the dentry's d_lock when copying the filename from the dentries. Instead, it treats the name pointer and length as unreliable and just copy the string byte-by-byte over until it hits a null byte or the end of string as specified by the length. This should avoid stepping into invalid memory address. The error cases are left to be handled by the sequence number check. The following code re-factoring are also made: 1. Move prepend('/') into prepend_name() to remove one conditional check. 2. Move the global root check in prepend_path() back to the top of the while loop. With this patch, the _raw_spin_lock will now account for only 1.2% of the total CPU cycles for the short workload. This patch also has the effect of reducing the effect of running perf on its profile since the perf command itself can be a heavy user of the d_path() function depending on the complexity of the workload. When taking the perf profile of the high-systime workload, the amount of spinlock contention contributed by running perf without this patch was about 16%. With this patch, the spinlock contention caused by the running of perf will go away and we will have a more accurate perf profile. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>		2013-09-09 13:44:16 -04:00
..
9p	Second round of 9p patches for the 3.11 merge window.	2013-07-11 10:21:23 -07:00
adfs	Don't pass inode to ->d_hash() and ->d_compare()	2013-06-29 12:57:36 +04:00
affs	Don't pass inode to ->d_hash() and ->d_compare()	2013-06-29 12:57:36 +04:00
afs	afs: get rid of redundant ->d_name.len checks	2013-09-07 19:54:55 -04:00
autofs4	autofs4 - fix device ioctl mount lookup	2013-09-08 22:07:47 -04:00
befs	[readdir] convert befs	2013-06-29 12:56:55 +04:00
bfs	bfs: iget_locked() doesn't return an ERR_PTR	2013-08-24 12:10:22 -04:00
btrfs	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2013-09-06 09:36:28 -07:00
cachefiles	mm: remove lru parameter from __pagevec_lru_add and remove parts of pagevec API	2013-07-03 16:07:31 -07:00
ceph	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client	2013-07-09 12:39:10 -07:00
cifs	direct-io: Handle O_(D)SYNC AIO	2013-09-04 09:23:46 -04:00
coda	helper for reading ->d_count	2013-07-05 18:59:33 +04:00
configfs	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-07-14 11:42:26 -07:00
cramfs	[readdir] convert f2fs	2013-06-29 12:56:46 +04:00
debugfs	debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)	2013-07-31 12:16:31 -04:00
devpts
dlm	dlm: remove signal blocking	2013-08-12 15:22:43 -05:00
ecryptfs	Code cleanups and improved buffer handling during page crypto operations	2013-07-11 10:20:18 -07:00
efivarfs	efivarfs: we can use simple_lookup() now	2013-07-14 17:48:35 +04:00
efs	efs: iget_locked() doesn't return an ERR_PTR()	2013-08-24 12:10:22 -04:00
exofs	Lots of bug fixes, cleanups and optimizations. In the bug fixes	2013-07-02 09:39:34 -07:00
exportfs	exportfs: don't assume that ->iterate() won't feed us too long entries	2013-09-07 19:54:55 -04:00
ext2	[O_TMPFILE] it's still short a few helpers, but infrastructure should be OK now...	2013-06-29 12:57:10 +04:00
ext3	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs	2013-09-06 09:06:02 -07:00
ext4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2013-09-06 09:36:28 -07:00
f2fs	f2fs: optimize gc for better performance	2013-09-05 13:50:32 +09:00
fat	fatfs: add FAT_IOCTL_GET_VOLUME_ID	2013-07-09 10:33:25 -07:00
freevxfs	[readdir] convert freevxfs	2013-06-29 12:56:53 +04:00
fscache	FS-Cache: Don't use spin_is_locked() in assertions	2013-06-19 14:16:47 +01:00
fuse	fuse: drop dentry on failed revalidate	2013-09-05 16:23:54 -04:00
gfs2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-09-07 14:36:57 -07:00
hfs	Don't pass inode to ->d_hash() and ->d_compare()	2013-06-29 12:57:36 +04:00
hfsplus	Don't pass inode to ->d_hash() and ->d_compare()	2013-06-29 12:57:36 +04:00
hostfs	[readdir] convert hostfs	2013-06-29 12:56:59 +04:00
hpfs	Merge branch 'hpfs' from Mikulas Patocka	2013-07-04 11:22:55 -07:00
hppfs	clean up scary strncpy(dst, src, strlen(src)) uses	2013-07-03 16:07:41 -07:00
hugetlbfs	cope with potentially long ->d_dname() output for shmem/hugetlb	2013-08-24 12:10:17 -04:00
isofs	isofs: Refuse RW mount of the filesystem instead of making it RO	2013-07-31 22:14:50 +02:00
jbd	jbd: use a single printk for jbd_debug()	2013-08-09 10:49:00 +02:00
jbd2	jbd2: Fix endian mixing problems in the checksumming code	2013-08-28 14:59:58 -04:00
jffs2	[readdir] convert jffs2	2013-06-29 12:56:47 +04:00
jfs	jfs: fix readdir cookie incompatibility with NFSv4	2013-08-15 17:22:29 -05:00
lockd	LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs	2013-08-05 15:03:46 -04:00
logfs	Lots of bug fixes, cleanups and optimizations. In the bug fixes	2013-07-02 09:39:34 -07:00
minix	minix: bug widening a binary "not" operation	2013-06-29 12:57:35 +04:00
ncpfs	ncpfs: fix error return code in ncp_parse_options()	2013-07-09 10:33:25 -07:00
nfs	nfs: use check_submounts_and_drop()	2013-09-05 16:23:52 -04:00
nfs_common
nfsd	nfsd: racy access to ->d_name in nsfd4_encode_path()	2013-09-03 22:50:28 -04:00
nilfs2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-09-05 08:50:26 -07:00
nls
notify	fsnotify: update comments concerning locking scheme	2013-07-09 10:33:20 -07:00
ntfs	Lots of bug fixes, cleanups and optimizations. In the bug fixes	2013-07-02 09:39:34 -07:00
ocfs2	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2013-09-07 14:34:07 -07:00
omfs	[readdir] convert omfs	2013-06-29 12:56:37 +04:00
openpromfs	[readdir] convert openpromfs	2013-06-29 12:56:32 +04:00
proc	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2013-09-07 14:35:32 -07:00
pstore	pstore/ram: (really) fix undefined usage of rounddown_pow_of_two	2013-08-30 15:57:01 -07:00
qnx4	[readdir] convert qnx4	2013-06-29 12:56:38 +04:00
qnx6	[readdir] convert qnx6	2013-06-29 12:56:39 +04:00
quota	quota: provide interface for readding allocated space into reserved space	2013-08-17 09:32:32 -04:00
ramfs
reiserfs	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs	2013-09-06 09:06:02 -07:00
romfs	[readdir] convert romfs	2013-06-29 12:56:29 +04:00
squashfs	[readdir] convert squashfs	2013-06-29 12:56:28 +04:00
sysfs	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-09-07 14:36:57 -07:00
sysv	Don't pass inode to ->d_hash() and ->d_compare()	2013-06-29 12:57:36 +04:00
ubifs	Only a single patch which fixes a message.	2013-07-05 12:08:47 -07:00
udf	udf: Refuse RW mount of the filesystem instead of making it RO	2013-07-31 22:14:51 +02:00
ufs	[readdir] simple local unixlike: switch to ->iterate()	2013-06-29 12:46:47 +04:00
xfs	direct-io: Implement generic deferred AIO completions	2013-09-04 09:23:46 -04:00
aio.c	aio: fix wrong comment in aio_complete()	2013-07-03 16:08:06 -07:00
anon_inodes.c
attr.c
bad_inode.c	[readdir] ->readdir() is gone	2013-06-29 12:57:04 +04:00
binfmt_aout.c	mm: remove free_area_cache	2013-07-10 18:11:34 -07:00
binfmt_elf_fdpic.c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc	2013-05-02 10:16:16 -07:00
binfmt_elf.c	mm: remove free_area_cache	2013-07-10 18:11:34 -07:00
binfmt_em86.c
binfmt_flat.c	new helper: read_code()	2013-04-29 15:40:23 -04:00
binfmt_misc.c	binfmt_misc: reuse string_unescape_inplace()	2013-04-30 17:04:03 -07:00
binfmt_script.c
binfmt_som.c
bio-integrity.c
bio.c	Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2013-09-03 18:25:03 -07:00
block_dev.c	direct-io: Handle O_(D)SYNC AIO	2013-09-04 09:23:46 -04:00
buffer.c	mm: vmscan: take page buffers dirty and locked state into account	2013-07-03 16:07:29 -07:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c	compat.c: LOOP_CLR_FD is taken care of in loop.c itself...	2013-06-29 12:46:44 +04:00
compat.c	[readdir] constify ->actor	2013-06-29 12:57:05 +04:00
coredump.c	coredump: '% at the end' shouldn't bypass core_uses_pid logic	2013-07-03 16:08:02 -07:00
coredump.h
dcache.c	dcache: Translating dentry into pathname without taking rename_lock	2013-09-09 13:44:16 -04:00
dcookies.c
direct-io.c	direct-io: Handle O_(D)SYNC AIO	2013-09-04 09:23:46 -04:00
drop_caches.c
eventfd.c
eventpoll.c	switch epoll_ctl() to fdget	2013-09-03 23:04:44 -04:00
exec.c	Fix TLB gather virtual address range invalidation corner cases	2013-08-16 08:52:46 -07:00
fcntl.c	vfs: add missing check for __O_TMPFILE in fcntl_init()	2013-08-05 18:25:32 +04:00
fhandle.c
file_table.c	only regular files with FMODE_WRITE need to be on s_files	2013-09-03 22:50:28 -04:00
file.c	don't bother with deferred freeing of fdtables	2013-05-01 17:31:42 -04:00
filesystems.c
fs_struct.c
fs-writeback.c	mm/writeback: don't check force_wait to handle bdi->work_list	2013-07-09 10:33:22 -07:00
generic_acl.c
inode.c	constify touch_atime()	2013-09-03 22:52:45 -04:00
internal.h	rename user_path_umountat() to user_path_mountpoint_at()	2013-09-08 20:20:21 -04:00
ioctl.c
ioprio.c
Kconfig	efivarfs: Move to fs/efivarfs	2013-04-17 13:25:09 +01:00
Kconfig.binfmt	fs: make binfmt support for #! scripts modular and removable	2013-04-30 17:04:04 -07:00
libfs.c	make simple_lookup() usable for filesystems that set ->s_d_op	2013-07-14 17:43:25 +04:00
locks.c	locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock	2013-07-08 13:36:42 +04:00
Makefile	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-05-01 17:51:54 -07:00
mbcache.c
mount.h	get rid of full-hash scan on detaching vfsmounts	2013-04-09 14:12:52 -04:00
mpage.c
namei.c	introduce kern_path_mountpoint()	2013-09-08 20:20:23 -04:00
namespace.c	rename user_path_umountat() to user_path_mountpoint_at()	2013-09-08 20:20:21 -04:00
no-block.c
open.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2013-09-07 14:35:32 -07:00
pipe.c	aio: don't include aio.h in sched.h	2013-05-07 20:16:25 -07:00
pnode.c	vfs: Fix invalid ida_remove() call	2013-05-31 15:16:33 -04:00
pnode.h	vfs: Don't copy mount bind mounts of /proc/<pid>/ns/mnt between namespaces	2013-08-26 18:42:15 -07:00
posix_acl.c
proc_namespace.c
read_write.c	vfs: export lseek_execute() to modules	2013-07-03 16:23:27 +04:00
readdir.c	[readdir] constify ->actor	2013-06-29 12:57:05 +04:00
select.c	net: rename include/net/ll_poll.h to include/net/busy_poll.h	2013-07-10 17:08:27 -07:00
seq_file.c	seq_file: add seq_list_*_percpu helpers	2013-07-08 13:36:41 +04:00
signalfd.c
splice.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-07-03 09:10:19 -07:00
stack.c
stat.c	quota: provide interface for readding allocated space into reserved space	2013-08-17 09:32:32 -04:00
statfs.c
super.c	prune_super(): sb->s_op is never NULL	2013-09-07 19:54:56 -04:00
sync.c
timerfd.c	timerfd: Add alarm timers	2013-05-29 12:57:34 -07:00
utimes.c
xattr_acl.c
xattr.c