linux/fs
Jeff Layton 1c8c601a8c locks: protect most of the file_lock handling with i_lock
Having a global lock that protects all of this code is a clear
scalability problem. Instead of doing that, move most of the code to be
protected by the i_lock instead. The exceptions are the global lists
that the ->fl_link sits on, and the ->fl_block list.

->fl_link is what connects these structures to the
global lists, so we must ensure that we hold those locks when iterating
over or updating these lists.

Furthermore, sound deadlock detection requires that we hold the
blocked_list state steady while checking for loops. We also must ensure
that the search and update to the list are atomic.

For the checking and insertion side of the blocked_list, push the
acquisition of the global lock into __posix_lock_file and ensure that
checking and update of the  blocked_list is done without dropping the
lock in between.

On the removal side, when waking up blocked lock waiters, take the
global lock before walking the blocked list and dequeue the waiters from
the global list prior to removal from the fl_block list.

With this, deadlock detection should be race free while we minimize
excessive file_lock_lock thrashing.

Finally, in order to avoid a lock inversion problem when handling
/proc/locks output we must ensure that manipulations of the fl_block
list are also protected by the file_lock_lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:42 +04:00
..
9p [readdir] convert 9p 2013-06-29 12:56:45 +04:00
adfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
affs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
afs locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
autofs4 [readdir] switch dcache_readdir() users to ->iterate() 2013-06-29 12:46:48 +04:00
befs [readdir] convert befs 2013-06-29 12:56:55 +04:00
bfs [readdir] convert bfs 2013-06-29 12:56:33 +04:00
btrfs btrfs: more open-coded file_inode() 2013-06-29 12:57:24 +04:00
cachefiles lift sb_start_write() out of ->write() 2013-04-09 14:12:56 -04:00
ceph locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
cifs locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
coda coda: don't bother with find_inode_number() 2013-06-29 12:57:20 +04:00
configfs [readdir] convert configfs 2013-06-29 12:56:30 +04:00
cramfs [readdir] convert f2fs 2013-06-29 12:56:46 +04:00
debugfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
devpts fs: Limit sys_mount to only request filesystem modules (Part 2). 2013-03-07 01:08:55 -08:00
dlm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-05-01 14:08:52 -07:00
ecryptfs ecryptfs: switch ecryptfs_decode_and_decrypt_filename() from dentry to sb 2013-06-29 12:57:25 +04:00
efivarfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
efs [readdir] convert efs 2013-06-29 12:56:31 +04:00
exofs [readdir] convert exofs 2013-06-29 12:56:34 +04:00
exportfs [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
ext2 [O_TMPFILE] it's still short a few helpers, but infrastructure should be OK now... 2013-06-29 12:57:10 +04:00
ext3 ext3 ->tmpfile() support 2013-06-29 12:57:12 +04:00
ext4 [readdir] convert ext4 2013-06-29 12:56:40 +04:00
f2fs [readdir] convert f2fs 2013-06-29 12:56:46 +04:00
fat Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
freevxfs [readdir] convert freevxfs 2013-06-29 12:56:53 +04:00
fscache fs/fscache/stats.c: fix memory leak 2013-04-29 15:54:27 -07:00
fuse fuse: another open-coded file_inode() 2013-06-29 12:57:24 +04:00
gfs2 locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
hfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
hfsplus Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
hostfs [readdir] convert hostfs 2013-06-29 12:56:59 +04:00
hpfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
hppfs [readdir] convert procfs 2013-06-29 12:56:32 +04:00
hugetlbfs hugetlbfs: fix mmap failure in unaligned size request 2013-05-07 18:38:27 -07:00
isofs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
jbd Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2013-05-03 09:56:25 -07:00
jbd2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
jffs2 [readdir] convert jffs2 2013-06-29 12:56:47 +04:00
jfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
lockd locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
logfs [readdir] convert logfs 2013-06-29 12:56:43 +04:00
minix minix: bug widening a binary "not" operation 2013-06-29 12:57:35 +04:00
ncpfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
nfs locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
nfs_common nfs_common: Update the translation between nfsv3 acls linux posix acls 2013-02-13 06:15:14 -08:00
nfsd locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
nilfs2 [readdir] convert nilfs2 2013-06-29 12:56:36 +04:00
nls
notify fanotify: quit wanking with FASYNC in ->release() 2013-06-29 12:57:23 +04:00
ntfs [readdir] convert ntfs 2013-06-29 12:56:48 +04:00
ocfs2 [readdir] convert ocfs2 2013-06-29 12:57:02 +04:00
omfs [readdir] convert omfs 2013-06-29 12:56:37 +04:00
openpromfs [readdir] convert openpromfs 2013-06-29 12:56:32 +04:00
proc Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
pstore Couple of pstore cleanups 2013-05-09 16:42:10 -07:00
qnx4 [readdir] convert qnx4 2013-06-29 12:56:38 +04:00
qnx6 [readdir] convert qnx6 2013-06-29 12:56:39 +04:00
quota quota: add missing use of dq_data_lock in __dquot_initialize 2013-03-11 22:05:56 +01:00
ramfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
reiserfs reiserfs: switch reiserfs_readdir_dentry to inode 2013-06-29 12:56:51 +04:00
romfs [readdir] convert romfs 2013-06-29 12:56:29 +04:00
squashfs [readdir] convert squashfs 2013-06-29 12:56:28 +04:00
sysfs [readdir] convert sysfs 2013-06-29 12:56:36 +04:00
sysv Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
ubifs [readdir] convert ubifs 2013-06-29 12:56:25 +04:00
udf udf: provide ->tmpfile() 2013-06-29 12:57:12 +04:00
ufs [readdir] simple local unixlike: switch to ->iterate() 2013-06-29 12:46:47 +04:00
xfs [readdir] convert xfs 2013-06-29 12:57:00 +04:00
aio.c constify rw_verify_area() 2013-06-29 12:57:34 +04:00
anon_inodes.c get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero 2013-02-26 02:46:11 -05:00
attr.c
bad_inode.c [readdir] ->readdir() is gone 2013-06-29 12:57:04 +04:00
binfmt_aout.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
binfmt_elf_fdpic.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2013-05-02 10:16:16 -07:00
binfmt_elf.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2013-05-02 10:16:16 -07:00
binfmt_em86.c
binfmt_flat.c new helper: read_code() 2013-04-29 15:40:23 -04:00
binfmt_misc.c binfmt_misc: reuse string_unescape_inplace() 2013-04-30 17:04:03 -07:00
binfmt_script.c
binfmt_som.c
bio-integrity.c bio-integrity: Add explicit field for owner of bip_buf 2013-03-23 14:26:34 -07:00
bio.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
block_dev.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
buffer.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c compat.c: LOOP_CLR_FD is taken care of in loop.c itself... 2013-06-29 12:46:44 +04:00
compat.c [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
coredump.c do_coredump(): don't wait for thaw if coredump has already been interrupted 2013-05-04 14:45:54 -04:00
coredump.h
dcache.c Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
dcookies.c consolidate compat lookup_dcookie() 2013-03-03 23:00:23 -05:00
direct-io.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
drop_caches.c
eventfd.c
eventpoll.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-05-01 07:21:43 -07:00
exec.c allow build_open_flags() to return an error 2013-06-29 12:57:09 +04:00
fcntl.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
fhandle.c
file_table.c Replace a bunch of file->dentry->d_inode refs with file_inode() 2013-06-29 12:57:13 +04:00
file.c don't bother with deferred freeing of fdtables 2013-05-01 17:31:42 -04:00
filesystems.c fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
fs_struct.c constify path_get/path_put and fs_struct.c stuff 2013-03-01 23:51:07 -05:00
fs-writeback.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
generic_acl.c
inode.c allow the temp files created by open() to be linked to 2013-06-29 12:57:11 +04:00
internal.h constify rw_verify_area() 2013-06-29 12:57:34 +04:00
ioctl.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
ioprio.c
Kconfig efivarfs: Move to fs/efivarfs 2013-04-17 13:25:09 +01:00
Kconfig.binfmt fs: make binfmt support for #! scripts modular and removable 2013-04-30 17:04:04 -07:00
libfs.c [readdir] switch dcache_readdir() users to ->iterate() 2013-06-29 12:46:48 +04:00
locks.c locks: protect most of the file_lock handling with i_lock 2013-06-29 12:57:42 +04:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
mbcache.c
mount.h get rid of full-hash scan on detaching vfsmounts 2013-04-09 14:12:52 -04:00
mpage.c
namei.c Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
namespace.c create_mnt_ns: unidiomatic use of list_add() 2013-05-04 15:18:53 -04:00
no-block.c
open.c [O_TMPFILE] it's still short a few helpers, but infrastructure should be OK now... 2013-06-29 12:57:10 +04:00
pipe.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
pnode.c vfs: Fix invalid ida_remove() call 2013-05-31 15:16:33 -04:00
pnode.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
posix_acl.c
proc_namespace.c
read_write.c constify rw_verify_area() 2013-06-29 12:57:34 +04:00
readdir.c [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
select.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
seq_file.c new helper: single_open_size() 2013-04-09 14:13:29 -04:00
signalfd.c switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE 2013-03-03 22:58:46 -05:00
splice.c splice: lift checks from do_splice_from() into callers 2013-06-29 12:57:35 +04:00
stack.c
stat.c switch vfs_getattr() to struct path 2013-02-26 02:46:08 -05:00
statfs.c
super.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
sync.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
timerfd.c compat: restore timerfd settime and gettime compat syscalls 2013-03-02 09:35:13 -05:00
utimes.c
xattr_acl.c
xattr.c