linux/fs
Tejun Heo 1ae06819c7 kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers
Sometimes it's necessary to implement a node which wants to delete
nodes including itself.  This isn't straightforward because of kernfs
active reference.  While a file operation is in progress, an active
reference is held and kernfs_remove() waits for all such references to
drain before completing.  For a self-deleting node, this is a deadlock
as kernfs_remove() ends up waiting for an active reference that itself
is sitting on top of.

This currently is worked around in the sysfs layer using
sysfs_schedule_callback() which makes such removals asynchronous.
While it works, it's rather cumbersome and inherently breaks
synchronicity of the operation - the file operation which triggered
the operation may complete before the removal is finished (or even
started) and the removal may fail asynchronously.  If a removal
operation is immmediately followed by another operation which expects
the specific name to be available (e.g. removal followed by rename
onto the same name), there's no way to make the latter operation
reliable.

The thing is there's no inherent reason for this to be asynchrnous.
All that's necessary to do this synchronous is a dedicated operation
which drops its own active ref and deactivates self.  This patch
implements kernfs_remove_self() and its wrappers in sysfs and driver
core.  kernfs_remove_self() is to be called from one of the file
operations, drops the active ref and deactivates using
__kernfs_deactivate_self(), removes the self node, and restores active
ref to the dead node using __kernfs_reactivate_self() so that the ref
is balanced afterwards.  __kernfs_remove() is updated so that it takes
an early exit if the target node is already fully removed so that the
active ref restored by kernfs_remove_self() after removal doesn't
confuse the deactivation path.

This makes implementing self-deleting nodes very easy.  The normal
removal path doesn't even need to be changed to use
kernfs_remove_self() for the self-deleting node.  The method can
invoke kernfs_remove_self() on itself before proceeding the normal
removal path.  kernfs_remove() invoked on the node by the normal
deletion path will simply be ignored.

This will replace sysfs_schedule_callback().  A subtle feature of
sysfs_schedule_callback() is that it collapses multiple invocations -
even if multiple removals are triggered, the removal callback is run
only once.  An equivalent effect can be achieved by testing the return
value of kernfs_remove_self() - only the one which gets %true return
value should proceed with actual deletion.  All other instances of
kernfs_remove_self() will wait till the enclosing kernfs operation
which invoked the winning instance of kernfs_remove_self() finishes
and then return %false.  This trivially makes all users of
kernfs_remove_self() automatically show correct synchronous behavior
even when there are multiple concurrent operations - all "echo 1 >
delete" instances will finish only after the whole operation is
completed by one of the instances.

v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing
    and sysfs_remove_file_self() had incorrect return type.  Fix it.
    Reported by kbuild test bot.

v3: Updated to use __kernfs_{de|re}activate_self().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-10 14:01:05 -08:00
..
9p consolidate simple ->d_delete() instances 2013-11-15 22:04:17 -05:00
adfs adfs: delayed freeing of sbi 2013-10-24 23:43:27 -04:00
affs remove obsolete references to powertweak 2013-11-27 20:34:32 -08:00
afs Merge branch 'fscache' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into linux-next 2013-10-28 19:36:46 -04:00
autofs4 autofs4: make freeing sbi rcu-delayed 2013-10-24 23:43:27 -04:00
befs befs: split symlink iops in two - for short and long symlinks resp. 2013-10-24 23:34:50 -04:00
bfs truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2013-12-12 15:25:10 -08:00
cachefiles Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
ceph ceph: Avoid data inconsistency due to d-cache aliasing in readpage() 2013-12-13 09:11:38 -08:00
cifs [CIFS] Do not use btrfs refcopy ioctl for SMB2 copy offload 2013-11-25 09:50:31 -06:00
coda coda_revalidate_inode(): switch to passing inode... 2013-11-09 00:16:21 -05:00
configfs configfs: fix race between dentry put and lookup 2013-11-21 16:42:27 -08:00
cramfs cramfs: mark as obsolete 2013-11-13 12:09:12 +09:00
debugfs debugfs: use list_next_entry() in debugfs_remove_recursive() 2013-11-13 12:09:24 +09:00
devpts devpts: plug the memory leak in kill_sb 2013-11-13 12:09:36 +09:00
dlm genetlink: only pass array to genl_register_family_with_ops() 2013-11-19 16:39:05 -05:00
ecryptfs Quiet static checkers by removing unneeded conditionals 2013-11-22 10:58:14 -08:00
efivarfs consolidate simple ->d_delete() instances 2013-11-15 22:04:17 -05:00
efs efs: iget_locked() doesn't return an ERR_PTR() 2013-08-24 12:10:22 -04:00
exofs truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
exportfs exportfs: fix quadratic behavior in filehandle lookup 2013-11-09 00:16:38 -05:00
ext2 ext2: Fix fs corruption in ext2_get_xip_mem() 2013-11-05 11:26:47 +01:00
ext3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2013-11-13 15:25:47 +09:00
ext4 Ext4 updates for 3.13. Mostly bug fixes and cleanups. 2013-11-14 17:19:58 +09:00
f2fs f2fs: issue more large discard command 2013-11-11 09:36:32 +09:00
fat fat: rcu-delay unloading nls and freeing sbi 2013-10-24 23:43:28 -04:00
freevxfs [readdir] convert freevxfs 2013-06-29 12:56:53 +04:00
fscache Merge branch 'for-3.13/core' of git://git.kernel.dk/linux-block 2013-11-14 12:08:14 +09:00
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
gfs2 GFS2: Fix ref count bug relating to atomic_open 2013-11-21 18:47:57 +00:00
hfs fs/hfs/btree.h: remove duplicate defines 2013-11-13 12:09:32 +09:00
hfsplus block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
hostfs consolidate simple ->d_delete() instances 2013-11-15 22:04:17 -05:00
hpfs locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
hppfs clean up scary strncpy(dst, src, strlen(src)) uses 2013-07-03 16:07:41 -07:00
hugetlbfs cope with potentially long ->d_dname() output for shmem/hugetlb 2013-08-24 12:10:17 -04:00
isofs isofs: don't pass dentry to isofs_hash{i,}_common() 2013-10-24 23:34:59 -04:00
jbd jbd: Revert "jbd: remove dependency on __GFP_NOFAIL" 2013-10-31 20:37:15 +01:00
jbd2 jbd2: Fix endian mixing problems in the checksumming code 2013-08-28 14:59:58 -04:00
jffs2 jffs2: do not support the MLC nand 2013-10-27 16:27:07 -07:00
jfs Just a patch to fix an oops in an error path. 2013-10-22 09:01:11 +01:00
kernfs kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers 2014-01-10 14:01:05 -08:00
lockd LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs 2013-08-05 15:03:46 -04:00
logfs block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
minix fs/minix: Drop dependency on H8300 2013-09-16 18:20:25 -07:00
ncpfs ncpfs: rcu-delay unload_nls() and freeing ncp_server 2013-10-24 23:43:28 -04:00
nfs NFS client bugfixes 2013-12-05 13:05:48 -08:00
nfs_common
nfsd nfsd: when reusing an existing repcache entry, unhash it first 2013-12-10 20:34:44 -05:00
nilfs2 nilfs2: fix issue with race condition of competition between segments for dirty blocks 2013-09-30 14:31:02 -07:00
nls
notify fsnotify: update comments concerning locking scheme 2013-07-09 10:33:20 -07:00
ntfs iget/iget5: don't bother with ->i_lock until we find a match 2013-11-09 00:16:31 -05:00
ocfs2 tree-wide: use reinit_completion instead of INIT_COMPLETION 2013-11-15 09:32:21 +09:00
omfs truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
openpromfs [readdir] convert openpromfs 2013-06-29 12:56:32 +04:00
proc procfs: also fix proc_reg_get_unmapped_area() for !MMU case 2013-12-12 18:19:26 -08:00
pstore pstore: Don't allow high traffic options on fragile devices 2013-12-20 13:12:01 -08:00
qnx4 qnx4: i_sb is never NULL 2013-11-09 00:16:32 -05:00
qnx6 [readdir] convert qnx6 2013-06-29 12:56:39 +04:00
quota genetlink: make multicast groups const, prevent abuse 2013-11-19 16:39:06 -05:00
ramfs initmpfs: move rootfs code from fs/ramfs/ to init/ 2013-09-11 15:59:37 -07:00
reiserfs reiserfs: fix race with flush_used_journal_lists and flush_journal_list 2013-09-24 11:24:21 +02:00
romfs [readdir] convert romfs 2013-06-29 12:56:29 +04:00
squashfs Squashfs: fix failure to unlock pages on decompress error 2013-11-24 01:02:50 +00:00
sysfs kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers 2014-01-10 14:01:05 -08:00
sysv sysv: Add forgotten superblock lock init for v7 fs 2013-09-29 22:02:02 -04:00
ubifs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
udf udf: fix for pathetic mount times in case of invalid file system 2013-10-18 22:39:07 +02:00
ufs truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
xfs xfs: abort metadata writeback on permanent errors 2013-12-17 09:40:23 -06:00
aio.c Merge git://git.kvack.org/~bcrl/aio-next 2013-12-22 11:03:49 -08:00
anon_inodes.c ... and kill anon_inode_getfile_private() 2013-11-09 00:16:28 -05:00
attr.c locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
bad_inode.c [readdir] ->readdir() is gone 2013-06-29 12:57:04 +04:00
binfmt_aout.c dump_skip(): dump_seek() replacement taking coredump_params 2013-11-09 00:16:26 -05:00
binfmt_elf_fdpic.c elf{,_fdpic} coredump: get rid of pointless if (siginfo->si_signo) 2013-11-09 00:16:30 -05:00
binfmt_elf.c elf{,_fdpic} coredump: get rid of pointless if (siginfo->si_signo) 2013-11-09 00:16:30 -05:00
binfmt_em86.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
binfmt_som.c
bio-integrity.c Merge branch 'for-3.12/core' of git://git.kernel.dk/linux-block 2013-09-22 15:00:11 -07:00
bio.c bio: fix argument of __bio_add_page() for max_sectors > 0xffff 2013-11-18 12:31:27 -07:00
block_dev.c a trivial writeback fix 2013-09-13 23:06:40 -04:00
buffer.c fs: buffer: move allocation failure loop into the allocator 2013-10-16 21:35:53 -07:00
char_dev.c Merge branch 'for-3.13/core' of git://git.kernel.dk/linux-block 2013-11-14 12:08:14 +09:00
compat_binfmt_elf.c
compat_ioctl.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
compat.c [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
coredump.c dump_emit(): use __kernel_write(), not vfs_write() 2013-11-15 22:04:09 -05:00
coredump.h
dcache.c dcache: allow word-at-a-time name hashing with big-endian CPUs 2013-12-12 10:39:01 -08:00
dcookies.c
direct-io.c direct-io: Use return from cmpxchg to decide of assignment happened 2013-09-09 10:47:42 -07:00
drop_caches.c shrinker: add node awareness 2013-09-10 18:56:31 -04:00
eventfd.c
eventpoll.c epoll: drop EPOLLWAKEUP if PM_SLEEP is disabled 2013-12-03 15:35:52 +01:00
exec.c Merge git://git.infradead.org/users/eparis/audit 2013-11-21 19:18:14 -08:00
fcntl.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
fhandle.c
file_table.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
file.c
filesystems.c
fs_struct.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
fs-writeback.c Merge branch 'akpm' (patches from Andrew Morton) 2013-11-13 15:45:43 +09:00
generic_acl.c
inode.c locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
internal.h get rid of s_files and files_lock 2013-11-09 00:16:20 -05:00
ioctl.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
ioprio.c
Kconfig
Kconfig.binfmt
libfs.c consolidate simple ->d_delete() instances 2013-11-15 22:04:17 -05:00
locks.c locks: missing unlock on error in generic_add_lease() 2013-11-13 07:30:53 -05:00
Makefile sysfs, kernfs: add skeletons for kernfs 2013-11-27 13:28:24 -08:00
mbcache.c fs: convert fs shrinkers to new scan/count API 2013-09-10 18:56:31 -04:00
mount.h RCU'd vfsmounts 2013-11-09 00:16:19 -05:00
mpage.c
namei.c dcache: allow word-at-a-time name hashing with big-endian CPUs 2013-12-12 10:39:01 -08:00
namespace.c sysfs, kernfs: prepare mount path for kernfs 2013-11-29 18:16:08 -08:00
no-block.c
open.c locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
pipe.c vfs: fix subtle use-after-free of pipe_inode_info 2013-12-02 09:44:51 -08:00
pnode.c split __lookup_mnt() in two functions 2013-10-24 23:35:00 -04:00
pnode.h vfs: Don't copy mount bind mounts of /proc/<pid>/ns/mnt between namespaces 2013-08-26 18:42:15 -07:00
posix_acl.c
proc_namespace.c don't bother with vfsmount_lock in mounts_poll() 2013-10-24 23:34:59 -04:00
read_write.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
readdir.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
select.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
seq_file.c seq_file: always clear m->count when we free m->buf 2013-11-18 19:07:53 -08:00
signalfd.c
splice.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
stack.c
stat.c vfs: split out vfs_getattr_nosec 2013-11-09 00:16:31 -05:00
statfs.c vfs: allow O_PATH file descriptors for fstatfs() 2013-10-12 13:12:31 -07:00
super.c get rid of s_files and files_lock 2013-11-09 00:16:20 -05:00
sync.c Merge branch 'akpm' (patches from Andrew Morton) 2013-11-13 15:45:43 +09:00
timerfd.c
utimes.c locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
xattr_acl.c
xattr.c