linux/fs
Pavel Emelyanov cf7b708c8d Make access to task's nsproxy lighter
When someone wants to deal with some other taks's namespaces it has to lock
the task and then to get the desired namespace if the one exists.  This is
slow on read-only paths and may be impossible in some cases.

E.g.  Oleg recently noticed a race between unshare() and the (sent for
review in cgroups) pid namespaces - when the task notifies the parent it
has to know the parent's namespace, but taking the task_lock() is
impossible there - the code is under write locked tasklist lock.

On the other hand switching the namespace on task (daemonize) and releasing
the namespace (after the last task exit) is rather rare operation and we
can sacrifice its speed to solve the issues above.

The access to other task namespaces is proposed to be performed
like this:

     rcu_read_lock();
     nsproxy = task_nsproxy(tsk);
     if (nsproxy != NULL) {
             / *
               * work with the namespaces here
               * e.g. get the reference on one of them
               * /
     } / *
         * NULL task_nsproxy() means that this task is
         * almost dead (zombie)
         * /
     rcu_read_unlock();

This patch has passed the review by Eric and Oleg :) and,
of course, tested.

[clg@fr.ibm.com: fix unshare()]
[ebiederm@xmission.com: Update get_net_ns_by_pid]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:37 -07:00
..
9p 9p: fix bad kconfig cross-dependency 2007-10-17 14:31:07 -05:00
adfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
affs fs: mark nibblemap const 2007-10-17 08:42:47 -07:00
afs KEYS: Make request_key() and co fundamentally asynchronous 2007-10-17 08:42:57 -07:00
autofs pid namespaces: round up the API 2007-10-19 11:53:37 -07:00
autofs4 pid namespaces: round up the API 2007-10-19 11:53:37 -07:00
befs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
bfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
cifs CIFS: ignore mode change if it's just for clearing setuid/setgid bits 2007-10-18 14:37:22 -07:00
coda pid namespaces: round up the API 2007-10-19 11:53:37 -07:00
configfs r/o bind mounts: filesystem helpers for custom 'struct file's 2007-10-17 08:43:04 -07:00
cramfs fs/cramfs/inode.c: replace hardcoded value with preprocessor constant 2007-10-18 14:37:29 -07:00
debugfs docbook: fix filesystems content 2007-10-15 17:56:36 -07:00
devpts devpts: add fsnotify create event 2007-05-08 11:14:59 -07:00
dlm menuconfig: transform NLS and DLM menus 2007-10-17 08:43:00 -07:00
ecryptfs ecryptfs: allow lower fs to interpret ATTR_KILL_S*ID 2007-10-18 14:37:21 -07:00
efs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
exportfs knfsd: exportfs: split out reconnecting a dentry from find_exported_dentry 2007-07-17 10:23:06 -07:00
ext2 ext2 reservations 2007-10-17 08:43:02 -07:00
ext3 JBD: Fix JBD warnings when compiling with CONFIG_JBD_DEBUG 2007-10-19 11:53:35 -07:00
ext4 ext4: lighten up resize transaction requirements 2007-10-17 18:50:04 -04:00
fat Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
freevxfs mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
fuse fuse: add blksize field to fuse_attr 2007-10-18 14:37:31 -07:00
gfs2 fs: correct SuS compliance for open of large file without options 2007-10-17 08:43:01 -07:00
hfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
hfsplus Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
hostfs uml: fix hostfs style 2007-10-16 09:43:07 -07:00
hpfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
hppfs
hugetlbfs r/o bind mounts: filesystem helpers for custom 'struct file's 2007-10-17 08:43:04 -07:00
isofs fs/isofs/namei.c: Remove uninitialized local vars warning 2007-10-17 08:42:58 -07:00
jbd JBD: Fix JBD warnings when compiling with CONFIG_JBD_DEBUG 2007-10-19 11:53:35 -07:00
jbd2 JBD2: debug code cleanup. 2007-10-17 18:49:59 -04:00
jffs2 Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
jfs introduce I_SYNC 2007-10-17 08:43:02 -07:00
lockd NFS/SUNRPC: use transport protocol naming 2007-10-09 17:17:53 -04:00
minix limit minixfs printks on corrupted dir i_size 2007-10-17 08:42:53 -07:00
msdos
ncpfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
nfs NFS: if ATTR_KILL_S*ID bits are set, then skip mode change 2007-10-18 14:37:22 -07:00
nfs_common
nfsd knfsd: only set ATTR_KILL_S*ID if ATTR_MODE isn't being explicitly set 2007-10-18 14:37:22 -07:00
nls sparse pointer use of zero as null 2007-10-18 14:37:31 -07:00
ntfs writeback: fix ntfs with sb_has_dirty_inodes() 2007-10-17 08:43:02 -07:00
ocfs2 Fix f_version type: should be u64 instead of unsigned long 2007-10-17 08:42:53 -07:00
openpromfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
partitions fs/partitions/sun.c endianness annotations 2007-10-14 12:41:51 -07:00
proc Make access to task's nsproxy lighter 2007-10-19 11:53:37 -07:00
qnx4 Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
ramfs Remove valueless definition of hard-selected RAMFS option 2007-10-17 08:42:56 -07:00
reiserfs reiserfs: ignore on disk s_bmap_nr value 2007-10-19 11:53:35 -07:00
romfs fs/romfs/inode.c: trivial improvements 2007-10-17 08:42:47 -07:00
smbfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
sysfs spin_lock_unlocked cleanups 2007-10-17 08:43:01 -07:00
sysv Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
udf fs/udf/balloc.c: mark a variable as uninitialized_var() 2007-10-17 08:43:00 -07:00
ufs ufs: Fix mount check in ufs_fill_super() 2007-10-17 08:42:51 -07:00
vfat
xfs Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 2007-10-17 09:04:11 -07:00
aio.c Remove struct task_struct::io_wait 2007-10-18 14:37:20 -07:00
anon_inodes.c anon-inodes use open coded atomic_inc for the shared inode 2007-10-17 08:43:00 -07:00
attr.c VFS: make notify_change pass ATTR_KILL_S*ID to setattr operations 2007-10-18 14:37:22 -07:00
bad_inode.c sendfile: remove bad_sendfile() from bad_file_ops 2007-07-10 08:04:15 +02:00
binfmt_aout.c core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe 2007-10-17 08:42:50 -07:00
binfmt_elf_fdpic.c pid namespaces: round up the API 2007-10-19 11:53:37 -07:00
binfmt_elf.c pid namespaces: round up the API 2007-10-19 11:53:37 -07:00
binfmt_em86.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_flat.c binfmt_flat: warning fixes 2007-10-17 08:42:54 -07:00
binfmt_misc.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
binfmt_script.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
binfmt_som.c core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe 2007-10-17 08:42:50 -07:00
bio.c bio: make freeing of ->bi_io_vec conditional in bio_free() 2007-10-16 11:03:52 +02:00
block_dev.c Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
buffer.c writeback: remove pages_skipped accounting in __block_write_full_page() 2007-10-17 08:43:02 -07:00
char_dev.c mm: bdi init hooks 2007-10-17 08:42:45 -07:00
compat_ioctl.c sparse pointer use of zero as null 2007-10-18 14:37:31 -07:00
compat.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
dcache.c vfs: use the predefined d_unhashed inline function instead 2007-10-17 08:43:00 -07:00
dcookies.c Remove fs.h from mm.h 2007-07-29 17:09:29 -07:00
direct-io.c remove ZERO_PAGE 2007-10-16 09:42:53 -07:00
dnotify.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
dquot.c quota: send messages via netlink 2007-10-17 08:42:56 -07:00
drop_caches.c invalidate_mapping_pages(): add cond_resched 2007-07-16 09:05:36 -07:00
eventfd.c eventfd use waitqueue lock ... 2007-05-18 13:09:34 -07:00
eventpoll.c sparse pointer use of zero as null 2007-10-18 14:37:31 -07:00
exec.c pid namespaces: use task_pid() to find leader's pid 2007-10-19 11:53:37 -07:00
fcntl.c F_DUPFD_CLOEXEC implementation 2007-10-17 08:43:01 -07:00
fifo.c Detach sched.h from mm.h 2007-05-21 09:18:19 -07:00
file_table.c r/o bind mounts: filesystem helpers for custom 'struct file's 2007-10-17 08:43:04 -07:00
file.c
filesystems.c add filesystem subtype support 2007-05-08 11:15:01 -07:00
fs-writeback.c introduce I_SYNC 2007-10-17 08:43:02 -07:00
generic_acl.c Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check 2007-07-17 12:00:03 -07:00
inode.c introduce I_SYNC 2007-10-17 08:43:02 -07:00
inotify_user.c change inotifyfs magic as the same magic is used for futexfs 2007-10-17 08:43:00 -07:00
inotify.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
internal.h cleanup compat ioctl handling 2007-05-08 11:15:09 -07:00
ioctl.c drop obsolete sys_ioctl export 2007-07-16 09:05:48 -07:00
ioprio.c
Kconfig jbd: config_jbd_debug cannot create /proc entry 2007-10-19 11:53:35 -07:00
Kconfig.binfmt fs: Kill sh dependency for binfmt_flat. 2007-05-21 14:34:00 +09:00
libfs.c make fs/libfs.c:simple_commit_write() static 2007-10-17 08:42:53 -07:00
locks.c Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
Makefile Remove valueless definition of hard-selected RAMFS option 2007-10-17 08:42:56 -07:00
mbcache.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
mpage.c mm: buffered write cleanup 2007-10-16 09:42:54 -07:00
namei.c VFS: allow filesystems to implement atomic open+truncate 2007-10-18 14:37:30 -07:00
namespace.c fs: remove the unused mempages parameter 2007-10-17 08:42:49 -07:00
nfsctl.c nfsctl: use vfs_path_lookup 2007-07-19 10:04:45 -07:00
no-block.c
open.c Implement file posix capabilities 2007-10-17 08:43:07 -07:00
pipe.c sched: affine sync wakeups 2007-10-15 17:00:19 +02:00
pnode.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
pnode.h
posix_acl.c
quota_v1.c
quota_v2.c
quota.c [IA64] Fix build failure in fs/quota.c 2007-07-27 15:40:13 -07:00
read_write.c Cleanup macros for distinguishing mandatory locks 2007-10-09 18:32:46 -04:00
read_write.h
readdir.c ROUND_UP macro cleanup in fs/(select|compat|readdir).c 2007-05-08 11:15:09 -07:00
select.c Use ERESTART_RESTARTBLOCK if poll() is interrupted by a signal 2007-10-17 08:42:53 -07:00
seq_file.c [FS] seq_file: Introduce the seq_open_private() 2007-10-10 16:55:33 -07:00
signalfd.c rename signalfd_siginfo fields 2007-10-17 08:43:01 -07:00
splice.c Implement file posix capabilities 2007-10-17 08:43:07 -07:00
stack.c
stat.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
super.c put declaration of put_filesystem() in fs.h 2007-10-19 11:53:33 -07:00
sync.c Introduce fixed sys_sync_file_range2() syscall, implement on PowerPC and ARM 2007-06-28 11:38:30 -07:00
timerfd.c make timerfd return a u64 and fix the __put_user 2007-07-26 11:35:17 -07:00
utimes.c VFS: check nanoseconds in utimensat 2007-10-17 08:42:52 -07:00
xattr_acl.c
xattr.c Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check 2007-07-17 12:00:03 -07:00