linux/fs/proc
Calvin Owens bdb4d100af procfs: always expose /proc/<pid>/map_files/ and make it readable
Currently, /proc/<pid>/map_files/ is restricted to CAP_SYS_ADMIN, and is
only exposed if CONFIG_CHECKPOINT_RESTORE is set.

Each mapped file region gets a symlink in /proc/<pid>/map_files/
corresponding to the virtual address range at which it is mapped.  The
symlinks work like the symlinks in /proc/<pid>/fd/, so you can follow them
to the backing file even if that backing file has been unlinked.

Currently, files which are mapped, unlinked, and closed are impossible to
stat() from userspace.  Exposing /proc/<pid>/map_files/ closes this
functionality "hole".

Not being able to stat() such files makes noticing and explicitly
accounting for the space they use on the filesystem impossible.  You can
work around this by summing up the space used by every file in the
filesystem and subtracting that total from what statfs() tells you, but
that obviously isn't great, and it becomes unworkable once your filesystem
becomes large enough.

This patch moves map_files/ out from behind CONFIG_CHECKPOINT_RESTORE, and
adjusts the permissions enforced on it as follows:

* proc_map_files_lookup()
* proc_map_files_readdir()
* map_files_d_revalidate()

	Remove the CAP_SYS_ADMIN restriction, leaving only the current
	restriction requiring PTRACE_MODE_READ. The information made
	available to userspace by these three functions is already
	available in /proc/PID/maps with MODE_READ, so I don't see any
	reason to limit them any further (see below for more detail).

* proc_map_files_follow_link()

	This stub has been added, and requires that the user have
	CAP_SYS_ADMIN in order to follow the links in map_files/,
	since there was concern on LKML both about the potential for
	bypassing permissions on ancestor directories in the path to
	files pointed to, and about what happens with more exotic
	memory mappings created by some drivers (ie dma-buf).

In older versions of this patch, I changed every permission check in
the four functions above to enforce MODE_ATTACH instead of MODE_READ.
This was an oversight on my part, and after revisiting the discussion
it seems that nobody was concerned about anything outside of what is
made possible by ->follow_link(). So in this version, I've left the
checks for PTRACE_MODE_READ as-is.

[akpm@linux-foundation.org: catch up with concurrent proc_pid_follow_link() changes]
Signed-off-by: Calvin Owens <calvinowens@fb.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
..
array.c capabilities: ambient capabilities 2015-09-04 16:54:41 -07:00
base.c procfs: always expose /proc/<pid>/map_files/ and make it readable 2015-09-10 13:29:01 -07:00
cmdline.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
consoles.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
cpuinfo.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
devices.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
fd.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
fd.h proc: Move proc_fd() to fs/proc/fd.h 2013-05-01 17:29:39 -04:00
generic.c proc: Allow creating permanently empty directories that serve as mount points 2015-07-01 10:36:41 -05:00
inode.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
internal.h proc: Allow creating permanently empty directories that serve as mount points 2015-07-01 10:36:41 -05:00
interrupts.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
Kconfig fs, proc: add help for CONFIG_PROC_CHILDREN 2015-07-17 16:39:52 -07:00
kcore.c x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86 2015-07-18 03:42:51 +02:00
kmsg.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
loadavg.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
Makefile proc: Implement /proc/thread-self to point at the directory of the current thread 2014-08-04 10:07:11 -07:00
meminfo.c fs/proc/meminfo.c: include cma info in proc/meminfo 2014-12-18 19:08:10 -08:00
namespaces.c don't pass nameidata to ->follow_link() 2015-05-10 22:20:15 -04:00
nommu.c vfs: add seq_file_path() helper 2015-06-23 18:01:07 -04:00
page.c proc: add cond_resched to /proc/kpage* read/write loop 2015-09-10 13:29:01 -07:00
proc_net.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
proc_sysctl.c sysctl: Allow creating permanently empty directories that serve as mountpoints. 2015-07-01 10:36:39 -05:00
proc_tty.c proc: remove proc_tty_ldisc variable 2014-08-08 15:57:22 -07:00
root.c vfs: Commit to never having exectuables on proc and sysfs. 2015-07-10 10:39:25 -05:00
self.c don't pass nameidata to ->follow_link() 2015-05-10 22:20:15 -04:00
softirqs.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
stat.c genirq: Prevent proc race against freeing of irq descriptors 2014-12-13 13:33:07 +01:00
task_mmu.c mm: introduce idle page tracking 2015-09-10 13:29:01 -07:00
task_nommu.c vfs: add seq_file_path() helper 2015-06-23 18:01:07 -04:00
thread_self.c don't pass nameidata to ->follow_link() 2015-05-10 22:20:15 -04:00
uptime.c cputime: Default implementation of nsecs -> cputime conversion 2014-03-13 15:56:43 +01:00
version.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
vmcore.c vmcore: fix PT_NOTE n_namesz, n_descsz overflow issue 2015-02-17 14:34:52 -08:00