linux/fs/proc
Pavel Emelyanov 040fa02077 clear_refs: sanitize accepted commands declaration
This is the implementation of the soft-dirty bit concept that should
help keep track of changes in user memory, which in turn is very-very
required by the checkpoint-restore project (http://criu.org).

To create a dump of an application(s) we save all the information about
it to files, and the biggest part of such dump is the contents of tasks'
memory.  However, there are usage scenarios where it's not required to
get _all_ the task memory while creating a dump.  For example, when
doing periodical dumps, it's only required to take full memory dump only
at the first step and then take incremental changes of memory.  Another
example is live migration.  We copy all the memory to the destination
node without stopping all tasks, then stop them, check for what pages
has changed, dump it and the rest of the state, then copy it to the
destination node.  This decreases freeze time significantly.

That said, some help from kernel to watch how processes modify the
contents of their memory is required.

The proposal is to track changes with the help of new soft-dirty bit
this way:

1. First do "echo 4 > /proc/$pid/clear_refs".
   At that point kernel clears the soft dirty _and_ the writable bits from all
   ptes of process $pid. From now on every write to any page will result in #pf
   and the subsequent call to pte_mkdirty/pmd_mkdirty, which in turn will set
   the soft dirty flag.

2. Then read the /proc/$pid/pagemap2 and check the soft-dirty bit reported there
   (the 55'th one). If set, the respective pte was written to since last call
   to clear refs.

The soft-dirty bit is the _PAGE_BIT_HIDDEN one.  Although it's used by
kmemcheck, the latter one marks kernel pages with it, while the former
bit is put on user pages so they do not conflict to each other.

This patch:

A new clear-refs type will be added in the next patch, so prepare
code for that.

[akpm@linux-foundation.org: don't assume that sizeof(enum clear_refs_types) == sizeof(int)]
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:25 -07:00
..
array.c kthread: Prevent unpark race which puts threads on the wrong cpu 2013-04-12 14:18:43 +02:00
base.c proc_fill_cache(): clean up, get rid of pointless find_inode_number() use 2013-06-29 12:57:19 +04:00
cmdline.c proc: switch /proc/cmdline to seq_file 2008-10-23 14:29:04 +04:00
consoles.c console: rename acquire/release_console_sem() to console_lock/unlock() 2011-01-26 10:50:06 +10:00
cpuinfo.c proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c 2008-10-23 15:05:11 +04:00
devices.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
fd.c proc_fill_cache(): just make instantiate_t return int 2013-06-29 12:57:18 +04:00
fd.h proc: Move proc_fd() to fs/proc/fd.h 2013-05-01 17:29:39 -04:00
generic.c [readdir] convert procfs 2013-06-29 12:56:32 +04:00
inode.c proc: Split the namespace stuff out into linux/proc_ns.h 2013-05-01 17:29:39 -04:00
internal.h proc_fill_cache(): just make instantiate_t return int 2013-06-29 12:57:18 +04:00
interrupts.c proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c 2008-10-23 15:15:46 +04:00
Kconfig kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT 2011-01-20 17:02:05 -08:00
kcore.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
kmsg.c kmsg: honor dmesg_restrict sysctl on /dev/kmsg 2013-06-12 16:29:44 -07:00
loadavg.c sched, timers: cleanup avenrun users 2009-05-15 15:32:45 +02:00
Makefile mm, vmalloc: move get_vmalloc_info() to vmalloc.c 2013-04-29 15:54:33 -07:00
meminfo.c mm, vmalloc: move get_vmalloc_info() to vmalloc.c 2013-04-29 15:54:33 -07:00
namespaces.c proc_fill_cache(): just make instantiate_t return int 2013-06-29 12:57:18 +04:00
nommu.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
page.c kpageflags: fix wrong KPF_THP on non-huge compound pages 2012-10-09 16:23:00 +09:00
proc_devtree.c proc_devtree: Replace include linux/module.h with linux/export.h 2013-05-04 15:31:01 -04:00
proc_net.c [readdir] convert procfs 2013-06-29 12:56:32 +04:00
proc_sysctl.c Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
proc_tty.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
root.c [readdir] convert procfs 2013-06-29 12:56:32 +04:00
self.c Include missing linux/slab.h inclusions 2013-04-29 15:42:01 -04:00
softirqs.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
stat.c stat: Use size_t for sizes instead of unsigned 2013-02-01 12:32:08 +02:00
task_mmu.c clear_refs: sanitize accepted commands declaration 2013-07-03 16:07:25 -07:00
task_nommu.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
uptime.c Merge branch 'sched/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into cputime-tip 2011-12-19 19:23:15 +01:00
version.c proc: switch /proc/version to seq_file 2008-10-23 14:19:58 +04:00
vmcore.c proc: Supply a function to remove a proc entry by PDE 2013-05-01 17:29:46 -04:00