linux/kernel
Michal Hocko 1860033237 mm: make PR_SET_THP_DISABLE immediately active
PR_SET_THP_DISABLE has a rather subtle semantic.  It doesn't affect any
existing mapping because it only updated mm->def_flags which is a
template for new mappings.

The mappings created after prctl(PR_SET_THP_DISABLE) have VM_NOHUGEPAGE
flag set.  This can be quite surprising for all those applications which
do not do prctl(); fork() & exec() and want to control their own THP
behavior.

Another usecase when the immediate semantic of the prctl might be useful
is a combination of pre- and post-copy migration of containers with
CRIU.  In this case CRIU populates a part of a memory region with data
that was saved during the pre-copy stage.  Afterwards, the region is
registered with userfaultfd and CRIU expects to get page faults for the
parts of the region that were not yet populated.  However, khugepaged
collapses the pages and the expected page faults do not occur.

In more general case, the prctl(PR_SET_THP_DISABLE) could be used as a
temporary mechanism for enabling/disabling THP process wide.

Implementation wise, a new MMF_DISABLE_THP flag is added.  This flag is
tested when decision whether to use huge pages is taken either during
page fault of at the time of THP collapse.

It should be noted, that the new implementation makes PR_SET_THP_DISABLE
master override to any per-VMA setting, which was not the case
previously.

Fixes: a0715cc226 ("mm, thp: add VM_INIT_DEF_MASK and PRCTL_THP_DISABLE")
Link: http://lkml.kernel.org/r/1496415802-30944-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10 16:32:31 -07:00
..
bpf Merge branch 'work.memdup_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-05 16:05:24 -07:00
cgroup mm, cpuset: always use seqlock when changing task's nodemask 2017-07-06 16:24:34 -07:00
configs config: android-base: disable CONFIG_NFSD and CONFIG_NFS_FS 2017-06-09 11:47:38 +02:00
debug sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
events Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-07-05 12:31:59 -07:00
gcov gcov: support GCC 7.1 2017-05-12 15:57:15 -07:00
irq Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-09 10:24:46 -07:00
livepatch livepatch: Fix stacking of patches with respect to RCU 2017-06-20 10:42:19 +02:00
locking Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-09 10:47:50 -07:00
power kernel/power/snapshot.c: use linux/set_memory.h 2017-07-06 16:24:30 -07:00
printk Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2017-07-05 11:11:26 -07:00
rcu rcu: Remove RCU CPU stall warnings from Tiny RCU 2017-06-08 18:52:45 -07:00
sched sched/fair: Fix load_balance() affinity redo path 2017-07-05 16:28:48 +02:00
time Merge branch 'timers-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-05 15:34:35 -07:00
trace Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-09 10:49:47 -07:00
.gitignore
acct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
async.c async: Adjust system_state checks 2017-05-23 10:01:37 +02:00
audit_fsnotify.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-05-03 11:05:15 -07:00
audit_tree.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-05-03 11:05:15 -07:00
audit_watch.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-05-03 11:05:15 -07:00
audit.c Merge branch 'stable-4.13' of git://git.infradead.org/users/pcmoore/audit 2017-07-05 11:24:05 -07:00
audit.h audit: style fix 2017-06-12 18:07:43 -04:00
auditfilter.c audit: kernel generated netlink traffic should have a portid of 0 2017-05-02 10:16:05 -04:00
auditsc.c Merge branch 'stable-4.13' of git://git.infradead.org/users/pcmoore/audit 2017-07-05 11:24:05 -07:00
backtracetest.c
bounds.c
capability.c capability: export has_capability 2017-01-12 07:01:56 -07:00
compat.c Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-06 20:57:13 -07:00
configs.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
context_tracking.c
cpu_pm.c
cpu.c smp/hotplug: Move unparking of percpu threads to the control CPU 2017-07-06 10:55:10 +02:00
crash_core.c ia64: reuse append_elf_note() and final_note() functions 2017-05-08 17:15:11 -07:00
crash_dump.c
cred.c doc: ReSTify credentials.txt 2017-05-18 10:30:19 -06:00
delayacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c fix waitid(2) breakage 2017-07-08 11:26:39 -04:00
extable.c kernel/extable.c: mark core_kernel_text notrace 2017-07-06 16:24:29 -07:00
fork.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-09 10:52:16 -07:00
freezer.c freezer, oom: check TIF_MEMDIE on the correct task 2016-07-28 16:07:41 -07:00
futex_compat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
futex.c There has been a fair amount of activity in the docs tree this time 2017-07-03 21:13:25 -07:00
groups.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
hung_task.c kernel/hung_task.c: defer showing held locks 2017-05-08 17:15:10 -07:00
irq_work.c
jump_label.c jump_label: Reorder hotplug lock and jump_label_lock 2017-05-26 10:10:45 +02:00
kallsyms.c bpf: make jited programs visible in traces 2017-02-17 13:40:05 -05:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/mutex: Allow MUTEX_SPIN_ON_OWNER when DEBUG_MUTEXES 2016-10-25 11:31:51 +02:00
Kconfig.preempt
kcov.c kcov: simplify interrupt check 2017-05-08 17:15:12 -07:00
kexec_core.c objtool, x86: Add several functions and files to the objtool whitelist 2017-06-30 10:19:19 +02:00
kexec_file.c kimage_file_prepare_segments(): don't open-code memdup_user() 2017-06-30 02:04:10 -04:00
kexec_internal.h kexec, x86/purgatory: Unbreak it and clean it up 2017-03-10 20:55:09 +01:00
kexec.c kexec: allow architectures to override boot mapping 2016-08-02 19:35:27 -04:00
kmod.c sched/headers, vfs/execve: Prepare to move the do_execve*() prototypes from <linux/sched.h> to <linux/binfmts.h> 2017-03-02 08:42:39 +01:00
kprobes.c kprobes: Ensure that jprobe probepoints are at function entry 2017-07-08 11:05:35 +02:00
ksysfs.c crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE 2017-05-08 17:15:11 -07:00
kthread.c cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups 2017-03-17 10:18:47 -04:00
latencytop.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
Makefile crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE 2017-05-08 17:15:11 -07:00
membarrier.c Fix: Disable sys_membarrier when nohz_full is enabled 2017-01-23 11:32:16 -08:00
memremap.c mm, memory_hotplug: replace for_device by want_memblock in arch_add_memory 2017-07-06 16:24:32 -07:00
module_signing.c
module-internal.h
module.c Merge branch 'akpm' (patches from Andrew) 2017-07-06 22:27:08 -07:00
notifier.c kernel/notifier.c: simplify expression 2017-02-24 17:46:56 -08:00
nsproxy.c perf: Add PERF_RECORD_NAMESPACES to include namespaces related info 2017-03-13 15:57:41 -03:00
padata.c padata: Avoid nested calls to cpus_read_lock() in pcrypt_init_padata() 2017-05-26 10:10:37 +02:00
panic.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
params.c boot/param: Move next_arg() function to lib/cmdline.c for later reuse 2017-04-18 10:37:13 +02:00
pid_namespace.c pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes 2017-05-13 17:26:01 -05:00
pid.c mm: update callers to use HASH_ZERO flag 2017-07-06 16:24:33 -07:00
profile.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
ptrace.c ptrace: Properly initialize ptracer_cred on fork 2017-05-23 07:40:44 -05:00
range.c
reboot.c
relay.c Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:38:06 -07:00
resource.c
seccomp.c seccomp: Switch from atomic_t to recount_t 2017-06-26 09:24:00 -07:00
signal.c Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-06 20:57:13 -07:00
smp.c smp, cpumask: Use non-atomic cpumask_{set,clear}_cpu() 2017-05-23 10:01:32 +02:00
smpboot.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
smpboot.h
softirq.c sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() 2017-04-11 09:06:32 +02:00
stacktrace.c stacktrace/x86: add function for detecting reliable stack traces 2017-03-08 09:18:02 +01:00
stop_machine.c stop_machine: Provide stop_machine_cpuslocked() 2017-05-26 10:10:36 +02:00
sys_ni.c move aio compat to fs/aio.c 2016-12-22 22:58:37 -05:00
sys.c mm: make PR_SET_THP_DISABLE immediately active 2017-07-10 16:32:31 -07:00
sysctl_binary.c sysctl: switch to use uuid_t 2017-06-05 16:59:15 +02:00
sysctl.c proc/sysctl: fix the int overflow for jiffies conversion 2017-05-08 17:15:10 -07:00
task_work.c task_work: use READ_ONCE/lockless_dereference, avoid pi_lock if !task_works 2016-08-02 19:35:02 -04:00
taskstats.c taskstats: add e/u/stime for TGID command 2017-05-08 17:15:12 -07:00
test_kprobes.c
torture.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
tracepoint.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
tsacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
ucount.c ucount: Remove the atomicity from ucount->count 2017-03-06 15:26:37 -06:00
uid16.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
up.c smp: Add function to execute a function synchronously on a CPU 2016-09-05 13:52:39 +02:00
user_namespace.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
user-return-notifier.c
user.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> 2017-03-02 08:42:29 +01:00
utsname_sysctl.c sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> 2017-03-03 01:45:36 +01:00
utsname.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
watchdog_hld.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
watchdog.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
workqueue_internal.h
workqueue.c sched/wait: Rename wait_queue_t => wait_queue_entry_t 2017-06-20 12:18:27 +02:00