linux/kernel
Willy Tarreau 759c01142a pipe: limit the per-user amount of pages allocated in pipes
On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-19 19:25:21 -05:00
..
bpf net: bpf: reject invalid shifts 2016-01-12 17:06:53 -05:00
configs
debug module: use a structure to encapsulate layout. 2015-12-04 22:46:25 +01:00
events mm, shmem: add internal shmem resident memory accounting 2016-01-14 16:00:49 -08:00
gcov gcov: use within_module() helper. 2015-12-04 22:46:25 +01:00
irq Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-01-11 18:28:06 -08:00
livepatch livepatch: Cleanup module page permission changes 2015-12-04 22:51:07 +01:00
locking Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-01-11 14:18:38 -08:00
power PM / sleep: Add support for read-only sysfs attributes 2016-01-04 22:28:59 +01:00
printk printk: prevent userland from spoofing kernel messages 2015-11-06 17:50:42 -08:00
rcu Merge branches 'doc.2015.12.05a', 'exp.2015.12.07a', 'fixes.2015.12.07a', 'list.2015.12.04b' and 'torture.2015.12.05a' into HEAD 2015-12-07 17:02:54 -08:00
sched vmstat: make vmstat_updater deferrable again and shut down on idle 2016-01-14 16:00:49 -08:00
time Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2016-01-11 18:53:13 -08:00
trace Not much new with tracing for this release. Mostly just clean ups and 2016-01-12 20:04:15 -08:00
.gitignore certs: add .gitignore to stop git nagging about x509_certificate_list 2015-10-21 15:18:35 +01:00
acct.c
async.c
audit_fsnotify.c audit: clean simple fsnotify implementation 2015-08-06 16:14:53 -04:00
audit_tree.c audit: audit_tree_match can be boolean 2015-11-04 08:23:51 -05:00
audit_watch.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
audit.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
audit.h audit: audit_tree_match can be boolean 2015-11-04 08:23:51 -05:00
auditfilter.c audit: fix comment block whitespace 2015-11-04 08:23:51 -05:00
auditsc.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
backtracetest.c
bounds.c
capability.c
cgroup_freezer.c cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends 2015-12-03 10:24:08 -05:00
cgroup_pids.c cgroup_pids: fix a typo. 2015-12-14 14:54:37 -05:00
cgroup.c Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2016-01-12 19:20:32 -08:00
compat.c
configs.c
context_tracking.c context_tracking: Switch to new static_branch API 2015-11-24 09:56:43 +01:00
cpu_pm.c kernel/cpu_pm: fix cpu_cluster_pm_exit comment 2015-09-03 02:42:20 +02:00
cpu.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 18:03:50 -08:00
cpuset.c Merge branch 'for-4.4-fixes' into for-4.5 2015-12-03 10:22:52 -05:00
crash_dump.c
cred.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
delayacct.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 18:03:50 -08:00
extable.c kernel/extable.c: remove duplicated include 2015-09-10 13:29:01 -07:00
fork.c mm: rework virtual memory accounting 2016-01-14 16:00:49 -08:00
freezer.c
futex_compat.c
futex.c futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op 2015-12-20 12:43:25 +01:00
groups.c
hung_task.c
irq_work.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
jump_label.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec_core.c kexec: Fix race between panic() and crash_kexec() 2015-12-19 11:07:01 +01:00
kexec_file.c kexec: use file name as the output message prefix 2015-11-06 17:50:42 -08:00
kexec_internal.h kexec: split kexec_file syscall code to kexec_file.c 2015-09-10 13:29:01 -07:00
kexec.c kexec: use file name as the output message prefix 2015-11-06 17:50:42 -08:00
kmod.c kmod: don't run async usermode helper as a child of kworker thread 2015-10-23 17:55:10 +09:00
kprobes.c perf/x86/hw_breakpoints: Disallow kernel breakpoints unless kprobe-safe 2015-08-04 10:16:54 +02:00
ksysfs.c rcu: Remove TINY_RCU bloat from pointless boot parameters 2015-12-07 16:59:37 -08:00
kthread.c kernel/kthread.c:kthread_create_on_node(): clarify documentation 2015-09-04 16:54:41 -07:00
latencytop.c
Makefile sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
membarrier.c sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
memremap.c libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
module_signing.c KEYS: Merge the type-specific data with the payload data 2015-10-21 15:18:36 +01:00
module-internal.h
module.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2016-01-14 16:38:02 -08:00
notifier.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 08:40:25 -07:00
nsproxy.c
padata.c
panic.c kexec: Fix race between panic() and crash_kexec() 2015-12-19 11:07:01 +01:00
params.c Nothing exciting, minor tweaks and cleanups. 2015-11-09 15:53:39 -08:00
pid_namespace.c
pid.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
profile.c mm: rename alloc_pages_exact_node() to __alloc_pages_node() 2015-09-08 15:35:28 -07:00
ptrace.c seccomp, ptrace: add support for dumping seccomp filters 2015-10-27 19:55:13 -07:00
range.c
reboot.c kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
relay.c kernel/relay.c: use kvfree() in relay_free_page_array() 2015-06-30 19:44:59 -07:00
resource.c restrict /dev/mem to idle io memory ranges 2016-01-09 06:30:49 -08:00
seccomp.c seccomp, ptrace: add support for dumping seccomp filters 2015-10-27 19:55:13 -07:00
signal.c kernel/signal.c: unexport sigsuspend() 2015-11-20 16:17:32 -08:00
smp.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
smpboot.c stop_machine: Kill smp_hotplug_thread->pre_unpark, introduce stop_machine_unpark() 2015-10-20 10:23:55 +02:00
smpboot.h
softirq.c
stacktrace.c
stop_machine.c Merge branch 'sched/urgent' into sched/core, to pick up fixes before merging new patches 2016-01-06 11:02:29 +01:00
sys_ni.c vfs: add copy_file_range syscall and vfs helper 2015-12-01 14:00:53 -05:00
sys.c pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode 2015-11-06 17:50:42 -08:00
sysctl_binary.c
sysctl.c pipe: limit the per-user amount of pages allocated in pipes 2016-01-19 19:25:21 -05:00
task_work.c task_work: remove fifo ordering guarantee 2015-09-05 13:46:58 -07:00
taskstats.c
test_kprobes.c
torture.c torture: Consolidate cond_resched_rcu_qs() into stutter_wait() 2015-10-06 11:25:01 -07:00
tracepoint.c tracepoint: Give priority to probes of tracepoints 2015-10-25 21:33:54 -04:00
tsacct.c
uid16.c
up.c
user_namespace.c kernel/*: switch to memdup_user_nul() 2016-01-04 10:27:55 -05:00
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog.c Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2016-01-11 18:53:13 -08:00
workqueue_internal.h
workqueue.c workqueue: simplify the apply_workqueue_attrs_locked() 2016-01-07 11:04:34 -05:00