linux/mm
Linus Torvalds bb4aa39676 mm: avoid repeated anon_vma lock/unlock sequences in anon_vma_clone()
In anon_vma_clone() we traverse the vma->anon_vma_chain of the source
vma, locking the anon_vma for each entry.

But they are all going to have the same root entry, which means that
we're locking and unlocking the same lock over and over again.  Which is
expensive in locked operations, but can get _really_ expensive when that
root entry sees any kind of lock contention.

In fact, Tim Chen reports a big performance regression due to this: when
we switched to use a mutex instead of a spinlock, the contention case
gets much worse.

So to alleviate this all, this commit creates a small helper function
(lock_anon_vma_root()) that can be used to take the lock just once
rather than taking and releasing it over and over again.

We still have the same "take the lock and release" it behavior in the
exit path (in unlink_anon_vmas()), but that one is a bit harder to fix
since we're actually freeing the anon_vma entries as we go, and that
will touch the lock too.

Reported-and-tested-by: Tim Chen <tim.c.chen@linux.intel.com>
Tested-by: Hugh Dickins <hughd@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-17 19:20:49 -07:00
..
backing-dev.c backing-dev: Kill set but not used var in bdi_debug_stats_show() 2011-05-20 21:23:37 +02:00
bootmem.c crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
bounce.c bounce: call flush_dcache_page() after bounce_copy_vec() 2010-09-09 18:57:25 -07:00
cleancache.c mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
compaction.c mm: compaction: abort compaction if too many pages are isolated and caller is asynchronous V2 2011-06-15 20:04:02 -07:00
debug-pagealloc.c
dmapool.c mm/dmapool.c: use TASK_UNINTERRUPTIBLE in dma_pool_alloc() 2011-01-13 17:32:48 -08:00
fadvise.c readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM 2010-03-06 11:26:25 -08:00
failslab.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
filemap_xip.c mm: Convert i_mmap_lock to a mutex 2011-05-25 08:39:18 -07:00
filemap.c more conservative S_NOSEC handling 2011-06-03 18:24:58 -04:00
fremap.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
highmem.c mm,x86: fix kmap_atomic_push vs ioremap_32.c 2010-10-27 18:03:05 -07:00
huge_memory.c mm: remove khugepaged double thp vmstat update with CONFIG_NUMA=n 2011-06-15 20:03:58 -07:00
hugetlb.c mm: fix negative commitlimit when gigantic hugepages are allocated 2011-06-15 20:04:01 -07:00
hwpoison-inject.c Fix common misspellings 2011-03-31 11:26:23 -03:00
init-mm.c mm: convert mm->cpu_vm_cpumask into cpumask_var_t 2011-05-25 08:39:21 -07:00
internal.h mm: nommu: sort mm->mmap list properly 2011-05-25 08:39:05 -07:00
Kconfig mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
Kconfig.debug mm: debug-pagealloc: fix kconfig dependency warning 2011-03-22 17:44:02 -07:00
kmemcheck.c kmemcheck: Fix build errors due to missing slab.h 2010-03-30 22:02:32 +09:00
kmemleak-test.c kmemleak: remove memset by using kzalloc 2011-01-27 18:31:51 +00:00
kmemleak.c kmemleak: Do not return a pointer to an object that kmemleak did not get 2011-05-19 17:35:28 +01:00
ksm.c ksm: fix NULL pointer dereference in scan_get_next_rmap_item() 2011-06-15 20:04:02 -07:00
maccess.c maccess,probe_kernel: Make write/read src const void * 2011-05-25 19:56:23 -04:00
madvise.c thp: khugepaged: make khugepaged aware about madvise 2011-01-13 17:32:47 -08:00
Makefile mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
memblock.c mm/memblock: properly handle overlaps and fix error path 2011-03-22 17:44:09 -07:00
memcontrol.c memcg: avoid percpu cached charge draining at softlimit 2011-06-15 20:04:01 -07:00
memory_hotplug.c mm/memory_hotplug.c: fix building of node hotplug zonelist 2011-06-15 20:04:01 -07:00
memory-failure.c mm/memory-failure.c: fix page isolated count mismatch 2011-06-15 20:04:01 -07:00
memory.c mm: fix wrong kunmap_atomic() pointer 2011-06-15 20:04:00 -07:00
mempolicy.c mm: proc: move show_numa_map() to fs/proc/task_mmu.c 2011-05-25 08:39:34 -07:00
mempool.c
migrate.c migrate: don't account swapcache as shmem 2011-06-16 15:01:24 -07:00
mincore.c thp: mincore transparent hugepage support 2011-01-13 17:32:44 -08:00
mlock.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
mm_init.c
mmap.c mm: get rid of the most spurious find_vma_prev() users 2011-06-16 00:35:09 -07:00
mmu_context.c exit: fix oops in sync_mm_rss 2010-03-24 16:31:21 -07:00
mmu_notifier.c thp: mmu_notifier_test_young 2011-01-13 17:32:46 -08:00
mmzone.c mm: page allocator: adjust the per-cpu counter threshold when memory is low 2011-01-13 17:32:31 -08:00
mprotect.c thp: mprotect: transparent huge page support 2011-01-13 17:32:44 -08:00
mremap.c mm: Convert i_mmap_lock to a mutex 2011-05-25 08:39:18 -07:00
msync.c sanitize vfs_fsync calling conventions 2010-05-21 18:31:21 -04:00
nobootmem.c memblock/nobootmem: remove unneeded code from alloc_bootmem_node_high() 2011-05-25 08:39:31 -07:00
nommu.c nommu: add page alignment to mmap 2011-05-25 08:39:38 -07:00
oom_kill.c oom: replace PF_OOM_ORIGIN with toggling oom_score_adj 2011-05-25 08:39:10 -07:00
page_alloc.c Revert "mm: fail GFP_DMA allocations when ZONE_DMA is not configured" 2011-06-02 06:11:24 +09:00
page_cgroup.c memcg: fix init_page_cgroup nid with sparsemem 2011-06-15 20:04:01 -07:00
page_io.c block: kill off REQ_UNPLUG 2011-03-10 08:52:27 +01:00
page_isolation.c mm: page_isolation: codeclean fix comment and rm unneeded val init 2010-10-26 16:52:11 -07:00
page-writeback.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
pagewalk.c pagewalk: only split huge pages when necessary 2011-03-22 17:44:04 -07:00
percpu-km.c percpu: clear memory allocated with the km allocator 2010-10-02 10:28:42 +03:00
percpu-vm.c mm: remove gfp mask from pcpu_get_vm_areas 2011-01-13 17:32:34 -08:00
percpu.c Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2011-05-24 11:53:42 -07:00
pgtable-generic.c mm/pgtable-generic.c: fix CONFIG_SWAP=n build 2011-01-26 10:49:58 +10:00
prio_tree.c sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
quicklist.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
readahead.c readahead: readahead page allocations are OK to fail 2011-05-25 08:39:25 -07:00
rmap.c mm: avoid repeated anon_vma lock/unlock sequences in anon_vma_clone() 2011-06-17 19:20:49 -07:00
shmem.c tmpfs: fix race between truncate and writepage 2011-05-28 16:09:26 -07:00
slab.c SLAB: Record actual last user of freed objects. 2011-06-03 19:33:50 +03:00
slob.c mm: Remove support for kmem_cache_name() 2011-01-23 21:00:05 +02:00
slub.c slub: always align cpu_slab to honor cmpxchg_double requirement 2011-06-03 19:33:49 +03:00
sparse-vmemmap.c tree-wide: fix comment/printk typos 2010-11-01 15:38:34 -04:00
sparse.c Fix common misspellings 2011-03-31 11:26:23 -03:00
swap_state.c block: remove per-queue plugging 2011-03-10 08:52:07 +01:00
swap.c mm: batch activate_page() to reduce lock contention 2011-05-25 08:39:37 -07:00
swapfile.c oom: replace PF_OOM_ORIGIN with toggling oom_score_adj 2011-05-25 08:39:10 -07:00
thrash.c vmscan: implement swap token priority aging 2011-06-15 20:03:59 -07:00
truncate.c mm/fs: add hooks to support cleancache 2011-05-26 10:01:43 -06:00
util.c mm: nommu: sort mm->mmap list properly 2011-05-25 08:39:05 -07:00
vmalloc.c Merge branch 'upstream/tidy-xen-mmu-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen 2011-05-26 19:01:15 -07:00
vmscan.c mm: vmscan: do not use page_count without a page pin 2011-06-15 20:04:02 -07:00
vmstat.c mm, mem-hotplug: update pcp->stat_threshold when memory hotplug occur 2011-05-25 08:39:09 -07:00