linux/mm
Vladimir Davydov 832f37f5d5 slub: never fail to shrink cache
SLUB's version of __kmem_cache_shrink() not only removes empty slabs, but
also tries to rearrange the partial lists to place slabs filled up most to
the head to cope with fragmentation.  To achieve that, it allocates a
temporary array of lists used to sort slabs by the number of objects in
use.  If the allocation fails, the whole procedure is aborted.

This is unacceptable for the kernel memory accounting extension of the
memory cgroup, where we want to make sure that kmem_cache_shrink()
successfully discarded empty slabs.  Although the allocation failure is
utterly unlikely with the current page allocator implementation, which
retries GFP_KERNEL allocations of order <= 2 infinitely, it is better not
to rely on that.

This patch therefore makes __kmem_cache_shrink() allocate the array on
stack instead of calling kmalloc, which may fail.  The array size is
chosen to be equal to 32, because most SLUB caches store not more than 32
objects per slab page.  Slab pages with <= 32 free objects are sorted
using the array by the number of objects in use and promoted to the head
of the partial list, while slab pages with > 32 free objects are left in
the end of the list without any ordering imposed on them.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:10 -08:00
..
backing-dev.c Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block 2014-10-18 11:53:51 -07:00
balloon_compaction.c mm/balloon_compaction: fix deflation when compaction is disabled 2014-10-29 16:33:15 -07:00
bootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
cleancache.c mm: fix cleancache debugfs directory path 2015-01-20 14:08:31 +01:00
cma.c mm: cma: fix totalcma_pages to include DT defined CMA regions 2015-02-11 17:06:03 -08:00
compaction.c mm/compaction: add tracepoint to observe behaviour of compaction defer 2015-02-11 17:06:04 -08:00
debug-pagealloc.c mm/debug-pagealloc: make debug-pagealloc boottime configurable 2014-12-13 12:42:48 -08:00
debug.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
dmapool.c mm/dmapool.c: fixed a brace coding style issue 2014-10-09 22:26:00 -04:00
early_ioremap.c mm: create generic early_ioremap() support 2014-04-07 16:36:15 -07:00
fadvise.c mm: fadvise: document the fadvise(FADV_DONTNEED) behaviour for partial pages 2014-12-13 12:42:49 -08:00
failslab.c
filemap_xip.c mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub 2015-02-10 14:30:30 -08:00
filemap.c mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub 2015-02-10 14:30:30 -08:00
frontswap.c mm/frontswap.c: fix the condition in BUG_ON 2014-12-10 17:41:08 -08:00
gup.c mm: convert p[te|md]_numa users to p[te|md]_protnone_numa 2015-02-12 18:54:08 -08:00
highmem.c mm/highmem: make kmap cache coloring aware 2014-08-06 18:01:22 -07:00
huge_memory.c mm: numa: avoid unnecessary TLB flushes when setting NUMA hinting entries 2015-02-12 18:54:08 -08:00
hugetlb_cgroup.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
hugetlb.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
hwpoison-inject.c mm/hwpoison-inject.c: remove unnecessary null test before debugfs_remove_recursive 2014-08-06 18:01:19 -07:00
init-mm.c
internal.h mm: reduce try_to_compact_pages parameters 2015-02-11 17:06:02 -08:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
iov_iter.c copy_from_iter_nocache() 2014-12-08 20:25:23 -05:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
Kconfig.debug mm/debug_pagealloc: remove obsolete Kconfig options 2015-01-08 15:10:52 -08:00
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c mm/kmemleak-test.c: use pr_fmt for logging 2014-06-06 16:08:18 -07:00
kmemleak.c mm: introduce kmemleak_update_trace() 2014-06-06 16:08:17 -07:00
ksm.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
list_lru.c memcg: reparent list_lrus and free kmemcg_id on css offline 2015-02-12 18:54:10 -08:00
maccess.c
madvise.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
Makefile mm: replace remap_file_pages() syscall with emulation 2015-02-10 14:30:30 -08:00
memblock.c mm/memblock.c: refactor functions to set/clear MEMBLOCK_HOTPLUG 2014-12-13 12:42:46 -08:00
memcontrol.c memcg: reparent list_lrus and free kmemcg_id on css offline 2015-02-12 18:54:10 -08:00
memory_hotplug.c mm, memory_hotplug/failure: drain single zone pcplists 2014-12-10 17:41:05 -08:00
memory-failure.c vmscan: per memory cgroup slab shrinkers 2015-02-12 18:54:09 -08:00
memory.c mm: numa: add paranoid check around pte_protnone_numa 2015-02-12 18:54:08 -08:00
mempolicy.c mm: convert p[te|md]_mknonnuma and remaining page table manipulations 2015-02-12 18:54:08 -08:00
mempool.c mm/mempool.c: update the kmemleak stack trace for mempool allocations 2014-06-06 16:08:17 -07:00
migrate.c mm: convert p[te|md]_mknonnuma and remaining page table manipulations 2015-02-12 18:54:08 -08:00
mincore.c mincore: apply page table walker on do_mincore() 2015-02-11 17:06:06 -08:00
mlock.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-10-13 15:44:12 +02:00
mm_init.c mm: bring back /sys/kernel/mm 2014-01-27 21:02:39 -08:00
mmap.c mm/mmap.c: fix arithmetic overflow in __vm_enough_memory() 2015-02-11 17:06:07 -08:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mmu_notifier: add the callback for mmu_notifier_invalidate_range() 2014-11-13 13:46:09 +11:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c mm: numa: avoid unnecessary TLB flushes when setting NUMA hinting entries 2015-02-12 18:54:08 -08:00
mremap.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
msync.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
nobootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
nommu.c mm/nommu.c: fix arithmetic overflow in __vm_enough_memory() 2015-02-11 17:06:07 -08:00
oom_kill.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
page_alloc.c mm: more aggressive page stealing for UNMOVABLE allocations 2015-02-11 17:06:06 -08:00
page_counter.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
page_ext.c mm/page_owner: keep track of page owners 2014-12-13 12:42:48 -08:00
page_io.c fix __swap_writepage() compile failure on old gcc versions 2014-06-14 19:30:48 -05:00
page_isolation.c mm, page_isolation: drain single zone pcplists 2014-12-10 17:41:05 -08:00
page_owner.c mm/page_owner.c: remove unnecessary stack_trace field 2015-02-11 17:06:07 -08:00
page-writeback.c page_writeback: put account_page_redirty() after set_page_dirty() 2015-02-11 17:06:04 -08:00
pagewalk.c mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP) 2015-02-11 17:06:06 -08:00
percpu-km.c percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated 2014-09-02 14:46:05 -04:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c percpu: off by one in BUG_ON() 2014-10-29 10:34:34 -04:00
pgtable-generic.c mm: convert p[te|md]_mknonnuma and remaining page table manipulations 2015-02-12 18:54:08 -08:00
process_vm_access.c mm: gup: use get_user_pages_unlocked 2015-02-11 17:06:05 -08:00
quicklist.c
readahead.c mm/readahead.c: remove unused file_ra_state from count_history_pages 2014-08-06 18:01:15 -07:00
rmap.c mm: memcontrol: track move_lock state internally 2015-02-11 17:06:00 -08:00
shmem.c swap: remove unused mem_cgroup_uncharge_swapcache declaration 2015-02-11 17:06:00 -08:00
slab_common.c memcg: free memcg_caches slot on css offline 2015-02-12 18:54:10 -08:00
slab.c slab: link memcg caches of the same kind into a list 2015-02-12 18:54:09 -08:00
slab.h slab: link memcg caches of the same kind into a list 2015-02-12 18:54:09 -08:00
slob.c mm/sl[ao]b: always track caller in kmalloc_(node_)track_caller() 2014-10-09 22:25:50 -04:00
slub.c slub: never fail to shrink cache 2015-02-12 18:54:10 -08:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_state.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap.c rmap: drop support of non-linear mappings 2015-02-10 14:30:31 -08:00
swapfile.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
truncate.c mm: Fix comment before truncate_setsize() 2014-11-07 08:29:25 +11:00
util.c mm: gup: use get_user_pages_unlocked within get_user_pages_fast 2015-02-11 17:06:05 -08:00
vmacache.c mm,vmacache: count number of system-wide flushes 2014-12-13 12:42:48 -08:00
vmalloc.c mm/vmalloc.c: fix memory ordering bug 2014-12-13 12:42:49 -08:00
vmpressure.c mm/vmpressure.c: fix race in vmpressure_work_fn() 2014-12-02 17:32:07 -08:00
vmscan.c vmscan: per memory cgroup slab shrinkers 2015-02-12 18:54:09 -08:00
vmstat.c vmstat: Reduce time interval to stat update on idle cpu 2015-02-11 17:06:07 -08:00
workingset.c list_lru: add helpers to isolate items 2015-02-12 18:54:10 -08:00
zbud.c mm/zbud: init user ops only when it is needed 2014-12-13 12:42:51 -08:00
zpool.c mm/zpool: use prefixed module loading 2014-08-29 16:28:16 -07:00
zsmalloc.c mm/zsmalloc: adjust order of functions 2014-12-18 19:08:11 -08:00
zswap.c mm/zswap: delete unnecessary check before calling free_percpu() 2014-12-13 12:42:50 -08:00