linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-12 07:01:57 +00:00

History

Mel Gorman fed2719e7a mm: page_alloc: avoid marking zones full prematurely after zone_reclaim() The following problem was reported against a distribution kernel when zone_reclaim was enabled but the same problem applies to the mainline kernel. The reproduction case was as follows 1. Run numactl -m +0 dd if=largefile of=/dev/null This allocates a large number of clean pages in node 0 2. numactl -N +0 memhog 0.5*Mg This start a memory-using application in node 0. The expected behaviour is that the clean pages get reclaimed and the application uses node 0 for its memory. The observed behaviour was that the memory for the memhog application was allocated off-node since commits `cd38b115d5` ("mm: page allocator: initialise ZLC for first zone eligible for zone_reclaim") and commit `76d3fbf8fb` ("mm: page allocator: reconsider zones for allocation after direct reclaim"). The assumption of those patches was that it was always preferable to allocate quickly than stall for long periods of time and they were meant to take care that the zone was only marked full when necessary but an important case was missed. In the allocator fast path, only the low watermarks are checked. If the zones free pages are between the low and min watermark then allocations from the allocators slow path will succeed. However, zone_reclaim will only reclaim SWAP_CLUSTER_MAX or 1<<order pages. There is no guarantee that this will meet the low watermark causing the zone to be marked full prematurely. This patch will only mark the zone full after zone_reclaim if it the min watermarks are checked or if page reclaim failed to make sufficient progress. [mhocko@suse.cz: fix alloc_flags test] Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Hedi Berriche <hedi@sgi.com> Tested-by: Hedi Berriche <hedi@sgi.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2013-04-29 15:54:35 -07:00
..
backing-dev.c	bdi: allow block devices to say that they require stable page writes	2013-02-21 17:22:19 -08:00
balloon_compaction.c	mm: introduce a common interface for balloon pages mobility	2012-12-11 17:22:26 -08:00
bootmem.c	mm: Add alloc_bootmem_low_pages_nopanic()	2013-01-29 19:32:59 -08:00
bounce.c	mm: make snapshotting pages for stable writes a per-bio operation	2013-04-29 15:54:33 -07:00
cleancache.c	fs: encode_fh: return FILEID_INVALID if invalid fid_type	2013-02-26 02:46:10 -05:00
compaction.c	mm: add & use zone_end_pfn() and zone_spans_pfn()	2013-02-23 17:50:20 -08:00
debug-pagealloc.c	mm, x86: Remove debug_pagealloc_enabled	2011-12-06 09:24:07 +01:00
dmapool.c	dmapool: make DMAPOOL_DEBUG detect corruption of free marker	2012-12-11 17:22:24 -08:00
fadvise.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-02-26 20:16:07 -08:00
failslab.c	switch debugfs to umode_t	2012-01-03 22:54:56 -05:00
filemap_xip.c	mm: move all mmu notifier invocations to be done outside the PT lock	2012-10-09 16:22:58 +09:00
filemap.c	mm: trace filemap add and del	2013-04-29 15:54:28 -07:00
fremap.c	Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs"	2013-03-28 17:45:51 -07:00
frontswap.c	frontswap: support exclusive gets if tmem backend is capable	2012-09-21 10:38:12 -04:00
highmem.c	Some nice cleanups, and even a patch my wife did as a "live" demo for	2012-12-20 08:37:05 -08:00
huge_memory.c	hlist: drop the node parameter from iterators	2013-02-27 19:10:24 -08:00
hugetlb_cgroup.c	mm/hugetlb: create hugetlb cgroup file in hugetlb_init	2012-12-18 15:02:15 -08:00
hugetlb.c	mm, hugetlb: include hugepages in meminfo	2013-04-29 15:54:35 -07:00
hwpoison-inject.c	memcg: rename config variables	2012-07-31 18:42:43 -07:00
init-mm.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
internal.h	mm: accelerate munlock() treatment of THP pages	2013-02-27 19:10:09 -08:00
interval_tree.c	mm: add CONFIG_DEBUG_VM_RB build option	2012-10-09 16:22:42 +09:00
Kconfig	Select VIRT_TO_BUS directly where needed	2013-03-12 11:16:40 -07:00
Kconfig.debug	mm: more intensive memory corruption debugging	2012-01-10 16:30:42 -08:00
kmemcheck.c
kmemleak-test.c	kmemleak: remove memset by using kzalloc	2011-01-27 18:31:51 +00:00
kmemleak.c	hlist: drop the node parameter from iterators	2013-02-27 19:10:24 -08:00
ksm.c	ksm: fix m68k build: only NUMA needs pfn_to_nid	2013-03-08 15:05:34 -08:00
maccess.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
madvise.c	mm: make madvise(MADV_WILLNEED) support swap file prefetch	2013-02-23 17:50:10 -08:00
Makefile	mm: introduce a common interface for balloon pages mobility	2012-12-11 17:22:26 -08:00
memblock.c	memblock: add assertion for zero allocation alignment	2013-04-29 15:54:28 -07:00
memcontrol.c	memcg: do not check for do_swap_account in mem_cgroup_{read,write,reset}	2013-04-29 15:54:34 -07:00
memory_hotplug.c	mm: walk_memory_range(): fix typo in comment	2013-04-29 15:54:28 -07:00
memory-failure.c	HWPOISON: check dirty flag to match against clean page	2013-04-29 15:54:28 -07:00
memory.c	vm: add vm_iomap_memory() helper function	2013-04-16 16:45:45 -07:00
mempolicy.c	mm/mempolicy.c: fix sp_node_init() argument ordering	2013-03-08 15:05:34 -08:00
mempool.c	mempool: add @gfp_mask to mempool_create_node()	2012-06-25 11:53:47 +02:00
migrate.c	mm: remove offlining arg to migrate_pages	2013-02-23 17:50:19 -08:00
mincore.c	swap: make each swap partition have one address_space	2013-02-23 17:50:17 -08:00
mlock.c	Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs"	2013-03-28 17:45:51 -07:00
mm_init.c	mm: init: report on last-nid information stored in page->flags	2013-02-23 17:50:18 -08:00
mmap.c	mm: merging memory blocks resets mempolicy	2013-04-29 15:54:35 -07:00
mmu_context.c	mm, counters: remove task argument to sync_mm_rss() and __sync_task_rss_stat()	2012-03-21 17:54:59 -07:00
mmu_notifier.c	hlist: drop the node parameter from iterators	2013-02-27 19:10:24 -08:00
mmzone.c	mm: rename page struct field helpers	2013-02-23 17:50:18 -08:00
mprotect.c	mm/mprotect.c: coding-style cleanups	2012-12-18 15:02:15 -08:00
mremap.c	mm/rmap: rename anon_vma_unlock() => anon_vma_unlock_write()	2013-02-23 17:50:17 -08:00
msync.c
nobootmem.c	mm: Add alloc_bootmem_low_pages_nopanic()	2013-01-29 19:32:59 -08:00
nommu.c	mm, vmalloc: export vmap_area_list, instead of vmlist	2013-04-29 15:54:34 -07:00
oom_kill.c	memcg, oom: provide more precise dump info while memcg oom happening	2013-02-23 17:50:08 -08:00
page_alloc.c	mm: page_alloc: avoid marking zones full prematurely after zone_reclaim()	2013-04-29 15:54:35 -07:00
page_cgroup.c	memcontrol: use N_MEMORY instead N_HIGH_MEMORY	2012-12-12 17:38:32 -08:00
page_io.c	mm: add support for direct_IO to highmem pages	2012-07-31 18:42:47 -07:00
page_isolation.c	mm: fix zone_watermark_ok_safe() accounting of isolated pages	2013-01-04 16:11:46 -08:00
page-writeback.c	mm: make snapshotting pages for stable writes a per-bio operation	2013-04-29 15:54:33 -07:00
pagewalk.c	thp: change split_huge_page_pmd() interface	2012-12-12 17:38:31 -08:00
percpu-km.c
percpu-vm.c	mm: fix kernel-doc warnings	2012-06-20 14:39:36 -07:00
percpu.c	mm, percpu: Make sure percpu_alloc early parameter has an argument	2012-12-02 06:23:04 -08:00
pgtable-generic.c	mm: Only flush the TLB when clearing an accessible pte	2012-12-11 14:28:34 +00:00
process_vm_access.c	Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys	2013-03-12 11:05:45 -07:00
quicklist.c	mm: delete various needless include <linux/module.h>	2011-10-31 09:20:11 -04:00
readahead.c	switch simple cases of fget_light to fdget	2012-09-26 22:20:08 -04:00
rmap.c	rmap: recompute pgoff for unmapping huge page	2013-04-29 15:54:28 -07:00
shmem.c	mm/shmem.c: remove an ifdef	2013-04-29 15:54:28 -07:00
slab_common.c	slab: propagate tunable values	2012-12-18 15:02:14 -08:00
slab.c	taint: add explicit flag to show whether lock dep is still OK.	2013-01-21 17:17:57 +10:30
slab.h	slab: propagate tunable values	2012-12-18 15:02:14 -08:00
slob.c	mm: rename page struct field helpers	2013-02-23 17:50:18 -08:00
slub.c	The sweeping change is to make add_taint() explicitly indicate whether to disable	2013-02-25 15:41:43 -08:00
sparse-vmemmap.c	sparse-vmemmap: specify vmemmap population range in bytes	2013-04-29 15:54:35 -07:00
sparse.c	sparse-vmemmap: specify vmemmap population range in bytes	2013-04-29 15:54:35 -07:00
swap_state.c	swap: add per-partition lock for swapfile	2013-02-23 17:50:17 -08:00
swap.c	swap: make each swap partition have one address_space	2013-02-23 17:50:17 -08:00
swapfile.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-02-26 20:16:07 -08:00
truncate.c	mm: drop vmtruncate	2012-12-20 18:46:29 -05:00
util.c	swap: make each swap partition have one address_space	2013-02-23 17:50:17 -08:00
vmalloc.c	kexec, vmalloc: export additional vmalloc layer information	2013-04-29 15:54:34 -07:00
vmscan.c	mm/vmscan.c: minor cleanup for kswapd	2013-04-29 15:54:29 -07:00
vmstat.c	mm: add & use zone_end_pfn() and zone_spans_pfn()	2013-02-23 17:50:20 -08:00