linux/mm
Wu Fengguang e50e37201a writeback: balance_dirty_pages(): reduce calls to global_page_state
Reducing the number of times balance_dirty_pages calls global_page_state
reduces the cache references and so improves write performance on a
variety of workloads.

'perf stats' of simple fio write tests shows the reduction in cache
access.  Where the test is fio 'write,mmap,600Mb,pre_read' on AMD AthlonX2
with 3Gb memory (dirty_threshold approx 600 Mb) running each test 10
times, dropping the fasted & slowest values then taking the average &
standard deviation

		average (s.d.) in millions (10^6)
2.6.31-rc8	648.6 (14.6)
+patch		620.1 (16.5)

Achieving this reduction is by dropping clip_bdi_dirty_limit as it rereads
the counters to apply the dirty_threshold and moving this check up into
balance_dirty_pages where it has already read the counters.

Also by rearrange the for loop to only contain one copy of the limit tests
allows the pdflush test after the loop to use the local copies of the
counters rather than rereading them.

In the common case with no throttling it now calls global_page_state 5
fewer times and bdi_stat 2 fewer.

Fengguang:

This patch slightly changes behavior by replacing clip_bdi_dirty_limit()
with the explicit check (nr_reclaimable + nr_writeback >= dirty_thresh) to
avoid exceeding the dirty limit.  Since the bdi dirty limit is mostly
accurate we don't need to do routinely clip.  A simple dirty limit check
would be enough.

The check is necessary because, in principle we should throttle everything
calling balance_dirty_pages() when we're over the total limit, as said by
Peter.

We now set and clear dirty_exceeded not only based on bdi dirty limits,
but also on the global dirty limit.  The global limit check is added in
place of clip_bdi_dirty_limit() for safety and not intended as a behavior
change.  The bdi limits should be tight enough to keep all dirty pages
under the global limit at most time; occasional small exceeding should be
OK though.  The change makes the logic more obvious: the global limit is
the ultimate goal and shall be always imposed.

We may now start background writeback work based on outdated conditions.
That's safe because the bdi flush thread will (and have to) double check
the states.  It reduces overall overheads because the test based on old
states still have good chance to be right.

[akpm@linux-foundation.org] fix uninitialized dirty_exceeded
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-12 08:43:29 -07:00
..
backing-dev.c writeback: fix bad _bh spinlock nesting 2010-08-07 18:53:57 +02:00
bootmem.c x86,nobootmem: make alloc_bootmem_node fall back to other node when 32bit numa is used 2010-07-20 16:25:40 -07:00
bounce.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
compaction.c mm: compaction: add a tunable that decides when memory should be compacted and when it should be reclaimed 2010-05-25 08:06:59 -07:00
debug-pagealloc.c
dmapool.c
fadvise.c readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM 2010-03-06 11:26:25 -08:00
failslab.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
filemap_xip.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
filemap.c gcc-4.6: mm: fix unused but set warnings 2010-08-09 20:44:58 -07:00
fremap.c mm: clean up mm_counter 2010-03-06 11:26:23 -08:00
highmem.c mm,kdb,kgdb: Add a debug reference for the kdb kmap usage 2010-08-05 09:22:24 -05:00
hugetlb.c hugetlb: call mmu notifiers on hugepage cow 2010-08-09 20:44:54 -07:00
hwpoison-inject.c HWPOISON: Don't do early filtering if filter is disabled 2009-12-16 12:20:01 +01:00
init-mm.c mm: provide init_mm mm_context initializer 2010-08-09 20:44:54 -07:00
internal.h HWPOISON: add an interface to switch off/on all the page filters 2009-12-16 12:19:59 +01:00
Kconfig lmb: rename to memblock 2010-07-14 17:14:00 +10:00
Kconfig.debug trivial: improve help text for mm debug config options 2009-09-21 15:14:57 +02:00
kmemcheck.c kmemcheck: Fix build errors due to missing slab.h 2010-03-30 22:02:32 +09:00
kmemleak-test.c
kmemleak.c kmemleak: Fix typo in the comment 2010-08-08 21:57:23 +01:00
ksm.c ksm: cleanup for mm_slots_hash 2010-08-09 20:45:03 -07:00
maccess.c maccess,probe_kernel: Allow arch specific override probe_kernel_(read|write) 2010-01-07 11:58:36 -06:00
madvise.c HWPOISON: Add a madvise() injector for soft page offlining 2009-12-16 12:20:00 +01:00
Makefile lmb: rename to memblock 2010-07-14 17:14:00 +10:00
memblock.c memblock: Fix memblock_is_region_reserved() to return a boolean 2010-08-09 11:21:38 +10:00
memcontrol.c memcg: convert to use zone_to_nid() from bare zone->zone_pgdat->node_id 2010-08-11 08:59:19 -07:00
memory_hotplug.c mem-hotplug: fix potential race while building zonelist for new populated zone 2010-05-25 08:07:02 -07:00
memory-failure.c KVM: Fix a race condition for usage of is_hwpoison_address() 2010-08-01 10:47:11 +03:00
memory.c mmu-notifiers: remove mmu notifier calls in apply_to_page_range() 2010-08-09 20:45:03 -07:00
mempolicy.c mempolicy: reduce stack size of migrate_pages() 2010-08-09 20:44:58 -07:00
mempool.c mm: remove broken 'kzalloc' mempool 2009-09-22 07:17:35 -07:00
migrate.c mm: extend KSM refcounts to the anon_vma root 2010-08-09 20:44:55 -07:00
mincore.c mincore: do nested page table walks 2010-05-25 08:06:58 -07:00
mlock.c x86, perf, bts, mm: Delete the never used BTS-ptrace code 2010-03-26 11:33:55 +01:00
mm_init.c
mmap.c mmap: remove unnecessary lock from __vma_link 2010-08-09 20:44:58 -07:00
mmu_context.c exit: fix oops in sync_mm_rss 2010-03-24 16:31:21 -07:00
mmu_notifier.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
mmzone.c
mprotect.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
mremap.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
msync.c sanitize vfs_fsync calling conventions 2010-05-21 18:31:21 -04:00
nommu.c nommu: allow private mappings of read-only devices 2010-05-26 08:19:23 -07:00
oom_kill.c memcg: use find_lock_task_mm() in memory cgroups oom 2010-08-11 08:59:19 -07:00
page_alloc.c vmscan: kill prev_priority completely 2010-08-09 20:45:00 -07:00
page_cgroup.c kmemleak: Annotate false positive in init_section_page_cgroup() 2010-07-19 11:54:14 +01:00
page_io.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
page_isolation.c
page-writeback.c writeback: balance_dirty_pages(): reduce calls to global_page_state 2010-08-12 08:43:29 -07:00
pagewalk.c pagemap: fix pfn calculation for hugepage 2010-04-07 08:38:04 -07:00
percpu_up.c percpu: don't implicitly include slab.h from percpu.h 2010-03-30 22:02:32 +09:00
percpu-km.c percpu: implement kernel memory based chunk allocation 2010-05-01 08:30:50 +02:00
percpu-vm.c percpu: move vmalloc based chunk management into percpu-vm.c 2010-05-01 08:30:50 +02:00
percpu.c percpu: allow limited allocation before slab is online 2010-06-27 18:50:00 +02:00
prio_tree.c
quicklist.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
readahead.c readahead.c: fix comment 2010-05-25 08:07:00 -07:00
rmap.c rmap: add exclusive page to private anon_vma on swapin 2010-08-09 20:45:02 -07:00
shmem.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
slab.c gcc-4.6: mm: fix unused but set warnings 2010-08-09 20:44:58 -07:00
slob.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 2010-08-06 11:44:08 -07:00
slub.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 2010-08-06 11:44:08 -07:00
sparse-vmemmap.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
sparse.c sparsemem: on no vmemmap path put mem_map on node high too 2010-05-25 08:06:56 -07:00
swap_state.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
swap.c mm: export lru_cache_add_*() to modules 2010-05-25 15:06:06 +02:00
swapfile.c hibernation: freeze swap at hibernation 2010-08-09 20:45:04 -07:00
thrash.c
truncate.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
util.c mm: use memdup_user 2010-08-09 20:44:54 -07:00
vmalloc.c mm/vmalloc.c: check kmalloc() return value 2010-08-09 20:45:03 -07:00
vmscan.c memcg: remove nid and zid argument from mem_cgroup_soft_limit_reclaim() 2010-08-11 08:59:19 -07:00
vmstat.c vmscan: kill prev_priority completely 2010-08-09 20:45:00 -07:00