linux/mm
Nick Piggin 0ed361dec3 mm: fix PageUptodate data race
After running SetPageUptodate, preceeding stores to the page contents to
actually bring it uptodate may not be ordered with the store to set the
page uptodate.

Therefore, another CPU which checks PageUptodate is true, then reads the
page contents can get stale data.

Fix this by having an smp_wmb before SetPageUptodate, and smp_rmb after
PageUptodate.

Many places that test PageUptodate, do so with the page locked, and this
would be enough to ensure memory ordering in those places if
SetPageUptodate were only called while the page is locked.  Unfortunately
that is not always the case for some filesystems, but it could be an idea
for the future.

Also bring the handling of anonymous page uptodateness in line with that of
file backed page management, by marking anon pages as uptodate when they
_are_ uptodate, rather than when our implementation requires that they be
marked as such.  Doing allows us to get rid of the smp_wmb's in the page
copying functions, which were especially added for anonymous pages for an
analogous memory ordering problem.  Both file and anonymous pages are
handled with the same barriers.

FAQ:
Q. Why not do this in flush_dcache_page?
A. Firstly, flush_dcache_page handles only one side (the smb side) of the
ordering protocol; we'd still need smp_rmb somewhere. Secondly, hiding away
memory barriers in a completely unrelated function is nasty; at least in the
PageUptodate macros, they are located together with (half) the operations
involved in the ordering. Thirdly, the smp_wmb is only required when first
bringing the page uptodate, wheras flush_dcache_page should be called each time
it is written to through the kernel mapping. It is logically the wrong place to
put it.

Q. Why does this increase my text size / reduce my performance / etc.
A. Because it is adding the necessary instructions to eliminate the data-race.

Q. Can it be improved?
A. Yes, eg. if you were to create a rule that all SetPageUptodate operations
run under the page lock, we could avoid the smp_rmb places where PageUptodate
is queried under the page lock. Requires audit of all filesystems and at least
some would need reworking. That's great you're interested, I'm eagerly awaiting
your patches.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:19 -08:00
..
allocpercpu.c Slab allocators: Replace explicit zeroing with __GFP_ZERO 2007-07-17 10:23:02 -07:00
backing-dev.c mm/backing-dev.c: fix percpu_counter_destroy call bug in bdi_init 2007-12-05 09:21:18 -08:00
bootmem.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
bounce.c block: Initial support for data-less (or empty) barrier support 2007-10-16 11:03:56 +02:00
fadvise.c check ADVICE of fadvise64_64 even if get_xip_page is given 2008-02-05 09:44:19 -08:00
filemap_xip.c Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user 2008-02-05 09:44:13 -08:00
filemap.c mm: remove fastcall from mm/ 2008-02-05 09:44:18 -08:00
fremap.c sys_remap_file_pages: fix ->vm_file accounting 2008-02-05 09:44:07 -08:00
highmem.c mm: remove fastcall from mm/ 2008-02-05 09:44:18 -08:00
hugetlb.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
internal.h set_page_refcounted() VM_BUG_ON fix 2008-02-05 09:44:19 -08:00
Kconfig sh: Bump number of quicklists for SH-5. 2008-01-28 13:18:55 +09:00
madvise.c speed up madvise_need_mmap_write() usage 2007-07-16 09:05:36 -07:00
Makefile maps4: make page monitoring /proc file optional 2008-02-05 09:44:17 -08:00
memory_hotplug.c Page allocator: clean up pcp draining functions 2008-02-05 09:44:17 -08:00
memory.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
mempolicy.c Migration: find correct vma in new_vma_page() 2007-11-14 18:45:38 -08:00
mempool.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
migrate.c page migraton: handle orphaned pages 2008-02-05 09:44:19 -08:00
mincore.c [PATCH] mincore: vma crossing fix 2007-02-15 09:57:03 -08:00
mlock.c do not limit locked memory when RLIMIT_MEMLOCK is RLIM_INFINITY 2007-07-16 09:05:37 -07:00
mmap.c arch_rebalance_pgtables call 2008-02-05 09:44:18 -08:00
mmzone.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
mprotect.c fix mprotect vma_wants_writenotify prot 2007-10-23 08:32:06 -07:00
mremap.c sparse pointer use of zero as null 2007-10-18 14:37:31 -07:00
msync.c Detach sched.h from mm.h 2007-05-21 09:18:19 -07:00
nommu.c vmalloc: add const to void* parameters 2008-02-05 09:44:14 -08:00
oom_kill.c sched: sched_rt_entity 2008-01-25 21:08:27 +01:00
page_alloc.c Include count of pagecache pages in show_mem() output 2008-02-05 09:44:19 -08:00
page_io.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
page_isolation.c memory hotremove: unset migrate type "ISOLATE" after removal 2007-11-14 18:45:38 -08:00
page-writeback.c mm: remove fastcall from mm/ 2008-02-05 09:44:18 -08:00
pagewalk.c maps4: introduce a generic page walker 2008-02-05 09:44:16 -08:00
pdflush.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
prio_tree.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
quicklist.c quicklists: Only consider memory that can be used with GFP_KERNEL 2008-01-14 08:52:22 -08:00
readahead.c mm: bdi init hooks 2007-10-17 08:42:45 -07:00
rmap.c mm: don't waste swap on locked pages 2008-02-05 09:44:18 -08:00
shmem_acl.c [PATCH] Fix typos in mm/shmem_acl.c 2006-10-11 11:14:23 -07:00
shmem.c tmpfs: fix shmem_swaplist races 2008-02-05 09:44:16 -08:00
slab.c cpu-hotplug: replace per-subsystem mutexes with get_online_cpus() 2008-01-25 21:08:02 +01:00
slob.c Avoid double memclear() in SLOB/SLUB 2007-12-09 10:17:52 -08:00
slub.c SLUB: Do not upset lockdep 2008-02-04 10:56:02 -08:00
sparse-vmemmap.c memory hotplug fix: fix section mismatch in vmammap_allock_block() 2007-11-29 09:24:54 -08:00
sparse.c is_vmalloc_addr(): Check if an address is within the vmalloc boundaries 2008-02-05 09:44:14 -08:00
swap_state.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
swap.c mm: remove fastcall from mm/ 2008-02-05 09:44:18 -08:00
swapfile.c tmpfs: open a window in shmem_unuse_inode 2008-02-05 09:44:15 -08:00
thrash.c Bug in mm/thrash.c function grab_swap_token() 2007-05-11 08:29:32 -07:00
tiny-shmem.c Remove unused code from mm/tiny-shmem.c 2008-02-05 09:44:17 -08:00
truncate.c page migraton: handle orphaned pages 2008-02-05 09:44:19 -08:00
util.c fix mm/util.c:krealloc() 2007-11-14 18:45:41 -08:00
vmalloc.c mm: don't allow ioremapping of ranges larger than vmalloc space 2008-02-05 09:44:18 -08:00
vmscan.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
vmstat.c vmstat: remove prefetch 2008-02-05 09:44:18 -08:00