linux/mm
Dave Hansen 9fa2dd9467 mm: fix pin vs. gup mismatch with gate pages
Gate pages were missed when converting from get to pin_user_pages().
This can lead to refcount imbalances.  This is reliably and quickly
reproducible running the x86 selftests when vsyscall=emulate is enabled
(the default).  Fix by using try_grab_page() with appropriate flags
passed.

The long story:

Today, pin_user_pages() and get_user_pages() are similar interfaces for
manipulating page reference counts.  However, "pins" use a "bias" value
and manipulate the actual reference count by 1024 instead of 1 used by
plain "gets".

That means that pin_user_pages() must be matched with unpin_user_pages()
and can't be mixed with a plain put_user_pages() or put_page().

Enter gate pages, like the vsyscall page.  They are pages usually in the
kernel image, but which are mapped to userspace.  Userspace is allowed
access to them, including interfaces using get/pin_user_pages().  The
refcount of these kernel pages is manipulated just like a normal user
page on the get/pin side so that the put/unpin side can work the same
for normal user pages or gate pages.

get_gate_page() uses try_get_page() which only bumps the refcount by
1, not 1024, even if called in the pin_user_pages() path.  If someone
pins a gate page, this happens:

	pin_user_pages()
		get_gate_page()
			try_get_page() // bump refcount +1
	... some time later
	unpin_user_pages()
		page_ref_sub_and_test(page, 1024))

... and boom, we get a refcount off by 1023.  This is reliably and
quickly reproducible running the x86 selftests when booted with
vsyscall=emulate (the default).  The selftests use ptrace(), but I
suspect anything using pin_user_pages() on gate pages could hit this.

To fix it, simply use try_grab_page() instead of try_get_page(), and
pass 'gup_flags' in so that FOLL_PIN can be respected.

This bug traces back to the very beginning of the FOLL_PIN support in
commit 3faa52c03f ("mm/gup: track FOLL_PIN pages"), which showed up in
the 5.7 release.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Fixes: 3faa52c03f ("mm/gup: track FOLL_PIN pages")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-03 18:36:55 -07:00
..
kasan mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h 2020-06-04 19:06:21 -07:00
backing-dev.c bdi: remove the name field in struct backing_dev_info 2020-05-09 16:15:13 -06:00
balloon_compaction.c
cleancache.c
cma_debug.c mm/cma_debug.c: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
cma.c mm/cma.c: use exact_nid true to fix possible per-numa cma leak 2020-07-03 16:15:25 -07:00
cma.h
compaction.c mm, compaction: make capture control handling safe wrt interrupts 2020-06-26 00:27:36 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: fix build failure with powerpc 8xx 2020-06-26 00:27:37 -07:00
debug.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
dmapool.c mm/dmapool.c: micro-optimisation remove unnecessary branch 2020-04-07 10:43:42 -07:00
early_ioremap.c mm/early_ioremap.c: use %pa to print resource_size_t variables 2020-01-31 10:30:38 -08:00
fadvise.c mm: return void from various readahead functions 2020-06-02 10:59:06 -07:00
failslab.c
filemap.c fs: Add IOCB_NOIO flag for generic_file_read_iter 2020-07-07 23:40:08 +02:00
frame_vector.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
frontswap.c mm/frontswap: fix some typos in frontswap.c 2020-06-04 19:06:24 -07:00
gup_benchmark.c mm/gup_benchmark: support pin_user_pages() and related calls 2020-04-02 09:35:27 -07:00
gup.c mm: fix pin vs. gup mismatch with gate pages 2020-09-03 18:36:55 -07:00
highmem.c mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions 2019-12-10 10:12:55 +01:00
hmm.c mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked() 2020-06-09 09:39:14 -07:00
huge_memory.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
hugetlb_cgroup.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
hugetlb.c mm/hugetlb: avoid hardcoding while checking if cma is enabled 2020-07-24 12:42:41 -07:00
hwpoison-inject.c mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
init-mm.c mmap locking API: add MMAP_LOCK_INITIALIZER 2020-06-09 09:39:14 -07:00
internal.h mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
interval_tree.c
Kconfig Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next 2020-06-07 17:25:29 -07:00
Kconfig.debug treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
khugepaged.c khugepaged: fix null-pointer dereference due to race 2020-07-24 12:42:41 -07:00
kmemleak-test.c
kmemleak.c mm/kmemleak.c: use address-of operator on section symbols 2020-04-02 09:35:26 -07:00
ksm.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
list_lru.c mm/list_lru: fix a typo in comment "numbesr"->"numbers" 2020-06-04 19:06:24 -07:00
maccess.c maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault 2020-06-17 10:57:41 -07:00
madvise.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
Makefile The Kernel Concurrency Sanitizer (KCSAN) 2020-06-11 18:55:43 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: update huge page-table entry callbacks 2020-04-02 09:35:29 -07:00
memblock.c mm/memblock: fix a typo in comment "implict"->"implicit" 2020-06-04 19:06:23 -07:00
memcontrol.c mm/memcg: fix refcount error while moving and swapping 2020-07-24 12:42:41 -07:00
memfd.c
memory_hotplug.c mm/memory_hotplug.c: fix false softlockup during pfn range removal 2020-06-26 00:27:38 -07:00
memory-failure.c mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread 2020-06-11 18:17:47 -07:00
memory.c mm: initialize return of vm_insert_pages 2020-07-24 12:42:41 -07:00
mempolicy.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mempool.c
memremap.c mm/memremap: set caching mode for PCI P2PDMA memory to WC 2020-04-10 15:36:21 -07:00
memtest.c
migrate.c Raise gcc version requirement to 4.9 2020-07-08 10:48:35 -07:00
mincore.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
mlock.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mm_init.c mm/mm_init.c: report kasan-tag information stored in page->flags 2020-06-02 10:59:12 -07:00
mmap.c mm/mmap.c: close race between munmap() and expand_upwards()/downwards() 2020-07-24 12:42:41 -07:00
mmu_gather.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mmu_notifier.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mmzone.c
mprotect.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mremap.c mm: document warning in move_normal_pmd() and make it warn only once 2020-07-13 11:37:39 -07:00
msync.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
nommu.c mm: remove vmalloc_exec 2020-06-26 00:27:38 -07:00
oom_kill.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
page_alloc.c mm/page_alloc: fix documentation error 2020-07-03 16:15:25 -07:00
page_counter.c mm, memcg: prevent memory.min load/store tearing 2020-04-02 09:35:29 -07:00
page_ext.c mm/page_ext.c: drop pfn_present() check when onlining 2020-04-07 10:43:40 -07:00
page_idle.c mm/page_idle.c: skip offline pages 2020-06-08 11:05:55 -07:00
page_io.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
page_isolation.c mm: Allow to offline unmovable PageOffline() pages via MEM_GOING_OFFLINE 2020-06-04 15:36:52 -04:00
page_owner.c mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention 2020-06-03 20:09:45 -07:00
page_poison.c
page_reporting.c mm/page_reporting: add budget limit on how many pages can be reported per pass 2020-04-07 10:43:39 -07:00
page_reporting.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: explicitly compare pfn for normal, hugetlbfs and THP page 2020-01-31 10:30:38 -08:00
page-writeback.c mm/page-writeback: fix a typo in comment "effictive"->"effective" 2020-06-04 19:06:24 -07:00
pagewalk.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
percpu-internal.h
percpu-km.c
percpu-stats.c percpu: update copyright emails to dennis@kernel.org 2020-04-01 10:09:12 -07:00
percpu-vm.c
percpu.c mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
pgtable-generic.c mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
process_vm_access.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
ptdump.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
readahead.c mm: use memalloc_nofs_save in readahead path 2020-06-02 10:59:07 -07:00
rmap.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
rodata_test.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
shmem.c vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way 2020-07-24 12:42:41 -07:00
shuffle.c mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
shuffle.h mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
slab_common.c mm: memcg/slab: fix memory leak at non-root kmem_cache destroy 2020-07-24 12:42:41 -07:00
slab.c mm/page_alloc: integrate classzone_idx and high_zoneidx 2020-06-03 20:09:44 -07:00
slab.h mm, slab: fix sign conversion problem in memcg_uncharge_slab() 2020-06-26 00:27:37 -07:00
slob.c mm/sl[uo]b: export __kmalloc_track(_node)_caller 2020-03-26 14:45:51 +01:00
slub.c slub: cure list_slab_objects() from double fix 2020-06-26 00:27:37 -07:00
sparse-vmemmap.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
sparse.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm/swap_slots.c: assign|reset cache slot by value directly 2020-04-02 09:35:27 -07:00
swap_state.c mm: fix swap cache node allocation mask 2020-06-26 00:27:37 -07:00
swap.c mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" 2020-06-26 00:27:38 -07:00
swapfile.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
truncate.c mm/thp: allow dropping THP from page cache 2019-10-19 06:32:33 -04:00
usercopy.c
userfaultfd.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
util.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
vmacache.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
vmalloc.c mm: remove vmalloc_exec 2020-06-26 00:27:38 -07:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm: workingset: age nonresident information alongside anonymous pages 2020-06-26 00:27:37 -07:00
vmstat.c mm/vmstat.c: convert to use DEFINE_SEQ_ATTRIBUTE macro 2020-06-04 19:06:26 -07:00
workingset.c mm: workingset: age nonresident information alongside anonymous pages 2020-06-26 00:27:37 -07:00
z3fold.c mm/z3fold: silence kmemleak false positives of slots 2020-05-28 11:35:40 -07:00
zbud.c mm: use false for bool variable 2020-06-04 19:06:24 -07:00
zpool.c zpool: add malloc_support_movable to zpool_driver 2019-09-24 15:54:12 -07:00
zsmalloc.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
zswap.c mm/zswap: allow setting default status, compressor and allocator in Kconfig 2020-04-07 10:43:41 -07:00