linux/mm
Daniel Jordan 2be7cfed99 mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors
Commit 9a291a7c94 ("mm/hugetlb: report -EHWPOISON not -EFAULT when
FOLL_HWPOISON is specified") causes __get_user_pages to ignore certain
errors from follow_hugetlb_page.  After such error, __get_user_pages
subsequently calls faultin_page on the same VMA and start address that
follow_hugetlb_page failed on instead of returning the error immediately
as it should.

In follow_hugetlb_page, when hugetlb_fault returns a value covered under
VM_FAULT_ERROR, follow_hugetlb_page returns it without setting nr_pages
to 0 as __get_user_pages expects in this case, which causes the
following to happen in __get_user_pages: the "while (nr_pages)" check
succeeds, we skip the "if (!vma..." check because we got a VMA the last
time around, we find no page with follow_page_mask, and we call
faultin_page, which calls hugetlb_fault for the second time.

This issue also slightly changes how __get_user_pages works.  Before, it
only returned error if it had made no progress (i = 0).  But now,
follow_hugetlb_page can clobber "i" with an error code since its new
return path doesn't check for progress.  So if "i" is nonzero before a
failing call to follow_hugetlb_page, that indication of progress is lost
and __get_user_pages can return error even if some pages were
successfully pinned.

To fix this, change follow_hugetlb_page so that it updates nr_pages,
allowing __get_user_pages to fail immediately and restoring the "error
only if no progress" behavior to __get_user_pages.

Tested that __get_user_pages returns when expected on error from
hugetlb_fault in follow_hugetlb_page.

Fixes: 9a291a7c94 ("mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified")
Link: http://lkml.kernel.org/r/1500406795-58462-1-git-send-email-daniel.m.jordan@oracle.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: James Morse <james.morse@arm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: zhong jiang <zhongjiang@huawei.com>
Cc: <stable@vger.kernel.org>	[4.12.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-02 16:34:46 -07:00
..
kasan kasan: make get_wild_bug_type() static 2017-07-10 16:32:33 -07:00
backing-dev.c bdi: Drop 'parent' argument from bdi_register[_va]() 2017-04-20 12:09:55 -06:00
balloon_compaction.c mm/balloon_compaction.c: enqueue zero page to balloon device 2017-07-10 16:32:32 -07:00
bootmem.c mm/bootmem.c: cosmetic improvement of code readability 2017-02-22 16:41:29 -08:00
cleancache.c fs: switch ->s_uuid to uuid_t 2017-06-05 16:59:12 +02:00
cma_debug.c cma: Store a name in the cma structure 2017-04-18 20:41:12 +02:00
cma.c cma: fix calculation of aligned offset 2017-07-10 16:32:32 -07:00
cma.h cma: Store a name in the cma structure 2017-04-18 20:41:12 +02:00
compaction.c mm, compaction: skip over holes in __reset_isolation_suitable 2017-07-06 16:24:32 -07:00
debug_page_ref.c mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
debug.c mm, debug: print raw struct page data in __dump_page() 2016-12-12 18:55:08 -08:00
dmapool.c lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
early_ioremap.c
fadvise.c mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED 2016-12-20 09:48:46 -08:00
failslab.c mm: fault-inject take over bootstrap kmem_cache check 2016-03-15 16:55:16 -07:00
filemap.c mm: hugetlb: return immediately for hugetlb page in __delete_from_page_cache() 2017-07-10 16:32:30 -07:00
frame_vector.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
frontswap.c mm, frontswap: convert frontswap_enabled to static key 2016-07-26 16:19:19 -07:00
gup.c mm, gup: ensure real head page is ref-counted when using hugepages 2017-07-06 16:24:34 -07:00
highmem.c mm/highmem: make nr_free_highpages() handles all highmem zones by itself 2016-05-19 19:12:14 -07:00
huge_memory.c mm, THP, swap: check whether THP can be split firstly 2017-07-06 16:24:31 -07:00
hugetlb_cgroup.c mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size 2016-05-20 17:58:30 -07:00
hugetlb.c mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors 2017-08-02 16:34:46 -07:00
hwpoison-inject.c mm: hwpoison: call shake_page() unconditionally 2017-05-03 15:52:12 -07:00
init-mm.c mm: Add a user_ns owner to mm_struct and fix ptrace permission checks 2016-11-22 11:49:48 -06:00
internal.h mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic 2017-07-12 16:26:03 -07:00
interval_tree.c
Kconfig mm/kasan: add support for memory hotplug 2017-07-10 16:32:33 -07:00
Kconfig.debug mm: enable page poisoning early at boot 2017-05-03 15:52:10 -07:00
khugepaged.c mm: make PR_SET_THP_DISABLE immediately active 2017-07-10 16:32:31 -07:00
kmemcheck.c mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c mm: kmemleak: treat vm_struct as alternative reference to vmalloc'ed objects 2017-07-06 16:24:34 -07:00
ksm.c ksm: optimize refile of stable_node_dup at the head of the chain 2017-07-06 16:24:31 -07:00
list_lru.c mm/list_lru.c: fix list_lru_count_node() to be race free 2017-07-10 16:32:33 -07:00
maccess.c x86: remove more uaccess_32.h complexity 2016-05-22 17:21:27 -07:00
madvise.c userfaultfd: non-cooperative: add madvise() event for MADV_FREE request 2017-07-10 16:32:32 -07:00
Makefile percpu: expose statistics about percpu memory via debugfs 2017-06-20 15:31:38 -04:00
memblock.c mm, memory_hotplug: move movable_node to the hotplug proper 2017-07-06 16:24:35 -07:00
memcontrol.c mm, memcg: fix potential undefined behavior in mem_cgroup_event_ratelimit() 2017-07-10 16:32:32 -07:00
memory_hotplug.c mm/memory-hotplug: switch locking to a percpu rwsem 2017-07-10 16:32:33 -07:00
memory-failure.c mm, hugetlb, soft_offline: use new_page_nodemask for soft offline migration 2017-07-10 16:32:32 -07:00
memory.c mm/memory.c: mark create_huge_pmd() inline to prevent build failure 2017-07-12 16:25:59 -07:00
mempolicy.c mm, migration: do not trigger OOM killer when migrating memory 2017-07-12 16:26:04 -07:00
mempool.c sched/wait: Rename wait_queue_t => wait_queue_entry_t 2017-06-20 12:18:27 +02:00
memtest.c
migrate.c mm/migrate.c: stabilise page count when migrating transparent hugepages 2017-07-10 16:32:31 -07:00
mincore.c mm: remove shmem_mapping() shmem_zero_setup() duplicates 2017-02-24 17:46:56 -08:00
mlock.c mlock: fix mlock count can not decrease in race condition 2017-06-02 15:07:38 -07:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c mm: fix overflow check in expand_upwards() 2017-07-14 15:05:12 -07:00
mmu_context.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
mmu_notifier.c mm: Use static initialization for "srcu" 2017-04-18 11:38:22 -07:00
mmzone.c mm/mmzone.c: swap likely to unlikely as code logic is different for next_zones_zonelist() 2017-02-22 16:41:29 -08:00
mprotect.c mm: drop NULL return check of pte_offset_map_lock() 2017-07-06 16:24:33 -07:00
mremap.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
msync.c
nobootmem.c mm/nobootmem.c: return 0 when start_pfn equals end_pfn 2017-07-06 16:24:31 -07:00
nommu.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
oom_kill.c mm/oom_kill.c: add tracepoints for oom reaper-related events 2017-07-10 16:32:32 -07:00
page_alloc.c mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic 2017-07-12 16:26:03 -07:00
page_counter.c
page_ext.c mm: enable page poisoning early at boot 2017-05-03 15:52:10 -07:00
page_idle.c mm: make rmap_one boolean function 2017-05-03 15:52:10 -07:00
page_io.c swap: add block io poll in swapin path 2017-07-10 16:32:30 -07:00
page_isolation.c mm: unify new_node_page and alloc_migrate_target 2017-07-10 16:32:31 -07:00
page_owner.c mm: avoid taking zone lock in pagetypeinfo_showmixed() 2017-07-10 16:32:32 -07:00
page_poison.c mm: enable page poisoning early at boot 2017-05-03 15:52:10 -07:00
page_vma_mapped.c mm/hugetlb: add size parameter to huge_pte_offset() 2017-07-06 16:24:34 -07:00
page-writeback.c writeback: rework wb_[dec|inc]_stat family of functions 2017-07-12 16:26:05 -07:00
pagewalk.c mm/hugetlb: add size parameter to huge_pte_offset() 2017-07-06 16:24:34 -07:00
percpu-internal.h percpu: fix early calls for spinlock in pcpu_stats 2017-06-21 13:53:52 -04:00
percpu-km.c percpu: fix static checker warnings in pcpu_destroy_chunk 2017-06-29 11:23:38 -04:00
percpu-stats.c percpu: expose statistics about percpu memory via debugfs 2017-06-20 15:31:38 -04:00
percpu-vm.c percpu: fix static checker warnings in pcpu_destroy_chunk 2017-06-29 11:23:38 -04:00
percpu.c percpu: resolve err may not be initialized in pcpu_alloc 2017-06-21 12:00:45 -04:00
pgtable-generic.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
process_vm_access.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> 2017-03-02 08:42:28 +01:00
quicklist.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
readahead.c mm: don't cap request size based on read-ahead setting 2016-12-12 18:55:08 -08:00
rmap.c mm: memcontrol: per-lruvec stats infrastructure 2017-07-06 16:24:35 -07:00
rodata_test.c mm: remove rodata_test_data export, add pr_fmt 2017-05-03 15:52:09 -07:00
shmem.c mm: make PR_SET_THP_DISABLE immediately active 2017-07-10 16:32:31 -07:00
slab_common.c mm: allow slab_nomerge to be set at build time 2017-07-06 16:24:31 -07:00
slab.c mm: memcontrol: account slab stats per lruvec 2017-07-06 16:24:35 -07:00
slab.h mm: memcontrol: account slab stats per lruvec 2017-07-06 16:24:35 -07:00
slob.c mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
slub.c mm: memcontrol: account slab stats per lruvec 2017-07-06 16:24:35 -07:00
sparse-vmemmap.c mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic 2017-07-12 16:26:03 -07:00
sparse.c mm, memory_hotplug: do not associate hotadded memory to zones until online 2017-07-06 16:24:32 -07:00
swap_cgroup.c mm, THP, swap: delay splitting THP during swap out 2017-07-06 16:24:31 -07:00
swap_slots.c mm/swap_slots.c: don't disable preemption while taking the per-CPU cache 2017-07-10 16:32:32 -07:00
swap_state.c swap: add block io poll in swapin path 2017-07-10 16:32:30 -07:00
swap.c mm: swap: provide lru_add_drain_all_cpuslocked() 2017-07-10 16:32:33 -07:00
swapfile.c swap: add block io poll in swapin path 2017-07-10 16:32:30 -07:00
truncate.c mm/truncate.c: fix THP handling in invalidate_mapping_pages() 2017-07-10 16:32:32 -07:00
usercopy.c mm/usercopy: Drop extra is_vmalloc_or_module() check 2017-04-05 12:30:18 -07:00
userfaultfd.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
util.c Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-15 12:00:42 -07:00
vmacache.c sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
vmalloc.c mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic 2017-07-12 16:26:03 -07:00
vmpressure.c mm, vmpressure: pass-through notification support 2017-07-10 16:32:31 -07:00
vmscan.c mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic 2017-07-12 16:26:03 -07:00
vmstat.c mm: avoid taking zone lock in pagetypeinfo_showmixed() 2017-07-10 16:32:32 -07:00
workingset.c mm: memcontrol: per-lruvec stats infrastructure 2017-07-06 16:24:35 -07:00
z3fold.c z3fold: fix page locking in z3fold_alloc() 2017-04-13 18:24:20 -07:00
zbud.c
zpool.c
zsmalloc.c mm/zsmalloc: simplify zs_max_alloc_size handling 2017-07-10 16:32:33 -07:00
zswap.c mm/zswap.c: delete an error message for a failed memory allocation in zswap_dstmem_prepare() 2017-07-06 16:24:35 -07:00