linux/drivers/gpu/drm/i915
Linus Torvalds 17839856fd gup: document and work around "COW can break either way" issue
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.

Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.

End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.

So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.

At the same time, some users simply don't even care.

For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.

This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.

The current semantics end up being:

 - __get_user_pages_fast(): no change. If you don't ask for a write,
   you won't break COW. You'd better know what you're doing.

 - get_user_pages_fast(): the fast-case "look it up in the page tables
   without anything getting mmap_sem" now refuses to follow a read-only
   page, since it might need COW breaking.  Which happens in the slow
   path - the fast path doesn't know if the memory might be COW or not.

 - get_user_pages() (including the slow-path fallback for gup_fast()):
   for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
   very similar semantics to FOLL_FORCE.

If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path.  So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".

Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.

But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.

[ It might be worth noting that we've always had this ambiguity, and it
  could arguably be seen as a user-space issue.

  You only get private COW mappings that could break either way in
  situations where user space is doing cooperative things (ie fork()
  before an execve() etc), but it _is_ surprising and very subtle, and
  fork() is supposed to give you independent address spaces.

  So let's treat this as a kernel issue and make the semantics of
  get_user_pages() easier to understand. Note that obviously a true
  shared mapping will still get a page that can change under us, so this
  does _not_ mean that get_user_pages() somehow returns any "stable"
  page ]

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:19:17 -07:00
..
display Make the "Reducing compressed framebufer size" message be DRM_INFO_ONCE() 2020-05-04 11:08:39 -07:00
gem gup: document and work around "COW can break either way" issue 2020-06-02 10:19:17 -07:00
gt drm/i915: Mark concurrent submissions with a weak-dependency 2020-05-11 10:54:04 -07:00
gvt Merge tag 'gvt-fixes-2020-05-12' of https://github.com/intel/gvt-linux into drm-intel-fixes 2020-05-11 23:55:26 -07:00
oa drm/i915: reimplement header test feature 2020-01-02 12:24:10 +02:00
selftests drm/i915/execlists: Track inflight CCID 2020-05-06 15:37:59 -07:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
i915_active_types.h
i915_active.c drm/i915: Extend i915_request_await_active to use all timelines 2020-03-11 10:54:59 +00:00
i915_active.h drm/i915: Extend i915_request_await_active to use all timelines 2020-03-11 10:54:59 +00:00
i915_buddy.c drm/i915/buddy: avoid double list_add 2020-03-06 14:33:08 +00:00
i915_buddy.h
i915_cmd_parser.c drm/i915/cmd_parser: conversion to struct drm_device logging macros. 2020-02-04 11:29:40 +02:00
i915_debugfs_params.c drm/i915: Include the debugfs params header for its own definition 2020-01-17 13:00:16 +00:00
i915_debugfs_params.h drm/i915/params: add i915 parameters to debugfs 2020-01-15 15:10:16 +02:00
i915_debugfs.c drm/i915: Remove debugfs i915_drpc_info and i915_forcewake_domains 2020-03-11 09:47:12 +00:00
i915_debugfs.h drm/i915: split out display debugfs to a separate file 2020-02-14 13:26:51 +02:00
i915_drv.c drm for 5.7-rc1 2020-04-01 15:24:20 -07:00
i915_drv.h drm/i915/tgl: Add Wa_14010477008:tgl 2020-04-20 10:12:32 -07:00
i915_fixed.h
i915_gem_evict.c drm/i915: Handle idling during i915_gem_evict_something busy loops 2020-05-13 14:39:41 -07:00
i915_gem_fence_reg.c drm/i915/vgpu: improve vgpu abstractions 2020-03-03 17:46:54 +02:00
i915_gem_fence_reg.h
i915_gem_gtt.c drm/i915: significantly reduce the use of <drm/i915_drm.h> 2020-02-27 08:35:09 +02:00
i915_gem_gtt.h drm/i915/gtt: split up i915_gem_gtt 2020-01-07 19:27:36 +00:00
i915_gem.c drm/i915: significantly reduce the use of <drm/i915_drm.h> 2020-02-27 08:35:09 +02:00
i915_gem.h i915 features for v5.6: 2019-12-27 15:25:04 +10:00
i915_getparam.c
i915_globals.c drm/i915: Ratelimit i915_globals_park 2019-12-18 17:38:56 +00:00
i915_globals.h
i915_gpu_error.c drm/i915: Avoid dereferencing a dead context 2020-05-04 10:35:47 -07:00
i915_gpu_error.h drm/i915: Track hw reported context runtime 2020-02-16 15:16:22 +00:00
i915_ioc32.c drm/i915: add i915_ioc32.h for compat 2020-03-02 13:32:37 +02:00
i915_ioc32.h drm/i915: add i915_ioc32.h for compat 2020-03-02 13:32:37 +02:00
i915_irq.c drm/i915/tgl+: Fix interrupt handling for DP AUX transactions 2020-05-06 06:47:09 -07:00
i915_irq.h drm/i915: Convert to CRTC VBLANK callbacks 2020-02-13 13:08:13 +01:00
i915_memcpy.c drm/i915: remove always-defined CONFIG_AS_MOVNTDQA 2020-04-09 00:01:59 +09:00
i915_memcpy.h
i915_mm.c drm/i915/gem: Extend mmap support for lmem 2020-01-04 17:57:46 +00:00
i915_params.c drm/i915: Remove 'prefault_disable' modparam 2020-01-27 11:45:35 +00:00
i915_params.h drm/i915: Mark i915.reset as unsigned 2020-02-05 18:51:52 +00:00
i915_pci.c drm/i915/tgl: Remove require_force_probe protection 2020-03-13 14:26:09 -07:00
i915_perf_types.h drm/i915/perf: Reintroduce wait on OA configuration completion 2020-03-04 13:49:26 +02:00
i915_perf.c Linux 5.7-rc7 2020-05-28 07:58:12 +02:00
i915_perf.h drm/i915/perf: Register sysctl path globally 2019-12-13 20:16:23 +00:00
i915_pmu.c drm/i915/pmu: Avoid using globals for PMU events 2020-02-26 14:07:50 +02:00
i915_pmu.h drm/i915: significantly reduce the use of <drm/i915_drm.h> 2020-02-27 08:35:09 +02:00
i915_priolist_types.h
i915_pvinfo.h
i915_query.c
i915_query.h
i915_reg.h drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore 2020-05-06 15:37:09 -07:00
i915_request.c drm/i915: Mark concurrent submissions with a weak-dependency 2020-05-11 10:54:04 -07:00
i915_request.h drm/i915: Defer semaphore priority bumping to a workqueue 2020-03-11 23:12:39 +02:00
i915_scatterlist.c
i915_scatterlist.h
i915_scheduler_types.h drm/i915: Mark concurrent submissions with a weak-dependency 2020-05-11 10:54:04 -07:00
i915_scheduler.c drm/i915: Mark concurrent submissions with a weak-dependency 2020-05-11 10:54:04 -07:00
i915_scheduler.h drm/i915: Mark concurrent submissions with a weak-dependency 2020-05-11 10:54:04 -07:00
i915_selftest.h
i915_suspend.c drm/i915: significantly reduce the use of <drm/i915_drm.h> 2020-02-27 08:35:09 +02:00
i915_suspend.h
i915_sw_fence_work.c drm/i915: Unpin vma->obj on early error 2019-12-18 10:13:03 +00:00
i915_sw_fence_work.h
i915_sw_fence.c drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-11 21:58:39 +00:00
i915_sw_fence.h drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-11 21:58:39 +00:00
i915_switcheroo.c drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count 2020-01-24 17:41:34 +00:00
i915_switcheroo.h
i915_syncmap.c
i915_syncmap.h
i915_sysfs.c drm/i915/gt: Expose engine properties via sysfs 2020-02-28 22:03:19 +00:00
i915_sysfs.h
i915_trace_points.c
i915_trace.h drm/i915/trace: i915_request.prio is a signed value 2020-01-28 15:53:36 +00:00
i915_user_extensions.c
i915_user_extensions.h
i915_utils.c drm/i915: Force DPCD backlight mode on X1 Extreme 2nd Gen 4K AMOLED panel 2020-03-03 20:34:32 -05:00
i915_utils.h drm/i915: be more solid in checking the alignment 2020-03-11 23:12:39 +02:00
i915_vgpu.c drm/i915/vgpu: improve vgpu abstractions 2020-03-03 17:46:54 +02:00
i915_vgpu.h drm/i915/vgpu: improve vgpu abstractions 2020-03-03 17:46:54 +02:00
i915_vma_types.h drm/i915/gem: Extract transient execbuf flags from i915_vma 2020-03-03 21:52:51 +00:00
i915_vma.c drm/i915: Check current i915_vma.pin_count status first on unbind 2020-05-06 15:36:54 -07:00
i915_vma.h drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutex 2020-01-30 21:35:43 +00:00
intel_device_info.c drm/i915: significantly reduce the use of <drm/i915_drm.h> 2020-02-27 08:35:09 +02:00
intel_device_info.h drm/i915: Read rawclk_freq earlier 2020-02-19 14:09:18 +00:00
intel_dram.c drm/i915/dram: hide the dram structs better 2020-03-02 13:32:27 +02:00
intel_dram.h drm/i915: split out intel_dram.[ch] from i915_drv.c 2020-02-27 09:16:01 +02:00
intel_gvt.c drm/i915/gvt: make intel_gvt_active internal to intel_gvt 2020-03-03 17:47:03 +02:00
intel_gvt.h
intel_memory_region.c drm/i915: convert to new logging macros in i915/intel_memory_region.c 2020-01-17 17:44:19 +02:00
intel_memory_region.h drm/i915: lookup for mem_region of a mem_type 2020-01-05 01:08:09 +00:00
intel_pch.c drm/i915: Make WARN* drm specific where drm_priv ptr is available 2020-01-22 17:54:33 +02:00
intel_pch.h
intel_pm.c drm/i915: Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled 2020-05-04 10:36:00 -07:00
intel_pm.h drm/i915: Manipulate DBuf slices properly 2020-02-05 19:19:23 +02:00
intel_region_lmem.c drm/i915/lmem: use new struct drm_device based logging macros. 2020-01-10 16:11:04 +02:00
intel_region_lmem.h
intel_runtime_pm.c
intel_runtime_pm.h
intel_sideband.c drm for 5.7-rc1 2020-04-01 15:24:20 -07:00
intel_sideband.h
intel_uncore.c drm/i915: Make WARN* drm specific where uncore or stream ptr is available 2020-01-22 17:57:39 +02:00
intel_uncore.h
intel_wakeref.c drm/i915/gt: Flush ongoing retires during wait_for_idle 2020-01-03 00:33:07 +00:00
intel_wakeref.h drm/i915/gt: Flush ongoing retires during wait_for_idle 2020-01-03 00:33:07 +00:00
intel_wopcm.c
intel_wopcm.h
Kconfig drm/i915: Update drm/i915 bug filing URL 2020-02-17 21:16:45 +02:00
Kconfig.debug
Kconfig.profile drm/i915/gen12: Disable preemption timeout 2020-03-12 13:46:01 +00:00
Kconfig.unstable
Makefile drm/i915: remove always-defined CONFIG_AS_MOVNTDQA 2020-04-09 00:01:59 +09:00
vlv_suspend.c drm/i915: switch vlv_suspend to use intel uncore register accessors 2020-02-17 11:29:51 +02:00
vlv_suspend.h drm/i915: split out vlv/chv specific suspend/resume code 2020-02-17 11:29:35 +02:00