linux/drivers/gpu/drm/i915/gem
Chris Wilson 4f2a572eda drm/i915/userptr: Never allow userptr into the mappable GGTT
Daniel Vetter uncovered a nasty cycle in using the mmu-notifiers to
invalidate userptr objects which also happen to be pulled into GGTT
mmaps. That is when we unbind the userptr object (on mmu invalidation),
we revoke all CPU mmaps, which may then recurse into mmu invalidation.

We looked for ways of breaking the cycle, but the revocation on
invalidation is required and cannot be avoided. The only solution we
could see was to not allow such GGTT bindings of userptr objects in the
first place. In practice, no one really wants to use a GGTT mmapping of
a CPU pointer...

Just before Daniel's explosive lockdep patches land in v5.4-rc1, we got
a genuine blip from CI:

<4>[  246.793958] ======================================================
<4>[  246.793972] WARNING: possible circular locking dependency detected
<4>[  246.793989] 5.3.0-gbd6c56f50d15-drmtip_372+ #1 Tainted: G     U
<4>[  246.794003] ------------------------------------------------------
<4>[  246.794017] kswapd0/145 is trying to acquire lock:
<4>[  246.794030] 000000003f565be6 (&dev->struct_mutex/1){+.+.}, at: userptr_mn_invalidate_range_start+0x18f/0x220 [i915]
<4>[  246.794250]
                  but task is already holding lock:
<4>[  246.794263] 000000001799cef9 (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe6/0x2a0
<4>[  246.794291]
                  which lock already depends on the new lock.

<4>[  246.794307]
                  the existing dependency chain (in reverse order) is:
<4>[  246.794322]
                  -> #3 (&anon_vma->rwsem){++++}:
<4>[  246.794344]        down_write+0x33/0x70
<4>[  246.794357]        __vma_adjust+0x3d9/0x7b0
<4>[  246.794370]        __split_vma+0x16a/0x180
<4>[  246.794385]        mprotect_fixup+0x2a5/0x320
<4>[  246.794399]        do_mprotect_pkey+0x208/0x2e0
<4>[  246.794413]        __x64_sys_mprotect+0x16/0x20
<4>[  246.794429]        do_syscall_64+0x55/0x1c0
<4>[  246.794443]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  246.794456]
                  -> #2 (&mapping->i_mmap_rwsem){++++}:
<4>[  246.794478]        down_write+0x33/0x70
<4>[  246.794493]        unmap_mapping_pages+0x48/0x130
<4>[  246.794519]        i915_vma_revoke_mmap+0x81/0x1b0 [i915]
<4>[  246.794519]        i915_vma_unbind+0x11d/0x4a0 [i915]
<4>[  246.794519]        i915_vma_destroy+0x31/0x300 [i915]
<4>[  246.794519]        __i915_gem_free_objects+0xb8/0x4b0 [i915]
<4>[  246.794519]        drm_file_free.part.0+0x1e6/0x290
<4>[  246.794519]        drm_release+0xa6/0xe0
<4>[  246.794519]        __fput+0xc2/0x250
<4>[  246.794519]        task_work_run+0x82/0xb0
<4>[  246.794519]        do_exit+0x35b/0xdb0
<4>[  246.794519]        do_group_exit+0x34/0xb0
<4>[  246.794519]        __x64_sys_exit_group+0xf/0x10
<4>[  246.794519]        do_syscall_64+0x55/0x1c0
<4>[  246.794519]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  246.794519]
                  -> #1 (&vm->mutex){+.+.}:
<4>[  246.794519]        i915_gem_shrinker_taints_mutex+0x6d/0xe0 [i915]
<4>[  246.794519]        i915_address_space_init+0x9f/0x160 [i915]
<4>[  246.794519]        i915_ggtt_init_hw+0x55/0x170 [i915]
<4>[  246.794519]        i915_driver_probe+0xc9f/0x1620 [i915]
<4>[  246.794519]        i915_pci_probe+0x43/0x1b0 [i915]
<4>[  246.794519]        pci_device_probe+0x9e/0x120
<4>[  246.794519]        really_probe+0xea/0x3d0
<4>[  246.794519]        driver_probe_device+0x10b/0x120
<4>[  246.794519]        device_driver_attach+0x4a/0x50
<4>[  246.794519]        __driver_attach+0x97/0x130
<4>[  246.794519]        bus_for_each_dev+0x74/0xc0
<4>[  246.794519]        bus_add_driver+0x13f/0x210
<4>[  246.794519]        driver_register+0x56/0xe0
<4>[  246.794519]        do_one_initcall+0x58/0x300
<4>[  246.794519]        do_init_module+0x56/0x1f6
<4>[  246.794519]        load_module+0x25bd/0x2a40
<4>[  246.794519]        __se_sys_finit_module+0xd3/0xf0
<4>[  246.794519]        do_syscall_64+0x55/0x1c0
<4>[  246.794519]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  246.794519]
                  -> #0 (&dev->struct_mutex/1){+.+.}:
<4>[  246.794519]        __lock_acquire+0x15d8/0x1e90
<4>[  246.794519]        lock_acquire+0xa6/0x1c0
<4>[  246.794519]        __mutex_lock+0x9d/0x9b0
<4>[  246.794519]        userptr_mn_invalidate_range_start+0x18f/0x220 [i915]
<4>[  246.794519]        __mmu_notifier_invalidate_range_start+0x85/0x110
<4>[  246.794519]        try_to_unmap_one+0x76b/0x860
<4>[  246.794519]        rmap_walk_anon+0x104/0x280
<4>[  246.794519]        try_to_unmap+0xc0/0xf0
<4>[  246.794519]        shrink_page_list+0x561/0xc10
<4>[  246.794519]        shrink_inactive_list+0x220/0x440
<4>[  246.794519]        shrink_node_memcg+0x36e/0x740
<4>[  246.794519]        shrink_node+0xcb/0x490
<4>[  246.794519]        balance_pgdat+0x241/0x580
<4>[  246.794519]        kswapd+0x16c/0x530
<4>[  246.794519]        kthread+0x119/0x130
<4>[  246.794519]        ret_from_fork+0x24/0x50
<4>[  246.794519]
                  other info that might help us debug this:

<4>[  246.794519] Chain exists of:
                    &dev->struct_mutex/1 --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem

<4>[  246.794519]  Possible unsafe locking scenario:

<4>[  246.794519]        CPU0                    CPU1
<4>[  246.794519]        ----                    ----
<4>[  246.794519]   lock(&anon_vma->rwsem);
<4>[  246.794519]                                lock(&mapping->i_mmap_rwsem);
<4>[  246.794519]                                lock(&anon_vma->rwsem);
<4>[  246.794519]   lock(&dev->struct_mutex/1);
<4>[  246.794519]
                   *** DEADLOCK ***

v2: Say no to mmap_ioctl

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111744
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111870
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190928082546.3473-1-chris@chris-wilson.co.uk
(cherry picked from commit a4311745bb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-10-16 10:56:50 -07:00
..
selftests Merge drm/drm-next into drm-intel-next-queued 2019-08-22 00:10:36 -07:00
i915_gem_busy.c dma-buf: Restore seqlock around dma_resv updates 2019-08-16 12:40:58 +01:00
i915_gem_clflush.c drm/i915: Generalise the clflush dma-worker 2019-08-22 08:27:44 +01:00
i915_gem_clflush.h drm/i915: Move more GEM objects under gem/ 2019-05-28 12:45:29 +01:00
i915_gem_client_blt.c Merge drm/drm-next into drm-intel-next-queued 2019-08-22 00:10:36 -07:00
i915_gem_client_blt.h drm/i915: add in-kernel blitter client 2019-05-30 12:01:44 +01:00
i915_gem_context_types.h drm/i915: Push the ring creation flags to the backend 2019-08-09 20:18:30 +01:00
i915_gem_context.c drm/i915/tgl: add GEN12_MAX_CONTEXT_HW_ID 2019-08-20 14:23:45 +01:00
i915_gem_context.h drm/i915: Remove i915_gem_context_create_gvt() 2019-08-09 20:18:30 +01:00
i915_gem_dmabuf.c dma-buf: rename reservation_object to dma_resv 2019-08-13 09:09:30 +02:00
i915_gem_domain.c drm/i915: Replace i915_vma_put_fence() 2019-08-22 08:53:42 +01:00
i915_gem_execbuffer.c drm main pull for 5.4-rc1 2019-09-19 16:24:24 -07:00
i915_gem_fence.c Merge drm/drm-next into drm-intel-next-queued 2019-08-22 00:10:36 -07:00
i915_gem_internal.c drm/i915: Pull scatterlist utils out of i915_gem.h 2019-05-28 12:45:29 +01:00
i915_gem_ioctls.h
i915_gem_mman.c drm/i915/userptr: Never allow userptr into the mappable GGTT 2019-10-16 10:56:50 -07:00
i915_gem_object_blt.c drm/i915: Serialize against vma moves 2019-08-19 15:25:56 +01:00
i915_gem_object_blt.h drm/i915/blt: support copying objects 2019-08-10 19:35:36 +01:00
i915_gem_object_types.h drm/i915/userptr: Never allow userptr into the mappable GGTT 2019-10-16 10:56:50 -07:00
i915_gem_object.c Merge drm/drm-next into drm-intel-next-queued 2019-08-22 00:10:36 -07:00
i915_gem_object.h drm/i915/userptr: Never allow userptr into the mappable GGTT 2019-10-16 10:56:50 -07:00
i915_gem_pages.c Merge drm/drm-next into drm-intel-next-queued 2019-08-22 00:10:36 -07:00
i915_gem_phys.c drm/i915: Free the imported shmemfs file for phys objects 2019-08-09 13:32:29 +01:00
i915_gem_pm.c drm/i915: Perform GGTT restore much earlier during resume 2019-10-07 10:44:46 -07:00
i915_gem_pm.h drm/i915: Move more GEM objects under gem/ 2019-05-28 12:45:29 +01:00
i915_gem_shmem.c drm/i915: avoid including intel_drv.h via i915_drv.h->i915_trace.h 2019-08-07 12:43:14 +03:00
i915_gem_shrinker.c drm/i915/gem: Make caps.scheduler static 2019-08-06 15:00:14 +01:00
i915_gem_shrinker.h drm/i915: extract i915_gem_shrinker.h from i915_drv.h 2019-08-09 12:03:32 +03:00
i915_gem_stolen.c drm/i915: Convert a few more bland dmesg info to be device specific 2019-08-15 13:13:23 +01:00
i915_gem_stolen.h drm/i915: extract gem/i915_gem_stolen.h from i915_drv.h 2019-08-09 12:03:29 +03:00
i915_gem_throttle.c drm/i915/gt: Use intel_gt as the primary object for handling resets 2019-07-12 21:06:56 +01:00
i915_gem_tiling.c drm/i915: Move more GEM objects under gem/ 2019-05-28 12:45:29 +01:00
i915_gem_userptr.c drm/i915/userptr: Never allow userptr into the mappable GGTT 2019-10-16 10:56:50 -07:00
i915_gem_wait.c dma-buf: rename reservation_object to dma_resv 2019-08-13 09:09:30 +02:00
i915_gemfs.c drm/i915: Stop reconfiguring our shmemfs mountpoint 2019-08-09 20:18:30 +01:00
i915_gemfs.h drm/i915: Move more GEM objects under gem/ 2019-05-28 12:45:29 +01:00
Makefile drm/i915: use upstream version of header tests 2019-07-30 12:11:57 +03:00