Commit Graph

17 Commits

Author SHA1 Message Date
Lucas De Marchi
6266992cf1 drm/i915/gt: remove GRAPHICS_VER == 10
Replace all remaining handling of GRAPHICS_VER {==,>=} 10 with
{==,>=} 11. With the removal of CNL, there is no platform with graphics
version equals 10.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-5-lucas.demarchi@intel.com
2021-07-29 10:06:10 -07:00
Matthew Auld
32334c9b1f drm/i915/gtt: ignore min_page_size for paging structures
The min_page_size is only needed for pages inserted into the GTT, and
for our paging structures we only need at most 4K bytes, so simply
ignore the min_page_size restrictions here, otherwise we might see some
severe overallocation on some devices.

v2(Thomas): add some commentary

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625103824.558481-2-matthew.auld@intel.com
2021-06-30 13:24:29 +01:00
Lucas De Marchi
c816723b6b drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER
This was done by the following semantic patch:

	@@ expression i915; @@
	- INTEL_GEN(i915)
	+ GRAPHICS_VER(i915)

	@@ expression i915; expression E; @@
	- INTEL_GEN(i915) >= E
	+ GRAPHICS_VER(i915) >= E

	@@ expression dev_priv; expression E; @@
	- !IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) != E

	@@ expression dev_priv; expression E; @@
	- IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) == E

	@@
	expression dev_priv;
	expression from, until;
	@@
	- IS_GEN_RANGE(dev_priv, from, until)
	+ IS_GRAPHICS_VER(dev_priv, from, until)

	@def@
	expression E;
	identifier id =~ "^gen$";
	@@
	- id = GRAPHICS_VER(E)
	+ ver = GRAPHICS_VER(E)

	@@
	identifier def.id;
	@@
	- id
	+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
2021-06-05 15:09:06 -07:00
Thomas Hellström
4d8151ae53 drm/i915: Don't free shared locks while shared
We are currently sharing the VM reservation locks across a number of
gem objects with page-table memory. Since TTM will individiualize the
reservation locks when freeing objects, including accessing the shared
locks, make sure that the shared locks are not freed until that is done.
For PPGTT we add an additional refcount, for GGTT we take additional
measures to make sure objects sharing the GGTT reservation lock are
freed at GGTT takedown

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210601074654.3103-3-thomas.hellstrom@linux.intel.com
2021-06-01 09:32:33 +01:00
Maarten Lankhorst
bc6f80cce9 drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2.
The stop_machine() lock may allocate memory, but is called inside
vm->mutex, which is taken in the shrinker. This will cause a lockdep
splat, as can be seen below:

<4>[  462.585762] ======================================================
<4>[  462.585768] WARNING: possible circular locking dependency detected
<4>[  462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G     U
<4>[  462.585779] ------------------------------------------------------
<4>[  462.585783] i915_selftest/5540 is trying to acquire lock:
<4>[  462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30
<4>[  462.585814]
                  but task is already holding lock:
<4>[  462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
<4>[  462.586301]
                  which lock already depends on the new lock.

<4>[  462.586305]
                  the existing dependency chain (in reverse order) is:
<4>[  462.586309]
                  -> #2 (&vm->mutex/1){+.+.}-{3:3}:
<4>[  462.586323]        i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4>[  462.586719]        i915_address_space_init+0x12d/0x130 [i915]
<4>[  462.587092]        ppgtt_init+0x4e/0x80 [i915]
<4>[  462.587467]        gen8_ppgtt_create+0x3e/0x5c0 [i915]
<4>[  462.587828]        i915_ppgtt_create+0x28/0xf0 [i915]
<4>[  462.588203]        intel_gt_init+0x123/0x370 [i915]
<4>[  462.588572]        i915_gem_init+0x129/0x1f0 [i915]
<4>[  462.588971]        i915_driver_probe+0x753/0xd80 [i915]
<4>[  462.589320]        i915_pci_probe+0x43/0x1d0 [i915]
<4>[  462.589671]        pci_device_probe+0x9e/0x110
<4>[  462.589680]        really_probe+0xea/0x410
<4>[  462.589690]        driver_probe_device+0xd9/0x140
<4>[  462.589697]        device_driver_attach+0x4a/0x50
<4>[  462.589704]        __driver_attach+0x83/0x140
<4>[  462.589711]        bus_for_each_dev+0x75/0xc0
<4>[  462.589718]        bus_add_driver+0x14b/0x1f0
<4>[  462.589724]        driver_register+0x66/0xb0
<4>[  462.589731]        i915_init+0x70/0x87 [i915]
<4>[  462.590053]        do_one_initcall+0x56/0x2e0
<4>[  462.590061]        do_init_module+0x55/0x200
<4>[  462.590068]        load_module+0x2703/0x2990
<4>[  462.590074]        __do_sys_finit_module+0xad/0x110
<4>[  462.590080]        do_syscall_64+0x33/0x80
<4>[  462.590089]        entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[  462.590096]
                  -> #1 (fs_reclaim){+.+.}-{0:0}:
<4>[  462.590109]        fs_reclaim_acquire+0x9f/0xd0
<4>[  462.590118]        kmem_cache_alloc_trace+0x3d/0x430
<4>[  462.590126]        intel_cpuc_prepare+0x3b/0x1b0
<4>[  462.590133]        cpuhp_invoke_callback+0x9e/0x890
<4>[  462.590141]        _cpu_up+0xa4/0x130
<4>[  462.590147]        cpu_up+0x82/0x90
<4>[  462.590153]        bringup_nonboot_cpus+0x4a/0x60
<4>[  462.590159]        smp_init+0x21/0x5c
<4>[  462.590167]        kernel_init_freeable+0x8a/0x1b7
<4>[  462.590175]        kernel_init+0x5/0xff
<4>[  462.590181]        ret_from_fork+0x22/0x30
<4>[  462.590187]
                  -> #0 (cpu_hotplug_lock){++++}-{0:0}:
<4>[  462.590199]        __lock_acquire+0x1520/0x2590
<4>[  462.590207]        lock_acquire+0xd1/0x3d0
<4>[  462.590213]        cpus_read_lock+0x39/0xc0
<4>[  462.590219]        stop_machine+0x12/0x30
<4>[  462.590226]        bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
<4>[  462.590601]        ggtt_bind_vma+0x5d/0x80 [i915]
<4>[  462.590970]        i915_vma_bind+0xdc/0x1c0 [i915]
<4>[  462.591374]        i915_vma_pin_ww+0x435/0xb40 [i915]
<4>[  462.591779]        make_obj_busy+0xcb/0x330 [i915]
<4>[  462.592170]        igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
<4>[  462.592562]        __i915_subtests.cold.7+0x42/0x92 [i915]
<4>[  462.592995]        __run_selftests.part.3+0x10d/0x172 [i915]
<4>[  462.593428]        i915_live_selftests.cold.5+0x1f/0x47 [i915]
<4>[  462.593860]        i915_pci_probe+0x93/0x1d0 [i915]
<4>[  462.594210]        pci_device_probe+0x9e/0x110
<4>[  462.594217]        really_probe+0xea/0x410
<4>[  462.594226]        driver_probe_device+0xd9/0x140
<4>[  462.594233]        device_driver_attach+0x4a/0x50
<4>[  462.594240]        __driver_attach+0x83/0x140
<4>[  462.594247]        bus_for_each_dev+0x75/0xc0
<4>[  462.594254]        bus_add_driver+0x14b/0x1f0
<4>[  462.594260]        driver_register+0x66/0xb0
<4>[  462.594267]        i915_init+0x70/0x87 [i915]
<4>[  462.594586]        do_one_initcall+0x56/0x2e0
<4>[  462.594592]        do_init_module+0x55/0x200
<4>[  462.594599]        load_module+0x2703/0x2990
<4>[  462.594605]        __do_sys_finit_module+0xad/0x110
<4>[  462.594612]        do_syscall_64+0x33/0x80
<4>[  462.594618]        entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[  462.594625]
                  other info that might help us debug this:

<4>[  462.594629] Chain exists of:
                    cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1

<4>[  462.594645]  Possible unsafe locking scenario:

<4>[  462.594648]        CPU0                    CPU1
<4>[  462.594652]        ----                    ----
<4>[  462.594655]   lock(&vm->mutex/1);
<4>[  462.594664]                                lock(fs_reclaim);
<4>[  462.594671]                                lock(&vm->mutex/1);
<4>[  462.594679]   lock(cpu_hotplug_lock);
<4>[  462.594686]
                   *** DEADLOCK ***

<4>[  462.594690] 4 locks held by i915_selftest/5540:
<4>[  462.594696]  #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50
<4>[  462.594715]  #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915]
<4>[  462.595118]  #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915]
<4>[  462.595519]  #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
<4>[  462.595934]
                  stack backtrace:
<4>[  462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G     U            5.12.0-rc5-CI-Trybot_7644+ #1
<4>[  462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018
<4>[  462.595952] Call Trace:
<4>[  462.595961]  dump_stack+0x7f/0xad
<4>[  462.595974]  check_noncircular+0x12e/0x150
<4>[  462.595982]  ? save_stack.isra.17+0x3f/0x70
<4>[  462.595991]  ? drm_mm_insert_node_in_range+0x34a/0x5b0
<4>[  462.596000]  ? i915_vma_pin_ww+0x9ec/0xb40 [i915]
<4>[  462.596410]  __lock_acquire+0x1520/0x2590
<4>[  462.596419]  ? do_init_module+0x55/0x200
<4>[  462.596429]  lock_acquire+0xd1/0x3d0
<4>[  462.596435]  ? stop_machine+0x12/0x30
<4>[  462.596445]  ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915]
<4>[  462.596816]  cpus_read_lock+0x39/0xc0
<4>[  462.596824]  ? stop_machine+0x12/0x30
<4>[  462.596831]  stop_machine+0x12/0x30
<4>[  462.596839]  bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
<4>[  462.597210]  ggtt_bind_vma+0x5d/0x80 [i915]
<4>[  462.597580]  i915_vma_bind+0xdc/0x1c0 [i915]
<4>[  462.597986]  i915_vma_pin_ww+0x435/0xb40 [i915]
<4>[  462.598395]  ? make_obj_busy+0xcb/0x330 [i915]
<4>[  462.598786]  make_obj_busy+0xcb/0x330 [i915]
<4>[  462.599180]  ? 0xffffffff81000000
<4>[  462.599187]  ? debug_mutex_unlock+0x50/0xa0
<4>[  462.599198]  igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
<4>[  462.599592]  __i915_subtests.cold.7+0x42/0x92 [i915]
<4>[  462.600026]  ? i915_perf_selftests+0x20/0x20 [i915]
<4>[  462.600422]  ? __i915_nop_setup+0x10/0x10 [i915]
<4>[  462.600820]  __run_selftests.part.3+0x10d/0x172 [i915]
<4>[  462.601253]  i915_live_selftests.cold.5+0x1f/0x47 [i915]
<4>[  462.601686]  i915_pci_probe+0x93/0x1d0 [i915]
<4>[  462.602037]  ? _raw_spin_unlock_irqrestore+0x3d/0x60
<4>[  462.602047]  pci_device_probe+0x9e/0x110
<4>[  462.602057]  really_probe+0xea/0x410
<4>[  462.602067]  driver_probe_device+0xd9/0x140
<4>[  462.602075]  device_driver_attach+0x4a/0x50
<4>[  462.602084]  __driver_attach+0x83/0x140
<4>[  462.602091]  ? device_driver_attach+0x50/0x50
<4>[  462.602099]  ? device_driver_attach+0x50/0x50
<4>[  462.602107]  bus_for_each_dev+0x75/0xc0
<4>[  462.602116]  bus_add_driver+0x14b/0x1f0
<4>[  462.602124]  driver_register+0x66/0xb0
<4>[  462.602133]  i915_init+0x70/0x87 [i915]
<4>[  462.602453]  ? 0xffffffffa0606000
<4>[  462.602458]  do_one_initcall+0x56/0x2e0
<4>[  462.602466]  ? kmem_cache_alloc_trace+0x374/0x430
<4>[  462.602476]  do_init_module+0x55/0x200
<4>[  462.602484]  load_module+0x2703/0x2990
<4>[  462.602500]  ? __do_sys_finit_module+0xad/0x110
<4>[  462.602507]  __do_sys_finit_module+0xad/0x110
<4>[  462.602519]  do_syscall_64+0x33/0x80
<4>[  462.602527]  entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[  462.602535] RIP: 0033:0x7fab69d8d89d

Changes since v1:
- Add lockdep annotations during init, to ensure that lockdep is primed.
  This also fixes a false positive when reading /proc/lockdep_stats
  during module reload.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2021-04-29 11:32:32 +02:00
Matthew Auld
6aed5673f0 drm/i915/gtt/dgfx: place the PD in LMEM
It's a requirement that for dgfx we place all the paging structures in
device local-memory.

v2: use i915_coherent_map_type()
v3: improve the shared dma-resv object comment

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-4-matthew.auld@intel.com
2021-04-27 16:21:47 +01:00
Matthew Auld
529b9ec809 drm/i915/gtt: map the PD up front
We need to generalise our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
mapping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Note that keeping the mapping around is a potential concern here, since
while the vma is pinned the mapping remains there for the PDs
underneath, or at least until the used_count reaches zero, at which
point we can safely destroy the mapping. For 32b this will be even worse
since the address space is more limited, but since this change mostly
impacts full ppGTT platforms, the justification is that for modern
platforms we shouldn't care too much about 32b.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com
2021-04-27 16:21:47 +01:00
Maarten Lankhorst
26ad4f8b73 drm/i915: Use a single page table lock for each gtt.
We may create page table objects on the fly, but we may need to
wait with the ww lock held. Instead of waiting on a freed obj
lock, ensure we have the same lock for each object to keep
-EDEADLK working. This ensures that i915_vma_pin_ww can lock
the page tables when required.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-41-maarten.lankhorst@linux.intel.com
2021-03-24 17:30:37 +01:00
Maarten Lankhorst
2a66596838 drm/i915: Move pinning to inside engine_wa_list_verify()
This should be done as part of the ww loop, in order to remove a
i915_vma_pin that needs ww held.

Now only i915_ggtt_pin() callers remaining.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-25-maarten.lankhorst@linux.intel.com
2021-03-24 17:27:20 +01:00
Chris Wilson
a4d86249c7 drm/i915/gt: Provide a utility to create a scratch buffer
Primarily used by selftests, but also by runtime debugging of engine
w/a, is a routine to create a temporarily bound buffer for readback.
Almagamate the duplicated routines into one.

Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201219020343.22681-2-chris@chris-wilson.co.uk
2020-12-21 09:53:52 +00:00
Ville Syrjälä
c0888e9e22 drm/i915: Enable eLLC caching of display buffers for SKL+
Since SKL the eLLC has been sitting on the far side of the system
agent, meaning the display engine can utilize it. Let's enable that.

I chose WB for the caching mode, because my numbers are indicating
that WT might actually be WB and WC might actually be UC. I'm not
100% sure that is indeed the case but at least my simple rendercopy
based benchmark didn't see any difference in performance.

Also if I configure things to do LLCeLLC+WT I still get cache dirt
on my screen, suggesting that is in fact operating in WB mode
anyway. This is also the reason I had to fix the MOCS target cache
to really say PTE rather than LLC+eLLC.
Since SKL the eLLC has been sitting on the far side of the system agent,
meaning the display engine can utilize it. Let's enable that.

Eero's earlier benchmarks numbers:
"* Results in GfxBench and Unigine (Valley/Heaven) tests were within daily
   variation on the tested SKL machines

 * SKL GT4e (128MB eLLC) / Wayland / Weston:
   +15-20% SynMark TexMem512 (512MB of textures)
   +4-6% SynMark TerrainFly*, CSCloth, ShMapVsm
   -5-10% SynMark TexMem128 (128MB of textures)

 * SKL GT3e (64MB eLLC) / Xorg / Unity:
   +4-8% GpuTest Triangle fullscreen (FullHD)
   -5-10% GpuTest Triangle windowed (1/2 screen)

 * SKL GT2 (no eLLC) / Xorg / Unity:
   * Some of the higher FPS SynMark pixel and vertex shader tests
     are few percent higher, more than daily variance
   => Do you see any reason why this machine would be impacted
      although it doesn't eLLC?"

Caveats:
- Still haven't tested with a prime setup
- Still not entirely sure this a good idea, but I've been
  using it on my cfl anyway :)

v2: Split the MOCS PTE change out

Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-3-ville.syrjala@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-3-chris@chris-wilson.co.uk
2020-10-15 15:38:20 +01:00
Chris Wilson
89351925a4 drm/i915/gt: Switch to object allocations for page directories
The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.

In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.

v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:08 +03:00
Chris Wilson
ad2f9bc9bc drm/i915/gt: Pull marking vm as closed underneath the vm->mutex
Pull the final atomic_dec of vm->open (marking the vm as closed)
underneath the same vm->mutex as used to close it. This is required to
correctly serialise with attempting to reuse the vma as the vm is closed
by a second thread.

References: 00de702c6c ("drm/i915: Check that the vma hasn't been closed before we insert it")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-10-chris@chris-wilson.co.uk
2020-02-28 12:33:07 +00:00
Daniele Ceraolo Spurio
69edc390a5 drm/i915/ggtt: do not set bits 1-11 in gen12 ptes
On TGL, bits 2-4 in the GGTT PTE are not ignored anymore and are
instead used for some extra VT-d capabilities. We don't (yet?) have
support for those capabilities, but, given that we shared the pte_encode
function betweed GGTT and PPGTT, we still set those bits to the PPGTT
PPAT values. The DMA engine gets very confused when those bits are
set while the iommu is enabled, leading to errors. E.g. when loading
the GuC we get:

[    9.796218] DMAR: DRHD: handling fault status reg 2
[    9.796235] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
[    9.899215] [drm:intel_guc_fw_upload [i915]] *ERROR* GuC firmware signature verification failed

To fix this, just have dedicated gen8_pte_encode function per type of
gtt. Also, explicitly set vm->pte_encode for gen8_ppgtt, even if we
don't use it, to make sure we don't accidentally assign it to the GGTT
one, like we do for gen6_ppgtt, in case we need it in the future.

Reported-by: "Sodhi, Vunny" <vunny.sodhi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200226185657.26445-1-daniele.ceraolospurio@intel.com
2020-02-27 22:38:11 +00:00
Chris Wilson
82d71e31ae drm/i915/gt: Poison GTT scratch pages
Using a clear page for scratch means that we have relatively benign
errors in case it is accidentally used, but that can be rather too
benign for debugging. If we poison the scratch, ideally it quickly
results in an obvious error.

v2: Set each page individually just in case we are using highmem for our
scratch page.
v3: Pick a new scratch register as MI_STORE_REGISTER_MEM does not work
with GPR0 on gen7, unbelievably.
v4: Haswell still considers 3DPRIM a privileged register!

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200124115133.53360-1-chris@chris-wilson.co.uk
2020-01-24 21:08:24 +00:00
Pankaj Bharadiya
0d4c351a0f drm/i915/gt: Make WARN* drm specific where drm_priv ptr is available
drm specific WARN* calls include device information in the
backtrace, so we know what device the warnings originate from.

Covert all the calls of WARN* with device specific drm_WARN*
variants in functions where drm_i915_private struct pointer is readily
available.

The conversion was done automatically with below coccinelle semantic
patch. checkpatch errors/warnings are fixed manually.

@rule1@
identifier func, T;
@@
func(...) {
...
struct drm_i915_private *T = ...;
<+...
(
-WARN(
+drm_WARN(&T->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->drm,
...)
)
...+>
}

@rule2@
identifier func, T;
@@
func(struct drm_i915_private *T,...) {
<+...
(
-WARN(
+drm_WARN(&T->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->drm,
...)
)
...+>
}

command: spatch --sp-file <script> --dir drivers/gpu/drm/i915/gt \
					--linux-spacing --in-place

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-7-pankaj.laxminarayan.bharadiya@intel.com
2020-01-22 17:53:37 +02:00
Matthew Auld
2c86e55d2a drm/i915/gtt: split up i915_gem_gtt
Attempt to split i915_gem_gtt.[ch] into more manageable chunks.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk
2020-01-07 19:27:36 +00:00