Commit Graph

6 Commits

Author SHA1 Message Date
Matthew Auld
bfe53be268 drm/i915/ttm: handle blitter failure on DG2
If the move or clear operation somehow fails, and the memory underneath
is not cleared, like when moving to lmem, then we currently fallback to
memcpy or memset. However with small-BAR systems this fallback might no
longer be possible. For now we use the set_wedged sledgehammer if we
ever encounter such a scenario, and mark the object as borked to plug
any holes where access to the memory underneath can happen. Add some
basic selftests to exercise this.

v2:
  - In the selftests make sure we grab the runtime pm around the reset.
    Also make sure we grab the reset lock before checking if the device
    is wedged, since the wedge might still be in-progress and hence the
    bit might not be set yet.
  - Don't wedge or put the object into an unknown state, if the request
    construction fails (or similar). Just returning an error and
    skipping the fallback should be safe here.
  - Make sure we wedge each gt. (Thomas)
  - Peek at the unknown_state in io_reserve, that way we don't have to
    export or hand roll the fault_wait_for_idle. (Thomas)
  - Add the missing read-side barriers for the unknown_state. (Thomas)
  - Some kernel-doc fixes. (Thomas)
v3:
  - Tweak the ordering of the set_wedged, also add FIXME.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com
2022-07-01 08:30:00 +01:00
Thomas Hellström
63cf4cad73 drm/i915: Break out the i915_deps utility
Since it's starting to be used outside the i915 TTM move code, move it
to a separate set of files.

v2:
- Update the documentation.
v4:
- Rebase.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211221200050.436316-4-thomas.hellstrom@linux.intel.com
2021-12-22 08:52:57 +01:00
Thomas Hellström
1193081710 drm/i915: Avoid using the i915_fence_array when collecting dependencies
Since the gt migration code was using only a single fence for
dependencies, these were collected in a dma_fence_array. However, it
turns out that it's illegal to use some dma_fences in a dma_fence_array,
in particular other dma_fence_arrays and dma_fence_chains, and this
causes trouble for us moving forward.

Have the gt migration code instead take a const struct i915_deps for
dependencies. This means we can skip the dma_fence_array creation
and instead pass the struct i915_deps instead to circumvent the
problem.

v2:
- Make the prev_deps() function static. (kernel test robot <lkp@intel.com>)
- Update the struct i915_deps kerneldoc.
v4:
- Rebase.

Fixes: 5652df829b ("drm/i915/ttm: Update i915_gem_obj_copy_ttm() to be asynchronous")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211221200050.436316-2-thomas.hellstrom@linux.intel.com
2021-12-22 08:14:30 +01:00
Thomas Hellström
05d1c76107 drm/i915/ttm: Move the i915_gem_obj_copy_ttm() function
Move the i915_gem_obj_copy_ttm() function to i915_gem_ttm_move.h.
This will help keep a number of functions static when introducing
async moves.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-3-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:15 +01:00
Thomas Hellström
2b0a750caf drm/i915/ttm: Failsafe migration blits
If the initial fill blit or copy blit of an object fails, the old
content of the data might be exposed and read as soon as either CPU- or
GPU PTEs are set up to point at the pages.

Intercept the blit fence with an async callback that checks the
blit fence for errors and if there are errors performs an async cpu blit
instead. If there is a failure to allocate the async dma_fence_work,
allocate it on the stack and sync wait for the blit to complete.

Add selftests that simulate gpu blit failures and failure to allocate
the async dma_fence_work.

A previous version of this pach used dma_fence_work, now that's
opencoded which adds more code but might lower the latency
somewhat in the common non-error case.

v3:
- Style fixes (Matthew Auld)
v4:
- Use "#if IS_ENABLED()" instead of #ifdef (Matthew Auld)
v5:
- Fix an issue where we, if the dependency was already signaled, might
  end up waiting for a memcpy fence that would never signal.
v6:
- Add a missing i915_ttm_memcpy_release() (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104110718.688420-3-thomas.hellstrom@linux.intel.com
2021-11-05 09:05:32 +01:00
Thomas Hellström
3589fdbd3b drm/i915/ttm: Reorganize the ttm move code
We are about to introduce failsafe- and asynchronous migration and
ttm moves.
This will add complexity and code to the TTM move code so it makes sense
to split it out to a separate file to make the i915 TTM code easer to
digest.
Split the i915 TTM move code out and since we will have to change the name
of the gpu_binds_iomem() and cpu_maps_iomem() functions anyway,
we alter the name of gpu_binds_iomem() to i915_ttm_gtt_binds_lmem() which
is more reflecting what it is used for.
With this we also add some more documentation. Otherwise there should be
no functional change.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104110718.688420-2-thomas.hellstrom@linux.intel.com
2021-11-05 09:05:30 +01:00