Commit Graph

14 Commits

Author SHA1 Message Date
Matthew Auld
bfe53be268 drm/i915/ttm: handle blitter failure on DG2
If the move or clear operation somehow fails, and the memory underneath
is not cleared, like when moving to lmem, then we currently fallback to
memcpy or memset. However with small-BAR systems this fallback might no
longer be possible. For now we use the set_wedged sledgehammer if we
ever encounter such a scenario, and mark the object as borked to plug
any holes where access to the memory underneath can happen. Add some
basic selftests to exercise this.

v2:
  - In the selftests make sure we grab the runtime pm around the reset.
    Also make sure we grab the reset lock before checking if the device
    is wedged, since the wedge might still be in-progress and hence the
    bit might not be set yet.
  - Don't wedge or put the object into an unknown state, if the request
    construction fails (or similar). Just returning an error and
    skipping the fallback should be safe here.
  - Make sure we wedge each gt. (Thomas)
  - Peek at the unknown_state in io_reserve, that way we don't have to
    export or hand roll the fault_wait_for_idle. (Thomas)
  - Add the missing read-side barriers for the unknown_state. (Thomas)
  - Some kernel-doc fixes. (Thomas)
v3:
  - Tweak the ordering of the set_wedged, also add FIXME.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com
2022-07-01 08:30:00 +01:00
Tvrtko Ursulin
8ec5c0006c Merge tag 'drm-intel-next-2022-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-intel-gt-next
drm/i915 drm-intel-next -> drm-intel-gt-next cross-merge sync

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

# Conflicts:
#	drivers/gpu/drm/i915/gt/intel_rps.c
#	drivers/gpu/drm/i915/i915_vma.c
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87y1ywbh5y.fsf@intel.com
2022-05-23 09:34:47 +01:00
Matthew Auld
a7ce8f821c drm/i915: consider min_page_size when migrating
We can only force migrate an object if the existing object size is
compatible with the new destinations min_page_size for the region.
Currently we blow up with something like:

[ 2857.497462] kernel BUG at drivers/gpu/drm/i915/gt/intel_migrate.c:431!
[ 2857.497497] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 2857.497502] CPU: 1 PID: 8921 Comm: i915_selftest Tainted: G     U  W         5.18.0-rc1-drm-tip+ #27
[ 2857.497513] RIP: 0010:emit_pte.cold+0x11a/0x17e [i915]
[ 2857.497646] Code: 00 48 c7 c2 f0 cd c1 a0 48 c7 c7 e9 99 bd a0 e8 d2 77 5d e0 bf 01 00 00 00 e8 08 47 5d e0 31 f6 bf 09 00 00 00 e8 3c 7b 4d e0 <0f> 0b 48 c7 c1 e0 2a c5 a0 ba 34 00 00 00 48 c7 c6 00 ce c1 a0 48
[ 2857.497654] RSP: 0018:ffffc900000f7748 EFLAGS: 00010246
[ 2857.497658] RAX: 0000000000000000 RBX: ffffc900000f77c8 RCX: 0000000000000006
[ 2857.497662] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
[ 2857.497665] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
[ 2857.497668] R10: 0000000000022302 R11: ffff88846dea08f0 R12: 0000000000010000
[ 2857.497672] R13: 0000000001880000 R14: 000000000000081b R15: ffff888106b7c040
[ 2857.497675] FS:  00007f0d4c4e0600(0000) GS:ffff88845da80000(0000) knlGS:0000000000000000
[ 2857.497679] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2857.497682] CR2: 00007f113966c088 CR3: 0000000211e60003 CR4: 00000000003706e0
[ 2857.497686] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2857.497689] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2857.497692] Call Trace:
[ 2857.497694]  <TASK>
[ 2857.497697]  intel_context_migrate_copy+0x1e5/0x4f0 [i915]

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220420181613.70033-1-matthew.auld@intel.com
2022-04-21 10:10:34 +01:00
Joonas Lahtinen
c16c8bfa09 Merge drm/drm-next into drm-intel-gt-next
Pull in TTM changes needed for DG2 CCS enabling from Ram.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2022-04-12 11:28:42 +03:00
Christian König
1d7f5e6c52 drm/i915: drop bo->moving dependency
That should now be handled by the common dma_resv framework.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-13-christian.koenig@amd.com
2022-04-07 12:53:54 +02:00
Christian König
73511edf8b dma-buf: specify usage while adding fences to dma_resv obj v7
Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.

Rework all drivers to use this interface instead and deprecate the old one.

v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
    disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com
2022-04-07 12:53:53 +02:00
Christian König
c8d4c18bfb dma-buf/drivers: make reserving a shared slot mandatory v4
Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.

This is the next step towards handling the exclusive fence like a
shared one.

v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com
2022-04-06 17:38:25 +02:00
Andi Shyti
fa73208837 drm/i915: Rename INTEL_REGION_LMEM with INTEL_REGION_LMEM_0
With the upcoming multitile support each tile will have its own
local memory. Mark the current LMEM with the suffix '0' to
emphasise that it belongs to the root tile.

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-2-andi.shyti@linux.intel.com
2022-03-21 08:37:33 +00:00
Thomas Hellström
950505cabe drm/i915: Asynchronous migration selftest
Add a selftest to exercise asynchronous migration and -unbining.
Extend the gem_migrate selftest to perform the migrations while
depending on a spinner and a bound vma set up on the migrated
buffer object.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-6-thomas.hellstrom@linux.intel.com
2022-01-11 10:54:11 +01:00
Michał Winiarski
1a9c4db4ca drm/i915/gem: Use to_gt() helper
Use to_gt() helper consistently throughout the codebase.
Pure mechanical s/i915->gt/to_gt(i915). No functional changes.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-6-andi.shyti@linux.intel.com
2021-12-17 21:50:32 -08:00
Thomas Hellström
2b0a750caf drm/i915/ttm: Failsafe migration blits
If the initial fill blit or copy blit of an object fails, the old
content of the data might be exposed and read as soon as either CPU- or
GPU PTEs are set up to point at the pages.

Intercept the blit fence with an async callback that checks the
blit fence for errors and if there are errors performs an async cpu blit
instead. If there is a failure to allocate the async dma_fence_work,
allocate it on the stack and sync wait for the blit to complete.

Add selftests that simulate gpu blit failures and failure to allocate
the async dma_fence_work.

A previous version of this pach used dma_fence_work, now that's
opencoded which adds more code but might lower the latency
somewhat in the common non-error case.

v3:
- Style fixes (Matthew Auld)
v4:
- Use "#if IS_ENABLED()" instead of #ifdef (Matthew Auld)
v5:
- Fix an issue where we, if the dependency was already signaled, might
  end up waiting for a memcpy fence that would never signal.
v6:
- Add a missing i915_ttm_memcpy_release() (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104110718.688420-3-thomas.hellstrom@linux.intel.com
2021-11-05 09:05:32 +01:00
Jason Ekstrand
f3170ba8c9 drm/i915/gem: Check object_can_migrate from object_migrate
We don't roll them together entirely because there are still a couple
cases where we want a separate can_migrate check.  For instance, the
display code checks that you can migrate a buffer to LMEM before it
accepts it in fb_create.  The dma-buf import code also uses it to do an
early check and return a different error code if someone tries to attach
a LMEM-only dma-buf to another driver.

However, no one actually wants to call object_migrate when can_migrate
has failed.  The stated intention is for self-tests but none of those
actually take advantage of this unsafe migration.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723172142.3273510-2-jason@jlekstrand.net
2021-07-26 16:37:30 +01:00
Matthew Auld
d22632c83b drm/i915: support forcing the page size with lmem
For some specialised objects we might need something larger than the
regions min_page_size due to some hw restriction, and slightly more
hairy is needing something smaller with the guarantee that such objects
will never be inserted into any GTT, which is the case for the paging
structures.

This also fixes how we setup the BO page_alignment, if we later migrate
the object somewhere else. For example if the placements are {SMEM,
LMEM}, then we might get this wrong. Pushing the min_page_size behaviour
into the manager should fix this.

v2(Thomas): push the default page size behaviour into buddy_man, and let
the user override it with the page-alignment, which looks cleaner

v3: rebase on ttm sys changes

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625103824.558481-1-matthew.auld@intel.com
2021-06-30 13:24:29 +01:00
Matthew Auld
bf74a18ca8 drm/i915/gem: Introduce a selftest for the gem object migrate functionality
A selftest for the gem object migrate functionality. Slightly adapted
from the original by Matthew to the new interface and new fill blit
code.

v4:
- Initialize buffers and check contents after migration
  (Suggested by Matthew Auld)
- Perform async migration (if implemented) in the igt_lmem_pages_migrate
  test
- Test also migration to the current region.

Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> #v3
Link: https://patchwork.freedesktop.org/patch/msgid/20210629151203.209465-3-thomas.hellstrom@linux.intel.com
2021-06-30 11:32:40 +01:00