Commit Graph

199 Commits

Author SHA1 Message Date
Chris Wilson
3447c4c55d drm/i915: Avoid live-lock with i915_vma_parked()
Abuse^W Take advantage that we know we are inside the GT wakeref and
that prevents any client execbuf from reopening the i915_vma in order to
claim all the vma to close without having to drop the spinlock to free
each one individually. By keeping the spinlock, we do not have to
restart if we run concurrently with i915_gem_free_objects -- which
causes them both to restart continually and make very very slow
progress.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1361
Fixes: 77853186e5 ("drm/i915: Claim vma while under closed_lock in i915_vma_parked()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-2-chris@chris-wilson.co.uk
2020-03-23 11:16:14 +00:00
Chris Wilson
29e6ecf3ce drm/i915: Extend i915_request_await_active to use all timelines
Extend i915_request_await_active() to be able to asynchronously wait on
all the tracked timelines simultaneously.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311092044.16353-1-chris@chris-wilson.co.uk
2020-03-11 10:54:59 +00:00
Chris Wilson
2f0003089b drm/i915: Drop vma is-closed assertion on insert
The is-closed flag may be added after we have acquired the vma under the
ctx->mutex, but will not take effect until after we release the
vm->mutex. i.e. the flag may be set on the vma as attempt to bind it and
that will cause the vma to be unbound later after we unpin it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200303093157.1153887-1-chris@chris-wilson.co.uk
2020-03-03 17:30:20 +00:00
Chris Wilson
00de702c6c drm/i915: Check that the vma hasn't been closed before we insert it
As there is a delay before we pin a vma, there is an opportunity for
another thread to have closed the vm and its vma (including us).
Check as soon as we acquire the vm->mutex and know the vm/vma is stable.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1291
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200221121940.2741563-1-chris@chris-wilson.co.uk
2020-02-21 17:32:27 +00:00
Chris Wilson
30ca04e16c drm/i915: Hold reference to previous active fence as we queue
Take a reference to the previous exclusive fence on the i915_active, as
we wish to add an await to it in the caller (and so must prevent it from
being freed until we have completed that task).

Fixes: e3793468b4 ("drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200203094152.4150550-1-chris@chris-wilson.co.uk
2020-02-03 11:25:39 +00:00
Chris Wilson
e3793468b4 drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutex
On Braswell and Broxton (also known as Valleyview and Apollolake), we
need to serialise updates of the GGTT using the big stop_machine()
hammer. This has the side effect of appearing to lockdep as a possible
reclaim (since it uses the cpuhp mutex and that is tainted by per-cpu
allocations). However, we want to use vm->mutex (including ggtt->mutex)
from within the shrinker and so must avoid such possible taints. For this
purpose, we introduced the asynchronous vma binding and we can apply it
to the PIN_GLOBAL so long as take care to add the necessary waits for
the worker afterwards.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/211
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-3-chris@chris-wilson.co.uk
2020-01-30 21:35:43 +00:00
Chris Wilson
d62f416f92 drm/i915: Wait on vma activity before taking the mutex
Optimistically wait for the prior vma activity before taking the mutex
to minimise the mutex hold time while unbinding. We will then verify the
vma is idle with a second wait under the mutex to ensure it is safe to
unbind.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123224459.38128-2-chris@chris-wilson.co.uk
2020-01-24 10:37:13 +00:00
Chris Wilson
60e94557ff drm/i915: Check activity on i915_vma after confirming pin_count==0
Only assert that the i915_vma is now idle if and only if no other pins
are present. If another user has the i915_vma pinned, they may submit
more work to the i915_vma skipping the vm->mutex used to serialise the
unbind. We need to wait again, if we want to continue and unbind this
vma.

However, if we own the i915_vma (we hold the vm->mutex for the unbind
and the pin_count is 0), we can assert that the vma remains idle as we
unbind.

Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/530
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123224459.38128-1-chris@chris-wilson.co.uk
2020-01-24 10:37:13 +00:00
Chris Wilson
5424f5d794 drm/i915: Clear the GGTT_WRITE bit on unbinding the vma
While we do flush writes to the vma before unbinding (to make sure they
go through the right detiling register), we may also be concurrently
poking at the GGTT_WRITE bit from set-domain, as we mark all GGTT vma
associated with an object. We know this is for another vma, as we
are currently unbinding this one -- so if this vma will be reused, it
will be refaulted and have its dirty bit set before the next write.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/999
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200121222447.419489-1-chris@chris-wilson.co.uk
2020-01-22 13:21:16 +00:00
Chris Wilson
c0e60347d4 drm/i915/gt: Hold rpm wakeref before taking ggtt->vm.mutex
We need to hold the runtime-pm wakeref to update the global PTEs (as
they exist behind a PCI BAR). However, some systems invoke ACPI during
runtime resume and so require allocations, which is verboten inside the
vm->mutex. Ergo, we must not use intel_runtime_pm_get() inside the
mutex, but lift the call outside.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/958
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110144418.1415639-1-chris@chris-wilson.co.uk
2020-01-10 20:30:40 +00:00
Chris Wilson
a5972e9315 drm/i915: Reduce warning for i915_vma_pin_iomap() without runtime-pm
Access through the GGTT (iomap) into the vma does require the device to
be awake. However, we often take the i915_vma_pin_iomap() as an early
preparatory step that is long before we use the iomap. Asserting that
the device is awake at pin time does not protect us, and is merely a
nuisance.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200108153550.3803446-2-chris@chris-wilson.co.uk
2020-01-09 08:53:28 +00:00
Chris Wilson
76f9764cc3 drm/i915: Introduce a vma.kref
Start introducing a kref on i915_vma in order to protect the vma unbind
(i915_gem_object_unbind) from a parallel destruction (i915_vma_parked).
Later, we will use the refcount to manage all access and turn i915_vma
into a first class container.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222210256.2066451-2-chris@chris-wilson.co.uk
2019-12-23 12:31:54 +00:00
Chris Wilson
e6ba764802 drm/i915: Remove i915->kernel_context
Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
2019-12-21 16:37:10 +00:00
Chris Wilson
da42104f58 drm/i915: Hold reference to intel_frontbuffer as we track activity
Since obj->frontbuffer is no longer protected by the struct_mutex, as we
are processing the execbuf, it may be removed. Mark the
intel_frontbuffer as rcu protected, and so acquire a reference to
the struct as we track activity upon it.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/827
Fixes: 8e7cb1799b ("drm/i915: Extract intel_frontbuffer active tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191218104043.3539458-1-chris@chris-wilson.co.uk
2019-12-18 12:09:57 +00:00
Chris Wilson
54d7195f8c drm/i915: Unpin vma->obj on early error
If we inherit an error along the fence chain, we skip the main work
callback and go straight to the error. In the case of the vma bind
worker, we only dropped the pinned pages from the worker.

In the process, make sure we call the release earlier rather than wait
until the final reference to the fence is dropped (as a reference is
kept while being listened upon).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191216161717.2688274-1-chris@chris-wilson.co.uk
2019-12-18 10:13:03 +00:00
Chris Wilson
d3e483526c drm/i915: Change i915_vma_unbind() to report -EAGAIN on activity
If someone else acquires the i915_vma before we complete our wait and
unbind it, we currently error out with -EBUSY. Use -EAGAIN instead so
that if necessary the caller is prepared to try again.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/683
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191208161252.3015727-2-chris@chris-wilson.co.uk
2019-12-09 11:49:57 +00:00
Chris Wilson
77853186e5 drm/i915: Claim vma while under closed_lock in i915_vma_parked()
Remove the vma we wish to destroy from the gt->closed_list to avoid
having two i915_vma_parked() try and free it.

Fixes: aa5e4453dc ("drm/i915/gem: Try to flush pending unbind events")
References: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205214159.829727-1-chris@chris-wilson.co.uk
2019-12-06 11:16:49 +00:00
Chris Wilson
ccd2094559 drm/i915: Try hard to bind the context
It is not acceptable for context pinning to fail with -ENOSPC as we
should always be able to make space in the GGTT. The only reason we may
fail is that other "temporary" context pins are reserving their space
and we need to wait for an available slot.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/676
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205113726.413351-2-chris@chris-wilson.co.uk
2019-12-05 13:50:54 +00:00
Abdiel Janulgue
cc662126b4 drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
This is really just an alias of mmap_gtt. The 'mmap offset' nomenclature
comes from the value returned by this ioctl which is the offset into the
device fd which userpace uses with mmap(2).

mmap_gtt was our initial mmap_offset implementation, this extends
our CPU mmap support to allow additional fault handlers that depends on
the object's backing pages.

Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
and use the zero extending behaviour of drm to differentiate between
them, when we inspect the flags.

To support multiple mmap types on an object we need to support multiple
mmap_offsets for an object (each offset in the global device address
space corresponding to a unique instance of the object for a file + mmap
type). As we drop the simplified drm core idea of a single mmap_offset,
we need to provide replacement hooks for the dumb mmap interface as
well.

Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1675
Testcase: igt/gem_mmap_offset
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191204120032.3682839-1-chris@chris-wilson.co.uk
2019-12-04 15:11:44 +00:00
Chris Wilson
dde01d9435 drm/i915: Split detaching and removing the vma
In order to keep the assert_bind_count() valid, we need to hold the vma
page reference until after we drop the bind count. However, we must also
keep the drm_mm_remove_node() as the last action of i915_vma_unbind() so
that it serialises with the unlocked check inside i915_vma_destroy(). So
we need to split up i915_vma_remove() so that we order the detach, drop
pages and remove as required during unbind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112067
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030192159.18404-1-chris@chris-wilson.co.uk
2019-10-31 14:52:19 +00:00
Chris Wilson
71e51ca8dc drm/i915: Lift i915_vma_parked() onto the gt
Currently even though i915_vma_parked() operates on a per-gt struct, it
is called from a global pm notify. This oddity was only because the long
term plan is to decouple the vma cache from the pm notification, but
right now the oddity stands out like a sore thumb!

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021183236.21790-1-chris@chris-wilson.co.uk
2019-10-21 21:07:56 +01:00
Chris Wilson
454a325a97 drm/i915: Remove leftover vma->obj->pages_pin_count on insert/remove
We now do the page pin count upfront in vma_get_pages/vma_put_pages, so
that we do the allocations before we enter the vm->mutex. Our vma
page references we are tracked in vma->pages_count and the extra
obj->pages_pin_count being performed later in i915_vma_insert and
i915_vma_remove is redundant, and worse throws off the shrinker's logic
on when it can free an object by unbinding it.

Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reported-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015100155.10376-1-chris@chris-wilson.co.uk
2019-10-15 11:46:52 +01:00
Chris Wilson
56184a20a8 drm/i915: Drop obj.page_pin_count after a failed vma->set_pages()
Before we attempt to set_pages on the vma, we claim a
obj.pages_pin_count for it. If we subsequently fail to set the pages on
the vma, we need to drop our pinning before returning the error.

Reported-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015093915.3995-1-chris@chris-wilson.co.uk
2019-10-15 11:46:40 +01:00
Chris Wilson
b1e3177bd1 drm/i915: Coordinate i915_active with its own mutex
Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.

This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).

The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.

v2: Add some comments that barely elucidate anything :(

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
2019-10-04 15:39:12 +01:00
Chris Wilson
274cbf20fd drm/i915: Push the i915_active.retire into a worker
As we need to use a mutex to serialise i915_active activation
(because we want to allow the callback to sleep), we need to push the
i915_active.retire into a worker callback in case we get need to retire
from an atomic context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-5-chris@chris-wilson.co.uk
2019-10-04 15:39:10 +01:00
Chris Wilson
2850748ef8 drm/i915: Pull i915_vma_pin under the vm->mutex
Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).

In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.

Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!

v2: Add some commentary, and some helpers to reduce patch churn.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
2019-10-04 15:39:02 +01:00
Chris Wilson
5e053450c1 drm/i915: Only track bound elements of the GTT
The premise here is to simply avoiding having to acquire the vm->mutex
inside vma create/destroy to update the vm->unbound_lists, to avoid some
nasty lock recursions later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-2-chris@chris-wilson.co.uk
2019-10-04 15:39:01 +01:00
Chris Wilson
b290a78b5c drm/i915: Use helpers for drm_mm_node booleans
A subset of 71724f7089 ("drm/mm: Use helpers for drm_mm_node booleans")
in order to prepare drm-intel-next-queued for subsequent patches before
we can backmerge 71724f7089 itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004142226.13711-1-chris@chris-wilson.co.uk
2019-10-04 15:34:27 +01:00
Chris Wilson
d19d71fc2b drm/i915: Mark i915_request.timeline as a volatile, rcu pointer
The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.

One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.

v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
2019-09-20 10:24:09 +01:00
Chris Wilson
4dd2fbbfb5 drm/i915: Make i915_vma.flags atomic_t for mutex reduction
In preparation for reducing struct_mutex stranglehold around the vm,
make the vma.flags atomic so that we can acquire a pin on the vma
atomically before deciding if we need to take the mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk
2019-09-11 13:39:42 +01:00
Matthew Auld
33dd889923 drm/i915: cleanup cache-coloring
Try to tidy up the cache-coloring such that we rid the code of any
mm.color_adjust assumptions, this should hopefully make it more obvious
in the code when we need to actually use the cache-level as the color,
and as a bonus should make adding a different color-scheme simpler.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com
2019-09-09 21:00:20 +01:00
Matthew Auld
1e0a96e508 drm/i915: export color_differs
Export color_differs so that we can use it elsewhere.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-1-matthew.auld@intel.com
2019-09-09 21:00:04 +01:00
Chris Wilson
1f7fd484ff drm/i915: Replace i915_vma_put_fence()
Avoid calling i915_vma_put_fence() by using our alternate paths that
bind a secondary vma avoiding the original fenced vma. For the few
instances where we need to release the fence (i.e. on binding when the
GGTT range becomes invalid), replace the put_fence with a revoke_fence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822061557.18402-1-chris@chris-wilson.co.uk
2019-08-22 08:53:42 +01:00
Chris Wilson
b7d151ba4b drm/i915: Pull obj->userfault tracking under the ggtt->mutex
Since we want to revoke the ggtt vma from only under the ggtt->mutex, we
need to move protection of the userfault tracking from the struct_mutex
to the ggtt->mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822060914.2671-2-chris@chris-wilson.co.uk
2019-08-22 08:53:41 +01:00
Rodrigo Vivi
829e8def7b Merge drm/drm-next into drm-intel-next-queued
We need the rename of reservation_object to dma_resv.

The solution on this merge came from linux-next:
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Wed, 14 Aug 2019 12:48:39 +1000
Subject: [PATCH] drm: fix up fallout from "dma-buf: rename reservation_object to dma_resv"

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 drivers/gpu/drm/i915/gt/intel_engine_pool.c | 8 ++++----
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
index 03d90b49584a..4cd54c569911 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c
@@ -43,12 +43,12 @@ static int pool_active(struct i915_active *ref)
 {
        struct intel_engine_pool_node *node =
                container_of(ref, typeof(*node), active);
-       struct reservation_object *resv = node->obj->base.resv;
+       struct dma_resv *resv = node->obj->base.resv;
        int err;

-       if (reservation_object_trylock(resv)) {
-               reservation_object_add_excl_fence(resv, NULL);
-               reservation_object_unlock(resv);
+       if (dma_resv_trylock(resv)) {
+               dma_resv_add_excl_fence(resv, NULL);
+               dma_resv_unlock(resv);
        }

        err = i915_gem_object_pin_pages(node->obj);

which is a simplified version from a previous one which had:
Reviewed-by: Christian König <christian.koenig@amd.com>

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-08-22 00:10:36 -07:00
Dave Airlie
5f680625d9 drm-misc-next for 5.4:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - dma-buf: add reservation_object_fences helper, relax
              reservation_object_add_shared_fence, remove
              reservation_object seq number (and then
              restored)
   - dma-fence: Shrinkage of the dma_fence structure,
                Merge dma_fence_signal and dma_fence_signal_locked,
                Store the timestamp in struct dma_fence in a union with
                cb_list
 
 Driver Changes:
   - More dt-bindings YAML conversions
   - More removal of drmP.h includes
   - dw-hdmi: Support get_eld and various i2s improvements
   - gm12u320: Few fixes
   - meson: Global cleanup
   - panfrost: Few refactors, Support for GPU heap allocations
   - sun4i: Support for DDC enable GPIO
   - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02,
                 Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1
                 Toppoly TD043MTEA1
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXVqvpwAKCRDj7w1vZxhR
 xa3RAQDzAnt5zeesAxX4XhRJzHoCEwj2PJj9Re6xMJ9PlcfcvwD+OS+bcB6jfiXV
 Ug9IBd/DqjlmD9G9MxFxfSV946rksAw=
 =8uv4
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-08-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.4:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - dma-buf: add reservation_object_fences helper, relax
             reservation_object_add_shared_fence, remove
             reservation_object seq number (and then
             restored)
  - dma-fence: Shrinkage of the dma_fence structure,
               Merge dma_fence_signal and dma_fence_signal_locked,
               Store the timestamp in struct dma_fence in a union with
               cb_list

Driver Changes:
  - More dt-bindings YAML conversions
  - More removal of drmP.h includes
  - dw-hdmi: Support get_eld and various i2s improvements
  - gm12u320: Few fixes
  - meson: Global cleanup
  - panfrost: Few refactors, Support for GPU heap allocations
  - sun4i: Support for DDC enable GPIO
  - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02,
                Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1
                Toppoly TD043MTEA1

Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: fixup dma_resv rename fallout]

From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190819141923.7l2adietcr2pioct@flea
2019-08-21 16:44:41 +10:00
Chris Wilson
2833ddccbd drm/i915: Be defensive when starting vma activity
Before we acquire the vma for GPU activity, ensure that the underlying
object is not already in the process of being freed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820100531.8430-1-chris@chris-wilson.co.uk
2019-08-20 14:23:45 +01:00
Chris Wilson
25ffd4b11d drm/i915: Markup expected timeline locks for i915_active
As every i915_active_request should be serialised by a dedicated lock,
i915_active consists of a tree of locks; one for each node. Markup up
the i915_active_request with what lock is supposed to be guarding it so
that we can verify that the serialised updated are indeed serialised.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-2-chris@chris-wilson.co.uk
2019-08-16 18:02:07 +01:00
Chris Wilson
8e7cb1799b drm/i915: Extract intel_frontbuffer active tracking
Move the active tracking for the frontbuffer operations out of the
i915_gem_object and into its own first class (refcounted) object. In the
process of detangling, we switch from low level request tracking to the
easier i915_active -- with the plan that this avoids any potential
atomic callbacks as the frontbuffer tracking wishes to sleep as it
flushes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816074635.26062-1-chris@chris-wilson.co.uk
2019-08-16 09:51:11 +01:00
Christian König
52791eeec1 dma-buf: rename reservation_object to dma_resv
Be more consistent with the naming of the other DMA-buf objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/
2019-08-13 09:09:30 +02:00
Chris Wilson
3d6792cf0a drm/i915: Forgo last_fence active request tracking
We were using the last_fence to track the last request that used this
vma that might be interpreted by a fence register and forced ourselves
to wait for this request before modifying any fence register that
overlapped our vma. Due to requirement that we need to track any XY_BLT
command, linear or tiled, this in effect meant that we have to track the
vma for its active lifespan anyway, so we can forgo the explicit
last_fence tracking and just use the whole vma->active.

Another solution would be to pipeline the register updates, and would
help resolve some long running stalls for gen3 (but only gen 2 and 3!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190812174804.26180-1-chris@chris-wilson.co.uk
2019-08-12 19:29:16 +01:00
Jani Nikula
a09d9a8002 drm/i915: avoid including intel_drv.h via i915_drv.h->i915_trace.h
Disentangle i915_drv.h from intel_drv.h, which gets included via
i915_trace.h. This necessitates including i915_trace.h wherever it's
needed.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ed82bf259d3b725a1a1a3c3e9d6fb5c08bc4d489.1565085691.git.jani.nikula@intel.com
2019-08-07 12:43:14 +03:00
Chris Wilson
1aff1903d0 drm/i915: Hide unshrinkable context objects from the shrinker
The shrinker cannot touch objects used by the contexts (logical state
and ring). Currently we mark those as "pin_global" to let the shrinker
skip over them, however, if we remove them from the shrinker lists
entirely, we don't event have to include them in our shrink accounting.

By keeping the unshrinkable objects in our shrinker tracking, we report
a large number of objects available to be shrunk, and leave the shrinker
deeply unsatisfied when we fail to reclaim those. The shrinker will
persist in trying to reclaim the unavailable objects, forcing the system
into a livelock (not even hitting the dread oomkiller).

v2: Extend unshrinkable protection for perma-pinned scratch and guc
allocations (Tvrtko)
v3: Notice that we should be pinned when marking unshrinkable and so the
link cannot be empty; merge duplicate paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk
2019-08-02 23:39:46 +01:00
Chris Wilson
cd2a4eaf8c drm/i915: Report resv_obj allocation failure
Since commit 64d6c500a3 ("drm/i915: Generalise GPU activity
tracking"), we have been prepared for i915_vma_move_to_active() to fail.
We can take advantage of this to report the failure for allocating the
shared-fence slot in the reservation_object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730205805.3733-1-chris@chris-wilson.co.uk
2019-08-02 16:03:53 +01:00
Chris Wilson
c082afac86 drm/i915: Move aliasing_ppgtt underneath its i915_ggtt
The aliasing_ppgtt provides a PIN_USER alias for the global gtt, so move
it under the i915_ggtt to simplify later transformations to enable
intel_context.vm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-1-chris@chris-wilson.co.uk
2019-07-30 16:09:32 +01:00
Chris Wilson
09480072e3 drm/i915: Mark up vma->active as safe for use inside shrinkers
Since a shrinker may be forced to wait on GPU activity,
i915_active_wait(&vma->active) must be safe for use inside a shrinker,
and so let's mark up the lock as being acquired by the shrinker to avoid
any nasty surprises creeping in.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703091726.11690-8-chris@chris-wilson.co.uk
2019-07-03 12:24:08 +01:00
Chris Wilson
12c255b5da drm/i915: Provide an i915_active.acquire callback
If we introduce a callback for i915_active that is only called the first
time we use the i915_active and is symmetrically paired with the
i915_active.retire callback, we can replace the open-coded and
non-atomic implementations -- which will be very fragile (i.e. broken)
upon removing the struct_mutex serialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-4-chris@chris-wilson.co.uk
2019-06-21 19:47:55 +01:00
Chris Wilson
a93615f900 drm/i915: Throw away the active object retirement complexity
Remove the accumulated optimisations that we have for i915_vma_retire
and reduce it to the bare essential of tracking the active object
reference. This allows us to only use atomic operations, and so will be
able to avoid the struct_mutex requirement.

The principal loss here is the shrinker MRU bumping, so now if we have
to shrink, we will do so in much more random order and more likely to
try and shrink recently used objects. That is a nuisance, but shrinking
active objects is a second step we try to avoid and will always be a
system-wide performance issue.

The other loss is here is in the automatic pruning of the
reservation_object when idling. This is not as large an issue as upon
reservation_object introduction as now adding new fences into the object
replaces already signaled fences, keeping the array compact. But we do
lose the auto-expiration of stale fences and unused arrays. That may be
a noticeable problem for which we need to re-implement autopruning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-3-chris@chris-wilson.co.uk
2019-06-21 19:47:51 +01:00
Tvrtko Ursulin
a1c8a09e0c drm/i915: Convert i915_gem_flush_ggtt_writes to intel_gt
Having introduced struct intel_gt (named the anonymous structure in i915)
we can start using it to compartmentalize our code better. It makes more
sense logically to have the code internally like this and it will also
help with future split between gt and display in i915.

v2:
 * Keep ggtt flush before fb obj flush. (Chris)

v3:
 * Fix refactoring fail.
 * Always flush ggtt writes. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-23-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:38 +01:00
Chris Wilson
ef78f7b187 drm/i915: Use drm_gem_object.resv
Since commit 1ba627148e ("drm: Add reservation_object to
drm_gem_object"), struct drm_gem_object grew its own builtin
reservation_object rendering our own private one bloat. Remove our
redundant reservation_object and point into obj->base.resv instead.

References: 1ba627148e ("drm: Add reservation_object to drm_gem_object")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618125858.7295-1-chris@chris-wilson.co.uk
2019-06-18 15:30:32 +01:00
Jani Nikula
df0566a641 drm/i915: move modesetting core code under display/
Now that we have a new subdirectory for display code, continue by moving
modesetting core code.

display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this
is, again, a surprisingly clean operation.

v2:
- don't move intel_sideband.[ch] (Ville)
- use tabs for Makefile file lists and sort them

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com
2019-06-17 11:48:32 +03:00
Daniele Ceraolo Spurio
87b391b951 drm/i915: Remove rpm asserts that use i915
Quite a few of the call points have already switched to the version
working directly on the runtime_pm structure, so let's switch over the
rest and kill the i915-based asserts.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-3-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Chris Wilson
ecab9be174 drm/i915: Combine unbound/bound list tracking for objects
With async binding, we don't want to manage a bound/unbound list as we
may end up running before we even acquire the pages. All that is
required is keeping track of shrinkable objects, so reduce it to the
minimum list.

Fixes: 6951e5893b ("drm/i915: Move GEM object domain management from struct_mutex to local")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190612105720.30310-1-chris@chris-wilson.co.uk
2019-06-12 13:36:43 +01:00
Chris Wilson
a8cff4c828 drm/i915: Promote i915->mm.obj_lock to be irqsafe
The intent is to be able to update the mm.lists from inside an irqsoff
section (e.g. from a softirq rcu workqueue), ergo we need to make the
i915->mm.obj_lock irqsafe.

v2: can_discard_pages() ensures we are shrinkable
v3: Beware shadowing of 'flags'

Fixes: 3b4fa9640c ("drm/i915: Track the purgeable objects on a separate eviction list")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110869
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190610145430.17717-1-chris@chris-wilson.co.uk
2019-06-10 20:43:08 +01:00
Chris Wilson
155ab8836c drm/i915: Move object close under its own lock
Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk
2019-06-06 12:51:13 +01:00
Chris Wilson
d82b4b2621 drm/i915: Report all objects with allocated pages to the shrinker
Currently, we try to report to the shrinker the precise number of
objects (pages) that are available to be reaped at this moment. This
requires searching all objects with allocated pages to see if they
fulfill the search criteria, and this count is performed quite
frequently. (The shrinker tries to free ~128 pages on each invocation,
before which we count all the objects; counting takes longer than
unbinding the objects!) If we take the pragmatic view that with
sufficient desire, all objects are eventually reapable (they become
inactive, or no longer used as framebuffer etc), we can simply return
the count of pinned pages maintained during get_pages/put_pages rather
than walk the lists every time.

The downside is that we may (slightly) over-report the number of
objects/pages we could shrink and so penalize ourselves by shrinking
more than required. This is mitigated by keeping the order in which we
shrink objects such that we avoid penalizing active and frequently used
objects, and if memory is so tight that we need to free them we would
need to anyway.

v2: Only expose shrinkable objects to the shrinker; a small reduction in
not considering stolen and foreign objects.
v3: Restore the tracking from a "backup" copy from before the gem/ split

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-2-chris@chris-wilson.co.uk
2019-05-31 21:23:51 +01:00
Chris Wilson
3b4fa9640c drm/i915: Track the purgeable objects on a separate eviction list
Currently the purgeable objects, I915_MADV_DONTNEED, are mixed in the
normal bound/unbound lists. Every shrinker pass starts with an attempt
to purge from this set of unneeded objects, which entails us doing a
walk over both lists looking for any candidates. If there are none, and
since we are shrinking we can reasonably assume that the lists are
full!, this becomes a very slow futile walk.

If we separate out the purgeable objects into own list, this search then
becomes its own phase that is preferentially handled during shrinking.
Instead the cost becomes that we then need to filter the purgeable list
if we want to distinguish between bound and unbound objects.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-1-chris@chris-wilson.co.uk
2019-05-31 21:23:51 +01:00
Chris Wilson
c017cf6b1a drm/i915: Drop the deferred active reference
An old optimisation to reduce the number of atomics per batch sadly
relies on struct_mutex for coordination. In order to remove struct_mutex
from serialising object/context closing, always taking and releasing an
active reference on first use / last use greatly simplifies the locking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
6951e5893b drm/i915: Move GEM object domain management from struct_mutex to local
Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Dave Airlie
14ee642c2a Features:
- Engine discovery query (Tvrtko)
 - Support for DP YCbCr4:2:0 outputs (Gwan-gyeong)
 - HDCP revocation support, refactoring (Ramalingam)
 - Remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW (Christian König)
 - Asynchronous display power disabling (Imre)
 - Perma-pin uC firmware and re-enable global reset (Fernando)
 - GTT remapping for display, for bigger fb size and stride (Ville)
 - Enable pipe HDR mode on ICL if only HDR planes are used (Ville)
 - Kconfig to tweak the busyspin durations for i915_wait_request (Chris)
 - Allow multiple user handles to the same VM (Chris)
 - GT/GEM runtime pm improvements using wakerefs (Chris)
 - Gen 4&5 render context support (Chris)
 - Allow userspace to clone contexts on creation (Chris)
 - SINGLE_TIMELINE flags for context creation (Chris)
 - Allow specification of parallel execbuf (Chris)
 
 Refactoring:
 - Header refactoring (Jani)
 - Move GraphicsTechnology files under gt/ (Chris)
 - Sideband code refactoring (Chris)
 
 Fixes:
 - ICL DSI state readout and checker fixes (Vandita)
 - GLK DSI picture corruption fix (Stanislav)
 - HDMI deep color fixes (Clinton, Aditya)
 - Fix driver unbinding from a device in use (Janusz)
 - Fix clock gating with pipe scaling (Radhakrishna)
 - Disable broken FBC on GLK (Daniel Drake)
 - Miscellaneous GuC fixes (Michal)
 - Fix MG PHY DP register programming (Imre)
 - Add missing combo PHY lane power setup (Imre)
 - Workarounds for early ICL VBT issues (Imre)
 - Fix fastset vs. pfit on/off on HSW EDP transcoder (Ville)
 - Add readout and state check for pch_pfit.force_thru (Ville)
 - Miscellaneous display fixes and refactoring (Ville)
 - Display workaround fixes (Ville)
 - Enable audio even if ELD is bogus (Ville)
 - Fix use-after-free in reporting create.size (Chris)
 - Sideband fixes to avoid BYT hard lockups (Chris)
 - Workaround fixes and improvements (Chris)
 
 Maintainer shortcomings:
 - Failure to adequately describe and give credit for all changes (Jani)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFWWmW3ewYy4RJOWc05gHnSar7m8FAlzoK8oACgkQ05gHnSar
 7m8jZg//UuIkz4bIu7A0YfN/VH3/h3fthxboejj27HpO4OO9eFqLVqaEUFEngGvf
 66fnFKNwtLdW7Dsx9iQsKNsVTcdsEE5PvSA6FZ3rVtYOwBdZ9OKYRxci2KcSnjqz
 F0/8Jxgz2G0gu9TV6dgTLrfdJiuJrCbidRV3G5id0XHNEGbpABtmVxYfsbj/w9mU
 luckCgKyRDZNzfhyGIPV763bNGZWLQPcbP99yrZf4+EcsiQ2MfjHJdwe5Ko+iGDk
 sO3lFg/1iEf41gqaD4LPokOtUKZfXI1Sujs1w/0djDbqs9USq0eY1L5C3ZBq5Si1
 woz7ATXO71FfBcNRxLTejNqCVlQMLix/185/ItkDA4gDlHwWZPYaT5VTNgRtEEy6
 XNtscZyM6Z1ghqRqahWWu40g80sOdfYuiTFEAYonVbDAUootgF46uWO/2ib0Hya+
 tYlm60M097eMealzaXEyHPHlW1OeUUJTKxl9j7nHmqVn542OI8gn7xvIXX2VsYDY
 7D4gVPoFg0UpGXM2uuSHVgvxwtg4t083Wu+utYu76RjmwNye4LkHewWGFjmOkYRf
 BraHoA+gKPFtAJjjtkyE/ZnlT4c3tDoQ0a6+gRKVurXzu/Y6JVzquhJvH5mShyZ7
 oTv+erupcz7JEnEeKzgMCyon/Drumiut5I6zr29GNQ3eelpf4jQ=
 =U/nc
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2019-05-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Features:
- Engine discovery query (Tvrtko)
- Support for DP YCbCr4:2:0 outputs (Gwan-gyeong)
- HDCP revocation support, refactoring (Ramalingam)
- Remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW (Christian König)
- Asynchronous display power disabling (Imre)
- Perma-pin uC firmware and re-enable global reset (Fernando)
- GTT remapping for display, for bigger fb size and stride (Ville)
- Enable pipe HDR mode on ICL if only HDR planes are used (Ville)
- Kconfig to tweak the busyspin durations for i915_wait_request (Chris)
- Allow multiple user handles to the same VM (Chris)
- GT/GEM runtime pm improvements using wakerefs (Chris)
- Gen 4&5 render context support (Chris)
- Allow userspace to clone contexts on creation (Chris)
- SINGLE_TIMELINE flags for context creation (Chris)
- Allow specification of parallel execbuf (Chris)

Refactoring:
- Header refactoring (Jani)
- Move GraphicsTechnology files under gt/ (Chris)
- Sideband code refactoring (Chris)

Fixes:
- ICL DSI state readout and checker fixes (Vandita)
- GLK DSI picture corruption fix (Stanislav)
- HDMI deep color fixes (Clinton, Aditya)
- Fix driver unbinding from a device in use (Janusz)
- Fix clock gating with pipe scaling (Radhakrishna)
- Disable broken FBC on GLK (Daniel Drake)
- Miscellaneous GuC fixes (Michal)
- Fix MG PHY DP register programming (Imre)
- Add missing combo PHY lane power setup (Imre)
- Workarounds for early ICL VBT issues (Imre)
- Fix fastset vs. pfit on/off on HSW EDP transcoder (Ville)
- Add readout and state check for pch_pfit.force_thru (Ville)
- Miscellaneous display fixes and refactoring (Ville)
- Display workaround fixes (Ville)
- Enable audio even if ELD is bogus (Ville)
- Fix use-after-free in reporting create.size (Chris)
- Sideband fixes to avoid BYT hard lockups (Chris)
- Workaround fixes and improvements (Chris)

Maintainer shortcomings:
- Failure to adequately describe and give credit for all changes (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87sgt3n45z.fsf@intel.com
2019-05-28 09:26:52 +10:00
Ville Syrjälä
bb211c3d0c drm/i915/selftests: Add live vma selftest
Add a live selftest to excercise rotated/remapped vmas. We simply
write through the rotated/remapped vma, and confirm that the data
appears in the right page when read through the normal vma.

Not sure what the fallout of making all rotated/remapped vmas
mappable/fenceable would be, hence I just hacked it in the test.

v2: Grab rpm reference (Chris)
    GEM_BUG_ON(view.type not as expected) (Chris)
    Allow CAN_FENCE for rotated/remapped vmas (Chris)
    Update intel_plane_uses_fence() to ask for a fence
    only for normal vmas on gen4+
v3: Deal with intel_wakeref_t
v4: Rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-4-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:47 +03:00
Ville Syrjälä
1a74fc0b3f drm/i915: Add a new "remapped" gtt_view
To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.

v2: Use intel_remapped_plane_info base type
    s/unused/unused_mbz/ (Chris)
    Separate BUILD_BUG_ON()s (Chris)
    Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
    Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
    to one row of pages to keep things simple

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:47 +03:00
Linus Torvalds
a2d635decb drm pull request for 5.2
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc04M6AAoJEAx081l5xIa+SJgP/0uIgIOM53vPpydgmr+2IEHF
 jbDqrd+mipgNriRVHjDsWdUHCUNtyhB7YEBCMrj3mY0rRFI7FlQQf4lOwYGoHiKP
 4JZg4kwC37997lFXl1uabGj3DmJLtxKL2/D15zCH/uLe+2EDzWznP6NVdFT3WK0P
 YKZQCWT19PWSsLoBRPutWxkmop4AYvkqE0a6vXUlJlFYZK3Bbytx6/179uWKfiX5
 ZkKEEtx1XiDAvcp5gBb6PISurycrBY0e/bkPBnK3ES5vawMbTU5IrmWOrQ4D8yOd
 z9qOVZawZ6+b2XBDgBWjQ9bM7I5R7Il1q/LglYEaFI9+wHUnlUdDSm6ft5/5BiCZ
 fqgkh5Bj2iEsajbSsacoljMOpxpYPqj63mqc+7fAGXF34V+B+9U1bpt8kCbMKowf
 7Abb7IuiCR6vLDapjP6VqTMvdQ4O466OEAN83ULGFTdmMqYYH4AxaIwc+xcAk/aP
 RNq7/RHhh4FRynRAj9fCkGlF3ArnM88gLINwWuEQq4SClWGcvdw7eaHpwWo77c4g
 iccCnTLqSIg5pDVu07AQzzBlW6KulWxh5o72x+Xx+EXWdYUDHQ1SlNs11bSNUBV1
 5MkrzY2GuD+NFEjsXJEDIPOr40mQOyJCXnxq8nXPsz/hD9kHeJPvWn3J3eVKyb5B
 Z6/knNqM0BDn3SaYR/rD
 =YFiQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This has two exciting community drivers for ARM Mali accelerators.
  Since ARM has never been open source friendly on the GPU side of the
  house, the community has had to create open source drivers for the
  Mali GPUs. Lima covers the older t4xx and panfrost the newer 6xx/7xx
  series. Well done to all involved and hopefully this will help ARM
  head in the right direction.

  There is also now the ability if you don't have any of the legacy
  drivers enabled (pre-KMS) to remove all the pre-KMS support code from
  the core drm, this saves 10% or so in codesize on my machine.

  i915 also enable Icelake/Elkhart Lake Gen11 GPUs by default, vboxvideo
  moves out of staging.

  There are also some rcar-du patches which crossover with media tree
  but all should be acked by Mauro.

  Summary:

  uapi changes:
   - Colorspace connector property
   - fourcc - new YUV formts
   - timeline sync objects initially merged
   - expose FB_DAMAGE_CLIPS to atomic userspace

  new drivers:
   - vboxvideo: moved out of staging
   - aspeed: ASPEED SoC BMC chip display support
   - lima: ARM Mali4xx GPU acceleration driver support
   - panfrost: ARM Mali6xx/7xx Midgard/Bitfrost acceleration driver support

  core:
   - component helper docs
   - unplugging fixes
   - devm device init
   - MIPI/DSI rate control
   - shmem backed gem objects
   - connector, display_info, edid_quirks cleanups
   - dma_buf fence chain support
   - 64-bit dma-fence seqno comparison fixes
   - move initial fb config code to core
   - gem fence array helpers for Lima
   - ability to remove legacy support code if no drivers requires it (removes 10% of drm.ko size)
   - lease fixes

  ttm:
   - unified DRM_FILE_PAGE_OFFSET handling
   - Account for kernel allocations in kernel zone only

  panel:
   - OSD070T1718-19TS panel support
   - panel-tpo-td028ttec1 backlight support
   - Ronbo RB070D30 MIPI/DSI
   - Feiyang FY07024DI26A30-D MIPI-DSI panel
   - Rocktech jh057n00900 MIPI-DSI panel

  i915:
   - Comet Lake (Gen9) PCI IDs
   - Updated Icelake PCI IDs
   - Elkhartlake (Gen11) support
   - DP MST property addtions
   - plane and watermark fixes
   - Icelake port sync and VEBOX disable fixes
   - struct_mutex usage reduction
   - Icelake gamma fix
   - GuC reset fixes
   - make mmap more asynchronous
   - sound display power well race fixes
   - DDI/MIPI-DSI clocks for Icelake
   - Icelake RPS frequency changing support
   - Icelake workarounds

  amdgpu:
   - Use HMM for userptr
   - vega20 experimental smu11 support
   - RAS support for vega20
   - BACO support for vega12 + fixes for vega20
   - reworked IH interrupt handling
   - amdkfd RAS support
   - Freesync improvements
   - initial timeline sync object support
   - DC Z ordering fixes
   - NV12 planes support
   - colorspace properties for planes=
   - eDP opts if eDP already initialized

  nouveau:
   - misc fixes

  etnaviv:
   - misc fixes

  msm:
   - GPU zap shader support expansion
   - robustness ABI addition

  exynos:
   - Logging cleanups

  tegra:
   - Shared reset fix
   - CPU cache maintenance fix

  cirrus:
   - driver rewritten using simple helpers

  meson:
   - G12A support

  vmwgfx:
   - Resource dirtying management improvements
   - Userspace logging improvements

  virtio:
   - PRIME fixes

  rockchip:
   - rk3066 hdmi support

  sun4i:
   - DSI burst mode support

  vc4:
   - load tracker to detect underflow

  v3d:
   - v3d v4.2 support

  malidp:
   - initial Mali D71 support in komeda driver

  tfp410:
   - omap related improvement

  omapdrm:
   - drm bridge/panel support
   - drop some omap specific panels

  rcar-du:
   - Display writeback support"

* tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm: (1507 commits)
  drm/msm/a6xx: No zap shader is not an error
  drm/cma-helper: Fix drm_gem_cma_free_object()
  drm: Fix timestamp docs for variable refresh properties.
  drm/komeda: Mark the local functions as static
  drm/komeda: Fixed warning: Function parameter or member not described
  drm/komeda: Expose bus_width to Komeda-CORE
  drm/komeda: Add sysfs attribute: core_id and config_id
  drm: add non-desktop quirk for Valve HMDs
  drm/panfrost: Show stored feature registers
  drm/panfrost: Don't scream about deferred probe
  drm/panfrost: Disable PM on probe failure
  drm/panfrost: Set DMA masks earlier
  drm/panfrost: Add sanity checks to submit IOCTL
  drm/etnaviv: initialize idle mask before querying the HW db
  drm: introduce a capability flag for syncobj timeline support
  drm: report consistent errors when checking syncobj capibility
  drm/nouveau/nouveau: forward error generated while resuming objects tree
  drm/nouveau/fb/ramgk104: fix spelling mistake "sucessfully" -> "successfully"
  drm/nouveau/i2c: Disable i2c bus access after ->fini()
  drm/nouveau: Remove duplicate ACPI_VIDEO_NOTIFY_PROBE definition
  ...
2019-05-08 21:35:19 -07:00
Thomas Gleixner
487f3c7fb1 drm: Simplify stacktrace handling
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

The original code in all printing functions is really wrong. It allocates a
storage array on stack which is unused because depot_fetch_stack() does not
store anything in it. It overwrites the entries pointer in the stack_trace
struct so it points to the depot storage.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: linux-mm@kvack.org
Cc: David Rientjes <rientjes@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: iommu@lists.linux-foundation.org
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: linux-btrfs@vger.kernel.org
Cc: dm-devel@redhat.com
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-arch@vger.kernel.org
Link: https://lkml.kernel.org/r/20190425094802.622094226@linutronix.de
2019-04-29 12:37:53 +02:00
Chris Wilson
112ed2d31a drm/i915: Move GraphicsTechnology files under gt/
Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
2019-04-24 21:01:46 +01:00
Chris Wilson
103b76eeff drm/i915: Use i915_global_register()
Rather than manually add every new global into each hook, use
i915_global_register() function and keep a list of registered globals to
invoke instead.

However, I haven't found a way for random drivers to add an .init table
to avoid having to manually add ourselves to i915_globals_init() each
time.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305213830.18094-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-03-06 10:00:50 +00:00
Chris Wilson
13f1bfd3b3 drm/i915: Make object/vma allocation caches global
As our allocations are not device specific, we can move our slab caches
to a global scope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk
2019-02-28 11:08:02 +00:00
Chris Wilson
21950ee7cc drm/i915: Pull i915_gem_active into the i915_active family
Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-02-05 17:20:11 +00:00
Chris Wilson
64d6c500a3 drm/i915: Generalise GPU activity tracking
We currently track GPU memory usage inside VMA, such that we never
release memory used by the GPU until after it has finished accessing it.
However, we may want to track other resources aside from VMA, or we may
want to split a VMA into multiple independent regions and track each
separately. For this purpose, generalise our request tracking (akin to
struct reservation_object) so that we can embed it into other objects.

v2: Tweak error handling during selftest setup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk
2019-02-05 17:12:00 +00:00
Chris Wilson
528cbd17ce drm/i915: Move vma lookup to its own lock
Remove the struct_mutex requirement for looking up the vma for an
object.

v2: Highlight how the race for duplicate vma creation is resolved on
reacquiring the lock with a short comment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk
2019-01-28 16:24:16 +00:00
Chris Wilson
09d7e46b97 drm/i915: Pull VM lists under the VM mutex.
A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk
2019-01-28 16:24:13 +00:00
Chris Wilson
499197dc16 drm/i915: Stop tracking MRU activity on VMA
Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
2019-01-28 16:24:09 +00:00
Jani Nikula
2ac5e38ea4 Merge drm/drm-next into drm-intel-next-queued
Pull in v4.20-rc3 via drm-next.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2018-11-20 13:14:08 +02:00
Christian König
ca05359f1e dma-buf: allow reserving more than one shared fence slot
Let's support simultaneous submissions to multiple engines.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.kernel.org/patch/10626149/
2018-10-25 13:45:07 +02:00
Tvrtko Ursulin
bbb8a9d7e0 drm/i915: GEM_WARN_ON considered harmful
GEM_WARN_ON currently has dangerous semantics where it is completely
compiled out on !GEM_DEBUG builds. This can leave users who expect it to
be more like a WARN_ON, just without a warning in non-debug builds, in
complete ignorance.

Another gotcha with it is that it cannot be used as a statement. Which is
again different from a standard kernel WARN_ON.

This patch fixes both problems by making it behave as one would expect.

It can now be used both as an expression and as statement, and also the
condition evaluates properly in all builds - code under the conditional
will therefore not unexpectedly disappear.

To satisfy call sites which really want the code under the conditional to
completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the
callers to it. This one can also be used as both expression and statement.

>From the above it follows GEM_DEBUG_WARN_ON should be used in situations
where we are certain the condition will be hit during development, but at
a place in code where error can be handled to the benefit of not crashing
the machine.

GEM_WARN_ON on the other hand should be used where condition may happen in
production and we just want to distinguish the level of debugging output
emitted between the production and debug build.

v2:
 * Dropped BUG_ON hunk.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com
2018-10-18 10:10:12 +01:00
Chris Wilson
a4417b7b41 drm/i915: Stop holding a ref to the ppgtt from each vma
The context owns both the ppgtt and the vma within it, and our activity
tracking on the context ensures that we do not release active ppgtt. As
the context fulfils our obligations for active memory tracking, we can
relinquish the reference from the vma.

This fixes a silly transient refleak from closed vma being kept alive
until the entire system was idle, keeping all vm alive as well.

Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Testcase: igt/gem_ctx_create/files
Fixes: 3365e2268b ("drm/i915: Lazily unbind vma on close")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180816073448.19396-1-chris@chris-wilson.co.uk
2018-08-16 13:31:37 +01:00
Chris Wilson
6a2f59e45a drm/i915: Pull unpin map into vma release
A reasonably common operation is to pin the map of the vma alongside the
vma itself for the lifetime of the vma, and so release both pins at the
same time as destroying the vma. It is common enough to pull into the
release function, making that central function more attractive to a
couple of other callsites.

The continual ulterior motive is to sweep over errors on module load
aborting...

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180721125037.20127-1-chris@chris-wilson.co.uk
2018-07-24 09:55:12 +01:00
Chris Wilson
46b1063f91 drm/i915: Handle recursive shrinker for vma->last_active allocation
If we call into the shrinker for direct relcaim inside kmalloc, it will
retire the requests. If we retire the vma->last_active while processing a
new i915_vma_move_to_active() we can upset the delicate bookkeeping
required for the cache. After the possible invocation of the shrinker, we
need to double check the vma->last_active is still valid.

Fixes: 8b293eb53a ("drm/i915: Track the last-active inside the i915_vma")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105600#c39
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719072206.16015-1-chris@chris-wilson.co.uk
2018-07-19 12:27:46 +01:00
Chris Wilson
8b293eb53a drm/i915: Track the last-active inside the i915_vma
Using a VMA on more than one timeline concurrently is the exception
rather than the rule (using it concurrently on multiple engines). As we
expect to only use one active tracker, store the most recently used
tracker inside the i915_vma itself and only fallback to the rbtree if
we need a second or more concurrent active trackers.

v2: Comments on how we overwrite any existing last_active cache.
v3: __list_del_entry() before list_replace_init() is confusing and, much
more important, entirely redundant.
v4: Note that both last_active and the rbtree may be simultaneously
tracking this timeline, albeit with different requests, and so the vma
may be retired twice for the same timeline.
v5: No, that list_del is required!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706123157.9645-1-chris@chris-wilson.co.uk
2018-07-06 18:23:07 +01:00
Chris Wilson
5c3f8c221c drm/i915: Track vma activity per fence.context, not per engine
In the next patch, we will want to be able to use more flexible request
timelines that can hop between engines. From the vma pov, we can then
not rely on the binding of this request to an engine and so can not
ensure that different requests are ordered through a per-engine
timeline, and so we must track activity of all timelines. (We track
activity on the vma itself to prevent unbinding from HW before the HW
has finished accessing it.)

v2: Switch to a rbtree for 32b safety (since using u64 as a radixtree
index is fraught with aliasing of unsigned longs).
v3: s/lookup_active/active_instance/ because we can never agree on names

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-5-chris@chris-wilson.co.uk
2018-07-06 18:22:43 +01:00
Chris Wilson
e6bb1d7f1a drm/i915: Move i915_vma_move_to_active() to i915_vma.c
i915_vma_move_to_active() has grown beyond its execbuf origins, and
should take its rightful place in i915_vma.c as a method for i915_vma!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-4-chris@chris-wilson.co.uk
2018-07-06 18:22:41 +01:00
Chris Wilson
1eca65d922 drm/i915: Squelch very verbose error logging
Having found the error causing the IGT test to fail, downgrade the
verbose logging so that we stop flooding the syslogs as we deliberately
provoke it many thousands of time during selftests.

References: 10195b1e44 ("drm/i915: Show vma allocator stack when in doubt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-1-chris@chris-wilson.co.uk
2018-07-06 11:23:47 +01:00
Chris Wilson
7e7367d3bc drm/i915: Try GGTT mmapping whole object as partial
If the whole object is already pinned by HW for use as scanout, we will
fail to move it to the mappable region and so must resort to using a
partial VMA covering the whole object.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104513
Fixes: aa136d9d72 ("drm/i915: Convert partial ggtt vma to full ggtt if it spans the entire object")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180630090509.469-1-chris@chris-wilson.co.uk
2018-07-02 17:36:09 +01:00
Chris Wilson
10195b1e44 drm/i915: Show vma allocator stack when in doubt
At the moment, gem_exec_gttfill fails with a sporadic EBUSY due to us
wanting to unbind a pinned batch. Let's dump who first bound that vma to
see if that helps us identify who still unexpectedly has it pinned.

v2: We cannot allocate inside the printer (as it may be on an fs-reclaim
path), so hope for the best and build the string on the stack
v3: stack depth of 16 routinely overflows a 512 character string, limit
it to 12 to avoid unsightly truncation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628132206.8329-1-chris@chris-wilson.co.uk
2018-06-28 20:56:35 +01:00
Chris Wilson
93f2cde2a4 drm/i915: Decouple vma vfuncs from vm
To allow for future non-object backed vma, we need to be able to
specialise the callbacks for binding, et al, the vma. For example,
instead of calling vma->vm->bind_vma(), we now call
vma->ops->bind_vma(). This gives us the opportunity to later override the
operation for a custom vma.

v2: flip order of unbind/bind

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-2-chris@chris-wilson.co.uk
2018-06-07 21:53:11 +01:00
Chris Wilson
520ea7c581 drm/i915: Prepare for non-object vma
In order to allow ourselves to use VMA to wrap other entities other than
GEM objects, we need to allow for the vma->obj backpointer to be NULL.
In most cases, we know we are operating on a GEM object and its vma, but
we need the core code (such as i915_vma_pin/insert/bind/unbind) to work
regardless of the innards.

The remaining eyesore here is vma->obj->cache_level and related (but
less of an issue) vma->obj->gt_ro. With a bit of care we should mirror
those on the vma itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-1-chris@chris-wilson.co.uk
2018-06-07 21:53:10 +01:00
Chris Wilson
82ad6443a5 drm/i915/gtt: Rename i915_hw_ppgtt base member
In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk
2018-06-05 21:11:20 +01:00
Chris Wilson
83d317adfb drm/i915/vma: Move the bind_count vs pin_count assertion to a helper
To spare ourselves a long line later, refactor the repeated check of
bind_count vs pin_count to a helper.

v2: Fix up the commentary!

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605094107.31367-1-chris@chris-wilson.co.uk
2018-06-05 15:16:07 +01:00
Chris Wilson
3365e2268b drm/i915: Lazily unbind vma on close
When userspace is passing around swapbuffers using DRI, we frequently
have to open and close the same object in the foreign address space.
This shows itself as the same object being rebound at roughly 30fps
(with a second object also being rebound at 30fps), which involves us
having to rewrite the page tables and maintain the drm_mm range manager
every time.

However, since the object still exists and it is only the local handle
that disappears, if we are lazy and do not unbind the VMA immediately
when the local user closes the object but defer it until the GPU is
idle, then we can reuse the same VMA binding. We still have to be
careful to mark the handle and lookup tables as closed to maintain the
uABI, just allowing the underlying VMA to be resurrected if the user is
able to access the same object from the same context again.

If the object itself is destroyed (neither userspace keeping a handle to
it), the VMA will be reaped immediately as usual.

In the future, this will be even more useful as instantiating a new VMA
for use on the GPU will become heavier. A nuisance indeed, so nip it in
the bud.

v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
v3: Leave a hint as to why we deferred the unbind on close.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk
2018-05-04 07:26:56 +01:00
Chris Wilson
e61e0f51ba drm/i915: Rename drm_i915_gem_request to i915_request
We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)

In short, the spatch:
@@

@@
- struct drm_i915_gem_request
+ struct i915_request

A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-02-21 20:57:22 +00:00
Matthew Auld
73ebd50303 drm/i915: make mappable struct resource centric
Now that we are using struct resource to track the stolen region, it is
more convenient if we track the mappable region in a resource as well.

v2: prefer iomap and gmadr naming scheme
    prefer DEFINE_RES_MEM

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-8-matthew.auld@intel.com
2017-12-12 12:30:21 +02:00
Chris Wilson
e2189dd078 drm/i915: Refactor common list iteration over GGTT vma
In quite a few places, we have a list iteration over the vma on an
object that only want to inspect GGTT vma. By construction, these are
placed at the start of the list, so we have copied that knowledge into
many callsites. Pull that knowledge back to i915_vma.h and provide a
for_each_ggtt_vma() to tidy up the code.

v2: Add a backreference from vma_create() to remind ourselves why we put
ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail).
v3: Fixup s/vma/V/

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk
2017-12-07 23:26:55 +00:00
Chris Wilson
7125397b82 drm/i915: Track GGTT writes on the vma
As writes through the GTT and GGTT PTE updates do not share the same
path, they are not strictly ordered and so we must explicitly flush the
indirect writes prior to modifying the PTE. We do track outstanding GGTT
writes on the object itself, but since the object may have multiple GGTT
vma, that is overly coarse as we can track and flush individual vma as
required.

Whilst here, update the GGTT flushing behaviour for Cannonlake.

v2: Hard-code ring offset to allow use during unload (after RCS may have
been freed, or never existed!)

References: https://bugs.freedesktop.org/show_bug.cgi?id=104002
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-2-chris@chris-wilson.co.uk
2017-12-07 14:01:59 +00:00
Chris Wilson
010e3e68cd drm/i915: Remove vma from object on destroy, not close
Originally we translated from the object to the vma by walking
obj->vma_list to find the matching vm (for user lookups). Now we process
user lookups using the rbtree, and we only use obj->vma_list itself for
maintaining state (e.g. ensuring that all vma are flushed or rebound).
As such maintenance needs to go on beyond the user's awareness of the
vma, defer removal of the vma from the obj->vma_list from i915_vma_close()
to i915_vma_destroy()

Fixes: 5888fc9eac ("drm/i915: Flush pending GTT writes before unbinding")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104155
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-12-07 14:00:20 +00:00
Chris Wilson
7f017b19fb drm/i915: Mark up i915_vma_unbind() as a potential sleeper
Whenever we want to unbind a vma, we must wait on all GPU activity to
complete first. (This is what gives us the ability to do fine grained
eviction and purging by only having to wait on the VMA that we need to
unbind to proceed; though if pushed we can make it a rule that we are
only allowed to unbind already idle VMA and move the burden of the work
and organising the sleep onto the caller.) Currently, we might only
sleep if the vma is still active on the GPU, but in principle
i915_vma_unbind() always implies a sleep, so mark it up with a
might_sleep().

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=103638
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171109213450.13875-2-chris@chris-wilson.co.uk
2017-11-09 22:06:09 +00:00
Chris Wilson
1ab22356b3 drm/i915: Prune the reservation shared fence array
The shared fence array is not autopruning and may continue to grow as an
object is shared between new timelines. Take the opportunity when we
think the object is idle (we have to confirm that any external fence is
also signaled) to decouple all the fences.

We apply a similar trick after waiting on an object, see commit
e54ca97747 ("drm/i915: Remove completed fences after a wait")

v2: No longer need to handle the batch pool as a special case.
v3: Need to trylock from within i915_vma_retire as this may be called
form the shrinker - and we may later try to allocate underneath the
reservation lock, so a deadlock is possible.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102936
Fixes: d07f0e59b2 ("drm/i915: Move GEM activity tracking into a common struct reservation_object")
Fixes: 80b204bce8 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107220656.5020-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-08 15:40:31 +00:00
Chris Wilson
d36caeea4b drm/i915: Assert vma->flags are updated correctly during binding
As we bind, and unbind on error, we want to be sure that the vma->flags
are updated to reflect the binding state so that on the next invocation
all is well.

v2: Take two.
v3: Take three; vma-misplaced is checking map-and-fenceable so keep it
last!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171105124550.32715-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
2017-11-06 09:30:36 +00:00
Chris Wilson
f2123818ff drm/i915: Move dev_priv->mm.[un]bound_list to its own lock
Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61 ("drm/i915/shrinker: Flush active on objects before
counting")).

v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
2017-10-16 20:44:19 +01:00
Chris Wilson
a65adaf8a8 drm/i915: Track user GTT faulting per-vma
We don't wish to refault the entire object (other vma) when unbinding
one partial vma. To do this track which vma have been faulted into the
user's address space.

v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09 17:07:29 +01:00
Chris Wilson
3bd4073524 drm/i915: Consolidate get_fence with pin_fence
Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09 17:07:29 +01:00