Arguably some of these should use intel_de_read() or intel_de_write(),
however not all. Prioritize I915_READ() and I915_WRITE() removal in
general over migrating to the pedantically correct replacements right
away.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-8-jani.nikula@intel.com
Arguably some of these should use intel_de_read() or intel_de_write(),
however not all. Prioritize I915_READ() and I915_WRITE() removal in
general over migrating to the pedantically correct replacements right
away.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-7-jani.nikula@intel.com
Another straggler with I915_READ() uses gone.
Arguably some of these should use intel_de_read(), however not
all. Prioritize I915_READ() removal in general over migrating to the
pedantically correct replacement right away.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-5-jani.nikula@intel.com
The i915_cache_sharing file is a debugfs interface for gen 6-7 with no
validation or user. Remove it.
This also removes the last I915_WRITE() use in i915_debugfs.c.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-4-jani.nikula@intel.com
Good riddance! Remove the macros and their remaining references in
comments.
intel_uncore_read_fw() and intel_uncore_write_fw() should be used
instead.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-2-jani.nikula@intel.com
The information is no longer relevant, so remove it. This also removes
the last users of I915_READ_FW()
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-1-jani.nikula@intel.com
Ville noticed that the last mocs entry is used unconditionally by the HW
when it performs cache evictions, and noted that while the value is not
meant to be writable by the driver, we should program it to a reasonable
value nevertheless.
As it turns out, we can change the value of mocs:63 and the value we
were programming into it would cause hard hangs in conjunction with
atomic operations.
v2: Add details from bspec about how it is used by HW
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2707
Fixes: 3bbaba0cea ("drm/i915: Added Programming of the MOCS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140841.1982-1-chris@chris-wilson.co.uk
(cherry picked from commit 977933b5da)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
After a cursory check on the parameters to i915_gem_object_pin_map(),
where we return a precise error, if the backend rejects the mapping we
always return PTR_ERR(-ENOMEM). Let us also return a more precise error
here so we can differentiate between running out of memory and
programming errors (or situations where we may be trying different paths
and looking for an error from an unsupported map).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127195334.13134-1-chris@chris-wilson.co.uk
Block sizes are only limited by the largest power-of-two that will fit
in the region size, but to construct an object we also require feeding
it into an sg list, where the upper limit of the sg entry is at most
UINT_MAX. Therefore to prevent issues with allocating blocks that are
too large, add the flag I915_ALLOC_MAX_SEGMENT_SIZE which should limit
block sizes to the i915_sg_segment_size().
v2: (matt)
- query the max segment.
- prefer flag to limit block size to 4G, since it's best not to assume
the user will feed the blocks into an sg list.
- simple selftest so we don't have to guess.
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130134721.54457-1-matthew.auld@intel.com
For the LMEM case if we have suitable alignment and 2M physical pages we
should always get 2M GTT pages within the constraints of the hugepages
selftest. If we don't then something might be wrong in our construction
of the backing pages.
References: 330b7d3305 ("drm/i915/region: fix order when adding blocks")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130141809.65330-2-matthew.auld@intel.com
In igt_ppgtt_sanity_check we should also exercise the non-contiguous
option for LMEM, since this will give us slightly different sg layouts
and alignment.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130141809.65330-1-matthew.auld@intel.com
We treat idling the GT (intel_rps_park) as a downclock event, and reduce
the frequency we intend to restart the GT with. Since the two workloads
are likely related (e.g. a compositor rendering every 16ms), we want to
carry the frequency and load information from across the idling.
However, we do also need to update the frequencies so that workloads
that run for less than 1ms are autotuned by RPS (otherwise we leave
compositors running at max clocks, draining excess power). Conversely,
if we try to run too slowly, the next workload has to run longer. Since
there is a hysteresis in the power graph, below a certain frequency
running a short workload for longer consumes more energy than running it
slightly higher for less time. The exact balance point is unknown
beforehand, but measurements with 30fps media playback indicate that RPe
is a better choice.
Reported-by: Edward Baker <edward.baker@intel.com>
Tested-by: Edward Baker <edward.baker@intel.com>
Fixes: 043cd2d14e ("drm/i915/gt: Leave rps->cur_freq on unpark")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124183521.28623-1-chris@chris-wilson.co.uk
idr_init() uses base 0 which is an invalid identifier. The new function
idr_init_base allows IDR to set the ID lookup from base 1. This avoids
all lookups that otherwise starts from 0 since 0 is always unused.
References: commit 6ce711f275 ("idr: Make 1-based IDRs more efficient")
Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104150339.GA68663@localhost
We know a problem exists in the ifwi shipped with the early
pre-production Tigerlake and DG1 prototypes, later revisions are fine.
However, CI still relies on the earlier ifwi and we grow tired of
the volume of warnings as we wait for replacements.
Since the warning is a bug, we do not want to lose the warning in its
entirety, so only suppress the warning for the platforms currently
exhibiting the issue.
Suggested-by: José Roberto de Souza <gitlab@gitlab.freedesktop.org>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2411
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127210059.10702-1-chris@chris-wilson.co.uk
As we use a shmemfs file to hold the context state, when not in use it
may be swapped out, such as across suspend. Since we wrote into the
shmemfs without marking the pages as dirty, the contents may be dropped
instead of being written back to swap. On re-using the shmemfs file,
such as creating a new context after resume, the contents of that file
were likely garbage and so the new context could then hang the GPU.
Simply mark the page as being written when copying into the shmemfs
file, and it the new contents will be retained across swapout.
Fixes: be1cb55a07 ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.8+
Link: https://patchwork.freedesktop.org/patch/msgid/20201127120718.454037-161-matthew.auld@intel.com
We now use ilk_hpd_irq_setup for all GMCH platforms that do not have
hotplug. These are early gen3 and gen2 devices that now explode on boot
as they try to access non-existent registers.
Fixes: 794d61a190 ("drm/i915: re-order if/else ladder for hpd_irq_setup")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127145748.29491-1-chris@chris-wilson.co.uk
We checked the table size against a hardcoded number of entries, and
that number was excluding the special mocs registers at the end.
Fixes: 977933b5da ("drm/i915/gt: Program mocs:63 for cache eviction on gen9")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127102540.13117-1-chris@chris-wilson.co.uk
If while we are cancelling the breadcrumb signaling, we find that the
request is already completed, move it to the irq signaler and let it be
signaled.
v2: Tweak reference counting so that we only acquire a new reference on
adding to a signal list, as opposed to a hidden i915_request_put of the
caller's reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-5-chris@chris-wilson.co.uk
Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.
This is required for a later patch to handle locking around virtual
requests in the signaler, as those requests may want to move between
engines and be destroyed while we are holding b->irq_lock on a physical
engine.
v2: Drop mutex_reinit(), if we never mark the mutex as destroyed we
don't need to reset the debug code, at the loss of having the mutex
debug code spot us attempting to destroy a locked mutex.
v3: As the intended use will remain strongly referenced counted, with
very little inflight access across reuse, drop the ctor.
v4: Drop the unrequired change to remove the temporary reference around
dropping the active context, and add back some more missing ctor
operations.
v5: The ctor is back. Tvrtko spotted that ce->signal_lock [introduced
later] maybe accessed under RCU and so needs special care not to be
reinitialised.
v6: Don't mix SLAB_TYPESAFE_BY_RCU and RCU list iteration.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-3-chris@chris-wilson.co.uk
Pull the repeated check for the last active request being completed to a
single spot, when deciding whether or not execlist preemption is
required.
In doing so, we remove the tasklet kick, introduced with the completion
checks in commit 35f3fd8182 ("drm/i915/execlists: Workaround switching
back to a completed context"), if we find the request was completed but
have not yet seen the corresponding CS event. This was devolving into a
busy spin of the tasklet while we waited for the event as the delivery
was not as instantaneous as expected. Under load this is sufficient to
exhaust the tasklet softirq timeslice, and force ksoftirqd. Quite
noticeable overhead for no apparent improvement in latency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-2-chris@chris-wilson.co.uk
Since the introduction of preempt-to-busy, requests can complete in the
background, even while they are not on the engine->active.requests list.
As such, the engine->active.request list itself is not in strict
retirement order, and we have to scan the entire list while unwinding to
not miss any. However, if the request is completed we currently leave it
on the list [until retirement], but we could just as simply remove it
and stop treating it as active. We would only have to then traverse it
once while unwinding in quick succession.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-1-chris@chris-wilson.co.uk
Ville noticed that the last mocs entry is used unconditionally by the HW
when it performs cache evictions, and noted that while the value is not
meant to be writable by the driver, we should program it to a reasonable
value nevertheless.
As it turns out, we can change the value of mocs:63 and the value we
were programming into it would cause hard hangs in conjunction with
atomic operations.
v2: Add details from bspec about how it is used by HW
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2707
Fixes: 3bbaba0cea ("drm/i915: Added Programming of the MOCS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140841.1982-1-chris@chris-wilson.co.uk
Since preempt-to-busy, we may unsubmit a request while it is still on
the HW and completes asynchronously. That means it may be retired and in
the process destroy the virtual engine (as the user has closed their
context), but that engine may still be holding onto the unsubmitted
compelted request. Therefore we need to potentially cleanup the old
request on destroying the virtual engine. We also have to keep the
virtual_engine alive until after the sibling's execlists_dequeue() have
finished peeking into the virtual engines, for which we serialise with
RCU.
v2: Be paranoid and flush the tasklet as well.
v3: And flush the tasklet before the engines, as the tasklet may
re-attach an rb_node after our removal from the siblings.
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-4-chris@chris-wilson.co.uk
(cherry picked from commit 46eecfccb4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We currently want to keep the interrupt enabled until the interrupt after
which we have no more work to do. This heuristic was broken by us
kicking the irq-work on adding a completed request without attaching a
signaler -- hence it appearing to the irq-worker that an interrupt had
fired when we were idle.
Fixes: 2854d86632 ("drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-3-chris@chris-wilson.co.uk
(cherry picked from commit 3aef910d26)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Move the register slow register write and readback from out of the
critical path for execlists submission and delay it until the following
worker, shaving off around 200us. Note that the same signal_irq_work() is
allowed to run concurrently on each CPU (but it will only be queued once,
once running though it can be requeued and reexecuted) so we have to
remember to lock the global interactions as we cannot rely on the
signal_irq_work() itself providing the serialisation (in constrast to a
tasklet).
By pushing the arm/disarm into the central signaling worker we can close
the race for disarming the interrupt (and dropping its associated
GT wakeref) on parking the engine. If we loose the race, that GT wakeref
may be held indefinitely, preventing the machine from sleeping while
the GPU is ostensibly idle.
v2: Move the self-arming parking of the signal_irq_work to a flush of
the irq-work from intel_breadcrumbs_park().
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2271
Fixes: e23005604b ("drm/i915/gt: Hold context/request reference while breadcrumbs are active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-1-chris@chris-wilson.co.uk
(cherry picked from commit 9d5612ca16)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
After having written the entire OA buffer with reports, the HW will
write again at the beginning of the OA buffer. It'll indicate it by
setting the WRAP bits in the OASTATUS register.
When a wrap happens and that at the end of the read vfunc we write the
OASTATUS register back to clear the REPORT_LOST bit, we sometimes see
that the OATAILPTR register is reset to a previous position on Gen8/9
(apparently not the case on Gen11+). This leads the next call to the
read vfunc to process reports we've already read. Because we've marked
those as read by clearing the reason & timestamp dwords, they're
discarded and a "Skipping spurious, invalid OA report" message is
emitted.
The workaround to avoid this OATAILPTR value reset seems to be to set
the wrap bits when writing back OASTATUS.
This change has no impact on userspace, it only avoids a bunch of
DRM_NOTE("Skipping spurious, invalid OA report\n") messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 19f81df285 ("drm/i915/perf: Add OA unit support for Gen 8+")
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117130124.829979-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 059a0beb48)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Get rid of the __call_single_node union and clean up the API a little
to avoid external code relying on the structure layout as much.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
CT event handler is called under the gt->irq_lock from the interrupt
handling paths so make it the same from the init path. I don't think this
mismatch caused any functional issue but we need to wean the code of the
global i915->irq_lock.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120095636.1987395-2-tvrtko.ursulin@linux.intel.com
Guc->mmio_msg is set under the guc->irq_lock in guc_get_mmio_msg so it
should be consumed under the same lock from guc_handle_mmio_msg.
I am not sure if the overall flow here makes complete sense but at least
the correct lock is now used.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120095636.1987395-1-tvrtko.ursulin@linux.intel.com
Since preempt-to-busy, we may unsubmit a request while it is still on
the HW and completes asynchronously. That means it may be retired and in
the process destroy the virtual engine (as the user has closed their
context), but that engine may still be holding onto the unsubmitted
compelted request. Therefore we need to potentially cleanup the old
request on destroying the virtual engine. We also have to keep the
virtual_engine alive until after the sibling's execlists_dequeue() have
finished peeking into the virtual engines, for which we serialise with
RCU.
v2: Be paranoid and flush the tasklet as well.
v3: And flush the tasklet before the engines, as the tasklet may
re-attach an rb_node after our removal from the siblings.
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-4-chris@chris-wilson.co.uk
We currently want to keep the interrupt enabled until the interrupt after
which we have no more work to do. This heuristic was broken by us
kicking the irq-work on adding a completed request without attaching a
signaler -- hence it appearing to the irq-worker that an interrupt had
fired when we were idle.
Fixes: 2854d86632 ("drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-3-chris@chris-wilson.co.uk
Move the register slow register write and readback from out of the
critical path for execlists submission and delay it until the following
worker, shaving off around 200us. Note that the same signal_irq_work() is
allowed to run concurrently on each CPU (but it will only be queued once,
once running though it can be requeued and reexecuted) so we have to
remember to lock the global interactions as we cannot rely on the
signal_irq_work() itself providing the serialisation (in constrast to a
tasklet).
By pushing the arm/disarm into the central signaling worker we can close
the race for disarming the interrupt (and dropping its associated
GT wakeref) on parking the engine. If we loose the race, that GT wakeref
may be held indefinitely, preventing the machine from sleeping while
the GPU is ostensibly idle.
v2: Move the self-arming parking of the signal_irq_work to a flush of
the irq-work from intel_breadcrumbs_park().
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2271
Fixes: e23005604b ("drm/i915/gt: Hold context/request reference while breadcrumbs are active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-1-chris@chris-wilson.co.uk
The current interface of intel_gvt_register_hypervisor() expects a
non-const pointer to struct intel_gvt_mpt, even though the mediator
never modifies (or should modifiy) the content of this struct.
Change the function signature and relevant struct members to const to
properly express the API's intent and allow instances of intel_gvt_mpt
to be allocated as const.
While I was here, I also made KVM's instance of this struct const to
reduce the number of writable function pointers in the kernel.
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: intel-gvt-dev@lists.freedesktop.org
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201111172811.558443-1-julian.stecklina@cyberus-technology.de
The old IPS interface did not match the RPS interface that we tried to
plug it into (bool vs int return). Once repaired, our minimal
selftesting is finally happy!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201121190352.15996-1-chris@chris-wilson.co.uk
If we run out of ring space, or exceed the desired runtime, we wish to
stop the subtest. Put these checks together, so that we always keep the
requests flushed on completion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120140314.24749-3-chris@chris-wilson.co.uk
We print out the "logical" context support before we discover whether or
not the engines have logical contexts. No one, except Tvrtko, seems to
have noticed the error, so the debug message must not be useful to
anyone.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120140314.24749-1-chris@chris-wilson.co.uk
For DG1 we have a little of mix up wrt to DDI/port names and indexes.
Bspec refers to the ports as DDIA, DDIB, DDI USBC1 and DDI USBC2
(besides the DDIA, DDIB, DDIC, DDID), but the previous naming is the
most unambiguous one. This means that for any register on Display Engine
we should use the index of A, B, D and E. However in some places this is
not true:
- VBT: uses C and D and have to be mapped to D/E
- IO/Combo: uses C and D, but we already differentiate those when
we created the phy vs port distinction.
This additional mapping for VBT and phy are already covered in previous
patches, so now we can initialize all the DDIs as A, B, D and E.
v2: Squash previous patch enabling just ports A and B since most of the
pumbling code is already merged now
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117084836.2318234-1-lucas.demarchi@intel.com
Include the signalers each request in the timeline is waiting on, as a
means to try and identify the cause of a stall. This can be quite
verbose, even as for now we only show each request in the timeline and
its immediate antecedents.
This generates output like:
Timeline 886: { count 1, ready: 0, inflight: 0, seqno: { current: 664, last: 666 }, engine: rcs0 }
U 886:29a- prio=0 @ 134ms: gem_exec_parall<4621>
U bc1:27a- prio=0 @ 134ms: gem_exec_parall[4917]
Timeline 825: { count 1, ready: 0, inflight: 0, seqno: { current: 802, last: 804 }, engine: vcs0 }
U 825:324 prio=0 @ 107ms: gem_exec_parall<4518>
U b75:140- prio=0 @ 110ms: gem_exec_parall<5486>
Timeline b46: { count 1, ready: 0, inflight: 0, seqno: { current: 782, last: 784 }, engine: vcs0 }
U b46:310- prio=0 @ 70ms: gem_exec_parall<5428>
U c11:170- prio=0 @ 70ms: gem_exec_parall[5501]
Timeline 96b: { count 1, ready: 0, inflight: 0, seqno: { current: 632, last: 634 }, engine: vcs0 }
U 96b:27a- prio=0 @ 67ms: gem_exec_parall<4878>
U b75:19e- prio=0 @ 67ms: gem_exec_parall<5486>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-6-chris@chris-wilson.co.uk
Include the active timelines for debugfs/i915_engine_info, so that we
can see which have unready requests inflight which are not shown
otherwise.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-4-chris@chris-wilson.co.uk
We plan to expand upon the number of available statuses for when we
pretty-print the requests along the timelines, and so need a new set of
flags. We have settled upon:
Unready [U]
- initial status after being submitted, the request is not
ready for execution as it is waiting for external fences
Ready [R]
- all fences the request was waiting on have been signaled,
and the request is now ready for execution and will be
in a backend queue
- a ready request may still need to wait on semaphores
[internal fences]
Ready/virtual [V]
- same as ready, but queued over multiple backends
Executing [E]
- the request has been transferred from the backend queue and
submitted for execution on HW
- a completed request may still be regarded as executing, its
status may not be updated until it is retired and removed
from the lists
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-3-chris@chris-wilson.co.uk
Extract i915_request_show for reuse in other request chain pretty
printers.
For a bonus point, quietly change the seqno format from %llx to %lld to
match everywhere else.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-2-chris@chris-wilson.co.uk
Forcing mocs:1 [used for our winsys follows-pte mode] to be cached
caused display glitches. Though it is documented as deprecated (and so
likely behaves as uncached) use the follow-pte bit and force it out of
L3 cache.
Testcase: igt/kms_frontbuffer_tracking
Testcase: igt/kms_big_fb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-4-chris@chris-wilson.co.uk
(cherry picked from commit a04ac82736)
Fixes: 849c0fe9e8 ("drm/i915/gt: Initialize reserved and unspecified MOCS indices")
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo: Updated Fixes tag]
After having written the entire OA buffer with reports, the HW will
write again at the beginning of the OA buffer. It'll indicate it by
setting the WRAP bits in the OASTATUS register.
When a wrap happens and that at the end of the read vfunc we write the
OASTATUS register back to clear the REPORT_LOST bit, we sometimes see
that the OATAILPTR register is reset to a previous position on Gen8/9
(apparently not the case on Gen11+). This leads the next call to the
read vfunc to process reports we've already read. Because we've marked
those as read by clearing the reason & timestamp dwords, they're
discarded and a "Skipping spurious, invalid OA report" message is
emitted.
The workaround to avoid this OATAILPTR value reset seems to be to set
the wrap bits when writing back OASTATUS.
This change has no impact on userspace, it only avoids a bunch of
DRM_NOTE("Skipping spurious, invalid OA report\n") messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 19f81df285 ("drm/i915/perf: Add OA unit support for Gen 8+")
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117130124.829979-1-lionel.g.landwerlin@intel.com
Add the new vma_set_file() function to allow changing
vma->vm_file with the necessary refcount dance.
v2: add more users of this.
v3: add missing EXPORT_SYMBOL, rebase on mmap cleanup,
add comments why we drop the reference on two occasions.
v4: make it clear that changing an anonymous vma is illegal.
v5: move vma_set_file to mm/util.c
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://patchwork.freedesktop.org/patch/399360/
Just a normal comment, not a kerneldoc function description.
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'vgpu' not described in 'bxt_ppat_low_write'
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'offset' not described in 'bxt_ppat_low_write'
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'p_data' not described in 'bxt_ppat_low_write'
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'bytes' not described in 'bxt_ppat_low_write'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201103204307.15723-1-chris@chris-wilson.co.uk
Since we allocate some breadcrumbs for the virtual engine, and the
virtual engine has a custom destructor, we also need to free the
breadcrumbs after use.
Fixes: b3786b2937 ("drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201118133839.1783-1-chris@chris-wilson.co.uk
(cherry picked from commit 45e50f48b7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110210447.27454-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 2ca5a7b85b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Since we allocate some breadcrumbs for the virtual engine, and the
virtual engine has a custom destructor, we also need to free the
breadcrumbs after use.
Fixes: b3786b2937 ("drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201118133839.1783-1-chris@chris-wilson.co.uk
We can't call drm_plane_state_src() this late for the slave plane since
it would consult the wrong uapi state. We've alreayd done the correct
uapi->hw copy earlier, so let's just preserve the unclipped src/dst
rects using a temp copy across the intel_atomic_plane_check_clipping()
call.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-14-manasi.d.navare@intel.com
Dump debugfs and planar links as well, this will make it easier to debug
when things go wrong.
v4:
* Rebase
Changes since v1:
- Report planar slaves as such, now that we have the plane_state switch.
Changes since v2:
- Rebase on top of the new plane format dumping
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-12-manasi.d.navare@intel.com
We need to look at hw.fb for the framebuffer, and add the translation
for the slave_plane_state. With these changes we set the correct
rectangle on the bigjoiner slave, and don't set incorrect
src/dst/visibility on the slave plane.
v2:
* Manual rebase (Manasi)
v3:
* hw.rotation instead of uapi.rotation (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-11-manasi.d.navare@intel.com
When using bigjoiner userspace is only controlling the "master"
plane, so use its uapi state for the "slave" plane as well.
hw.crtc needs a bit of magic since we don't want to copy that from
the uapi state (as it points to the wrong pipe for the "slave
" plane). Instead we pass the right crtc in explicitly but only
assign it when the uapi state indicates the plane to be logically
enabled (ie. uapi.crtc != NULL).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-10-manasi.d.navare@intel.com
Make sure both the bigjoiner "master" and "slave" plane are
in the state whenever either of them is in the state.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-9-manasi.d.navare@intel.com
Skip iterating over bigjoiner slaves, only the master has the state we
care about.
Add the width of the bigjoiner slave to the reconstructed fb.
Hide the bigjoiner slave to userspace, and double the mode on bigjoiner
master.
And last, disable bigjoiner slave from primary if reconstruction fails.
v3:
* Fix the ddi_get_config slave error (Ankit Nautiyal)
v2:
* Unsupported bigjoiner config for initial fb (Ville)
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala:
* Don't do any hw->uapi state copy for bigjoiner slave
* We still have hw.mode so no need to pass it in
* Appease checkpatch]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-7-manasi.d.navare@intel.com
Enabling is done in a special sequence and so should plane updates
be. Ideally the end user never notices the second pipe is used.
This way ideally everything will be tear free, and updates are
really atomic as userspace expects it.
This uses generic modeset_enables() calls like trans port sync
but still has special handling for disable since for slave we
should not disable things like encoder, plls that are not enabled
for slave.
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala: Appease checkpatch]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-6-manasi.d.navare@intel.com
Make vdsc work when no output is enabled. The big joiner needs VDSC
on the slave, so enable it and set the appropriate bits.
So remove encoder usage from dsc functions.
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-5-manasi.d.navare@intel.com
When the clock is higher than the dotclock, try with 2 pipes enabled.
If we can enable 2, then we will go into big joiner mode, and steal
the adjacent crtc.
This only links the crtc's in software, no hardware or plane
programming is done yet. Blobs are also copied from the master's
crtc_state, so it doesn't depend at commit time on the other
crtc_state.
v6:
* Enable dSC for any mode->hdisplay > 5120
v5:
* Remove intel_dp_max_dotclock (Manasi)
v4:
* Fixes in intel_crtc_compute_config (Ville)
v3:
* Manual Rebase (Manasi)
Changes since v1:
- Rename pipe timings to transcoder timings, as they are now different.
Changes since v2:
- Rework bigjoiner checks; always disable slave when recalculating
master. No need to have a separate bigjoiner pass any more.
- Use pipe_mode instead of transcoder_mode, to clean up the code.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala:
* hskew isn't a thing
* Do the dsc compute if bigjoiner is enabled, not the other way around]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-4-manasi.d.navare@intel.com
Small changes to intel_dp_mode_valid(), allow listing modes that
can only be supported in the bigjoiner configuration, which is
not supported yet.
v13:
* Allow bigjoiner if hdisplay >5120
v12:
* slice_count logic simplify (Ville)
* Fix unnecessary changes in downstream_mode_valid (Ville)
v11:
* Make intel_dp_can_bigjoiner non static
so it can be used in intel_display (Manasi)
v10:
* Simplify logic (Ville)
* Allow bigjoiner on edp (Ville)
v9:
* Restric Bigjoiner on PORT A (Ville)
v8:
* use source dotclock for max dotclock (Manasi)
v7:
* Add can_bigjoiner() helper (Ville)
* Pass bigjoiner to plane_size validation (Ville)
v6:
* Rebase after dp_downstream mode valid changes (Manasi)
v5:
* Increase max plane width to support 8K with bigjoiner (Maarten)
v4:
* Rebase (Manasi)
Changes since v1:
- Disallow bigjoiner on eDP.
Changes since v2:
- Rename intel_dp_downstream_max_dotclock to intel_dp_max_dotclock,
and split off the downstream and source checking to its own function.
(Ville)
v3:
* Rebase (Manasi)
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[vsyrjala:
* Keep bigjoiner disabled until everything is ready
* Appease checkpatch]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-3-manasi.d.navare@intel.com
When doing the plane state copy from the UV plane to the Y plane
let's just copy the hw state directly instead of using the original
uapi state. The UV plane has already had its uapi state copied into
its hw state, so this extra detour via the uapi state for the Y plane
is pointless.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-2-manasi.d.navare@intel.com
I totally fumbled the ?: usage when generating the DDI encoder
names. Reverse the things that need reversing, and to make it
a bit less messy add a few macros to hide the arithmetic on the
port enums.
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Fixes: 2d709a5a62 ("drm/i915: Give DDI encoders even better names")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117154028.8516-1-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
The presumption was that some time would always elapse between recording
the start and the finish of a context switch. This turns out to be a
regular occurrence and emitting a debug statement superfluous.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117113103.21480-4-chris@chris-wilson.co.uk
The WA specifies that we need to toggle a SDE chicken bit on and then
off as the final step in preparation for s0ix entry.
Bspec: 33450
Bspec: 8402
However, something is happening after we toggle the bit that causes
the WA to be invalidated. This makes dispcnlunit1_cp_xosc_clkreq
active being already in s0ix state i.e SLP_S0 counter incremented.
Tweaking the Wa_14010685332 by setting the bit on suspend and clearing
it on resume turns down the dispcnlunit1_cp_xosc_clkreq.
B.Spec has Documented this tweaked sequence of WA as an alternative.
Let keep this tweaked WA for Gen11 platforms and keep untweaked WA for
other platforms which never observed this issue.
v2 (MattR):
- Change the comment on the workaround to give PCH names rather than
platform names. Although the bspec is setup to list workarounds by
platform, the hardware team has confirmed that the actual issue being
worked around here is something that was introduced back in the
Cannon Lake PCH and carried forward to subsequent PCH's.
- Extend the untweaked version of the workaround to include PCH_CNP as
well. Note that since PCH_CNP is used to represent CMP, this will
apply on CML and some variants of RKL too.
- Cap the untweaked version of the workaround so that it won't apply to
"fake" PCH's (i.e., DG1). The issue we're working around really is
an issue in the PCH itself, not the South Display, so it shouldn't
apply when there isn't a real PCH.
v3:
- use intel_de_rmw(). [Rodrigo]
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110121700.4338-1-anshuman.gupta@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110210447.27454-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Just like for rkl and tgl, this should be permanent as well for dg1
instead just for A0. The commit making it permanent for those platforms
ended up "racing" with the commit adding the DG1 WAs, so now fix that up.
v2: Add "tgl,dg1" to WA comment (Matt)
Cc: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027043228.696518-3-lucas.demarchi@intel.com
If intel context create failed, the perf_request_latency() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: 25c26f18ea ("drm/i915/selftests: Measure dispatch latency")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116143540.3648870-1-zhangxiaoxu5@huawei.com
(cherry picked from commit 1938445205)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If intel context create failed, the perf_series_engines() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: cbfd3a0c5a ("drm/i915/selftests: Add request throughput measurement to perf")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116144112.3673011-1-zhangxiaoxu5@huawei.com
(cherry picked from commit 01d708840c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
I forgot to free the old list when growing past 16 entries.
Luckily, as much as I checked, none of the current platforms has more than
16 workarounds on a single list.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 452420d22d ("drm/i915: Fuse per-context workaround handling with the common framework")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113132510.2298483-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 77c296966e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Some media power gates are disabled by default. commit 5d86923060
("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
tried to enable it, but it duplicated an existent register.
So, the main PG setup sequences ended up overwriting it.
So, let's now merge this to the main PG setup sequence.
v2: (Chris): s/BIT/REG_BIT, remove useless comment,
remove useless =0, use the right gt,
remove rc6 sequence doubt from commit message.
Fixes: 5d86923060 ("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: stable@vger.kernel.org#v5.5+
Cc: Dale B Stimson <dale.b.stimson@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111072859.1186070-1-rodrigo.vivi@intel.com
(cherry picked from commit 695dc55b57)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently the DPLL .get_freq() uses pll->state.hw_state which
is not the thing we actually read out (except during driver
load/resume). Outside of that pll->state.hw_state is just the
thing we committed last time around. During state check we
just read the thing into crtc_state->dpll_hw_state, so that
is what we should use for calculating the DPLL output frequency.
I think we used to do this so that the results of the readout
were actually used, but somehow it got changed when the
.get_freq() refactoring happened.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109231239.17002-3-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
On icl+ we want to populate both crtc_state.{shared_dpll,dpll_hw_state}
and crtc_state.port_dplls[] during readout, whereas on pre-icl we
want to leave the latter stuff untouched. Rather than adding more ifs
into hsw_get_ddi_port_state() to copy the DPLL hw state around let's
just move the whole dpll readout into hsw_get_ddi_dpll() & co.
Slightly repetitive, but meh.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109231239.17002-2-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Store the relative data rate for planes in the crtc state
so that we don't have to use
intel_atomic_crtc_state_for_each_plane_state() to compute
it even for the planes that are no part of the current state.
Should probably just nuke this stuff entirely an use the normal
plane data rate instead. The two are slightly different since this
relative data rate doesn't factor in the actual pixel clock, so
it's a bit odd thing to even call a "data rate". And since the
watermarks are computed based on the actual data rate anyway
I don't really see what the point of this relative data rate
is. But that's for the future...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-6-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
In order to remove intel_atomic_crtc_state_for_each_plane_state()
from skl_crtc_can_enable_sagv() we can simply precompute whether
each wm level can tolerate the SAGV block time latency or not.
This has the nice side benefit that we remove the duplicated
wm level latency calculation. In fact the copy of that code
we had in skl_crtc_can_enable_sagv() didn't even handle
WaIncreaseLatencyIPCEnabled/Display WA #1141 whereas the copy
in skl_compute_plane_wm() did. So now we just have the one
copy which handles all the w/as.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-5-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
If intel context create failed, the perf_request_latency() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: 25c26f18ea ("drm/i915/selftests: Measure dispatch latency")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116143540.3648870-1-zhangxiaoxu5@huawei.com
If intel context create failed, the perf_series_engines() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: cbfd3a0c5a ("drm/i915/selftests: Add request throughput measurement to perf")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116144112.3673011-1-zhangxiaoxu5@huawei.com
I forgot to free the old list when growing past 16 entries.
Luckily, as much as I checked, none of the current platforms has more than
16 workarounds on a single list.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 452420d22d ("drm/i915: Fuse per-context workaround handling with the common framework")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113132510.2298483-1-tvrtko.ursulin@linux.intel.com
intel_atomic_crtc_state_for_each_plane_state() peeks at the
plane's current state without holding the plane's mutex, trusting
that the crtc's mutex will protect it. In practice that does work
since our planes can't move between pipes, but it sets a bad
example. intel_atomic_crtc_state_for_each_plane_state() also
relies on crtc_state.uapi.plane_mask which may be full of lies
when it comes to the bigjoiner stuff, so soon we can't use it as
is anyway. So best to just get rid of it entirely. Which we can
easily do by switching to the g4x/vlv "raw" watermark approach.
Later on we should even be able to move the "raw" watermark
computation into the normal .plane_check() code, leaving only
the merging/clamping of the final watermarks to the later
stages. But that will require adjusting the ilk+ wm code
similarly as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-3-ville.syrjala@linux.intel.com
Pass the whole intel_atomic_state to skl_build_pipe_wm()
and skl_allocate_pipe_ddb() so we can start to iterate
stuff containerd in the commit.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-2-ville.syrjala@linux.intel.com
No functional changes here, just adds a from_crtc_state
as a prep for bigjoiner
v2:
* More prep with intel_atomic_state (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113155656.17630-2-manasi.d.navare@intel.com
No functional changes, to align with previous cleanups pass
intel_atomic_state instead of drm_atomic_state.
Also pass this intel_atomic_state with crtc_state to
some of the atomic_check functions.
v2:
* Squash some changes from next patch (Ville)
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113155656.17630-1-manasi.d.navare@intel.com
With bigjoiner, there will be 2 pipes driving 2 halves of 1 transcoder,
because of this, we need a pipe_mode for various calculations, including
for example watermarks, plane clipping, etc.
v10:
* remove redundant pipe_mode assignment (Ville)
v9:
* pipe_mode in state dump nd state check (Ville)
v8:
* Add pipe_mode in readout in verify_crtc_state (Ville)
v7:
* Remove redundant comment (Ville)
* Just keep mode instead of pipe_mode (Ville)
v6:
* renaming in separate function, only pipe_mode here (Ville)
* Add description (Maarten)
v5:
* Rebase (Manasi)
v4:
* Manual rebase (Manasi)
v3:
* Change state to crtc_state, fix rebase err (Manasi)
v2:
* Manual Rebase (Manasi)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
[vsyrjala:
* Fix state checker
* Fix state dump
* Use pipe_mode for linetime watermarks
* Make sure pipe_mode normal timings are correct since the
silly ddb code uses them
* Drop the redundant pipe_mode copies from intel_modeset_pipe_config()
and intel_crtc_copy_uapi_to_hw_state()
* Use drm_mode_copy() all over]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-7-ville.syrjala@linux.intel.com
Collect up a bunch of derived state "readout" into
a common helper, which we can call from both
intel_encoder_get_config() and intel_crtc_get_pipe_config().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-6-ville.syrjala@linux.intel.com
Generalize intel_mode_from_pipe_config() to work on any two
arbitrary modes. Also relocate the code for the future, and
make it static since it's not needed elsewhere.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-5-ville.syrjala@linux.intel.com
No reason to make the callers of intel_crtc_get_pipe_config()
populate hw.active. Let's do it in intel_crtc_get_pipe_config()
itself. hw.enable we leave up to the callers since it's slightly
different for readout vs. state check.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-4-ville.syrjala@linux.intel.com
Create a new function intel_crtc_get_pipe_config()
that calls platform specific hooks for get_pipe_config()
No functional change here.
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala: Conform to modern i915 coding style, fix patch subject]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-3-ville.syrjala@linux.intel.com
No functional changes, create a separate intel_encoder_get_config()
function that calls encoder->get_config hook.
This is needed so that later we can add beigjoienr related
readout here.
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala: Move the code around for the future]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-2-ville.syrjala@linux.intel.com
Cross-subsystem Changes:
- DMA mapped scatterlist fixes in i915 to unblock merging of
https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom)
Driver Changes:
- Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"):
Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris)
- Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce)
- Delay execlist processing for Tigerlake to avoid hang (Chris)
- Fix for Tigerlake RCS engine health check through heartbeat (Chris)
- Fix for Tigerlake reserved MOCS entries (Ayaz, Chris)
- Fix Media power gate sequence on Tigerlake (Rodrigo)
- Enable eLLC caching of display buffers for SKL+ (Ville)
- Support parsing of oversize batches on Gen9 (Matt, Chris)
- Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris)
- Flush engines before Tigerlake breadcrumbs (Chris)
- Use the local HWSP offset during submission (Chris)
- Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew)
- Use the active reference on the vma while capturing to avoid use-after-free (Chris)
- Fix MOCS PTE setting for gen9+ (Ville)
- Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris)
- Avoid NULL dereference from PT/PD stash allocation error (Matt)
- Hold request reference for canceling an active context (Chris)
- Avoid infinite loop on x86-32 when mapping a lot of objects (Chris)
- Disallow WC mappings when processor doesn't support them (Chris)
- Return correct error in i915_gem_object_copy_blt() error path (Dan)
- Return correct error in intel_context_create_request() error path (Maarten)
- Tune down GuC communication enabled/disabled messages to debug (Jani)
- Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris)
- Cancel outstanding work after disabling heartbeats on an engine (Chris)
- Signal cancelled requests (Chris)
- Retire cancelled requests on unload (Chris)
- Scrub HW state on driver remove (Chris)
- Undo forced context restores after trivial preemptions (Chris)
- Handle PCI unbind in PMU code (Tvrtko)
- Fix CPU hotplug with multiple GPUs in PMU code (Trtkko)
- Correctly set SFC capability for video engines (Venkata)
- Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal)
- Improve GuC warnings on loading failure (John)
- Avoid ownership race in buffer pool by clearing age (Chris)
- Use MMIO to read CSB in case of failure (Chris, Mika)
- Show engine properties in engine state dump to indicate changes (Chris, Joonas)
- Break up error capture compression loops with cond_resched() (Chris)
- Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris)
- Serialise debugfs i915_gem_objects with ctx->mutex (Chris)
- Always test execution status on closing the context and close if not persistent (Chris)
- Avoid mixing integer types during batch copies (Chris, Jared)
- Skip over MI_NOOP when parsing to avoid overhead (Chris)
- Hold onto an explicit ref to i915_vma_work.pinned (Chris)
- Perform all asynchronous waits prior to marking payload start (Chris)
- Pull phys pread/pwrite implementations to the backend (Matt)
- Improve record of hung engines in error state (Tvrtko)
- Allow backends to override pread implementation (Matt)
- Reinforce LRC poisoning checks to confirm context survives execution (Chris)
- Fix memory region max size calculation (Matt)
- Fix order when adding blocks to memory region (Matt)
- Eliminate unused intel_virtual_engine_get_sibling func (Chris)
- Cleanup kasan warning for on-stack (unsigned long) casting (Chris)
- Onion unwind for scratch page allocation failure (Chris)
- Poison stolen pages before use (Chris)
- Selftest improvements (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com
When we fail to take the module reference, we go to the 'undo*' branch and
return. But the returned variable 'ret' has been set as zero by the
above code. Change 'ret' to '-ENODEV' in this situation.
Fixes: 9bdb073464 ("drm/i915/gvt: Change KVMGT as self load module")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1605187352-51761-1-git-send-email-wangxiongfeng2@huawei.com
Move the specialised interactions with the physical GEM object from the
pread/pwrite ioctl handler into the phys backend.
Currently, if one is able to exhaust the entire aperture and then try to
pwrite into an object not backed by struct page, we accidentally invoked
the phys pwrite handler on a non-phys object; calamitous.
Fixes: c6790dc223 ("drm/i915: Wean off drm_pci_alloc/drm_pci_free")
Testcase: igt/gem_pwrite/exhaustion
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-2-chris@chris-wilson.co.uk
(cherry picked from commit 852e1b3644)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As there are more and more complicated interactions between the different
backing stores and userspace, push the control into the backends rather
than accumulate them all inside the ioctl handlers.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-1-chris@chris-wilson.co.uk
(cherry picked from commit 0049b68845)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
No functional changes. This patch just moves some mode checks
around to prepare for adding bigjoiner related mode validation
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112023954.12301-1-manasi.d.navare@intel.com
Just following what we do in many other places, DG1 is a exception so
move it to the top instead of add it inside of INTEL_GEN(dev_priv) >= 12.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111162408.98002-2-jose.souza@intel.com
DC9 has a separate HW flow from the rest of the DC states and it is
available in GEN9 LP platforms and on GEN11 and newer, so here
moving the assignment of the mask to a single conditional block to
simplifly code.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111162408.98002-1-jose.souza@intel.com
Specification says the bit7 of the DPCD MAX_LANE_COUNT (offset 0x02) must
be set to 1 when comes to the displayport version 1.2. This patch respects
the definition.
W/o this patch, guest i915 driver can only set the resolution to 1024*768,
and complains about the unsuccessful link training:
[ 5.692193] i915 0000:00:02.0: [drm] *ERROR* index 0, lane_count 1 Link Training Unsuccessful
Fixes: e2e02cbb5b ("drm/i915/gvt: make dpcd_fix_data supports DP1.2")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200921065807.247847-1-tina.zhang@intel.com
Now that hpd/display related calls are split from the rest in
intel_irq_init(), skip all of that in case we don't have display.
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-8-lucas.demarchi@intel.com
!HAS_DISPLAY() implies !HAS_OVERLAY(), skipping overlay setup anyway, so
return earlier from intel_modeset_init() for clarity.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-4-lucas.demarchi@intel.com
Display is always disabled and enabled when resetting any engine, but if
there is no display it should not do anything with display and only
reset the needed engines.
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-3-lucas.demarchi@intel.com
Some media power gates are disabled by default. commit 5d86923060
("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
tried to enable it, but it duplicated an existent register.
So, the main PG setup sequences ended up overwriting it.
So, let's now merge this to the main PG setup sequence.
v2: (Chris): s/BIT/REG_BIT, remove useless comment,
remove useless =0, use the right gt,
remove rc6 sequence doubt from commit message.
Fixes: 5d86923060 ("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: stable@vger.kernel.org#v5.5+
Cc: Dale B Stimson <dale.b.stimson@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111072859.1186070-1-rodrigo.vivi@intel.com
idr_init() uses base 0 which is an invalid identifier. The new function
idr_init_base allows IDR to set the ID lookup from base 1. This avoids
all lookups that otherwise starts from 0 since 0 is always unused.
References: commit 6ce711f275 ("idr: Make 1-based IDRs more efficient")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201104121532.GA48202@localhost
Reduce this maintenance nightmare a bit by converting the plane
min/max width/height stuff into vfuncs.
Now, if I could just think of a nice way to also use this for
intel_mode_valid_max_plane_size()...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924185113.30849-1-ville.syrjala@linux.intel.com
Reviewed-by: Aditya Swarup <aditya.swarup@intel.com>
We need commit f8f6ae5d07 ("mm: always have io_remap_pfn_range() set
pgprot_decrypted()") to be able to merge Jason's cleanup patch.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl+oiOgeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGKBQIAJw6oad/FA7j9OO2
dMoaXb8UaBehGWgW2rdfWrFPV5v0DBnp/GkdRpLoZIjV3W4mBfnog7bIa4Eswlxo
Y8sZxo5/3JlgJQUkHvzR1TYk5z61lHkUw9Kj/cCyx6YdbjSl19AfFsnhQVVMuyp9
TXL2c7hxkHlw8eBGrymVu0Ip7Zq0x8O2g+8nQpmRcvaR6SBuSHdikDF/iWCtU1YW
wpk5eWEVaAO67keZOz6b+aCFHqjFX+1dUBBuPnslucYLR73Qi16hfaU9pebe97Gb
lX/MJ1bR9BeRp314cF0PYbm4WhKjRLudHOFJH8x3dj/BiYNrFK3SJGLiiTwsrAZ8
kytU0Xs=
=Ke/D
-----END PGP SIGNATURE-----
Merge v5.10-rc3 into drm-next
We need commit f8f6ae5d07 ("mm: always have io_remap_pfn_range() set
pgprot_decrypted()") to be able to merge Jason's cleanup patch.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some disply regs are not setup correctly during HPD for BXT/APL thus
vfio_edid still not working. Temporarily disable the vfio_edid dynamic
update until issue fixed.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201109073939.758302-1-colin.xu@intel.com
Program display related vregs to proper value at initialization, setup
virtual monitor and hotplug.
vGPU virtual display vregs inherit the value from pregs. The virtual DP
monitor is always setup on PORT_B for BXT/APL. However the host may
connect monitor on other PORT or without any monitor connected. Without
properly setup PIPE/DDI/PLL related vregs, guest driver may not setup
the virutal display as expected, and the guest desktop may not be
created.
Since only one virtual display is supported, enable PIPE_A only. And
enable transcoder/DDI/PLL based on which port is setup for BXT/APL.
V2:
Revise commit message.
V3:
set_edid should on PORT_B for BXT.
Inject hpd event for BXT.
V4:
Temporarily disable vfio edid on BXT/APL until issue fixed.
V5:
Rebase to use new HPD define GEN8_DE_PORT_HOTPLUG for BXT.
Put vfio edid disabling on BXT/APL to a separate patch.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201109073922.757759-1-colin.xu@intel.com
This patch add gvt resume wrapper into i915_drm_resume().
GVT relies on i915 so resume gvt at last.
V2:
- Direct call into gvt suspend/resume wrapper in intel_gvt.h/intel_gvt.c.
The wrapper and implementation will check and call gvt routine. (zhenyu)
V3:
Refresh.
V4:
Rebase.
V5:
Fail intel_gvt_suspend() if fail to save GGTT.
V6:
Save host entry to per-vGPU gtt.ggtt_mm on each host_entry update so
only need the resume routine.
V7:
Refresh.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201027045406.159566-1-colin.xu@intel.com
This patch save/restore necessary GVT info during i915 suspend/resume so
that GVT enabled QEMU VM can continue running.
Only GGTT and fence regs are saved/restored now. GVT will save GGTT
entries on each host_entry update, restore the saved dirty entries
and re-init fence regs in resume routine.
V2:
- Change kzalloc/kfree to vzalloc/vfree since the space allocated
from kmalloc may not enough for all saved GGTT entries.
- Keep gvt suspend/resume wrapper in intel_gvt.h/intel_gvt.c and
move the actual implementation to gvt.h/gvt.c. (zhenyu)
- Check gvt config on and active with intel_gvt_active(). (zhenyu)
V3: (zhenyu)
- Incorrect copy length. Should be num entries * entry size.
- Use memcpy_toio()/memcpy_fromio() instead of memcpy for iomem.
- Add F_PM_SAVE flags to indicate which MMIOs to save/restore for PM.
V4:
Rebase.
V5:
Fail intel_gvt_save_ggtt as -ENOMEM if fail to alloc memory to save
ggtt. Free allocated ggtt_entries on failure.
V6:
Save host entry to per-vGPU gtt.ggtt_mm on each host_entry update.
V7:
Restore GGTT entry based on present bit.
Split fence restore and mmio restore in different functions.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201027045308.158955-1-colin.xu@intel.com
JSL has update in vswing table for eDP.
BSpec: 21257
Changes since V1:
- Fixed few checkpatch errors
Cc: Souza Jose <jose.souza@intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020053657.99890-1-tejaskumarx.surendrakumar.upadhyay@intel.com
DG1 uses 2 registers for the ddi clock mapping, with PHY A and B using
DPCLKA_CFGCR0 and PHY C and D using DPCLKA1_CFGCR0. Hide this behind a
single macro that chooses the correct register according to the phy
being accessed, use the correct bitfields for each pll/phy and implement
separate functions for DG1 since it doesn't share much with ICL/TGL
anymore.
The previous values were correct for PHY A and B since they were using
the same register as before and the bitfields were matching.
v2: Add comment and try to simplify DG1_DPCLKA* macros by reusing
previous ones
v3:
- Fix DG1_DPCLKA_CFGCR0_DDI_CLK_SEL_MASK() after wrong macro reuse
- Move phy -> id map to a separate macro (Aditya)
- Remove DG1_DPCLKA_CFGCR0_DDI_CLK_SEL_MASK where not required
(Aditya)
- Use drm_WARN_ON
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Aditya Swarup <aditya.swarup@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106210006.837953-1-lucas.demarchi@intel.com
When performing an allocation we try split it down into the largest
possible power-of-two blocks/pages-sizes, and for the common case we
expect to allocate the blocks in descending order. This also naturally
fits with our GTT alignment tricks(including the hugepages selftest),
where we sometimes try to align to the largest possible GTT page-size
for the allocation, in the hope that translates to bigger GTT
page-sizes. Currently, we seem to incorrectly add the blocks in the
opposite order, which is definitely not the intended behaviour.
Reported-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109111249.109365-1-matthew.auld@intel.com
Instead of printing out the internal engine mask, which can change between
kernel versions making it difficult to map to actual engines, present a
bitmask of hanging engines ABI classes. For example:
[drm] GPU HANG: ecode 9:8:24dffffd, in gem_exec_schedu [1334]
Engine ABI class is useful to quickly categorize render vs media etc hangs
in bug reports. Considering virtual engine even more so than the current
scheme.
v2:
* Do not re-order fields. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105113842.1395391-1-tvrtko.ursulin@linux.intel.com
Between events which trigger engine and GPU resets and capturing the error
state we lose information on which engine triggered the reset. Improve
this by passing in the hung engine mask down to error capture.
Result is that the list of engines in user visible "GPU HANG: ecode
<gen>:<engines>:<ecode>, <process>" is now a list of hanging and not just
active engines. Most importantly the displayed process is now the one
which was actually hung.
v2:
* Stub prototype. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104134743.916027-1-tvrtko.ursulin@linux.intel.com
SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.
Fixes: c5d3e39caa ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniele.ceraolospurio@intel.com
Only the following drivers aren't converted:
- amdgpu, because of the driver_feature mangling due to virt support.
Subsequent patch will address this.
- nouveau, because DRIVER_ATOMIC uapi is still not the default on the
platforms where it's supported (i.e. again driver_feature mangling)
- vc4, again because of driver_feature mangling
- qxl, because the ioctl table is somewhere else and moving that is
maybe a bit too much, hence the num_ioctls assignment prevents a
const driver structure.
- arcpgu, because that is stuck behind a pending tiny-fication series
from me.
- legacy drivers, because legacy requires non-const drm_driver.
Note that for armada I also went ahead and made the ioctl array const.
Only cc'ing the driver people who've not been converted (everyone else
is way too much).
v2: Fix one misplaced const static, should be static const (0day)
v3:
- Improve commit message (Sam)
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: kernel test robot <lkp@intel.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104100425.1922351-5-daniel.vetter@ffwll.ch
UAPI Changes:
Cross-subsystem Changes:
- arch/arm64: Describe G12b GPU as coherent
- iommu: Support coherency for Mali LPAE
Core Changes:
- atomic: Pass full state to CRTC atomic_{check, begin, flush}(); Use
atomic-state pointers
- drm: Remove SCATTER_LIST_MAX_SEGMENT; Cleanups
- doc: Document legacy_cursor_update better; cleanups
- edid: Don't warn n EDIDs of zero
- ttm: New backend allocation pool; Remove old page allocator; Rework
no_retry handling; Replace flags with booleans in struct ttm_operation_ctx
- vram-helper: Cleanups
- fbdev: Cleanups
- console: Store font size as unsigned value
Driver Changes:
- ast: Support new display mode
- amdgpu: Switch to new TTM allocator
- hisilicon: Cleanups
- nouveau: Switch to new TTM allocator; Fix include of swiotbl.h and
limits.h; Use state helper instead of CRTC state pointer
- panfrost: Support cache-coherent integrations; Fix mutex corruption on
open/close; Cleanupse
- qxl: Cleanups
- radeon: Switch to new TTM allocator
- ticdc: Fix build failure
- vmwgfx: Switch to new TTM allocator
- xlnx: Use dma_request_chan
- fbdev/sh_mobile: Cleanups
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl+j0DcACgkQaA3BHVML
eiNdAQf/fadZLQeKHuUYTlvyCkjxcWnpT2gXpcX8pq/7AMv1dW2+R7ZEs1SfuFD7
/4LPd9+prFgYBB5T7GAoUN46Ni6fO3/hfP7KfQNW3o1GC4a0+kvgbgEp+bzRQotX
6AAH0nNA0ADYQWSWpouOEJOQVrAtLFe7eQcieMna8/SYMvKDhxch5sxP5HYONbsU
eF9wPzGVlcVFf2wdow2lllp3pKU40etxyNhdJuwbuqBtudJZgr7yWFKC4XoW0V2w
kWmbY5nZfAIPgb9DwD9FhWnhOd2jVVpeea1ZUjhCAqt1E8T7szzJI1BTl4vs8npT
O8KIG+RAI/l5WPrQ1AbMjbknIQTQVw==
=mzXQ
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-11-05' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.11:
UAPI Changes:
Cross-subsystem Changes:
- arch/arm64: Describe G12b GPU as coherent
- iommu: Support coherency for Mali LPAE
Core Changes:
- atomic: Pass full state to CRTC atomic_{check, begin, flush}(); Use
atomic-state pointers
- drm: Remove SCATTER_LIST_MAX_SEGMENT; Cleanups
- doc: Document legacy_cursor_update better; cleanups
- edid: Don't warn n EDIDs of zero
- ttm: New backend allocation pool; Remove old page allocator; Rework
no_retry handling; Replace flags with booleans in struct ttm_operation_ctx
- vram-helper: Cleanups
- fbdev: Cleanups
- console: Store font size as unsigned value
Driver Changes:
- ast: Support new display mode
- amdgpu: Switch to new TTM allocator
- hisilicon: Cleanups
- nouveau: Switch to new TTM allocator; Fix include of swiotbl.h and
limits.h; Use state helper instead of CRTC state pointer
- panfrost: Support cache-coherent integrations; Fix mutex corruption on
open/close; Cleanupse
- qxl: Cleanups
- radeon: Switch to new TTM allocator
- ticdc: Fix build failure
- vmwgfx: Switch to new TTM allocator
- xlnx: Use dma_request_chan
- fbdev/sh_mobile: Cleanups
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105101641.GA13099@linux-uq9g
Move the specialised interactions with the physical GEM object from the
pread/pwrite ioctl handler into the phys backend.
Currently, if one is able to exhaust the entire aperture and then try to
pwrite into an object not backed by struct page, we accidentally invoked
the phys pwrite handler on a non-phys object; calamitous.
Fixes: c6790dc223 ("drm/i915: Wean off drm_pci_alloc/drm_pci_free")
Testcase: igt/gem_pwrite/exhaustion
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-2-chris@chris-wilson.co.uk
As there are more and more complicated interactions between the different
backing stores and userspace, push the control into the backends rather
than accumulate them all inside the ioctl handlers.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-1-chris@chris-wilson.co.uk
As per W/A implemented for TGL to program half of the nominal
DCO divider fraction value which is also applicable on EHL.
Changes since V2:
- Apply stepping B0 till FOREVER
- B0 - revid update as per Bspec 29153
Changes since V1:
- ehl_ used as to keep earliest platform prefix
- WA required B0 stepping onwards
Cc: Deak Imre <imre.deak@intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104050655.171185-1-tejaskumarx.surendrakumar.upadhyay@intel.com
The kernel test robot is not happy with that.
This reverts commit 2b5b95b1ff.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/394773/
Replace the previous approach to force compute the initial PSR state
after i915 take over from firmware by the better and recently added
initial_fastset_check() hook.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221048.104294-1-jose.souza@intel.com
Add the new vma_set_file() function to allow changing
vma->vm_file with the necessary refcount dance.
v2: add more users of this.
v3: add missing EXPORT_SYMBOL, rebase on mmap cleanup,
add comments why we drop the reference on two occasions.
v4: make it clear that changing an anonymous vma is illegal.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/394773/
Fix a typo that led to some MST short pulse event handling issue (the
short pulse event was handled for both encoder instances, each having
its own state).
Fixes: 1d8ca00245 ("drm/i915: Add PORT_TCn aliases to enum port")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104010000.4165574-1-imre.deak@intel.com
Since __vma_release is run by a kworker after the fence has been
signaled, it is no longer protected by the active reference on the vma,
and so the alias of vw->pinned to vma->obj is also not protected by a
reference on the object. Add an explicit reference for vw->pinned so it
will always be safe.
Found by inspection.
Fixes: 54d7195f8c ("drm/i915: Unpin vma->obj on early error")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102161931.30031-1-chris@chris-wilson.co.uk
(cherry picked from commit bc73e5d330)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In a simple test case that writes to scratch and then busy-waits for the
batch to be signaled, we observe that the signal is before the write is
posted. That is bad news.
Splitting the flush + write_dword into two separate flush_dw prevents
the issue from being reproduced, we can presume the post-sync op is not
so post-sync.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/216
Testcase: igt/gem_exec_fence/parallel
Testcase: igt/i915_selftest/live/gt_timelines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221057.29626-2-chris@chris-wilson.co.uk
(cherry picked from commit 09212e81e5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add another lower level to emit_ggtt_write so that the GGTT nature of
the write is not hardcoded into the emitter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221057.29626-1-chris@chris-wilson.co.uk
(cherry picked from commit 2739d8cfc5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We wrap the timeline on construction of the next request, but there may
still be requests in flight that have not yet finalized the breadcrumb.
(The breadcrumb is delayed as we need engine-local offsets, and for the
virtual engine that is not known until execution.) As such, by the time
we write to the timeline's HWSP offset it may have changed, and we
should use the value we preserved in the request instead.
Though the window is small and infrequent (at full flow we can expect a
timeline's seqno to wrap once every 30 minutes), the impact of writing
the old seqno into the new HWSP is severe: the old requests are never
completed, and the new requests are completed before they are even
submitted.
Fixes: ebece75392 ("drm/i915: Keep timeline HWSP allocated until idle across the system")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022064127.10159-1-chris@chris-wilson.co.uk
(cherry picked from commit c10f6019d0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Avoid skipping what appears to be a no-op set-domain-ioctl if the cache
coherency state is inconsistent with our target domain. This also has
the utility of using the population of the pages to validate the backing
store.
The danger in skipping the first set-domain is leaving the cache
inconsistent and submitting stale data, or worse leaving the clean data
in the cache and not flushing it to the GPU. The impact should be small
as it requires a no-op set-domain as the very first ioctl in a
particular sequence not found in typical userspace.
Reported-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Fixes: 754a254427 ("drm/i915: Skip object locking around a no-op set-domain ioctl")
Testcase: igt/gem_mmap_offset/blt-coherency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019203825.10966-1-chris@chris-wilson.co.uk
(cherry picked from commit 44c2200afc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The initial breadcrumb marks the transition from context wait and setup
into the request payload. We use the marker to determine if the request
is merely waiting to begin, or is inside the payload and hung.
Forgetting to include a breadcrumb before the user payload would mean we
do not reset the guilty user request, and conversely if the initial
breadcrumb is too early we blame the user for a problem elsewhere.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007090947.19950-1-chris@chris-wilson.co.uk
Since __vma_release is run by a kworker after the fence has been
signaled, it is no longer protected by the active reference on the vma,
and so the alias of vw->pinned to vma->obj is also not protected by a
reference on the object. Add an explicit reference for vw->pinned so it
will always be safe.
Found by inspection.
Fixes: 54d7195f8c ("drm/i915: Unpin vma->obj on early error")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102161931.30031-1-chris@chris-wilson.co.uk
In a simple test case that writes to scratch and then busy-waits for the
batch to be signaled, we observe that the signal is before the write is
posted. That is bad news.
Splitting the flush + write_dword into two separate flush_dw prevents
the issue from being reproduced, we can presume the post-sync op is not
so post-sync.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/216
Testcase: igt/gem_exec_fence/parallel
Testcase: igt/i915_selftest/live/gt_timelines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221057.29626-2-chris@chris-wilson.co.uk
Since commit 9a40401cfa ("lib/scatterlist: Do not limit max_segment to
PAGE_ALIGNED values") the max_segment input to sg_alloc_table_from_pages()
does not have to be any special value. The new algorithm will always
create something less than what the user provides. Thus eliminate this
confusing constant.
- vmwgfx should use the HW capability, not mix in the OS page size for
calling dma_set_max_seg_size()
- i915 uses i915_sg_segment_size() both for sg_alloc_table_from_pages
and for some open coded sgl construction. This doesn't change the value
since rounddown(size, UINT_MAX) == SCATTERLIST_MAX_SEGMENT
- drm_prime_pages_to_sg uses it as a default if max_segment is zero,
UINT_MAX is fine to use directly.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Qian Cai <cai@lca.pw>
Cc: "Ursulin, Tvrtko" <tvrtko.ursulin@intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v1-44733fccd781+13d-rm_scatterlist_max_jgg@nvidia.com
If the VBT assigned tc->legacy_port mismatches the live_status indicator
for the connector, we ignore the VBT directive and switch over to the HW
setting. This is not a driver error, unless we happen to misparse the
VBT or the live_status registers. However, for the system in CI where
the error is only reported on 1 port out of 4, the evidence indicates
the VBT is wrong. Stop flaging this as an error since the cause is
beyond our control, fixup the mistake and continue on.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201030153209.14808-1-chris@chris-wilson.co.uk
ibx_irq_pre_postinstall() looks totally pointless. We can just
init both SDEIMR and SDEIER at the same time before enabling the
master interrupt. It's equally racy as the other order due
to doing all of this from the postinstall stage with the interrupt
handler already in place. That is, safe with MSI but racy with
shared legacy interrupts. Fortunately we should have MSI on all ilk+.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-20-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Let's enable the hardware hpd logic only for the ports we
can actually use.
In theory this may save some miniscule amounts of power,
and more importantly it eliminates a lot if platform specific
codepaths since the generic thing can now deal with any
combination of ports being present on each SKU.
v2: Deal with DG1
v3: Deal with DG1 some more
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-18-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
We no longer unmask all HPD irqs, so we can drop the ugly per-platform
HPD IIR masking. IMR will prevent unsupported bits from appearing in
IIR.
v2: Deal with DG1
Include "HOTPLUG" in the mask names (Lucas)
v3: Fix typos in subject
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-17-ville.syrjala@linux.intel.com
No reason that I can see why we should enable the hpd detection logic
already during irq postinstall phase. We don't even do this on all
the platforms. We just need it before we actually enable the hotplug
interrupts in .hpd_irq_setup(), and in fact we already do it there as
well. Let's just eliminate the redundant early setup.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-15-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Use hpd_pin to parametrize BXT_DE_PORT_HP_DDI() to make it clear
these have nothing to do with DDI ports or PHYs as such. The only
thing that matters is the HPD pin assignment.
v2: Remember the gvt
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-8-ville.syrjala@linux.intel.com
Let's make the AUX CH names match the spec (AUX A-F for pre-tgl,
AUX A-C or AUX USBC1-6 for tgl+). And while at it let's include
the full encoder name in the AUX CH name as well (as opposed to
just using port_name() which wouldn't give us the right thing on
tgl+).
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-6-ville.syrjala@linux.intel.com
Let's pimp the DDI encoder->name to reflect what the spec calls them.
Ie. on pre-tgl DDI A-F, on tgl+ DDI A-C or DDI TC1-6.
Also since each encoder is really a combination of the DDI and the PHY
we include the PHY name as well.
ICL is a bit special since it already has the two different types
of DDIs (combo or TC) but it still calls them just DDI A-F regarless
of the type. For that let's add an extra "(TC)" note to remind
is which type of DDI it really is.
The code is darn ugly, but not sure there's much we can do about it.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-4-ville.syrjala@linux.intel.com
- Remove dup mmio handler for BXT/APL. Otherwise mmio handler will fail
to init.
- Add engine GPR with F_CMD_ACCESS since BXT/APL will load them via
LRI. Otherwise, guest will enter failsafe mode.
V2:
Use RCS/BCS GPR macros instead of offset.
Revise commit message.
V3:
Use GEN8_RING_CS_GPR macros on ring base.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201016052913.209248-1-colin.xu@intel.com
One issue exposed after below commit with which the system will freeze
at suspend after vGPU is created (no need to activate the vGPU).
commit e6ba764802 ("drm/i915: Remove i915->kernel_context")
Old implementation pin the intel_context at setup_submission and
unpin it at clean_submission. So after some vGPU is created, the
intel_context is always pinned there although no workload using it.
It will then block i915 enter suspend state.
There is no need to pin it all the time. Pin/unpin it around workload
lifecycle is more reasonable. After GVT enabled suspend/resume, the
pinned intel_context will also get unpined when userspace put VM process
into suspend state since all workloads are retired, then it's safe to
unpin all intel_context for workloads created. So move the pin/unpin to
create_workload and destroy_workload, while still keep the
create/destroy in old place.
V2:
Rebase.
Fixes: e6ba764802 ("drm/i915: Remove i915->kernel_context")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201016054059.238371-1-colin.xu@intel.com
Restore RPS for ILK-M. We lost it when an extra HAS_RPS()
check appeared in intel_rps_enable().
Unfortunaltey this just makes the performance worse on my
ILK because intel_ips insists on limiting the GPU freq to
the minimum. If we don't do the RPS init then intel_ips will
not limit the frequency for whatever reason. Either it can't
get at some required information and thus makes wrong decisions,
or we mess up some weights/etc. and cause it to make the wrong
decisions when RPS init has been done, or the entire thing is
just wrong. Would require a bunch of reverse engineering to
figure out what's going on.
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 2bf06370bc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We are incorrectly limiting the max allocation size as per the mm
max_order, which is effectively the largest power-of-two that we can fit
in the region size. However, it's normal to setup the region or
allocator with a non-power-of-two size(for example 3G), which we should
already handle correctly, except it seems for the early too-big-check.
v2: make sure we also exercise the I915_BO_ALLOC_CONTIGUOUS path, which
is quite different, since for that we are actually limited by the
largest power-of-two that we can fit within the region size. (Chris)
Fixes: b908be543e ("drm/i915: support creating LMEM objects")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021103606.241395-1-matthew.auld@intel.com
(cherry picked from commit 83ebef47f8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This patch fixes below warnings reported by coccicheck
./drivers/gpu/drm/i915/i915_debugfs.c:789:5-8: Unneeded variable: "ret". Return "0" on line 1012
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1603937925-53176-1-git-send-email-zou_wei@huawei.com
Clear out some pointers when objects have been de-allocated. This
makes it much easier to track down use-after-free type issues.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-4-John.C.Harrison@Intel.com
Rather than just saying 'GuC failed to load: -110', actually print out
the GuC status register and break it down into the individual fields.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-3-John.C.Harrison@Intel.com
The latest GuC firmware includes a number of interface changes that
require driver updates to match.
* Starting from Gen11, the ID to be provided to GuC needs to contain
the engine class in bits [0..2] and the instance in bits [3..6].
NOTE: this patch breaks pointer dereferences in some existing GuC
functions that use the guc_id to dereference arrays but these functions
are not used for now as we have GuC submission disabled and we will
update these functions in follow up patch which requires new IDs.
* The new GuC requires the additional data structure (ADS) and associated
'private_data' pointer to be setup. This is basically a scratch area
of memory that the GuC owns. The size is read from the CSS header.
* There is now a physical to logical engine mapping table in the ADS
which needs to be configured in order for the firmware to load. For
now, the table is initialised with a 1 to 1 mapping.
* GUC_CTL_CTXINFO has been removed from the initialization params.
* reg_state_buffer is maintained internally by the GuC as part of
the private data.
* The ADS layout has changed significantly. This patch updates the
shared structure and also adds better documentation of the layout.
* While i915 does not use GuC doorbells, the firmware now requires
that some initialisation is done.
* The number of engine classes and instances supported in the ADS has
been increased.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-2-John.C.Harrison@Intel.com
On bxt, we require a VT'd w/a to serialise all GGTT updates with memory
transfers, and use stop_machine() for this purpose. stop_machine() is a
global serialisation barrier and so dangerous to use from within
critical sections, as the stop_machine() will wait for all cpus to enter
the stop_machine callback, and those cpus may be waiting for the
critical section already held.
Fixes: d7085b0faa ("drm/i915/gem: Poison stolen pages before use")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027184759.29888-1-chris@chris-wilson.co.uk
First check in the function is if swsci() is supported. All the error
paths are easy to figure out the reason, so remove the extra debug
message: it's normal not to support swsci() e.g. in dgfx.
v2: Rather than special case dgfx, just remove the debug message
(from Ville)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027044618.719064-2-lucas.demarchi@intel.com
Do not create the display debugfs files when we don't have display.
Based on previous patch by José Souza.
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027044618.719064-1-lucas.demarchi@intel.com
HPD pins are inverted for DG1 platform.
Bspec: 49956
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Clinton A Taylor <clinton.a.taylor@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021082034.3170478-3-lucas.demarchi@intel.com
DG1 has one more combo phy port, no TC and all irq handling goes through
SDE, like for MCC.
v2: Also change intel_hpd_pin_default() to include DG1 mapping
v3, v4: Rebase on hpd refactor
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021082034.3170478-2-lucas.demarchi@intel.com
Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
glitches that are often reproduced when executing CPU intensive
workloads while a eDP 4K panel is attached.
Manually exiting PSR causes the frontbuffer to be updated without
glitches and the IOMMU errors are also gone but this comes at the cost
of less time with PSR active.
So using this workaround until this issue is root caused and a better
fix is found.
The current code is already ready to enable PSR after this exit if
there is not other frontbuffer modifications.
Adding a new if block in psr_force_hw_tracking_exit() instead of reuse
the else/gen8- block because the plan is to revert this workaround
as soon as a better solution is found.
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002231627.24528-1-jose.souza@intel.com
fbcon/fonts:
- Two patches to prevent OOB access
ttm:
- fix for evicition value range check
amdgpu:
- Sienna Cichlid fixes
- MST manager resource leak fix
- GPU reset fix
amdkfd:
- Luxmark fix for Navi1x
i915:
- Tweak initial DPCD backlight.enabled value (Sean)
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfkhyvAAoJEAx081l5xIa+3qIQAKjvgIvuoLix+bl/gkOHwymv
QHTXVA3Gi4Rup778L3k1GNppQX4/LE42Of9wnsZIELxG0vxKl59JGzlWJVld/rtJ
ZXfoT5kh7Hh7kkiXnC2s11aqQSsgr25lfsY8gOWSPIGfwOn07JSMTkqPWlbwrOz2
il6qlOlgfSNfXwn2NxSTzGxZMrgOyUvKNZczRXU9gSuuoTsEo4bAvS7/vEN4hazX
DFQAZfd82PfcdAIkVzk/gOoaCQ6a9YgjOzg1RQ4gKhrj8UaWu4gUyJwWPjSMnLrh
uP2RM3gU54MhVF3jHt+D0Trv2ti2zStD9wc4AEJwOVQZtcDSsgGduOdUs5Xc/1l5
dBOvpumBmNxsYbVvvThijNeSx6Y5ybzI3iUp8SLDftiRZtwsXds2aRaskuVgMqj4
MSBdOiZqoJudLjwCBHWKe328+r00X+f14Vi30Y0cy4VW59NxLG5D7qjc5BGqJw1q
J3FQ/9uDh0lWbeKT/grapjP43IWcLApykZa3Rn6p2w0mW2+8Wht/WbrFYyNYGlzf
aNS9RnknTaMYWpvZUZLVG83dJpn6Y9ooHa9L/blMzfCxpF6ftEYf+Iq2x8s0gprz
tIq0xsGvBacsnQIOWRuHjuF87zibVbDp9ba+x78F/woyUqEhip+lBXaPofp132gQ
HtIdQewvG9KcLZcoluO4
=rAUM
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-10-23' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
"This should be the last round of things for rc1, a bunch of i915
fixes, some amdgpu, more font OOB fixes and one ttm fix just found
reading code:
fbcon/fonts:
- Two patches to prevent OOB access
ttm:
- fix for evicition value range check
amdgpu:
- Sienna Cichlid fixes
- MST manager resource leak fix
- GPU reset fix
amdkfd:
- Luxmark fix for Navi1x
i915:
- Tweak initial DPCD backlight.enabled value (Sean)
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)"
* tag 'drm-next-2020-10-23' of git://anongit.freedesktop.org/drm/drm: (31 commits)
drm/amdgpu: correct the cu and rb info for sienna cichlid
drm/amd/pm: remove the average clock value in sysfs
drm/amd/pm: fix pp_dpm_fclk
Revert drm/amdgpu: disable sienna chichlid UMC RAS
drm/amd/pm: fix pcie information for sienna cichlid
drm/amdkfd: Use same SQ prefetch setting as amdgpu
drm/amd/swsmu: correct wrong feature bit mapping
drm/amd/psp: Fix sysfs: cannot create duplicate filename
drm/amd/display: Avoid MST manager resource leak.
drm/amd/display: Revert "drm/amd/display: Fix a list corruption"
drm/amdgpu: update golden setting for sienna_cichlid
drm/amd/swsmu: add missing feature map for sienna_cichlid
drm/amdgpu: correct the gpu reset handling for job != NULL case
drm/amdgpu: add rlc iram and dram firmware support
drm/amdgpu: add function to program pbb mode for sienna cichlid
drm/i915: Drop runtime-pm assert from vgpu io accessors
drm/i915: Force VT'd workarounds when running as a guest OS
drm/i915: Exclude low pages (128KiB) of stolen from use
drm/i915/gt: Onion unwind for scratch page allocation failure
drm/ttm: fix eviction valuable range check.
...
intel_timeline_read_hwsp() is used to support semaphore waits between
engines, that may themselves be deferred for arbitrary periods -- that
is the read of the target request's HWSP is at an indeterminant point in
the future. To support this, we need to prevent overwriting a HWSP that
is being watched across a seqno wrap (otherwise the next request will
write its value into the old HWSP preventing the watcher from making
progress, ad infinitum.) To simulate the observer across a wrap, let's
create a request that reads from the HWSP and dispatch it at different
points around a wrap to see if the value is lost.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021220411.5777-2-chris@chris-wilson.co.uk
We wrap the timeline on construction of the next request, but there may
still be requests in flight that have not yet finalized the breadcrumb.
(The breadcrumb is delayed as we need engine-local offsets, and for the
virtual engine that is not known until execution.) As such, by the time
we write to the timeline's HWSP offset it may have changed, and we
should use the value we preserved in the request instead.
Though the window is small and infrequent (at full flow we can expect a
timeline's seqno to wrap once every 30 minutes), the impact of writing
the old seqno into the new HWSP is severe: the old requests are never
completed, and the new requests are completed before they are even
submitted.
Fixes: ebece75392 ("drm/i915: Keep timeline HWSP allocated until idle across the system")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022064127.10159-1-chris@chris-wilson.co.uk
Since Ironlake uses intel_ips.ko for its dynamic frequency adjustment,
we do not have direct control over the frequency management so such
tests are defunct. Similarly, we can't check the gen6+ RPS registers on
Ironlake.
Hopefully this catches all the invalid tests now that Ironlake has
rejoined the dynamic GPU frequency club. There is an opportunity for the
reader to add tests to exercise MEMINTRSTS and co.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022210814.23004-1-chris@chris-wilson.co.uk
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAl+R8a4ACgkQ+mJfZA7r
E8oazwgAu+NP5GlIHTcG6bV4vxRaEDdlDf3uH+ENOGfRT0PoX6qVNpkdc/nmTZC/
RFzHVsKJi1zXS9XetP33YKZ7KFPZxJMXdhGSu+f08Q6nf0jBdFUm3mhnrDWoLO9+
fS88L4HqXREvBLDctN89akuZ1/pZ3p0KsBqAOuP+LjRAsquHgIP3eUzwZMVbTgTZ
KLojKMHHrZxRtynMFrW6JO+gBok+mGydpOp7F1bK4jSmOIlYA5OsIW3xn9+Ytveu
9Od/yIjKeCyPD64ncz8wcbcPxZhnXENzZElBYT6sQSHDZ9wNCOUhfJT0v8t8G5So
vzOvf2YXa+2IKiGY2sd2ov9svz3Wgg==
=oidT
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-fixes-2020-10-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Tweak initia DPCD backlight.enabled value (Sean)
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022205613.GA3469192@intel.com
As we disable the interrupt during suspend, also reset the irq_mask to
short-circuit subsystems that later try to turn off their interrupt
source.
<4>[ 101.816730] i915 0000:00:02.0: drm_WARN_ON(!intel_irqs_enabled(dev_priv))
<4>[ 101.816853] WARNING: CPU: 3 PID: 4241 at drivers/gpu/drm/i915/i915_irq.c:343 ilk_update_display_irq+0xb3/0x130 [i915]
v2: Reset irq_mask for i8xx_irq_reset as well, and split patch to focus
on only i915->irq_mask
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022114246.28566-1-chris@chris-wilson.co.uk
Since we keep a driver global mask of online CPUs and base the decision
whether PMU needs to be migrated upon it, we need to make sure the
migration is done for all registered PMUs (so GPUs).
To do this we need to track the current CPU for each PMU and base the
decision on whether to migrate on a comparison between global and local
state.
At the same time, since dynamic CPU hotplug notification slots are a
scarce resource and given how we already register the multi instance type
state, we can and should add multiple instance of the i915 PMU to this
same state and not allocate a new one for every GPU.
v2:
* Use pr_notice. (Chris)
v3:
* Handle a nasty interaction where unregistration which triggers a false
CPU offline event. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Daniel Vetter <daniel.vetter@intel.com> # dynamic slot optimisation
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161144.678668-1-tvrtko.ursulin@linux.intel.com
Mark the device as closed and keep references to driver data alive to
allow for safe driver unbind with active PMU clients. Perf core does not
otherwise handle this case so we have to do it manually like this.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020100822.543332-1-tvrtko.ursulin@linux.intel.com
Avoid skipping what appears to be a no-op set-domain-ioctl if the cache
coherency state is inconsistent with our target domain. This also has
the utility of using the population of the pages to validate the backing
store.
The danger in skipping the first set-domain is leaving the cache
inconsistent and submitting stale data, or worse leaving the clean data
in the cache and not flushing it to the GPU. The impact should be small
as it requires a no-op set-domain as the very first ioctl in a
particular sequence not found in typical userspace.
Reported-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Fixes: 754a254427 ("drm/i915: Skip object locking around a no-op set-domain ioctl")
Testcase: igt/gem_mmap_offset/blt-coherency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019203825.10966-1-chris@chris-wilson.co.uk
The block comment for cnl_program_nearest_filter_coefs() has a wonderful
diagram, but although it is marked up as kerneldoc does not use the
markup for providing the function definition.
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'dev_priv' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'pipe' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'id' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'set' not described in 'cnl_program_nearest_filter_coefs'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021185649.17759-1-chris@chris-wilson.co.uk
Let's unmask the PCU event irq _after_ we've set up the
hardware and software to deal with the fallout. We can
also drop the PCU event bit from DEIER except when we
need it for rps.
And on the disable side we replace the hand rolled (and
unlocked) DEIER/IIR/IMR frobbing with ilk_disable_display_irq().
Ocd does require me to reorder it to be symmetric with
the enable path however.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-5-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
There is no GEN6_RPSTAT1 on ILK. Instead of reading that let's
try to get the same information from MEMSTAT_ILK. At least it
seems to track MEMSWCTL frequency request perfectly on my ILK.
It needs the same invert trick as the request value.
We don't want to put the invert thing into intel_gpu_freq()
and intel_freq_opcode() because that would incorrectly invert
the min/max/etc frequencies also.
One day someone might want to reverse engineer the formula for
converting these numbers to Hz, but for now we'll just report
them raw.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Restore RPS for ILK-M. We lost it when an extra HAS_RPS()
check appeared in intel_rps_enable().
Unfortunaltey this just makes the performance worse on my
ILK because intel_ips insists on limiting the GPU freq to
the minimum. If we don't do the RPS init then intel_ips will
not limit the frequency for whatever reason. Either it can't
get at some required information and thus makes wrong decisions,
or we mess up some weights/etc. and cause it to make the wrong
decisions when RPS init has been done, or the entire thing is
just wrong. Would require a bunch of reverse engineering to
figure out what's going on.
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
We are incorrectly limiting the max allocation size as per the mm
max_order, which is effectively the largest power-of-two that we can fit
in the region size. However, it's normal to setup the region or
allocator with a non-power-of-two size(for example 3G), which we should
already handle correctly, except it seems for the early too-big-check.
v2: make sure we also exercise the I915_BO_ALLOC_CONTIGUOUS path, which
is quite different, since for that we are actually limited by the
largest power-of-two that we can fit within the region size. (Chris)
Fixes: b908be543e ("drm/i915: support creating LMEM objects")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021103606.241395-1-matthew.auld@intel.com
In order to test how fast the heartbeat can respond, we measure with the
interval set to its minimum. Before we measure though, we want to be
sure we start with a fresh pulse, and so wait until any old one is
complete. During that wait though, we were continually flushing the
work, and so continually re-evaluating to see if the pulse was complete,
and each attempt would count as an unresponsive system. If the engine
did not complete the request in the couple of busy-spins, it would flag
an error. This is unfortunate, so let's not busy-spin waiting for the
old heartbeat, but terminate it and start afresh.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019142841.32273-1-chris@chris-wilson.co.uk
The "mmio" writes into vgpu registers are simple memory traps from the
guest into the host. We do not need to assert in the guest that the
device is awake for the io as we do not write to the device itself.
However, over time we have refactored all the mmio accessors with the
result that the vgpu reuses the gen2 accessors and so inherits the
assert for runtime-pm of the native device. The assert though has
actually been there since commit 3be0bf5acc ("drm/i915: Create vGPU
specific MMIO operations to reduce traps").
References: 3be0bf5acc ("drm/i915: Create vGPU specific MMIO operations to reduce traps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200811092532.13753-1-chris@chris-wilson.co.uk
(cherry picked from commit 0e65ce24a3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If i915.ko is being used as a passthrough device, it does not know if
the host is using intel_iommu. Mixing the iommu and gfx causes a few
issues (such as scanout overfetch) which we need to workaround inside
the driver, so if we detect we are running under a hypervisor, also
assume the device access is being virtualised.
Reported-by: Stefan Fritsch <sf@sfritsch.de>
Suggested-by: Stefan Fritsch <sf@sfritsch.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Stefan Fritsch <sf@sfritsch.de>
Cc: stable@vger.kernel.org
Tested-by: Stefan Fritsch <sf@sfritsch.de>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019101523.4145-1-chris@chris-wilson.co.uk
(cherry picked from commit f566fdcd6c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The GPU is trashing the low pages of its reserved memory upon reset. If
we are using this memory for ringbuffers, then we will dutiful resubmit
the trashed rings after the reset causing further resets, and worse. We
must exclude this range from our own use. The value of 128KiB was found
by empirical measurement (and verified now with a selftest) on gen9.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019165005.18128-2-chris@chris-wilson.co.uk
(cherry picked from commit d3606757e6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In switching to using objects for our ppGTT scratch pages, care was not
taken to avoid trying to unref NULL objects on failure. And for gen6
ppGTT, it appears we forgot entirely to unwind after a partial allocation
failure.
Fixes: 89351925a4 ("drm/i915/gt: Switch to object allocations for page directories")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019083444.1286-1-chris@chris-wilson.co.uk
(cherry picked from commit fa812ce96a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
GEN >= 10 hardware supports the programmable scaler filter.
Attach scaling filter property for CRTC and plane for GEN >= 10
hardwares and program scaler filter based on the selected filter
type.
changes since v3:
* None
changes since v2:
* Use updated functions
* Add ps_ctrl var to contain the full PS_CTRL register value (Ville)
* Duplicate the scaling filter in crtc and plane hw state (Ville)
changes since v1:
* None
Changes since RFC:
* Enable properties for GEN >= 10 platforms (Ville)
* Do not round off the crtc co-ordinate (Danial Stone, Ville)
* Add new functions to handle scaling filter setup (Ville)
* Remove coefficient set 0 hardcoding.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-5-pankaj.laxminarayan.bharadiya@intel.com
Integer scaling (IS) is a nearest-neighbor upscaling technique that
simply scales up the existing pixels by an integer
(i.e., whole number) multiplier.Nearest-neighbor (NN) interpolation
works by filling in the missing color values in the upscaled image
with that of the coordinate-mapped nearest source pixel value.
Both IS and NN preserve the clarity of the original image. Integer
scaling is particularly useful for pixel art games that rely on
sharp, blocky images to deliver their distinctive look.
Introduce functions to configure the scaler filter coefficients to
enable nearest-neighbor filtering.
Bspec: 49247
changes since v6:
* Trust compiler, remove pointless inline keyword from cnl_coef_tap()
& cnl_nearest_filter_coef() functions (Ville)
changes since v4:
* Make cnl_coef_tap(), cnl_nearest_filter_coef() inline (Uma)
changes since v3:
* None
changes since v2:
* Move APIs from 5/5 into this patch.
* Change filter programming related function names to cnl_*, move
filter select bits related code into inline function (Ville)
changes since v1:
* Rearrange skl_scaler_setup_nearest_neighbor_filter() to iterate the
registers directly instead of the phases and taps (Ville)
changes since RFC:
* Refine the skl_scaler_setup_nearest_neighbor_filter() logic (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-4-pankaj.laxminarayan.bharadiya@intel.com
Introduce scaler registers and bit fields needed to configure the
scaling filter in prgrammed mode and configure scaling filter
coefficients.
changes since v3:
* None
changes since v2:
* Change macro names to CNL_* and use +(set)*8 instead of adding
another trip through _PICK_EVEN (Ville).
changes since v1:
* None
changes since RFC:
* Parametrize scaler coeffient macros by 'set' (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-3-pankaj.laxminarayan.bharadiya@intel.com
No functional changes in this patch.
With Bigjoiner, there are 2 pipes driving 2 halfs of 1
transcoder. The transcoder_mode has the full timings, and is used
for configuring the transcoder with the intended mode after
joining the 2 halves.
To clear the confusion, we rename intel_set_pipe_timings to
intel_set_transcoder_timings
v2:
* Split the renaming into separate patch (Ville)
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201008214535.22942-2-manasi.d.navare@intel.com
Move the DSC stuff out from the middle of the ICP HPD register
definitions. The location seems to have been selected by a
dice roll.
SHPD_FILTER_CNT addition also went astray due to the DSC
mess, so we also fix that vs. ICP_TC_HPD_{SHORT,LONG}_DETECT().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006143349.5561-2-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
i915:
- Set all unused color plane offsets to ~0xfff again (Ville)
- Fix TGL DKL PHY DP vswing handling (Ville)
amdgpu:
- DCN clang warning fix
- eDP fix
- BACO fix
- Kernel documentation fixes
- SMU7 mclk fix
- VCN1 hw bug workaround
amdkfd:
- kvfree vs kfree fix
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfjSHWAAoJEAx081l5xIa+uFsP/2uljZFRr2IsiEsB7pI4cmpr
lZMRRA6SdCvSbSIF0Lu2Ndi3LVDM0TsLezsy0uoQWHPUB/TTI6uU+FcRCHevSOAs
JCyfp+DsFgJr5OIWiQzgP6qk67ygPLeSpzCr+Lr0HwXdlfuMQi/zo1Flp2srndLk
M1FwTb6WYGWfBB77q9qYzO9sJb8lnykd+cyOkvgYJsEcJUy/XCKyYi4IG21qaSCH
louciBMme9TbuE4IuIvQjQMFBVxCkE0ZTVrLPLC4VIBsQEH9Ld3PSxHIiCZmyo3k
nHRIxuxy4FnbB6bulToyxG4w94HoRtRbtCh6aBdRDpSNuGO9j1hTZhfR9Pbchyph
eI4BF4JpS4K5BoSYVqM/uviB0Ck6I0acr415p0guDI0BdeQCCjDZkZRnou3dW27p
FNWRaFlMCMr9n2elYoB4saKHd8hSjVYTFyaP/ftPZOYiO9IeZg8VrOC2QJcHirVG
4M77pixjCzUNZLGSvg55liLhmt2YsRWqrYABuv20MkeZUEqc329wjPjyeJFB1fBn
msq7dup37pNttD0XlU5x6Goabbcg3BeAyTAuMVWLCf0mQPOo5yfTUoRuyE4qJsfp
JSNe7wDN8U2N1uze5pIO1QriGcWb2++QGm9mXcoDJ0dbdGW4giZ+tVzssDloqb0X
/mQN0Af4HQj0R/Sh4jGx
=/+Vb
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-10-19' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Some fixes queued up already for i915 and amdgpu, I've also included
the fix for the clang warning you've seen.
i915:
- set all unused color plane offsets to ~0xfff again (Ville)
- fix TGL DKL PHY DP vswing handling (Ville)
amdgpu:
- DCN clang warning fix
- eDP fix
- BACO fix
- kernel documentation fixes
- SMU7 mclk fix
- VCN1 hw bug workaround
amdkfd:
- kvfree vs kfree fix"
* tag 'drm-next-2020-10-19' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Fix incorrect dsc force enable logic
drm/amdkfd: Use kvfree in destroy_crat_image
drm/amdgpu: vcn and jpeg ring synchronization
drm/amd/pm: increase mclk switch threshold to 200 us
docs: amdgpu: fix a warning when building the documentation
drm/amd/display: kernel-doc: document force_timing_sync
drm/amdgpu/swsmu: init the baco mutex in early_init
drm/amd/display: Fix module load hangs when connected to an eDP
drm/i915: Set all unused color plane offsets to ~0xfff again
drm/i915: Fix TGL DKL PHY DP vswing handling
Currently we call .hpd_irq_setup() directly just before display
resume, and follow it with another call via intel_hpd_init()
just afterwards. Assuming the hpd pins are marked as enabled
during the open-coded call these two things do exactly the
same thing (ie. enable HPD interrupts). Which even makes sense
since we definitely need working HPD interrupts for MST sideband
during the display resume.
So let's nuke the open-coded call and move the intel_hpd_init()
call earlier. However we need to leave the poll_init_work stuff
behind after the display resume as that will trigger display
detection while we're resuming. We don't want that trampling over
the display resume process. To make this a bit more symmetric
we turn this into a intel_hpd_poll_{enable,disable}() pair.
So we end up with the following transformation:
intel_hpd_poll_init() -> intel_hpd_poll_enable()
lone intel_hpd_init() -> intel_hpd_init()+intel_hpd_poll_disable()
.hpd_irq_setup()+resume+intel_hpd_init() -> intel_hpd_init()+resume+intel_hpd_poll_disable()
If we really would like to prevent all *long* HPD processing during
display resume we'd need some kind of software mechanism to simply
ignore all long HPDs. Currently we appear to have that just for
fbdev via ifbdev->hpd_suspended. Since we aren't exploding left and
right all the time I guess that's mostly sufficient.
For a bit of history on this, we first got a mechanism to block
hotplug processing during suspend in commit 15239099d7 ("drm/i915:
enable irqs earlier when resuming") on account of moving the irq enable
earlier. This then got removed in commit 50c3dc970a ("drm/fb-helper:
Fix hpd vs. initial config races") because the fdev initial config
got pushed to a later point. The second ad-hoc hpd_irq_setup() for
resume was added in commit 0e32b39cee ("drm/i915: add DP 1.2 MST
support (v0.7)") to be able to do MST sideband during the resume.
And finally we got a partial resurrection of the hpd blocking
mechanism in commit e8a8fedd57 ("drm/i915: Block fbdev HPD
processing during suspend"), but this time it only prevent fbdev
from handling hpd while resuming.
v2: Leave the poll_init_work behind
v3: Remove the extra intel_hpd_poll_disable() from display reset (Lyude)
Add the missing intel_hpd_poll_disable() to display init (Imre)
Cc: Lyude Paul <lyude@redhat.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201013181137.30560-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Rename intel_dp_sink_dpms() to intel_dp_set_power()
so one doesn't always have to convert from the DPMS
enum values to the actual DP D-states.
Also when dealing with a branch device this has nothing to
do with any sink, so the old name was nonsense anyway.
Also adjust the debug message accordingly, and pimp it
with the standard encoder id+name thing.
Trivial bits done with cocci:
@@
expression DP;
@@
(
- intel_dp_sink_dpms(DP, DRM_MODE_DPMS_OFF)
+ intel_dp_set_power(DP, DP_SET_POWER_D3)
|
- intel_dp_sink_dpms(DP, DRM_MODE_DPMS_ON)
+ intel_dp_set_power(DP, DP_SET_POWER_D0)
)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201016194800.25581-2-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Rather that try to trick LSPCON back into PCON mode from the .reset()
hook let's just do that as a regular part of the normal modeset
sequence, which is going to take care of the system resume case. During
a normal modeset this should normally be a nop as the mode should have
already been switched by .detect().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201016194800.25581-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The "mmio" writes into vgpu registers are simple memory traps from the
guest into the host. We do not need to assert in the guest that the
device is awake for the io as we do not write to the device itself.
However, over time we have refactored all the mmio accessors with the
result that the vgpu reuses the gen2 accessors and so inherits the
assert for runtime-pm of the native device. The assert though has
actually been there since commit 3be0bf5acc ("drm/i915: Create vGPU
specific MMIO operations to reduce traps").
References: 3be0bf5acc ("drm/i915: Create vGPU specific MMIO operations to reduce traps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200811092532.13753-1-chris@chris-wilson.co.uk
If i915.ko is being used as a passthrough device, it does not know if
the host is using intel_iommu. Mixing the iommu and gfx causes a few
issues (such as scanout overfetch) which we need to workaround inside
the driver, so if we detect we are running under a hypervisor, also
assume the device access is being virtualised.
Reported-by: Stefan Fritsch <sf@sfritsch.de>
Suggested-by: Stefan Fritsch <sf@sfritsch.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Stefan Fritsch <sf@sfritsch.de>
Cc: stable@vger.kernel.org
Tested-by: Stefan Fritsch <sf@sfritsch.de>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019101523.4145-1-chris@chris-wilson.co.uk
The GPU is trashing the low pages of its reserved memory upon reset. If
we are using this memory for ringbuffers, then we will dutiful resubmit
the trashed rings after the reset causing further resets, and worse. We
must exclude this range from our own use. The value of 128KiB was found
by empirical measurement (and verified now with a selftest) on gen9.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019165005.18128-2-chris@chris-wilson.co.uk
Underruns happens when plane height + y offset is not a modulo of 4
when FBC is enabled. It happens when scanline is at vactive - 10 but
that is not feasible to do from the software side so here completely
disabling FBC when height + y offset matches to avoid visual glitches.
Specification says that it only affects TGL display C stepping and
newer but to simply the check and as TGL is already in final costumers
hands, pre-production display stepping A and B was also included.
BSpec: 52887 ICL
BSpec: 52888 EHL/JSL
BSpec: 52890/55378 TGL
BSpec: 53508 DG1
BSpec: 53273 RKL
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019175609.28715-1-jose.souza@intel.com
This sequence is not part of "Sequences to Initialize Display" but
as noted in the MBus page the DBUF_CTL.Tracker_state_service needs
to be set to 8.
BSpec: 49213
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019173906.18892-1-jose.souza@intel.com
On Tigerlake, we are seeing a repeat of commit d8f5053117 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.
However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:
513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f5053117 ("drm/i915/icl: Forcibly evict stale csb entries")
References: HSDES#22011327657, HSDES#1508287568
Suggested-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org # v5.4
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-2-chris@chris-wilson.co.uk
(cherry picked from commit 233c1ae3c8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
A CSB entry is 64b, and it is simpler for us to treat it as an array of
64b entries than as an array of pairs of 32b entries.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-1-chris@chris-wilson.co.uk
(cherry picked from commit f24a44e52f)
(cherry picked from commit 3d4dbe0e0f0d04ebcea917b7279586817da8cf46)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We may try to preempt the currently executing request, only to find that
after unravelling all the dependencies that the original executing
context is still the earliest in the topological sort and re-submitted
back to HW (if we do detect some change in the ELSP that requires
re-submission). However, due to the way we check for wrap-around during
the unravelling, we mark any context that has been submitted just once
(i.e. with the rq->wa_tail set, but the ring->tail earlier) as
potentially wrapping and requiring a forced restore on resubmission.
This was expected to be not a problem, as it was anticipated that most
unwinding for preemption would result in a context switch and the few
that did not would be lost in the noise. It did not take long for
someone to find one particular workload where the cost of those extra
context restores was measurable.
However, since we know the wa_tail is of fixed size, and we know that a
request must be larger than the wa_tail itself, we can safely maintain
the check for request wrapping and check against a slightly future point
in the ring that includes an expected wa_tail. (That is if the
ring->tail is already set to rq->wa_tail, including another 8 bytes in
the check does not invalidate the incremental wrap detection.)
Fixes: 8ab3a3812a ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002083425.4605-1-chris@chris-wilson.co.uk
(cherry picked from commit bb65548e3c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When running gem_exec_nop, it floods the system with many requests (with
the goal of userspace submitting faster than the HW can process a single
empty batch). This causes the driver to continually resubmit new
requests onto the end of an active context, a flood of lite-restore
preemptions. If we time this just right, Tigerlake hangs.
Inserting a small delay between the processing of CS events and
submitting the next context, prevents the hang. Naturally it does not
occur with debugging enabled. The suspicion then is that this is related
to the issues with the CS event buffer, and inserting an mmio read of
the CS pointer status appears to be very successful in preventing the
hang. Other registers, or uncached reads, or plain mb, do not prevent
the hang, suggesting that register is key -- but that the hang can be
prevented by a simple udelay, suggests it is just a timing issue like
that encountered by commit 233c1ae3c8 ("drm/i915/gt: Wait for CSB
entries on Tigerlake"). Also note that the hang is not prevented by
applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
between requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015195023.32346-1-chris@chris-wilson.co.uk
(cherry picked from commit 6ca7217dff)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld@intel.com>
Fixes: 435e8fc059 ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk
(cherry picked from commit 57b2d834bf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <stable@vger.kernel.org> # v5.7+
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk
(cherry picked from commit d46b60a2e8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In order to avoid functional breakage of mis-programmed applications that
have grown to depend on unused MOCS entries, we are programming
those entries to be equal to fully cached ("L3 + LLC") entry.
These reserved and unspecified entries should not be used as they may be
changed to less performant variants with better coherency in the future
if more entries are needed.
v2: As suggested by Lucas De Marchi to utilise __init_mocs_table for
programming default value, setting I915_MOCS_PTE index of tgl_mocs_table
with desired value.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Mathew Alwin <alwin.mathew@intel.com>
Cc: Mcguire Russell W <russell.w.mcguire@intel.com>
Cc: Spruit Neil R <neil.r.spruit@intel.com>
Cc: Zhou Cheng <cheng.zhou@intel.com>
Cc: Benemelis Mike G <mike.g.benemelis@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729102539.134731-2-ayaz.siddiqui@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit 4d8a5cfe3b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In commit 7994672309 ("drm/i915: Assume 100% brightness when not in
DPCD control mode"), we fixed the brightness level when DPCD control was
not active to max brightness. This is as good as we can guess since most
backlights go on full when uncontrolled.
However in doing so we changed the semantics of the initial
'backlight.enabled' value. At least on Pixelbooks, they were relying
on the brightness level in DP_EDP_BACKLIGHT_BRIGHTNESS_MSB to be 0 on
boot such that enabled would be false. This causes the device to be
enabled when the brightness is set. Without this, brightness control
doesn't work. So by changing brightness to max, we also flipped enabled
to be true on boot.
To fix this, make enabled a function of brightness and backlight control
mechanism.
Fixes: 7994672309 ("drm/i915: Assume 100% brightness when not in DPCD control mode")
Cc: Lyude Paul <lyude@redhat.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Kevin Chowski <chowski@chromium.org>>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918002845.32766-1-sean@poorly.run
(cherry picked from commit 4ade8f31c2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In switching to using objects for our ppGTT scratch pages, care was not
taken to avoid trying to unref NULL objects on failure. And for gen6
ppGTT, it appears we forgot entirely to unwind after a partial allocation
failure.
Fixes: 89351925a4 ("drm/i915/gt: Switch to object allocations for page directories")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019083444.1286-1-chris@chris-wilson.co.uk
If guest fills non-priv bb on ApolloLake/Broxton as Mesa i965 does in:
717e7539124d (i965: Use a WC map and memcpy for the batch instead of pw-)
Due to the missing flush of bb filled by VM vCPU, host GPU hangs on
executing these MI_BATCH_BUFFER.
Temporarily workaround this by setting SNOOP bit for PAT3 used by PPGTT
PML4 PTE: PAT(0) PCD(1) PWT(1).
The performance is still expected to be low, will need further improvement.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201012045231.226748-1-colin.xu@intel.com
Guest driver may reset HWSP to 0 as init value during D3->D0:
The full sequence is:
- Boot ->D0
- Update HWSP
- D0->D3
- ...In D3 state...
- D3->D0
- DMLR reset.
- Set engine HWSP to 0.
- Set engine ring mode to 0.
- Set engine HWSP to correct value.
- Set engine ring mode to correct value.
Ring mode is masked register so set 0 won't take effect.
However HWPS addr 0 is considered as invalid GGTT address which will
report error like:
gvt: vgpu 1: write invalid HWSP address, reg:0x2080, value:0x0
gvt: vgpu 1: fail to emulate MMIO write 00002080 len 4
Detected your guest driver doesn't support GVT-g.
Now vgpu 2 will enter failsafe mode.
Zero out HWSP addr is considered as a valid setting from device driver
so don't treat it as invalid HWSP addr.
V2:
Treat HWSP addr 0 as valid. (zhenyu)
V3:
Change patch title.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200911065239.147789-1-colin.xu@intel.com
i915_gem_object_map implements fairly low-level vmap functionality in a
driver. Split it into two helpers, one for remapping kernel memory which
can use vmap, and one for I/O memory that uses vmap_pfn.
The only practical difference is that alloc_vm_area prefeaults the vmalloc
area PTEs, which doesn't seem to be required here for the kernel memory
case (and could be added to vmap using a flag if actually required).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Link: https://lkml.kernel.org/r/20201002122204.1534411-9-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kmap for !PageHighmem is just a convoluted way to say page_address, and
kunmap is a no-op in that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Link: https://lkml.kernel.org/r/20201002122204.1534411-8-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
shmem_pin_map somewhat awkwardly reimplements vmap using alloc_vm_area and
manual pte setup. The only practical difference is that alloc_vm_area
prefeaults the vmalloc area PTEs, which doesn't seem to be required here
(and could be added to vmap using a flag if actually required). Switch to
use vmap, and use vfree to free both the vmalloc mapping and the page
array, as well as dropping the references to each page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Link: https://lkml.kernel.org/r/20201002122204.1534411-7-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The typical set of driver updates across the subsystem:
- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma, hns,
usnic, qib, qedr, cxgb4, hns, bnxt_re
- Various rtrs fixes and updates
- Bug fix for mlx4 CM emulation for virtualization scenarios where MRA
wasn't working right
- Use tracepoints instead of pr_debug in the CM code
- Scrub the locking in ucma and cma to close more syzkaller bugs
- Use tasklet_setup in the subsystem
- Revert the idea that 'destroy' operations are not allowed to fail at
the driver level. This proved unworkable from a HW perspective.
- Revise how the umem API works so drivers make fewer mistakes using it
- XRC support for qedr
- Convert uverbs objects RWQ and MW to new the allocation scheme
- Large queue entry sizes for hns
- Use hmm_range_fault() for mlx5 On Demand Paging
- uverbs APIs to inspect the GID table instead of sysfs
- Move some of the RDMA code for building large page SGLs into
lib/scatterlist
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl+J37MACgkQOG33FX4g
mxrKfRAAnIecwdE8df0yvVU5k0Eg6qVjMy9MMHq4va9m7g6GpUcNNI0nIlOASxH2
l+9vnUQS3ebgsPeECaDYzEr0hh/u53+xw2g4WV5ts/hE8KkQ6erruXb9kasCe8yi
5QWJ9K36T3c03Cd3EeH6JVtytAxuH42ombfo9BkFLPVyfG/R2tsAzvm5pVi73lxk
46wtU1Bqi4tsLhyCbifn1huNFGbHp08OIBPAIKPUKCA+iBRPaWS+Dpi+93h3g3Bp
oJwDhL9CBCGcHM+rKWLzek3Dy87FnQn7R1wmTpUFwkK+4AH3U/XazivhX035w1vL
YJyhakVU0kosHlX9hJTNKDHJGkt0YEV2mS8dxAuqilFBtdnrVszb5/MirvlzC310
/b5xCPSEusv9UVZV0G4zbySVNA9knZ4YaRiR3VDVMLKl/pJgTOwEiHIIx+vs3ejk
p8GRWa1SjXw5LfZEQcq39J689ljt6xjCTonyuBSv7vSQq5v8pjBxvHxiAe2FIa2a
ZyZeSCYoSh0SwJQukO2VO7aprhHP3TcCJ/987+X03LQ8tV2VWPktHqm62YCaDcOl
fgiQuQdPivRjDDkJgMfDWDGKfZeHoWLKl5XsJhWByt0lablVrsvc+8ylUl1UI7gI
16hWB/Qtlhfwg10VdApn+aOFpIS+s5P4XIp8ik57MZO+VeJzpmE=
=LKpl
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A usual cycle for RDMA with a typical mix of driver and core subsystem
updates:
- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
hns, usnic, qib, qedr, cxgb4, hns, bnxt_re
- Various rtrs fixes and updates
- Bug fix for mlx4 CM emulation for virtualization scenarios where
MRA wasn't working right
- Use tracepoints instead of pr_debug in the CM code
- Scrub the locking in ucma and cma to close more syzkaller bugs
- Use tasklet_setup in the subsystem
- Revert the idea that 'destroy' operations are not allowed to fail
at the driver level. This proved unworkable from a HW perspective.
- Revise how the umem API works so drivers make fewer mistakes using
it
- XRC support for qedr
- Convert uverbs objects RWQ and MW to new the allocation scheme
- Large queue entry sizes for hns
- Use hmm_range_fault() for mlx5 On Demand Paging
- uverbs APIs to inspect the GID table instead of sysfs
- Move some of the RDMA code for building large page SGLs into
lib/scatterlist"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
RDMA/ucma: Fix use after free in destroy id flow
RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
RDMA: Explicitly pass in the dma_device to ib_register_device
lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
IB/mlx4: Convert rej_tmout radix-tree to XArray
RDMA/rxe: Fix bug rejecting all multicast packets
RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
RDMA/rxe: Remove duplicate entries in struct rxe_mr
IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
IB/rdmavt: Fix sizeof mismatch
MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
RDMA/umem: Move to allocate SG table from pages
lib/scatterlist: Add support in dynamic allocation of SG table from pages
tools/testing/scatterlist: Show errors in human readable form
tools/testing/scatterlist: Rejuvenate bit-rotten test
RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
RDMA/uverbs: Expose the new GID query API to user space
...
intel_dp_ycbcr420_config() is rather pointless. Just inline it
directly into intel_dp_compute_config(). This gets rid of the
ugly double assignment of output_format.
Not really sure what the best policy would be when the user
supplies a mode classified by the display as "YCbCr 4:2:0
only", but we know that we can't do YCbCr 4:2:0 output. For
now keep the current behaviour of just silently upgrade
it to RGB 4:4:4.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-3-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Remove the lspcon special case from intel_dp_compute_config() and
just treat it like any other DFP than can do 4:4:4->4:2:0 conversion.
The only difference between the two codepaths was that the lspcon
code tried to already halve port_clock. That was just total nonsense
as we hadn't even computed the base port_clock at that time.
All that stuff happens intel_dp_compute_link_config*() and it
already takes care of the 4:2:0 clock reduction.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-2-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
crtc_state->lspcon_downsampling isn't particularly useful at
the moment since we can't even do proper readout for it.
Let's get rid of it. Will help with unifying the LSPCON with
the regular DFP YCbCr output support.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-1-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <stable@vger.kernel.org> # v5.7+
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
WAC6entrylatency is trying to fix excessive rc6 entry latency caused
by the extra delay from FBC_LLC_READ_CTRL, which is there for some
extra sync with uncore for frame buffer caching in LLC.
Reading through the hsd the recommendation was to set the FBC_LLC_FULLY_OPEN
bit to disable this extra delay entirely. This can be done whenever fb LLC
caching is not used. The alternative suggestion was to reduce the delay to
eg. 0x5 via updated BIOS programming instructions. But all the kbl/cfl
machines I've seen still have the default 0xff programmed. As we never use
fb LLC caching let's just apply the w/a to all skl derivatives to get
consistent rc6 latencies.
I was able to measure the effect of FBC_LLC_READ_CTRL to rc6 latency
via forcewake. Here's a graph of some of the results:
sleep;fw_req=1;wait fw_ack==1;sleep;fw_req=0;wait fw_ack==0
fw_ack==1 duration
160us +----------------------------------------------------------------+
| + + $$+ + + |
| $$ $ $ ******$$ ** $ $**$* #########$$######|
140us |-$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$*$$$$$$$$$$$$$$$$ $$$$$$|
| $ * # |
| $ * # |
120us |$+ * # +-|
|$ * # |
|$ * # # |
100us |$+ ************######################## +-|
|$ * *# |
|$ ***** ######### |
80us |$+ * # #### ## +-|
|$ **** ### # # |
| ** #### FBC_LLC_READ_CTRL: 0x8000 ******* |
60us |-###### FBC_LLC_READ_CTRL: 0xffff #######-|
|## + + FBC_LLC_READ_CTRL: 0x400000ff $$$$$$$ |
+----------------------------------------------------------------+
0ms 10ms 20ms 30ms 40ms 50ms 60ms
sleep duration
The default FBC_LLC_READ_CTRL value of 0xff is documented to give us
a 170usec delay. That tracks well with the knees at 0xffff->~44msec and
0x8000->~22msec we see in the graph.
We can see that if we sleep longer than the FBC_LLC_READ_CTRL delay
we always observe the full (~145usec) rc6 wakeup latency. But if we sleep
for less than the FBC_LLC_READ_CTRL delay we see a quicker fw wakeup,
presumably due the hardware not having yet entered rc6 fully.
The other plateaus in the graph I suspect correspond to some shallower
internal rc states.
v2: s/usec/msec/ typo in commit msg
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716190426.17047-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
A recent bspec update has provided a new cdclk table for RKL. All of
the cdclk values are the same as those we've been using on ICL, TGL,
etc., but we obtain them by doubling both the PLL ratio and CD2X divider
numbers.
Bspec: 49202
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015220038.271740-1-matthew.d.roper@intel.com
Repeat our sanitychecks from before execution to after execution. One
expects that if we were to see these, the gpu would already be on fire,
but the timing may be informative.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015190816.31763-1-chris@chris-wilson.co.uk
We may try to preempt the currently executing request, only to find that
after unravelling all the dependencies that the original executing
context is still the earliest in the topological sort and re-submitted
back to HW (if we do detect some change in the ELSP that requires
re-submission). However, due to the way we check for wrap-around during
the unravelling, we mark any context that has been submitted just once
(i.e. with the rq->wa_tail set, but the ring->tail earlier) as
potentially wrapping and requiring a forced restore on resubmission.
This was expected to be not a problem, as it was anticipated that most
unwinding for preemption would result in a context switch and the few
that did not would be lost in the noise. It did not take long for
someone to find one particular workload where the cost of those extra
context restores was measurable.
However, since we know the wa_tail is of fixed size, and we know that a
request must be larger than the wa_tail itself, we can safely maintain
the check for request wrapping and check against a slightly future point
in the ring that includes an expected wa_tail. (That is if the
ring->tail is already set to rq->wa_tail, including another 8 bytes in
the check does not invalidate the incremental wrap detection.)
Fixes: 8ab3a3812a ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002083425.4605-1-chris@chris-wilson.co.uk
When running gem_exec_nop, it floods the system with many requests (with
the goal of userspace submitting faster than the HW can process a single
empty batch). This causes the driver to continually resubmit new
requests onto the end of an active context, a flood of lite-restore
preemptions. If we time this just right, Tigerlake hangs.
Inserting a small delay between the processing of CS events and
submitting the next context, prevents the hang. Naturally it does not
occur with debugging enabled. The suspicion then is that this is related
to the issues with the CS event buffer, and inserting an mmio read of
the CS pointer status appears to be very successful in preventing the
hang. Other registers, or uncached reads, or plain mb, do not prevent
the hang, suggesting that register is key -- but that the hang can be
prevented by a simple udelay, suggests it is just a timing issue like
that encountered by commit 233c1ae3c8 ("drm/i915/gt: Wait for CSB
entries on Tigerlake"). Also note that the hang is not prevented by
applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
between requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015195023.32346-1-chris@chris-wilson.co.uk
While we do lack the faster shared LLC, we should still have support
for snooping over PCIe.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-11-lucas.demarchi@intel.com
Update the DMC_DEBUG_DC5 register to its new location and do not try
reading the DC6 counter since DG1 doesn't support DC6.
v2: Use IS_DGFX() instead of IS_DG1(). Even if not having DC6 is not
directly related to DGFX, the register move to a new location is. So in
future, if there is one supporting DC6, it would just need to add the
other register rather than fixing the case of a wrong register being
read (Matt)
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-10-lucas.demarchi@intel.com
DC6 is not supported on DG1, so change the allowed DC mask for DG1.
This is not yet on bspec, but it has been confirmed by HW engineers.
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-9-lucas.demarchi@intel.com
DG1 shares some workarounds with TGL and RKL and also has some
additional workarounds of its own.
v2: Correct location of Wa_1408615072 (JohnH).
v3: Apply WAs 1606700617, 18011464164 and 22010931296 to DG1 (José)
v4 (Anusha)
- Add Wa_22010271021
- s/Wa_14010096844/Wa_1409836686
v5:
- Extend Wa_14010919138 to all revs (Matt Atwood)
- Power gate media is global gen12 design. (Rodrigo)
- Rebase (Lucas)
v6: use REG_BIT() to fix checkpatch warning (Lucas)
BSpec: 53508
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-8-lucas.demarchi@intel.com
Add support to load DMC v2.0.2 on DG1
While we're at it, make TGL use the same GEN12 firmware size definition
and remove obsolete comment.
Bpec: 49230
v2: do not replace GEN12_CSR_MAX_FW_SIZE (from José)
and replace stale comment
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-7-lucas.demarchi@intel.com
Add DG1 DPLL Enable register macro and use the macro to enable the
correct DPLL based on PLL id. Although we use
_MG_PLL1_ENABLE/_MG_PLL2_ENABLE these are rather combo phys.
While at it, fix coding style: wrong newlines and use if/else chain
v2: Rewrite original patch from Aditya Swarup based on refactors
upstream
Bspec: 49443, 49206
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Aditya Swarup <aditya.swarup@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-6-lucas.demarchi@intel.com
Add entries for dg1 plls and setup dg1_pll_mgr to reuse ICL callbacks.
Initial setup for shared dplls DPLL0/1 for DDIA/DDIB and DPLL2/3 for
DDI-TC1/DDI-TC2. Configure dpll cfgcrx registers to drive the plls on
DG1.
v2 (Lucas): Reword commit message and add missing update_ref_clks hook
(requested by Matt Roper)
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-5-lucas.demarchi@intel.com
DG1 has 4 DPLLs where DPLL0 and DPLL1 drive DDIA/B and
DPLL2 and DPLL3 drive DDI-TC1/DDI-TC2.
Introduce DG1_DPLL_CFCRx() helper macros to configure
DPLL registers.
Bspec: 50288, 50299
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-4-lucas.demarchi@intel.com
TGL power wells can be re-used for DG1 with the exception of the fake
power well for TC_COLD.
v2: use logic to skip power wells while copying instead of duplicating
the definition of TGL power wells (Matt Roper)
Bspec: 49182
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-3-lucas.demarchi@intel.com
The skus guarded by IS_CNL_WITH_PORT_F() have port F and thus they need
those power wells. The others don't have those. Up to now we were
just overriding the number of power wells on !IS_CNL_WITH_PORT_F(),
relying on those power wells to be the last ones. Now that we have logic
in place to skip power wells by id, use it instead.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-2-lucas.demarchi@intel.com
This allows us to skip power wells on a platform allowing it to re-use
the table from another one instead of having to create a new table from
scratch that is basically a copy with a few removals.
Cc: Imre Deak <imre.deak@intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
[ Adapt ignore logic to be based on pw id rather than adding a new
field, as suggested by Imre ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-1-lucas.demarchi@intel.com
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld@intel.com>
Fixes: 435e8fc059 ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk
New driver:
Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfh579AAoJEAx081l5xIa+GqoP/0amz+ZN7y/L7+f32CRinJ7/
3e4xjXNDmtWG4Whe/WKjlYmbAcvSdWV/4HYpurW2BFJnOAB/5lIqYcS/PyqErPzA
w4EpRoJ+ZdFgmlDH0vdsDwPLT/HFmhUN9AopNkoZpbSMxrManSj5QgmePXyiKReP
Q+ZAK5UW5AdOVY4bgXUSEkVq2eilCLXf+bSBR/LrVQuNgu7GULX8SIy/Y1CuMtv8
LgzzjLKfIZaIWC+F/RU7BxJ7YnrVq7z7yXnUx8j2416+k/Wwe+BeSUCSZstT7q9G
UkX8jWfR7ZKqhwP+UQeSwDbHkALz7lv88nyjQdxJZ3SrXRe4hy14YjxnR4maeNAj
3TAYSdcAMWyRHqeEZIZ7Hj5sQtTq5OZAoIjxzH3vpVdAnnAkcWoF77pqxV8XPqTC
nw40DihAxQOshGwMkjd5DqkEwnMv43Hs1WTVYu9dPTOfOdqPNt+Vqp7Xl9Z46+kV
k6PDcx60T9ayDW1QZ6MoIXHta9E7ixzu7gYBL3vP4LuporY0uNG3bzF3CMvof1BK
sHYcYTdZkqbTD2d6rHV+TbpPQXgTtlej9qVlQM4SeX37Xtc7LxCYpnpUHKz2S/fK
1vyeGPgdytHblwlxwZOPZ4R2I/HTfnITdr4kMcJHhxAsEewfW1Rd4+stQqVJ2Mph
Vz+CFP2BngivGFz5vuky
=4H8J
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Not a major amount of change, the i915 trees got split into display
and gt trees to better facilitate higher level review, and there's a
major refactoring of i915 GEM locking to use more core kernel concepts
(like ww-mutexes). msm gets per-process pagetables, older AMD SI cards
get DC support, nouveau got a bump in displayport support with common
code extraction from i915.
Outside of drm this contains a couple of patches for hexint
moduleparams which you've acked, and a virtio common code tree that
you should also get via it's regular path.
New driver:
- Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config"
* tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm: (1494 commits)
drm/ingenic: Fix bad revert
drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_init
drm/amdgpu: Remove warning for virtual_display
drm/amdgpu: kfd_initialized can be static
drm/amd/pm: setup APU dpm clock table in SMU HW initialization
drm/amdgpu: prevent spurious warning
drm/amdgpu/swsmu: fix ARC build errors
drm/amd/display: Fix OPTC_DATA_FORMAT programming
drm/amd/display: Don't allow pstate if no support in blank
drm/panfrost: increase readl_relaxed_poll_timeout values
MAINTAINERS: Update entry for st7703 driver after the rename
Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"
drm/amd/display: HDMI remote sink need mode validation for Linux
drm/amd/display: Change to correct unit on audio rate
drm/amd/display: Avoid set zero in the requested clk
drm/amdgpu: align frag_end to covered address space
drm/amdgpu: fix NULL pointer dereference for Renoir
drm/vmwgfx: fix regression in thp code due to ttm init refactor.
drm/amdgpu/swsmu: add interrupt work handler for smu11 parts
drm/amdgpu/swsmu: add interrupt work function
...
Forcing mocs:1 [used for our winsys follows-pte mode] to be cached
caused display glitches. Though it is documented as deprecated (and so
likely behaves as uncached) use the follow-pte bit and force it out of
L3 cache.
Fixes: 4d8a5cfe3b ("drm/i915/gt: Initialize reserved and unspecified MOCS indices")
Testcase: igt/kms_frontbuffer_tracking
Testcase: igt/kms_big_fb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-4-chris@chris-wilson.co.uk
Since SKL the eLLC has been sitting on the far side of the system
agent, meaning the display engine can utilize it. Let's enable that.
I chose WB for the caching mode, because my numbers are indicating
that WT might actually be WB and WC might actually be UC. I'm not
100% sure that is indeed the case but at least my simple rendercopy
based benchmark didn't see any difference in performance.
Also if I configure things to do LLCeLLC+WT I still get cache dirt
on my screen, suggesting that is in fact operating in WB mode
anyway. This is also the reason I had to fix the MOCS target cache
to really say PTE rather than LLC+eLLC.
Since SKL the eLLC has been sitting on the far side of the system agent,
meaning the display engine can utilize it. Let's enable that.
Eero's earlier benchmarks numbers:
"* Results in GfxBench and Unigine (Valley/Heaven) tests were within daily
variation on the tested SKL machines
* SKL GT4e (128MB eLLC) / Wayland / Weston:
+15-20% SynMark TexMem512 (512MB of textures)
+4-6% SynMark TerrainFly*, CSCloth, ShMapVsm
-5-10% SynMark TexMem128 (128MB of textures)
* SKL GT3e (64MB eLLC) / Xorg / Unity:
+4-8% GpuTest Triangle fullscreen (FullHD)
-5-10% GpuTest Triangle windowed (1/2 screen)
* SKL GT2 (no eLLC) / Xorg / Unity:
* Some of the higher FPS SynMark pixel and vertex shader tests
are few percent higher, more than daily variance
=> Do you see any reason why this machine would be impacted
although it doesn't eLLC?"
Caveats:
- Still haven't tested with a prime setup
- Still not entirely sure this a good idea, but I've been
using it on my cfl anyway :)
v2: Split the MOCS PTE change out
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-3-ville.syrjala@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-3-chris@chris-wilson.co.uk
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <stable@vger.kernel.org> # v5.7+
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk
Recently we came across requirement to identify EHL and JSL
platform to program them differently. Thus Split the basic
platform definition, macros, and PCI IDs to differentiate
between EHL and JSL platforms. Also, IS_ELKHARTLAKE is replaced
with IS_JSL_EHL everywhere.
Changes since V1 :
- Rebased to avoid merge conflicts
- Added missed check for jasperlake in intel_uc_fw.c
Cc : Matt Roper <matthew.d.roper@intel.com>
Cc : Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201013192948.63470-1-tejaskumarx.surendrakumar.upadhyay@intel.com
i915 does not want to see value entries. Switch it to use
find_lock_page() instead, and remove the export of find_lock_entry().
Move find_lock_entry() and find_get_entry() to mm/internal.h to discourage
any future use.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: https://lkml.kernel.org/r/20200910183318.20139-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order to avoid functional breakage of mis-programmed applications that
have grown to depend on unused MOCS entries, we are programming
those entries to be equal to fully cached ("L3 + LLC") entry.
These reserved and unspecified entries should not be used as they may be
changed to less performant variants with better coherency in the future
if more entries are needed.
v2: As suggested by Lucas De Marchi to utilise __init_mocs_table for
programming default value, setting I915_MOCS_PTE index of tgl_mocs_table
with desired value.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Mathew Alwin <alwin.mathew@intel.com>
Cc: Mcguire Russell W <russell.w.mcguire@intel.com>
Cc: Spruit Neil R <neil.r.spruit@intel.com>
Cc: Zhou Cheng <cheng.zhou@intel.com>
Cc: Benemelis Mike G <mike.g.benemelis@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729102539.134731-2-ayaz.siddiqui@intel.com
Cc: stable@vger.kernel.org
BOE panel with ID 2270 claims both PWM_PIN_CAP and AUX_SET_CAP backlight
control bits, but default chip backlight failed to control brightness.
Check AUX_SET_CAP and proceed to check quirks or VBT backlight type.
DPCD can control the brightness of this pannel.
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009085750.88490-1-aaron.ma@canonical.com
In commit 7994672309 ("drm/i915: Assume 100% brightness when not in
DPCD control mode"), we fixed the brightness level when DPCD control was
not active to max brightness. This is as good as we can guess since most
backlights go on full when uncontrolled.
However in doing so we changed the semantics of the initial
'backlight.enabled' value. At least on Pixelbooks, they were relying
on the brightness level in DP_EDP_BACKLIGHT_BRIGHTNESS_MSB to be 0 on
boot such that enabled would be false. This causes the device to be
enabled when the brightness is set. Without this, brightness control
doesn't work. So by changing brightness to max, we also flipped enabled
to be true on boot.
To fix this, make enabled a function of brightness and backlight control
mechanism.
Fixes: 7994672309 ("drm/i915: Assume 100% brightness when not in DPCD control mode")
Cc: Lyude Paul <lyude@redhat.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Kevin Chowski <chowski@chromium.org>>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918002845.32766-1-sean@poorly.run
When the number of potential color planes grew to 4 we stopped
setting all unused color plane offsets to ~0xfff. The code
still tries to do this, but actually does nothing since the
loop limits are bogus.
skl_check_main_surface() actually depends on this ~0xfff
behaviour as it will make sure to move the main surface
offset below the aux surface offset because the hardware
AUX_DIST must be a non-negative value [1], and for simplicity
it doesn't bother checking if the AUX plane is actually
needed or not. So currently it may end up shuffling the
main surface around based on some stale leftover AUX offset.
The skl+ plane code also just blindly calculates the AUX_DIST
whether or not the AUX plane is actually needed by the hw or
not, and that too will now potentially use some stale AUX
surface offset in the calculation. Would seem nicer to
guarantee a consistent non-negative AUX_DIST always.
So bring back the original ~0xfff offset behaviour for
unused color planes. Though it doesn't seem super likely
that this inconsistency would cause any real issues.
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Fixes: 2dfbf9d287 ("drm/i915/tgl: Gen-12 display can decompress surfaces compressed by the media engine")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201008101608.8652-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit 79148ce4b2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The HDMI vs. not-HDMI check got inverted whem the bogus encoder->type
checks were eliminated. So now we're using 0 as the link rate on DP
and potentially non-zero on HDMI, which is exactly the opposite of
what we want. The original bogus check actually worked more correctly
by accident since if would always evaluate to true. Due to this we
now always use the RBR/HBR1 vswing table and never ever the HBR2+
vswing table. That is probably not a good way to get a high quality
signal at HBR2+ rates. Fix the check so we pick the right table.
Cc: stable@vger.kernel.org
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Fixes: 94641eb6c6 ("drm/i915/display: Fix the encoder type check")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930223642.28565-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
(cherry picked from commit 945b18fb48)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
- Make all debug object descriptors constant. There is no reason to have
them writeable.
- Free the per CPU object pool after CPU unplug to avoid memory waste.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+EMM8THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYobvLD/95P1eqk5Hku2uHeHowj9t4PVqk68Gq
71rJTj/x32JnGVsNFfwoqm0u8CueAitT1Id5aIjiMAGK2eH83StwdfqFjdHsitwl
4zH6qywjjgq86jOSXXUQ2TSyYjlL7l2Pr8jwEZkTXUrShZyHAOvD9oW3KwlyV88H
QXiTGFAm4aAaFctgGlbVafxtp1IJkS+FzFT1/cyPV064d3uGjX6maKwKjssO/8Bv
WLzPuyOKh7IoEMfyFEXw3Ks3GK3Fo9Rkm4+CNLxjWy7DX7tY2Xvu4kuPqQ0dyn31
zzxJHOESLOjtw8w0vSEKpuNyIfL/66/MF8p4CXzkUWQTa9h26yos+mDnSzxth++d
ZERCpx7udTG3BbteJ8ZEf3/QyH6kJw9RccDET180/CvhBTHELEsroy9qOfDMJkgt
RODtwh+2+O0zqUGjbqEd+PTmkU5p0UwIWAI9t7pic5Dntse7stRwotDtI0FGNnF1
aj/4ZkTb+lq3EF/x9xh6Hw/SfcVGtmySeMaPlaj9fgq39dtTPIspYeI5GieZlE0s
1ZfoRSgWA+/K9+4iTsuGvTwXnLhsRYVZl+GIkzeG2FL3euK0t+vaIvIbE9IDHdsS
mgEFaUXwIosHOfn+8f9KWl6YupBtT/4PyPVu5DseREl4up0W6s26aamM1Ml16A8x
JUNS/oa+cGpI4A==
=BP3+
-----END PGP SIGNATURE-----
Merge tag 'core-debugobjects-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects updates from Thomas Gleixner:
"A small set of updates for debug objects:
- Make all debug object descriptors constant. There is no reason to
have them writeable.
- Free the per CPU object pool after CPU unplug to avoid memory
waste"
* tag 'core-debugobjects-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobjects: Free per CPU pool after CPU unplug
treewide: Make all debug_obj_descriptors const
debugobjects: Allow debug_obj_descr to be const
The DP Standard's recommendation is to use the LTTPR non-transparent
mode link training if LTTPRs are detected, so let's do this.
Besides power-saving, the advantages of this are that the maximum number
of LTTPRs can only be used in non-transparent mode (the limit is 5-8 in
transparent mode), and it provides a way to narrow down the reason for a
link training failure to a given link segment. Non-transparent mode is
probably also the mode that was tested the most by the industry.
The changes in this patchset:
- Pass the DP PHY that is currently link trained to all LT helpers, so
that these can access the correct LTTPR/DPRX DPCD registers.
- During LT take into account the LTTPR common lane rate/count and the
per LTTPR-PHY vswing/pre-emph limits.
- Switch to LTTPR non-transparent LT mode and train each link segment
according to the sequence in DP Standard v2.0 (complete CR/EQ for
each segment before continuing with the next segment).
v2:
- Switch to non-transparent mode during connector detection, which is
required before reading the per-PHY LTTPR capabilities.
- Move the DP_PHY_LTTPR() macro to drm_dp_helper.h (Ville)
- Use the new drm_dp_dpcd_read_phy_link_status() instead of adding the
same logic to intel_dp_get_link_status(). (Ville)
- Make intel_dp_lttpr_phy_caps() return a pointer to the whole array
instead of a pointer to its first element. (Ville)
- Add the intel_dp_phy_is_downstream_of_source() helper. (Ville)
- Add a code comment about the disable->enable quirk of
non-transparent mode.
- Add the intel_dp_training_pattern_set_reg() helper.
- Fix checkpatch/sparse warns.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007170917.1764556-7-imre.deak@intel.com
By default LTTPRs should be in transparent link training mode,
nevertheless in this patch we switch to this default mode explicitly.
The DP Standard recommends this, supposedly because an LTTPR may be left
in the non-transparent mode (by BIOS, previous kernel, or after reset
due to a firmware bug). I haven't seen this happening, but let's follow
the DP Standard.
v2:
- Add a code comment about the explicit disabling of non-transparent
mode.
v3:
- Move check to prevent initing LTTPRs on eDP to init_dp_lttpr_init().
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007170917.1764556-6-imre.deak@intel.com
To prepare for a follow-up LTTPR change factor out a helper to disable
the training pattern in DPCD. We'll need to do this for each LTTPR
(without programming the port to output the idle pattern) when training
in LTTPR non-transparent mode.
While at it also move the disable-link-training logic from
intel_dp_set_link_train() to intel_dp_stop_link_train(), since the
latter is the only user of this.
v2:
- Move the disable-link-training logic to intel_dp_stop_link_train()
(Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007170917.1764556-4-imre.deak@intel.com
Split the prepare, link training, fallback-handling steps into their own
functions for clarity and as a preparation for the upcoming LTTPR
changes.
While at it also:
- Unexport and inline intel_dp_set_idle_link_train(), which is used at a
single place.
- Add some documentation to functions that are exported or that can use
a better description about which part of the LT sequence they
implement.
v2: (Ville)
- Unexport/inline intel_dp_set_idle_link_train()
- Make the documentation of
intel_dp_prepare_link_train()/intel_dp_stop_link_train() more accurate
wrt. HW specific details.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007170917.1764556-3-imre.deak@intel.com
An LTTPR can be trained with training pattern 4 even if the DPCD
revision is < 1.4, but drm_dp_training_pattern_mask() would change
pattern 4 to pattern 3 on those DPCD revisions.
Since intel_dp_training_pattern() makes already sure that the proper
training pattern is used, all that needs to be masked out is the
scrambling disable flag, which is or'd to the mask later based on the
training pattern.
v2:
- Use a helper instead of open-coding the masking. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007170917.1764556-2-imre.deak@intel.com
- Fix CRTC state checker (Ville)
Propated from drm-intel-gt-next:
- Avoid implicit vmpa for highmem on 32b (Chris)
- Prevent PAT attriutes for writecombine if CPU doesn't support PAT (Chris)
- Clear the buffer pool age before use. (Chris)
- Fix error code (Dan)
- Break up error capture compression loops (Chris)
- Fix uninitialized variable in context_create_request (Maarten)
- Check for errors on i915_vm_alloc_pt_stash to avoid NULL dereference (Matt)
- Serialize debugfs i915_gem_objects with ctx->mutex (Chris)
- Fix a rebase mistake caused during drm-intel-gt-next creation (Chris)
- Hold request reference for canceling an active context (Chris)
- Heartbeats fixes (Chris)
- Use usigned during batch copies (Chris)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAl93cBoACgkQ+mJfZA7r
E8q8pQf+KvebXTbfD217OMONXqPt0+EC85hqA1LHmMq0E4W/qh2XjO242bdNq9oJ
qLd6YxKOXcWHxlDn1dlPTGOAbpDWeTT228QfC/vQMyvHnWX00J1EXoXnl14gHs7w
rYUdpdmC0qW5W5oJjdUU1P3EprahmOr0XNTOURS8fiylZBo8vTm4H3kB4iVsLSrT
zpUOthQ3PomnOUTeSQVDeYFgT5+S79qguUq9u27DBj+kKwrdx3IeAMHHEtXzg2JD
AmKgjRxH5PyZny9roCoKhm/aA3Zx32CXZI/zW84sKg9/ryh3SGHbIwJbTRaNX8ub
qheu87bWxm1v0a/7ZUr0Frb7tzXpAA==
=vULA
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-fixes-2020-10-02' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Propagated from drm-intel-next-queued:
- Fix CRTC state checker (Ville)
Propated from drm-intel-gt-next:
- Avoid implicit vmpa for highmem on 32b (Chris)
- Prevent PAT attriutes for writecombine if CPU doesn't support PAT (Chris)
- Clear the buffer pool age before use. (Chris)
- Fix error code (Dan)
- Break up error capture compression loops (Chris)
- Fix uninitialized variable in context_create_request (Maarten)
- Check for errors on i915_vm_alloc_pt_stash to avoid NULL dereference (Matt)
- Serialize debugfs i915_gem_objects with ctx->mutex (Chris)
- Fix a rebase mistake caused during drm-intel-gt-next creation (Chris)
- Hold request reference for canceling an active context (Chris)
- Heartbeats fixes (Chris)
- Use usigned during batch copies (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002182610.GA2204465@intel.com
The updated bspec forcewake table also provides us with new multicast
ranges that should be reflected in our workaround code.
Note that there are different types of multicast registers with
different styles of replication and different steering registers. The
i915 MCR range lists we're updating here are only used to ensure we can
verify workarounds properly (i.e., if we can't steer register reads we
don't want to verify workarounds where an unsteered read might hit a
fused-off instance of the unit). Because of this, we don't need to
include any of the multicast ranges where all instances of the register
will always present and fusing doesn't play a role. Specifically, that
means that we are not including the MCR ranges designated as "SQIDI" in
the bspec.
Bspec: 66696
Cc: Caz Yokoyama <caz.yokoyama@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009194442.3668677-4-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
The bspec's forcewake page was very stale and out of date for recent
platforms. The hardware team finally provided us with an updated gen12
table (which applies to TGL, RKL, and DG1) and there are a lot of
changes.
v2:
- Add comments showing the subregions of ranges that we've combined for
ease of code review. (Jose)
- Rebase on the s/FORCEWAKE_BLITTER/FORCEWAKE_GT/ patch
Bspec: 66696
Cc: Caz Yokoyama <caz.yokoyama@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009194442.3668677-3-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
The power well that we've been referring to as the 'blitter' well is
actually more of a general GT power well which contains a lot of things
other than the blitter engine registers. The FORCEWAKE_BLITTER name in
the code was used for historic reasons, but no longer matches how the
bspec describes this power well and just causes confusion for people not
familiar with this area of the code. Let's rename it to FORCEWAKE_GT to
more accurately describe the role of the power well and match how the
modern bspec refers to it.
v2:
- Add a comment noting that the GT power well includes the blitter
engine. (Jose)
Bspec: 66696, 66534, 67609
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009194442.3668677-2-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Another step towards PSR2 selective fetch, here programming plane
selective fetch registers and MAN_TRK_CTL enabling selective fetch but
for now it is fetching the whole area of the planes.
The damaged area calculation will come as next and final step.
v2:
- removed warn on when no plane is visible in state
- removed calculations using plane damaged area in
intel_psr2_program_plane_sel_fetch()
v3:
- do not shift 16 positions the plane dst coordinates, only src is
shifted
v4:
- only setting PLANE_SEL_FETCH_CTL_ENABLE and MCURSOR_MODE in
PLANE_SEL_FETCH_CTL
v5:
- not masking bits for cursor
BSpec: 55229
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007195238.53955-3-jose.souza@intel.com
Due to the debugfs flag, has_psr2 in CRTC state could have a different
value than psr.psr2_enabled and it was causing PSR2 subfeatures(DC3CO
and selective fetch) to be set to not a expected state.
So here only taking in consideration the parameter and debugfs flag
when computing PSR state, this way the CRTC state will also have
the correct state.
intel_psr_fastset_force() was already broken as
intel_psr_compute_config() was already only enabling PSR when
psr_global_enabled() and all other PSR requirements are met.
So some changes was required in this function, now it iterates over
all connectors, if it is a eDP connector and is active force a modeset
in the CRTC driving this connector, what will cause the new PSR state
to be set based on the debugfs flag.
v2:
- end connector iterator in error cases
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007195238.53955-2-jose.souza@intel.com
For platforms without selective fetch this register is reserved so
do not write 0 to it.
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007195238.53955-1-jose.souza@intel.com
This will remove the "Expected child device config size for VBT
version 235 not known" debug message seen in TGL, although this is not
fixing anything it good to keep our VBT parser updated.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201008211932.24989-2-jose.souza@intel.com
Child min_brightness is obsolete from VBT 234+, instead the new
min_brightness field in the main structure should be used.
This new field is 16 bits wide, so backlight_precision_bits is needed
to check if value needs to be scaled down but it is only available in
VBT 236+ so working around it by using the also new backlight_level
in the main struct.
v2:
- missed that backlight_data->level is also obsolete
v3:
- s/backlight/brightness to better match specification
- using u16 to specify brightness level instead of a u32 : 16
BSpec: 20149
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201008211932.24989-1-jose.souza@intel.com
when the hardware isn't going to use the aux plane there's no
real point in dealing with the relevant hardware restrictions.
So let's just skip all that when not necessary.
We can now also remove the offset=~0xfff behaviour for unused
color planes. Let's just zero out everyting so as to not leave
stale garbage behind to confuse people debugging the code.
v2: Explicitly set AUX_DIST to zero when there is no aux plane
Reviewed-by: Imre Deak <imre.deak@intel.com> #v1
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009120028.32422-1-ville.syrjala@linux.intel.com
When the number of potential color planes grew to 4 we stopped
setting all unused color plane offsets to ~0xfff. The code
still tries to do this, but actually does nothing since the
loop limits are bogus.
skl_check_main_surface() actually depends on this ~0xfff
behaviour as it will make sure to move the main surface
offset below the aux surface offset because the hardware
AUX_DIST must be a non-negative value [1], and for simplicity
it doesn't bother checking if the AUX plane is actually
needed or not. So currently it may end up shuffling the
main surface around based on some stale leftover AUX offset.
The skl+ plane code also just blindly calculates the AUX_DIST
whether or not the AUX plane is actually needed by the hw or
not, and that too will now potentially use some stale AUX
surface offset in the calculation. Would seem nicer to
guarantee a consistent non-negative AUX_DIST always.
So bring back the original ~0xfff offset behaviour for
unused color planes. Though it doesn't seem super likely
that this inconsistency would cause any real issues.
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Fixes: 2dfbf9d287 ("drm/i915/tgl: Gen-12 display can decompress surfaces compressed by the media engine")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201008101608.8652-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
i915_{save,restore}_state() are actually all about the display.
Currently they are split into display part + SWF part. But since
the SWF part is also related to the display let's just move that
part into its own thing and flip the roles around so that the
current display part is the main function.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201005171441.26612-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
As with eDP and LVDS we should also respect the power cycle
delay on DSI panels. We are not using the power sequencer
for these, and we have no optimizations around the sleep
duration, so we just msleep() the whole thing away.
Note that the ICL+ DSI code doesn't seem to have any power
off/power cycle delay handling whatsoever. The only thing it
handles is the power on delay. As that looks pretty busted
in general I won't bother dealing with it for the time being.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001151640.14590-6-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Just like with eDP let's wait for the power sequencer power
cycle delay before we reboot the machine, as otherwise we
can't guarantee the panel's minimum power cycle delay will
be respected.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001151640.14590-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Extend the eDP panel power cycle delay wait on reboot handling
to cover all platforms. No reason to think that VLV/CHV are
in any way special since the documentation states that the
hardware power cycle delay goes back to its default value on
reset, and that may not be enough for all panels.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001151640.14590-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Currently VLV/CHV use a reboot notifier to make sure the panel
power cycle delay isn't violated across a system reboot. Replace
that with the new encoder .shutdown() hook.
And let's also stop overriding the power cycle delay with the
max value. No idea why the current code does that. The already
programmed delay should be correct.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001151640.14590-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Implement the pci .shutdown() hook in order to quiesce the
hardware prior to reboot. The main purpose here is to turn
all displays off. Some displays/other drivers tend to get
confused if the state after reboot isn't exactly as they
expected.
One specific example was the Dell UP2414Q in MST mode.
It would require me to pull the power cord after a reboot
or else it would just not come back to life. Sadly I don't
have that at hand anymore so not sure if it's still
misbehaving without the graceful shutdown, or if we
managed to fix something else since I last tested it.
For good measure we do a gem suspend as well, so that
we match the suspend flow more closely. Also stopping
all DMA and whatnot is probably a good idea for kexec.
I would expect that some kind of GT reset happens on
normal reboot so probably not totally necessary there.
v2: Use the pci .shutdown() hook instead of a reboot notifier (Lukas)
Do the gem suspend for kexec (Chris)
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001151640.14590-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
The only bit we use in PHY_MISC is DE_IO_COMP_PWR_DOWN, and the bspec
details for that bit tell us that it need only be set for PHY-A and
PHY-B. It also turns out that there isn't even an instance of the
PHY_MISC register for PHY-D on this platform. Let's extend the EHL/RKL
logic that conditionally skips PHY_MISC usage to DG1 as well.
Bspec: 50107
Cc: Aditya Swarup <aditya.swarup@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007002210.3678024-6-lucas.demarchi@intel.com
Add tables to map the GMBUS pin pairs to GPIO registers and port to DDC.
From spec we have registers GPIO_CTL[1-4], so we should not do the 4->9
mapping as in ICL/TGL.
The values for VBT seem wrong in BSpec. For the current boards we
actually have a 1:1 mapping.
BSpec: 49311, 49945, 20124
Cc: Aditya Swarup <aditya.swarup@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007002210.3678024-5-lucas.demarchi@intel.com
On DGFX the register range has been extended to go up to 8MB. However we
only actually use up to address 280000h, so let's increase it to 4MB.
v2 (Lucas): add bspec reference and reword commit message to explain
the 4 vs 8 MB used (requested by Matt Roper)
Bspec: 53616
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007002210.3678024-4-lucas.demarchi@intel.com
DG1 has a new MOCS table. We still use the old definition of the table,
but as for any dgfx card it doesn't contain the control_value values
(these values don't matter as we won't program them).
Bspec: 45101
v2: Reword the comment to state that the last few entries are reserved
instead of "the last two". DG1 reserves four instead of two from
previous platforms (from Matt Roper)
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007002210.3678024-3-lucas.demarchi@intel.com
DG1 always uses a 38.4 MHz rawclk rather than the 19.2/24 MHz
frequencies on CNP+. Note that register bits associated with this
frequency confusingly use 37 for the divider field rather than 38 as you
might expect.
For simplicity, let's just assume that this 38.4 MHz frequency will hold
true for other future platforms with "fake" PCH south displays and that
the CNP-style behavior will remain for other platforms with a real PCH.
Bspec: 49950
Bspec: 49309
Cc: Aditya Swarup <aditya.swarup@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007002210.3678024-2-lucas.demarchi@intel.com
Recent update in documentation defeatured eDP HBR3 for EHL and JSL.
v2:
- Remove dead code in ehl_get_combo_buf_trans()
v3:
- Rebase
BSpec: 32247
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Vidya Srinivas <vidya.srinivas@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201005175447.93430-1-jose.souza@intel.com
Since we track the idle_pulse for flushing the barriers and avoid
re-emitting the pulse upon idling if no futher action is required, this
also impacts the heartbeat. Before emitting a fresh heartbeat, we look
at the engine idle status and assume that if the pulse was the last
request emitted along the heartbeat, the engine is idling and a
heartbeat pulse not required. This assumption fails, but we can reuse
the idle pulse as the heartbeat if we are yet to emit one, and so track
the status of that pulse for our engine health check.
This impacts tgl/rcs0 as we rely on the heartbeat for our healthcheck for
the normal preemption detection mechanism is disabled by default.
Testcase: igt/gem_exec_schedule/preempt-hang/rcs0 #tgl
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006094653.7558-1-chris@chris-wilson.co.uk
As the previous patch fixed the places where we walk the whole scatterlist
for DMA addresses, this patch fixes the random lookup functionality.
To achieve this we have to add a second lookup iterator and add a
i915_gem_object_get_sg_dma helper, to be used analoguous to existing
i915_gem_object_get_sg_dma. Therefore two lookup caches are maintained per
object and they are flushed at the same point for simplicity. (Strictly
speaking the DMA cache should be flushed from i915_gem_gtt_finish_pages,
but today this conincides with unsetting of the pages in general.)
Partial VMA view is then fixed to use the new DMA lookup and properly
query sg length.
v2:
* Checkpatch.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Tom Murphy <murphyt7@tcd.ie>
Cc: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-2-tvrtko.ursulin@linux.intel.com
When walking DMA mapped scatterlists sg_dma_len has to be used since it
can be different (coalesced) from the backing store entry.
This also means we have to end the walk when encountering a zero length
DMA entry and cannot rely on the normal sg list end marker.
Both issues were there in theory for some time but were hidden by the fact
Intel IOMMU driver was never coalescing entries. As there are ongoing
efforts to change this we need to start handling it.
v2:
* Use unsigned int for local storing sg_dma_len. (Logan)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 85d1225ec0 ("drm/i915: Introduce & use new lightweight SGL iterators")
References: b31144c0da ("drm/i915: Micro-optimise gen6_ppgtt_insert_entries()")
Reported-by: Tom Murphy <murphyt7@tcd.ie>
Suggested-by: Tom Murphy <murphyt7@tcd.ie> # __sgt_iter
Suggested-by: Logan Gunthorpe <logang@deltatee.com> # __sgt_iter
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-1-tvrtko.ursulin@linux.intel.com
Currently we do a final scrub of the HW state upon release. However,
when rebinding the device, this is too late as the device may either
have been partially rebound or the device is no longer accessible. If
the device has been removed before release, the reset goes astray
leaving the device in an inconsistent state, unlikely to work without a
full PCI reset. Furthermore, if the device is partially rebound before
the HW scrubbing, there may be leftover HW state that should have been
scrubbed. Either way, we need to push the scrubbing earlier before the
removal, so into unregister. The danger is that on older machines,
resetting the GPU also impact the display engine and so the reset should
be after modesetting is disabled (and before reuse we need to recover
modesetting).
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2508
Testcase: igt/core_hotunplug
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200929112639.24223-1-chris@chris-wilson.co.uk
Apply Display WA #22010492432 for combo PHY PLLs too. This should fix a
problem where the PLL output frequency is slightly off with the current
PLL fractional divider value.
I haven't seen an actual case where this causes a problem, but let's
follow the spec. It's also needed on some EHL platforms, but for that we
also need a way to distinguish the affected EHL SKUs, so I leave that
for a follow-up.
v2:
- Apply the WA at one place when calculating the PLL dividers from the
frequency and the frequency from the dividers for all the combo PLL
use cases (DP, HDMI, TBT). (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201003001846.1271151-6-imre.deak@intel.com
Atm, if a full modeset is performed during the initial modeset the link
training will happen with uninitialized max DP rate and lane count. Make
sure the corresponding encoder state is initialized by adding an encoder
hook called during driver init and system resume.
A better alternative would be to store all states in the CRTC state and
make this state available for the link re-training code. Also instead of
the DPCD read in the hook there should be really a proper sink HW
readout in place. Both of these require a bigger rework, so for now opting
for this minimal fix to make at least full initial modesets work.
The patch is based on
https://patchwork.freedesktop.org/patch/101473/?series=10354&rev=3
v2: (Ville)
- s/sanitize_state/sync_state/
- No point in calling the hook when CRTC is disabled, remove the call.
- No point in calling the hook for MST, remove it.
v3: Check only DPCD_REV to avoid clobbering intel_dp->dpcd. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201005230154.1477653-1-imre.deak@intel.com
Some BIOSes set an unsupported/imprecise DP link rate (for instance on
TGL A stepping). Make sure that we do an encoder recompute and a modeset
in this case.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201003001846.1271151-4-imre.deak@intel.com
Move the checks to decide whether a fastset is possible during the
initial commit to an encoder hook. This check is really encoder specific
and the next patch will also require this adding a DP encoder specific
check.
v2: Fix negated condition in gen11_dsi_initial_fastset_check().
v3: Make sure to call the hook for all encoders on the crtc. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201005215311.1475666-1-imre.deak@intel.com
The BIOS of at least one ASUS-Z170M system with an SKL I have programs
the 101b WRPLL PDIV divider value, which is the encoding for PDIV=7 with
bit#0 incorrectly set.
This happens with the
"3840x2160": 30 262750 3840 3888 3920 4000 2160 2163 2168 2191 0x48 0x9
HDMI mode (scaled from a 1024x768 src fb) set by BIOS and the
ref_clock=24000, dco_integer=383, dco_fraction=5802, pdiv=7, qdiv=1, kdiv=1
WRPLL parameters (assuming PDIV=7 was the intended setting). This
corresponds to 262749 PLL frequency/port clock.
Later the driver sets the same mode for which it calculates the same
dco_int/dco_frac/div WRPLL parameters (with the correct PDIV=7 encoding).
Based on the above, let's assume that PDIV=7 was intended and the HW
just ignores bit#0 in the PDIV register field for this setting, treating
100b and 101b encodings the same way.
While at it add the MISSING_CASE() for the p0,p2 divider decodings.
v2: (Ville)
- Add a define for the incorrect divider value.
- Emit only a debug message when detecting the incorrect divider value.
- Use fallthrough from the incorrect divider value case.
- Add the MISSING_CASE()s.
v3: Return 0 freq for incorrect divider values. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006013555.1488262-1-imre.deak@intel.com
Though less likely in practice, igt uses MI_NOOP frequently to pad out
its batch buffers. The lookup and valiation of so many MI_NOOP command
descriptions is noticeable, though the side-effect of poisoning the
last-validated-command cache is more likely to impact upon real CS.
Testcase: igt/gen9_exec_parse/bb-large
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001102632.18789-1-chris@chris-wilson.co.uk
Extend __sg_alloc_table_from_pages to support dynamic allocation of
SG table from pages. It should be used by drivers that can't supply
all the pages at one time.
This function returns the last populated SGE in the table. Users should
pass it as an argument to the function from the second call and forward.
As before, nents will be equal to the number of populated SGEs (chunks).
With this new extension, drivers can benefit the optimization of merging
contiguous pages without a need to allocate all pages in advance and
hold them in a large buffer.
E.g. with the Infiniband driver that allocates a single page for hold the
pages. For 1TB memory registration, the temporary buffer would consume only
4KB, instead of 2GB.
Link: https://lore.kernel.org/r/20201004154340.1080481-2-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
DG1 does some additional pcode/uncore handshaking at
boot time; this handshaking must complete before various other pcode
commands are effective and before general work is submitted to the GPU.
We need to poll a new pcode mailbox during startup until it reports that
this handshaking is complete.
The bspec doesn't give guidance on how long we may need to wait for this
handshaking to complete. For now, let's just set a really long timeout;
if we still don't get a completion status by the end of that timeout,
we'll just continue on and hope for the best.
v2 (Lucas): Rename macros to make clear the relation between command and
result (requested by José)
Bspec: 52065
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001063917.3133475-2-lucas.demarchi@intel.com
When using fake lmem for tests, we are overriding the setting in
device info for dgfx devices. Current users of IS_DGFX() except one are
correct. However, as we add support for DG1, we are going to use it in
additional places to trigger dgfx-only code path.
In future if we need we can use HAS_LMEM() instead of IS_DGFX() in the
places that make sense to also contemplate fake lmem use.
v2: update gen8_gmch_probe() to use HAS_LMEM(): we need to steal the
mappable aperture later(which is fine since it doesn't exist on "DGFX"),
and use it as a substitute for LMEMBAR. The !mappable aperture property
is also useful since it exercises some other parts of the code too.
(Matthew Auld)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001063917.3133475-1-lucas.demarchi@intel.com
Make lspcon_init() static since it's no longer needed outside
the compilation unit. This was correct in Kai-Heng's lspcon
patch, but I fumbled this when applying it.
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: f542d671ff ("drm/i915: Init lspcon after HPD in intel_dp_detect()")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002090446.21104-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On HP 800 G4 DM, if HDMI cable isn't plugged before boot, the HDMI port
becomes useless and never responds to cable hotplugging:
[ 3.031904] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
[ 3.031945] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port D
Seems like the lspcon chip on the system only gets powered after the
cable is plugged.
Consilidate lspcon_init() into lspcon_resume() to dynamically init
lspcon chip, and make HDMI port work.
v6:
- Rebase on latest for-linux-next.
v5:
- Consolidate lspcon_resume() with lspcon_init().
- Move more logic into lspcon code.
v4:
- Trust VBT in intel_infoframe_init().
- Init lspcon in intel_dp_detect().
v3:
- Make sure it's handled under long HPD case.
v2:
- Move lspcon_init() inside of intel_dp_hpd_pulse().
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/203
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200610075542.12882-1-kai.heng.feng@canonical.com
Now that we've plumbed the crtc state all the way down we can
eliminate the DP_TP_{CTL,STATUS} register offsets from intel_dp,
and instead we derive them directly from the crtc state.
And thus we can get rid of the nasty hack in intel_ddi_get_config()
which mutates intel_dp during the readout.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200929233449.32323-12-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Get rid of mode crtc->config usage, and some ad-hoc intel_dp state
usage by plumbing the crtc state all the way down to the link training
code.
Unfortunately we do have to keep some cached state in intel_dp so
that we can do the "does the link need retraining?" checks from
the short hpd handler.
v2: Add intel_crtc_state forward declaration
v3: Don't kill the PHY test code totally since it's
now in the hotplug work where we can get at the states
v4: Don't resurrect the debug scrambling disable bit (Imre)
Use intel_dp_mst_is_master_trans() (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001111053.24451-1-ville.syrjala@linux.intel.com
Doing any kind modeset stuff from the short hpd handler is
verboten. The ad-hoc PHY test modeset code violates this. And
by calling various link training related functions it's now
blocking further work to plumb the crtc state down into the
link training code.
Let's hack around that by pushing the PHY test stuff into the
hotplug work where it's less of a problem. Still not great but
at least acceptable. We take a few pages from the link retraining
handbook to handle the locking and whatnot.
v2: Fix the intel_dp_hotplug() return value
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930100412.9313-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The HDMI vs. not-HDMI check got inverted whem the bogus encoder->type
checks were eliminated. So now we're using 0 as the link rate on DP
and potentially non-zero on HDMI, which is exactly the opposite of
what we want. The original bogus check actually worked more correctly
by accident since if would always evaluate to true. Due to this we
now always use the RBR/HBR1 vswing table and never ever the HBR2+
vswing table. That is probably not a good way to get a high quality
signal at HBR2+ rates. Fix the check so we pick the right table.
Cc: stable@vger.kernel.org
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Fixes: 94641eb6c6 ("drm/i915/display: Fix the encoder type check")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930223642.28565-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Implement display w/a #1142. This supposedly fixes some underruns
with FBC+VTd. Bspec says we should use the same programming regardless
of circumstances. Apparently we should flip the magic bits before
turning on any planes so let's put this into the early w/as.
Cc: Lee Shawn C <shawn.c.lee@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924194810.10293-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
CNL+ can report DIMM sizes in .5 GB units. In order to not trauncate
away the .5 GB let's switch to storing the DIMM size in Gb units.
Cc: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200929131312.12999-1-ville.syrjala@linux.intel.com
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.
Fixes: ed13033f02 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk
(cherry picked from commit b7eeb2b413)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Verify that if a context is active at the time it is closed, that it is
either persistent and preemptible (with hangcheck running) or it shall
be removed from execution.
Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-close
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-3-chris@chris-wilson.co.uk
(cherry picked from commit d3bb2f9b5e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently, we check we can send a pulse prior to disabling the
heartbeat to verify that we can change the heartbeat, but since we may
re-evaluate execution upon changing the heartbeat interval we need another
pulse afterwards to refresh execution.
v2: Tvrtko asked if we could reduce the double pulse to a single, which
opened up a discussion of how we should handle the pulse-error after
attempting to change the property, and the desire to serialise
adjustment of the property with its validating pulse, and unwind upon
failure.
Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk
(cherry picked from commit 3dd66a94de)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We only allow persistent requests to remain on the GPU past the closure
of their containing context (and process) so long as they are continuously
checked for hangs or allow other requests to preempt them, as we need to
ensure forward progress of the system. If we allow persistent contexts
to remain on the system after the the hangcheck mechanism is disabled,
the system may grind to a halt. On disabling the mechanism, we sent a
pulse along the engine to remove all executing contexts from the engine
which would check for hung contexts -- but we did not prevent those
contexts from being resubmitted if they survived the final hangcheck.
Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-stop
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk
(cherry picked from commit 7a991cd3e3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The reordering and rebasing of commit 2e4c6c1a9d ("drm/i915: Remove
i915_request.lock requirement for execution callbacks") caused it to
revert an earlier correction. Let us restore commit 99f0a640d464
("drm/i915: Remove requirement for holding i915_request.lock for
breadcrumbs")
Fixes: 2e4c6c1a9d ("drm/i915: Remove i915_request.lock requirement for execution callbacks")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk
(cherry picked from commit 35faeb7de9)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Since the debugfs may peek into the GEM contexts as the corresponding
client/fd is being closed, we may try and follow a dangling pointer.
However, the context closure itself is serialised with the ctx->mutex,
so if we hold that mutex as we inspect the state coupled in the context,
we know the pointers within the context are stable and will remain valid
as we inspect their tables.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk
(cherry picked from commit 102f5aa491)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In case backoff fails with an error, we return an undefined rq,
assign err to rq correctly.
Fixes: 8a929c9eb1 ("drm/i915: Use ww pinning for intel_context_create_request()")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 4316b19dee)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As the error capture will compress user buffers as directed to by the
user, it can take an arbitrary amount of time and space. Break up the
compression loops with a call to cond_resched(), that will allow other
processes to schedule (avoiding the soft lockups) and also serve as a
warning should we try to make this loop atomic in the future.
Testcase: igt/gem_exec_capture/many-*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-2-chris@chris-wilson.co.uk
(cherry picked from commit 293f43c80c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This code should use "vma[1]" instead of "vma". The "vma" variable is a
valid pointer.
Fixes: 6b05030496 ("drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200911075243.GG12635@kadam
(cherry picked from commit 68ba71e3ae)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If we create a new node, it is possible for the slab allocator to return
us a recently freed node. If that node was just retired, it will retain
the current jiffy as its node->age. There is then a miniscule window,
where as that node is retired, it will appear on the free list with an
incorrect age and be eligible for reuse by one thread, and then by a
second thread as the correct node->age is written.
Fixes: 06b73c2d0b ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-3-chris@chris-wilson.co.uk
(cherry picked from commit 9bb34ff25c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Let's not try and use PAT attributes for I915_MAP_WC if the CPU doesn't
support PAT.
Fixes: 6056e50033 ("drm/i915/gem: Support discontiguous lmem object maps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-2-chris@chris-wilson.co.uk
(cherry picked from commit 121ba69ffd)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On 32b, highmem using a finite set of indirect PTE (i.e. vmap) to provide
virtual mappings of the high pages. As these are finite, map_new_virtual()
must wait for some other kmap() to finish when it runs out. If we map a
large number of objects, there is no method for it to tell us to release
the mappings, and we deadlock.
However, if we make an explicit vmap of the page, that uses a larger
vmalloc arena, and also has the ability to tell us to release unwanted
mappings. Most importantly, it will fail and propagate an error instead
of waiting forever.
Fixes: fb8621d3be ("drm/i915: Avoid allocating a vmap arena for a single page") #x86-32
References: e87666b52f ("drm/i915/shrinker: Hook up vmap allocation failure notifier")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.7+
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-1-chris@chris-wilson.co.uk
(cherry picked from commit 060bb115c2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If we manage to hit the intel_gt_set_wedged_on_fini() while active, i.e.
module unload during a stress test, we may cancel the requests but not
clean up. This leads to a very slow module unload as we wait for
something or other to trigger the retirement flushing, or timeout and
unload with a bunch of warnings. Instead if we explicitly cancel then
cleanup on an active unload, it should be instant and quiet.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930163253.2789-3-chris@chris-wilson.co.uk
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.
Fixes: ed13033f02 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk
This patch updates dma_buf_vunmap() and dma-buf's vunmap callback to
use struct dma_buf_map. The interfaces used to receive a buffer address.
This address is now given in an instance of the structure.
Users of the functions are updated accordingly. This is only an interface
change. It is currently expected that dma-buf memory can be accessed with
system memory load/store operations.
v2:
* include dma-buf-heaps and i915 selftests (kernel test robot)
* initialize cma_obj before using it in drm_gem_cma_free_object()
(kernel test robot)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Tomasz Figa <tfiga@chromium.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925115601.23955-4-tzimmermann@suse.de
This patch updates dma_buf_vmap() and dma-buf's vmap callback to use
struct dma_buf_map.
The interfaces used to return a buffer address. This address now gets
stored in an instance of the structure that is given as an additional
argument. The functions return an errno code on errors.
Users of the functions are updated accordingly. This is only an interface
change. It is currently expected that dma-buf memory can be accessed with
system memory load/store operations.
v3:
* update fastrpc driver (kernel test robot)
v2:
* always clear map parameter in dma_buf_vmap() (Daniel)
* include dma-buf-heaps and i915 selftests (kernel test robot)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Tomasz Figa <tfiga@chromium.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925115601.23955-3-tzimmermann@suse.de
Verify that if a context is active at the time it is closed, that it is
either persistent and preemptible (with hangcheck running) or it shall
be removed from execution.
Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-close
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-3-chris@chris-wilson.co.uk
Currently, we check we can send a pulse prior to disabling the
heartbeat to verify that we can change the heartbeat, but since we may
re-evaluate execution upon changing the heartbeat interval we need another
pulse afterwards to refresh execution.
v2: Tvrtko asked if we could reduce the double pulse to a single, which
opened up a discussion of how we should handle the pulse-error after
attempting to change the property, and the desire to serialise
adjustment of the property with its validating pulse, and unwind upon
failure.
Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk
We only allow persistent requests to remain on the GPU past the closure
of their containing context (and process) so long as they are continuously
checked for hangs or allow other requests to preempt them, as we need to
ensure forward progress of the system. If we allow persistent contexts
to remain on the system after the the hangcheck mechanism is disabled,
the system may grind to a halt. On disabling the mechanism, we sent a
pulse along the engine to remove all executing contexts from the engine
which would check for hung contexts -- but we did not prevent those
contexts from being resubmitted if they survived the final hangcheck.
Fixes: 9a40bddd47 ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-stop
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk
* DSI support for sm8150/sm8250
* Support for per-process GPU pagetables (finally!) for a6xx.
There are still some iommu/arm-smmu changes required to
enable, without which it will fallback to the current single
pgtable state. The first part (ie. what doesn't depend on
drm side patches) is queued up for v5.10[1].
* DisplayPort support. Userspace DP compliance tool support
is already merged in IGT[2]
* The usual assortment of smaller fixes/cleanups
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvqjuzH=Po_9EzzFsp2Xq3tqJUTKfsA2g09XY7_+6Ypfw@mail.gmail.com
Previously intel_dump_pipe_config() used to dump the full crtc state
whether or not the crtc was logically enabled or not. As that meant
occasionally dumping confusing stale garbage I changed it to
check whether the crtc is logically enabled or not. However I did
not realize that the state checker readout code does not
populate crtc_state.hw.{active,enabled}. Hence the state checker
dump would only give us a full dump of the sw state but not the hw
state. Fix that by populating those bits of the hw state as well.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Fixes: 10d75f5428 ("drm/i915: Fix plane state dumps")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925131656.10022-2-ville.syrjala@linux.intel.com
(cherry picked from commit 504c7bd85c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In case of DSI cmd mode, we get hw vblank counter updated after the TE
comes in, if we try to read the hw vblank counter in te handler we
wouldnt have the udpated vblank counter yet. This will lead to a state
where we would send the vblank event to the user space in the next te,
though the frame update would have completed in the first TE duration
itself. Hence switch to using software timestamp based vblank counter.
v2: Use mode_flags from crtc_state (Ville)
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924124209.17916-6-vandita.kulkarni@intel.com
In TE Gate mode or TE NO_GATE mode on every flip we need to set the
frame update request bit. After this bit is set transcoder hardware will
automatically send the frame data to the panel in case of TE NO_GATE
mode, where it sends after it receives the TE event in case of TE_GATE
mode. Once the frame data is sent to the panel, we see the frame counter
updating.
v2: Use intel_de_read/write
v3: remove the usage of private_flags
v4: Use icl_dsi in func names if non static,
fix code formatting issues. (Jani)
v5: Send frame update request at the beginning of
pipe_update_end, use crtc_state mode_flags (Ville)
v6: Add platform and dsi checks (Ville)
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928110834.15077-1-vandita.kulkarni@intel.com
In case of dual link, we get the TE on slave. So clear the TE on slave
DSI IIR.
If we are operating in TE_GATE mode, after we do a frame update, the
transcoder will send the frame data to the panel, after it receives a
TE. Whereas if we are operating in NO_GATE mode then the transcoder will
immediately send the frame data to the panel. We are not dealing with
the periodic command mode here.
v2: Pass only relevant masked bits to the handler (Jani)
v3: Fix the check for cmd mode in TE handler function.
v4: Use intel_handle_vblank instead of drm_handle_vblank (Jani)
v3: Use static on handler func (Jani)
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924124209.17916-4-vandita.kulkarni@intel.com
Configure TE interrupt as part of the vblank enable call flow.
v2: Hide the private flags check inside configure_te (Jani)
v3: Fix the position of masking de_port_masked for DSI_TE.
v4: Simplify the caller of configure_te (Jani)
v5: Clear IIR, remove the usage of private_flags
v6: including icl_dsi header is not needed
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924124209.17916-3-vandita.kulkarni@intel.com
We need details about enabling TE on which port before we enable TE
through vblank enable path. This is based on the configuration that we
receive from the VBT wrt ports, dual_link.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924124209.17916-2-vandita.kulkarni@intel.com
Dump the sizes of the software LUTs in the state dump. Makes
it a bit easier to see which is present without having to
decode it from the gamma_mode and other bits of state.
v2: Drop a spurious "is" in commit msg (Uma)
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925131656.10022-4-ville.syrjala@linux.intel.com
Previously intel_dump_pipe_config() used to dump the full crtc state
whether or not the crtc was logically enabled or not. As that meant
occasionally dumping confusing stale garbage I changed it to
check whether the crtc is logically enabled or not. However I did
not realize that the state checker readout code does not
populate crtc_state.hw.{active,enabled}. Hence the state checker
dump would only give us a full dump of the sw state but not the hw
state. Fix that by populating those bits of the hw state as well.
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Fixes: 10d75f5428 ("drm/i915: Fix plane state dumps")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925131656.10022-2-ville.syrjala@linux.intel.com
Enable asynchronous flips in i915 for gen9+ platforms.
v2: -Async flip enablement should be a stand alone patch (Paulo)
v3: -Move the patch to the end of the series (Paulo)
v4: -Rebased.
v5: -Rebased.
v6: -Rebased.
v7: -Rebased.
v8: -Rebased.
v9: -Rebased.
v10: -Rebased.
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-9-karthik.b.s@intel.com
Add the details of the implementation of asynchronous flips for i915.
v7: -Rebased.
v8: -Rebased.
v9: -Rebased.
v10: Move all documentation changes to this patch. (Ville)
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-8-karthik.b.s@intel.com
In Gen 9 and Gen 10 platforms, async address update enable bit is
double buffered. Due to this, during the transition from async flip
to sync flip we have to wait until this bit is updated before continuing
with the normal commit for sync flip.
v9: -Rename skl_toggle_async_sync() to skl_disable_async_flip_wa(). (Ville)
-Place the declarations appropriately as per need. (Ville)
-Take the lock before the reg read. (Ville)
-Fix comment and formatting. (Ville)
-Use IS_GEN_RANGE() for gen check. (Ville)
-Move skl_disable_async_flip_wa() to intel_pre_plane_update(). (Ville)
v10: -Rebased.
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-7-karthik.b.s@intel.com
This hook is added to avoid writing other plane registers in case of
async flips, so that we do not write the double buffered registers
during async surface address update.
v7: -Plane ctl needs bits from skl_plane_ctl_crtc as well. (Ville)
-Add a vfunc for skl_program_async_surface_address
and call it from intel_update_plane. (Ville)
v8: -Rebased.
v9: -Use if-else instead of return in intel_update_plane(). (Ville)
-Rename 'program_async_surface_address' to 'async_flip'. (Ville)
v10: -Check if async_flip hook is present before calling it.
Otherwise it will OOPS during legacy cursor updates. (Ville)
v11: -Rename skl_program_async_surface_address(). (Ville)
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-6-karthik.b.s@intel.com
Since the flip done event will be sent in the flip_done_handler,
no need to add the event to the list and delay it for later.
v2: -Moved the async check above vblank_get as it
was causing issues for PSR.
v3: -No need to wait for vblank to pass, as this wait was causing a
16ms delay once every few flips.
v4: -Rebased.
v5: -Rebased.
v6: -Rebased.
v7: -No need of irq disable if we are not doing vblank evade. (Ville)
v8: -Rebased.
v9: -Move the return in intel_pipe_update_end before tracepoint. (Ville)
v10: Rebased.
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-5-karthik.b.s@intel.com
If flip is requested on any other plane, reject it.
Make sure there is no change in fbc, offset and framebuffer modifiers
when async flip is requested.
If any of these are modified, reject async flip.
v2: -Replace DRM_ERROR (Paulo)
-Add check for changes in OFFSET, FBC, RC(Paulo)
v3: -Removed TODO as benchmarking tests have been run now.
v4: -Added more state checks for async flip (Ville)
-Moved intel_atomic_check_async to the end of intel_atomic_check
as the plane checks needs to pass before this. (Ville)
-Removed crtc_state->enable_fbc check. (Ville)
-Set the I915_MODE_FLAG_GET_SCANLINE_FROM_TIMESTAMP flag for async
flip case as scanline counter is not reliable here.
v5: -Fix typo and other check patch errors seen in CI
in 'intel_atomic_check_async' function.
v6: -Don't call intel_atomic_check_async multiple times. (Ville)
-Remove the check for n_planes in intel_atomic_check_async
-Added documentation for async flips. (Paulo)
v7: -Replace 'intel_plane' with 'plane'. (Ville)
-Replace all uapi.foo as hw.foo. (Ville)
-Do not use intel_wm_need_update function. (Ville)
-Add destination coordinate check. (Ville)
-Do not allow async flip with linear buffer
on older hw as it has issues with this. (Ville)
-Remove break after intel_atomic_check_async. (Ville)
v8: -Rebased.
v9: -Replace DRM_DEBUG_KMS with drm_dbg_kms(). (Ville)
-Fix comment formatting. (Ville)
-Remove gen specific checks. (Ville)
-Remove irrelevant FB size check. (Ville)
-Add missing stride check. (Ville)
-Use drm_rect_equals() instead of individual checks. (Ville)
-Call intel_atomic_check_async before state dump. (Ville)
v10: -Fix the checkpatch errors seen on CI.
v11: -Use const for all plane/crtc states. (Ville)
-Use 'switch' instead of 'if' for modifier check. (Ville)
-Move documentation changes to a single patch. (Ville)
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-4-karthik.b.s@intel.com
Set the Async Address Update Enable bit in plane ctl
when async flip is requested.
v2: -Move the Async flip enablement to individual patch (Paulo)
v3: -Rebased.
v4: -Add separate plane hook for async flip case (Ville)
v5: -Rebased.
v6: -Move the plane hook to separate patch. (Paulo)
-Remove the early return in skl_plane_ctl. (Paulo)
v7: -Move async address update enable to skl_plane_ctl_crtc() (Ville)
v8: -Rebased.
v9: -Rebased.
v10: -Rebased.
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-3-karthik.b.s@intel.com
Add enable/disable flip done functions and the flip done handler
function which handles the flip done interrupt.
Enable the flip done interrupt in IER.
Enable flip done function is called before writing the
surface address register as the write to this register triggers
the flip done interrupt
Flip done handler is used to send the page flip event as soon as the
surface address is written as per the requirement of async flips.
The interrupt is disabled after the event is sent.
v2: -Change function name from icl_* to skl_* (Paulo)
-Move flip handler to this patch (Paulo)
-Remove vblank_put() (Paulo)
-Enable flip done interrupt for gen9+ only (Paulo)
-Enable flip done interrupt in power_well_post_enable hook (Paulo)
-Removed the event check in flip done handler to handle async
flips without pageflip events.
v3: -Move skl_disable_flip_done out of interrupt handler (Paulo)
-Make the pending vblank event NULL in the beginning of
flip_done_handler to remove sporadic WARN_ON that is seen.
v4: -Calculate timestamps using flip done time stamp and current
timestamp for async flips (Ville)
v5: -Fix the sparse warning by making the function 'g4x_get_flip_counter'
static.(Reported-by: kernel test robot <lkp@intel.com>)
-Fix the typo in commit message.
v6: -Revert back to old time stamping code.
-Remove the break while calling skl_enable_flip_done. (Paulo)
v7: -Rebased.
v8: -Rebased.
v9: -Use struct drm_i915_private *i915 in new code. (Ville)
-Use intel_crtc instead of drm_crtc. (Ville)
-Do not mix the flip done and vblank hooks. (Ville)
v10: -Rebased.
Signed-off-by: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921110210.21182-2-karthik.b.s@intel.com
The reordering and rebasing of commit 2e4c6c1a9d ("drm/i915: Remove
i915_request.lock requirement for execution callbacks") caused it to
revert an earlier correction. Let us restore commit 99f0a640d464
("drm/i915: Remove requirement for holding i915_request.lock for
breadcrumbs")
Fixes: 2e4c6c1a9d ("drm/i915: Remove i915_request.lock requirement for execution callbacks")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk
GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in i915.
v2:
* move object-function instance to i915_gem_object.c (Jani)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200923102159.24084-7-tzimmermann@suse.de
This should make it harder for the kernel to corrupt the debug object
descriptor, used to call functions to fixup state and track debug objects,
by moving the structure to read-only memory.
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200815004027.2046113-3-swboyd@chromium.org
When validating a "YCbCr 4:2:0 only" mode we must take into
account the fact that we're going to be outputting YCbCr
4:2:0 or 4:4:4 (when a DP->HDMI protocol converter is doing
the 4:2:0 downsampling). For YCbCr 4:4:4 the minimum output
bpc is 8, for YCbCr 4:2:0 it'll be half that. The currently
hardcoded 6bpc is only correct for RGB 4:4:4, which we will
never use with these kinds of modes. Figure out what we're
going to output and use the correct min bpp value to validate
whether the link has sufficient bandwidth.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917214335.3569-3-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Pass the output_format directly to intel_dp_{min,output}_bpp()
rather than passing in the crtc_state and digging out the
output_format inside the functions. This will allow us to reuse
the functions for mode validation purposes.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917214335.3569-2-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Since the debugfs may peek into the GEM contexts as the corresponding
client/fd is being closed, we may try and follow a dangling pointer.
However, the context closure itself is serialised with the ctx->mutex,
so if we hold that mutex as we inspect the state coupled in the context,
we know the pointers within the context are stable and will remain valid
as we inspect their tables.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk
When roll over detected for seq_num_m, we shouldn't continue with stream
management with rolled over value.
So we are terminating the stream management retry, on roll over of the
seq_num_m.
v2:
using drm_dbg_kms instead of DRM_DEBUG_KMS [Anshuman]
v3:
dev_priv is used as i915 [JaniN]
v4:
roll over is detected at the start of the stream management.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> [v3]
Tested-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200923132435.17039-3-anshuman.gupta@intel.com
As per the HDCP2.2 compliance test 1B-10 expectation, when stream
management for a repeater fails, we retry thrice and when it fails
in all retries, HDCP2.2 reauthentication aborted at kernel.
v2:
seq_num_m++ is extended for steam management failures too.[Anshuman]
v3:
use drm_dbg_kms instead of DRM_DEBUG_KMS [Anshuman]
v4:
dev_priv is used as i915 [JaniN]
v5:
Few improvisements are done [Sean]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Tested-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200923132435.17039-2-anshuman.gupta@intel.com
The GuC communication enabled/disabled messages are too noisy in info
level. Convert them from info to debug level, and switch to device based
logging while at it.
Reported-by: Karol Herbst <kherbst@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917165056.29766-1-jani.nikula@intel.com
Since we store a pointer to the fake iommu device that is allocated on
the stack, as soon as we leave the function it goes out of scope and any
future dereference is undefined behaviour. Just in case we may need to
look at the fake iommu device after initialiation, move the allocation
from the stack into the data.
Fixes: 01b9d4e211 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916105022.28316-2-chris@chris-wilson.co.uk
(cherry picked from commit 9f9f4101fc)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- dev: More devm_drm convertions and removal of drm_dev_init
Driver Changes:
- i915: selftests improvements
- panfrost: support for Amlogic SoC
- vc4: one fix
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX2jGxQAKCRDj7w1vZxhR
xR3DAQCiZOnaxVcY49iG4343Z1aHHaIEShbnB0bDdaWstn7kiQD/UXBXUoOSFoFQ
FkTsW31JsdXNnWP5e6/eJd2Lb6waVAA=
=VlsU
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-09-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.10:
UAPI Changes:
Cross-subsystem Changes:
- virtio: Merged a PR for patches that will affect drm/virtio
Core Changes:
- dev: More devm_drm convertions and removal of drm_dev_init
- atomic: Split out drm_atomic_helper_calc_timestamping_constants of
drm_atomic_helper_update_legacy_modeset_state
- ttm: More rework
Driver Changes:
- i915: selftests improvements
- panfrost: support for Amlogic SoC
- vc4: one fix
- tree-wide: conversions to devm_drm_dev_alloc,
- ast: simplifications of the atomic modesetting code
- panfrost: multiple fixes
- vc4: multiple fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921152956.2gxnsdgxmwhvjyut@gilmour.lan
- Reduce INTEL_DISPLAY_ENABLED to just removed outputs treating it as disconnected (Ville)
- Introducing new AUX, DVO, and TC ports and refactoring code around hot plug interrupts for those. (Ville)
- Centralize PLL_ENABLE register lookup (Anusha)
- Improvements around DP downstream facing ports (DFP). (Ville)
- Enable YCbCr 444->420 conversion for HDMI DFPs. Ville
- Remove the old global state on Display's atomic modeset (Ville)
- Nuke force_min_cdclk_changed (Ville)
- Extend a TGL W/A to all SKUs and to RKL (Swathi)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAl9jzm4ACgkQ+mJfZA7r
E8otXgf/V0gGTWSo/CUiBIDjW6jn9f/1pRmF2W0a0M8duLwlFMGVj/TGecgHRTNf
ZHd66tqb7v2wxc+YouGYYZNcKyWwdH8nhTEjn8Zt3cIc2lweh3cWKIr0S+MiBQGo
klaq+knIbr9gk3tJS4gvM0OQv0lPoXp6Gu8FsTAfmvkdt8L93OeNpmbA4TtSFbv5
sVm6e4LWI36TZuDO5VRDHTfLrQ7XkVte5sk2CzRRap+L77+RpBwD8p+QovRmNK4Q
hTlfDHrLZR2XGpeTlqnqfzYq210hNyDspdhTENcnFrrxtB6pvd/CfOGRqEm/9MvX
A37jLJfTtpeLRlQJDPt3KzOG581a+A==
=gR/1
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2020-09-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:
- Reduce INTEL_DISPLAY_ENABLED to just removed outputs treating it as disconnected (Ville)
- Introducing new AUX, DVO, and TC ports and refactoring code around hot plug interrupts for those. (Ville)
- Centralize PLL_ENABLE register lookup (Anusha)
- Improvements around DP downstream facing ports (DFP). (Ville)
- Enable YCbCr 444->420 conversion for HDMI DFPs. Ville
- Remove the old global state on Display's atomic modeset (Ville)
- Nuke force_min_cdclk_changed (Ville)
- Extend a TGL W/A to all SKUs and to RKL (Swathi)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918173013.GA748558@intel.com
To avoid having to create all the device and driver scaffolding we
just manually create and destroy a devres_group.
v2: Rebased
v3: use devres_open/release_group so we can use devm without real
hacks in the driver core or having to create an entire fake bus for
testing drivers. Might want to extract this into helpers eventually,
maybe as a mock_drm_dev_alloc or test_drm_dev_alloc.
v4:
- Fix IS_ERR handling (Matt)
- Delete surplus put_device() in mock_device_release (intel-gfx-ci)
v5:
- do not switch to device_add - it breaks runtime pm in the tests and
with the devres_group_add/release no longer needed for automatic
cleanup (CI). Update commit message to match.
- print correct error in pr_err (Matt)
v6: Remove now unused err variable (CI).
v7: More warning fixes ...
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (v3)
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> (v4)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200919134032.2488403-1-daniel.vetter@ffwll.ch
Just some prep work before we rework the lifetime handling, which
requires replacing all the drm_dev_put in selftests by something else.
v2: Don't go with a static inline, upsets the header tests and
separation.
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918132505.2316382-2-daniel.vetter@ffwll.ch
As the last user was eliminated in commit e21fecdcde40 ("drm/i915/gt:
Distinguish the virtual breadcrumbs from the irq breadcrumbs"), we can
remove the function. One less implementation detail creeping beyond its
scope.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826132811.17577-16-chris@chris-wilson.co.uk
Shrink the hold time for the error capture mutex to just around the
acquire/release of the PTE used for reading back the object via the
Global GTT. For platforms that do not need the GGTT read back, we can
skip the mutex entirely and allow concurrent error capture. Where we do
use the GGTT, by restricting the hold time around the slow readback and
compression, we are more resilient against softlockups (khungtaskd) as
the heartbeat may well also trigger an error while the first is on
going, and this allows the heartbeat reset to skip past the capture and
not be stalled.
Testcase: igt/gem_exec_capture/many-*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-3-chris@chris-wilson.co.uk
As the error capture will compress user buffers as directed to by the
user, it can take an arbitrary amount of time and space. Break up the
compression loops with a call to cond_resched(), that will allow other
processes to schedule (avoiding the soft lockups) and also serve as a
warning should we try to make this loop atomic in the future.
Testcase: igt/gem_exec_capture/many-*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-2-chris@chris-wilson.co.uk
When debugging the engine state, include the user properties that may,
or may not, have been adjusted by the user/test.
For example,
vecs0
...
Properties:
heartbeat_interval_ms: 2500 [default 2500]
max_busywait_duration_ns: 8000 [default 8000]
preempt_timeout_ms: 640 [default 640]
stop_timeout_ms: 100 [default 100]
timeslice_duration_ms: 1 [default 1]
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-1-chris@chris-wilson.co.uk
Since we now have proper old and new cdclk state we no longer
need to keep this flag to indicate that the force min cdclk has
changed. Instead just check if the old vs. new value are different.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200714152626.380-4-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
For platforms that can't do native 4:2:0 outout we may still be
able to do it by getting the DP->HDMI protocol converter to
perform the 4:4:4->4:2:0 downsamling for us. In this case we
have to configure our hardware to output YCbCr 4:4:4, which we've
already hooked up so all we need to do is flip the switch.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-19-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Account for the TMDS clock limits declared by the DFP
when determining what color depth we're going to use.
v2: Drop the reference to DP++ dongle since it's not handled here
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-17-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pull the "do we want to enable audio?" computation into a small helper
to make intel_hdmi_compute_config() less messy. Will make it easier to
add more checks for this later (eg. we should actually be checking
at the hblank is long enough for audio transmission).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-16-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
DP 1.3 adds some extra control knobs for DP->HDMI protocol conversion.
Let's use that to configure the "HDMI mode" (ie. infoframes vs. not)
based on the capabilities of the sink.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-13-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use the new helpers to extract the TMDS clock limits from
the downstream facing port and check them in .mode_valid().
TODO: we should check these in .compute_config() too to eg.
determine if we can do deep color on the HDMI side or not
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-12-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Move the downstream facing port dotclock check into a new function
(intel_dp_mode_valid_downstream()) so that we have a nice future
place where we can collect other related checks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-10-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We want to differentiate between the DFP dotclock and TMDS clock
limits. Let's convert the current thing to just give us the
dotclock limit.
v2: Use Returns: for kdoc (Lyude)
Fix up nouveau code too
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-9-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Deal with more cases in drm_dp_downstream_max_bpc():
- DPCD 1.0 -> assume 8bpc for non-DP
- DPCD 1.1+ DP (or DP++ with DP sink) -> allow anything
- DPCD 1.1+ TMDS -> check the caps, assume 8bpc if the value is crap
- anything else -> assume 8bpc
v2: Use Returns: for kdoc (Lyude)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-8-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Stash the downstream facing port max bpc away during
intel_dp_set_edid(). We'll soon need the EDID in there so
we can't figure this out so easily during .compute_config() anymore.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-6-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Non-HDMI sinks shouldn't be sent infoframes. Check for that when
using LSPCON.
FIXME: How do we turn off infoframes once enabled? Do we even
have to?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200904115354.25336-3-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This code should use "vma[1]" instead of "vma". The "vma" variable is a
valid pointer.
Fixes: 6b05030496 ("drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200911075243.GG12635@kadam
Please pull a set of fixes for various DRM drivers that finally resolve
incorrect usage of the scatterlists (struct sg_table nents and orig_nents
entries), what causes issues when IOMMU is used.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910080505.24456-1-m.szyprowski@samsung.com
Since we store a pointer to the fake iommu device that is allocated on
the stack, as soon as we leave the function it goes out of scope and any
future dereference is undefined behaviour. Just in case we may need to
look at the fake iommu device after initialiation, move the allocation
from the stack into the data.
Fixes: 01b9d4e211 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916105022.28316-2-chris@chris-wilson.co.uk
Just in case the caller passes in 0 for both slow&fast timeouts, make
sure we initialise the stack value returned. Add an assert so that we
don't make the mistake of passing 0 timeouts for the wait.
drivers/gpu/drm/i915/intel_uncore.c:2011 __intel_wait_for_register_fw() error: uninitialized symbol 'reg_value'.
References: 3f649ab728 ("treewide: Remove uninitialized_var() usage")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916105022.28316-1-chris@chris-wilson.co.uk
(NOTE: This is the minimal backportable fix, a full fix is being
developed at https://patchwork.freedesktop.org/patch/388048/)
The flags passed to the wait_entry.func are passed onwards to
try_to_wake_up(), which has a very particular interpretation for its
wake_flags. In particular, beyond the published WF_SYNC, it has a few
internal flags as well. Since we passed the fence->error down the chain
via the flags argument, these ended up in the default_wake_function
confusing the kernel/sched.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110
Fixes: ef46884975 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152144.1100-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added a note and link about more complete fix]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit f4b3c39554)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
To implement preempt-to-busy (and so efficient timeslicing and best utilization
of the hardware submission ports) we let the GPU run asynchronously in respect
to the ELSP submission queue. This created challenges in keeping and accessing
the driver state mirroring the asynchronous GPU execution.
The latest occurence of this was spotted by KCSAN:
[ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915]
[ 1413.563221]
[ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1:
[ 1413.563548] __await_execution+0x217/0x370 [i915]
[ 1413.563891] i915_request_await_dma_fence+0x4eb/0x6a0 [i915]
[ 1413.564235] i915_request_await_object+0x421/0x490 [i915]
[ 1413.564577] i915_gem_do_execbuffer+0x29b7/0x3c40 [i915]
[ 1413.564967] i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915]
[ 1413.564998] drm_ioctl_kernel+0x156/0x1b0
[ 1413.565022] drm_ioctl+0x2ff/0x480
[ 1413.565046] __x64_sys_ioctl+0x87/0xd0
[ 1413.565069] do_syscall_64+0x4d/0x80
[ 1413.565094] entry_SYSCALL_64_after_hwframe+0x44/0xa9
To complicate matters, we have to both avoid the read tearing of *active and
avoid any write tearing as perform the pending[] -> inflight[] promotion of the
execlists.
This is because we cannot rely on the memcpy doing u64 aligned copies on all
kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE
annotations to satisfy KCSAN.
v2: When in doubt, write the same comment again.
v3: Expanded commit message.
Fixes: b55230e5e8 ("drm/i915: Check for awaits on still currently executing requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added expanded commit message from Tvrtko and Chris]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit b4d9145b01)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
As we now protect the timeline list using RCU, we can drop the
timeline->mutex for guarding the list iteration during context close, as
we are searching for an inflight request. Any new request will see the
context is banned and not be submitted. In doing so, pull the checks for
a concurrent submission of the request (notably the
i915_request_completed()) under the engine spinlock, to fully serialise
with __i915_request_submit()). That is in the case of preempt-to-busy
where the request may be completed during the __i915_request_submit(),
we need to be careful that we sample the request status after
serialising so that we don't miss the request the engine is actually
submitting.
Fixes: 4a31741521 ("drm/i915/gem: Refine occupancy test in kill_context()")
References: d22d2d073e ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622
References: https://gitlab.freedesktop.org/drm/intel/-/issues/2158
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200806105954.7766-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 736e785f9b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Avoid exposing a partially constructed context by deferring the
list_add() from the initial construction to the end of registration.
Otherwise, if we peek into the list of contexts from inside debugfs, we
may see the partially constructed context and chase down some dangling
incomplete pointers.
Reported-by: CQ Tang <cq.tang@intel.com>
Fixes: 3aa9945a52 ("drm/i915: Separate GEM context construction and registration to userspace")
References: f6e8aa3871 ("drm/i915: Report the number of closed vma held by each context in debugfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730092856.23615-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit eb4dedae92)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We currenty check for platform at multiple parts in the driver
to grab the correct PLL. Let us begin to centralize it through a
helper function.
v2: s/intel_get_pll_enable_reg()/intel_combo_pll_enable_reg() (Ville)
v3: Clean up combo_pll_disable() (Rodrigo)
v4: s/dev_priv/i915 (Jani)
Move static and return type to the same line( Ville, Jani)
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200914175703.15024-1-anusha.srivatsa@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The constant N value (0x8000) is used by i915 DP
driver. Define this value in dp helper header file
to use in multiple Display Port drivers. Change
i915 driver accordingly.
Change in v6: Change commit message
Signed-off-by: Chandan Uddaraju <chandanu@codeaurora.org>
Signed-off-by: Vara Reddy <varar@codeaurora.org>
Signed-off-by: Tanmay Shah <tanmay@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Introduce intel_hpd_hotplug_irqs() as a partner to
intel_hpd_enabled_irqs(). There's no need to care about the
encoders which we're not exposing, so we can avoid hardcoding
the masks in various places.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630215601.28557-12-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Make a clean split between hpd pins for DDI vs. TC. This matches
how the actual hardware is split.
And with this we move the DDI/PHY->HPD pin mapping into the encoder
init instead of having to remap yet again in the interrupt code.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630215601.28557-11-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Currently DP/HDMI/DDI encoders init their hpd_pin from the
connector init. Let's move it to the encoder init so that
we don't need to add platform specific junk to the connector
init (which is shared by all g4x+ platforms).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630215601.28557-10-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
gen11_hpd_detection_setup() is missing ports TC5/6. Add them.
TODO: Might be nice to only enable the hpd detection logic
for ports we actually have. Should be rolled out for all
platforms if/when done...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630215601.28557-8-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
If we find the GPU didn't update the CSB within 50us, we currently fail
and eventually reset the GPU. Lets report the value from the mmio space
as a last resort, it may just stave off an unnecessary GPU reset.
References: HSDES#22011327657
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-4-chris@chris-wilson.co.uk
Since we expect to inline the csb_parse() routines, the w/a for the
stale CSB data on Tigerlake will be pulled into process_csb(), and so we
might as well simply reuse the logic for all, and so will hopefully
avoid any strange behaviour on Icelake that was not covered by our
previous w/a.
References: d8f5053117 ("drm/i915/icl: Forcibly evict stale csb entries")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-3-chris@chris-wilson.co.uk
On Tigerlake, we are seeing a repeat of commit d8f5053117 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.
However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:
513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f5053117 ("drm/i915/icl: Forcibly evict stale csb entries")
References: HSDES#22011327657, HSDES#1508287568
Suggested-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org # v5.4
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-2-chris@chris-wilson.co.uk
A CSB entry is 64b, and it is simpler for us to treat it as an array of
64b entries than as an array of pairs of 32b entries.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-1-chris@chris-wilson.co.uk
Since the display hardware is all there even when INTEL_DISPLAY_ENABLED
return false we have to be capable of shutting it down cleanly so
as to not anger the hw. To that end let's reduce the effect of
!INTEL_DISPLAY_ENABLE to just treating all outputs as disconnected.
Should prevent anyone from automagically enabling any of them, while
still allowing us to cleanly shut them down.
v2: Put the check into the right place for CRT
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910164256.25983-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Having a mode where the display hardware is present but we try
to pretend it isn't just leads to massive headaches when trying
to reason what the fallout might be from skipping some random
bits of programming.
Let's just neuter INTEL_DISPLAY_ENABLED so that we treat the
hardware as fully present, except we just don't register any
outputs. That's still rather sketchy if the outputs are already
enabled when the driver is loaded. I think the simplest solution
would be to probe everything as normal and just return
disconnected" from all .detect() hooks. That would avoid anything
automagically enabling those outputs, but the driver could then
shut things down using the normal codepaths.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200909213824.12390-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
If we create a new node, it is possible for the slab allocator to return
us a recently freed node. If that node was just retired, it will retain
the current jiffy as its node->age. There is then a miniscule window,
where as that node is retired, it will appear on the free list with an
incorrect age and be eligible for reuse by one thread, and then by a
second thread as the correct node->age is written.
Fixes: 06b73c2d0b ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-3-chris@chris-wilson.co.uk
On 32b, highmem using a finite set of indirect PTE (i.e. vmap) to provide
virtual mappings of the high pages. As these are finite, map_new_virtual()
must wait for some other kmap() to finish when it runs out. If we map a
large number of objects, there is no method for it to tell us to release
the mappings, and we deadlock.
However, if we make an explicit vmap of the page, that uses a larger
vmalloc arena, and also has the ability to tell us to release unwanted
mappings. Most importantly, it will fail and propagate an error instead
of waiting forever.
Fixes: fb8621d3be ("drm/i915: Avoid allocating a vmap arena for a single page") #x86-32
References: e87666b52f ("drm/i915/shrinker: Hook up vmap allocation failure notifier")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.7+
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-1-chris@chris-wilson.co.uk
We update the timestamping constants per-crtc explicitly in
intel_crtc_update_active_timings(). Furtermore the helper will
use uapi.adjusted_mode whereas we want hw.adjusted_mode. Thus
let's drop the helper call an rely on what we already have in
intel_crtc_update_active_timings(). We can now also drop the
hw.adjusted_mode -> uapi.adjusted_mode copy hack that was added
to keep the helper from deriving the timestamping constants from
the wrong thing.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907120026.6360-3-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The timestamping constants have nothing to do with any legacy state
so should not be updated from
drm_atomic_helper_update_legacy_modeset_state().
Let's make everyone call drm_atomic_helper_calc_timestamping_constants()
directly instead of relying on
drm_atomic_helper_update_legacy_modeset_state() to call it.
@@
expression S;
@@
- drm_atomic_helper_calc_timestamping_constants(S);
@@
expression D, S;
@@
drm_atomic_helper_update_legacy_modeset_state(D, S);
+ drm_atomic_helper_calc_timestamping_constants(S);
v2: Update drm_crtc_vblank_helper_get_vblank_timestamp{,_internal}() docs (Daniel)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907120026.6360-2-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9epdgeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG9IMH/jHCRSbcsIXHuQHn
xcRLlhrDHfXoBza7auHfPWx2+9DZsmaSJs/SEiTGNag0Bi7jBcWcwBpsep7iVG/+
WiftD5uOMhZigyuvfMFrt0mjr2Kr3wg5p58lwMBeBdm8iL5uKV8ehKsh05/Fral2
6hu3jP8L0PCZMpF+sZ7s2jlhfVUMmjA8VzXZCvgQtmhoraHiF3mzfkcSMxnHwBPO
HLo+TDDm49u+LbVsJT7+cSTiWxuUJCbix9Q4PCTx/BGg4ezYsjc6v0BnYRaYtrrA
1uYiT6PVBEUkYYBHKQlD3N2KnUmbKx7dGUF4t+peTg5/JiocAJMNi1N9Qzvv7N6Q
CqTiuio=
=q+kJ
-----END PGP SIGNATURE-----
Merge v5.9-rc5 into drm-next
Paul needs 1a21e5b930 ("drm/ingenic: Fix leak of device_node
pointer") and 3b5b005ef7 ("drm/ingenic: Fix driver not probing when
IPU port is missing") from -fixes to be able to merge further ingenic
patches into -next.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
g4x+ sprites have an extra cdclk limitation listed for RGB formats.
For some random reason I chose to use cpp>=4 as the check for that.
While that does actually work let's deobfuscate it by checking
for !is_yuv instead. I suspect is_yuv didn't exist way back when
I originally write the code.
Also drop the duplicate comment.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200206201204.31704-2-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Even if we're not doing downscaling we should account for
some of the extra dotclock limitations for g4x+ sprites. In
particular we must never exceed the 90% rule, and with RGB
that limits actually drops to 80%.
So instead of bailing out when upscaling let's clamp the
scaling factor appropriately and go through the rest of
calculation normally. By luck we already did the full
calculations for the 1:1 case.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200206201204.31704-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
The CACHE_MODE_0 save/restore was added without explanation in
commit 1f84e550a8 ("drm/i915 more registers for S3 (DSPCLK_GATE_D,
CACHE_MODE_0, MI_ARB_STATE)"). If there are any bits we care about
those should be set explicitly during some appropriate init function.
Let's assume it's all good and just nuke this magic save/restore.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200908140210.31048-4-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Originally added in commit 1f84e550a8 ("drm/i915 more registers for
S3 (DSPCLK_GATE_D, CACHE_MODE_0, MI_ARB_STATE)") to fix some underruns.
I suspect that was due to the trickle feed settings getting clobbered
during suspend. We've been disabling trickle feed explicitly since
commit 20f949670f ("drm/i915: Disable trickle feed via MI_ARB_STATE
for the gen4") so this magic save/restore should no longer be needed.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200908140210.31048-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
The FBC_CONTROL save restore is there just to preserve the
compression interval setting. Since commit a68ce21ba0
("drm/i915/fbc: Store the fbc1 compression interval in the params")
we've been explicitly setting the interval to a specific
value, so the sace/restore is now entirely pointless.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200908140210.31048-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Current BDW virtual display port is initialized as PORT_B, so need
to use same port for VFIO EDID region, otherwise invalid EDID blob
pointer is assigned which caused kernel null pointer reference. We
might evaluate actual display hotplug for BDW to make this function
work as expected, anyway this is always required to be fixed first.
Reported-by: Alejandro Sior <aho@sior.be>
Cc: Alejandro Sior <aho@sior.be>
Fixes: 0178f4ce3c ("drm/i915/gvt: Enable vfio edid for all GVT supported platform")
Reviewed-by: Hang Yuan <hang.yuan@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200914030302.2775505-1-zhenyuw@linux.intel.com
There's no real reason to stash away the DPIO PHY IOSF sideband port
numbers for VLV/CHV. Just compute them at runtime in the sideband code.
Gets rid of the oddball intel_init_dpio() function from the high level
init flow.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907162709.29579-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Logically part of the display restore.
Note: This has been in place since the introduction of gmbus
support. The gmbus code also does the resets before transfers. Is this
really needed, or a historical accident?
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910095227.9466-3-jani.nikula@intel.com
Disable all display feature flags when there are no pipes i.e. there is
no display. This should help with not having to additionally check for
HAS_DISPLAY() when a feature flag check would suffice.
Also disable modeset and atomic driver features.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910095227.9466-1-jani.nikula@intel.com