Implement multi-lrc submission via a single workqueue entry and single
H2G. The workqueue entry contains an updated tail value for each
request, of all the contexts in the multi-lrc submission, and updates
these values simultaneously. As such, the tasklet and bypass path have
been updated to coalesce requests into a single submission.
v2:
(John Harrison)
- s/wqe/wqi
- Use FIELD_PREP macros
- Add GEM_BUG_ONs ensures length fits within field
- Add comment / white space to intel_guc_write_barrier
(Kernel test robot)
- Make need_tasklet a static function
v3:
(Docs)
- A comment for submission_stall_reason
v4:
(Kernel test robot)
- Initialize return value in bypass tasklt submit function
(John Harrison)
- Add comment near work queue defs
- Add BUILD_BUG_ON to ensure WQ_SIZE is a power of 2
- Update write_barrier comment to talk about work queue
v5:
(John Harrison)
- Fix typo in work queue comment
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-13-matthew.brost@intel.com
When running the GuC the GPU can't be considered idle if the GuC still
has contexts pinned. As such, a call has been added in
intel_gt_wait_for_idle to idle the UC and in turn the GuC by waiting for
the number of unpinned contexts to go to zero.
v2: rtimeout -> remaining_timeout
v3: Drop unnecessary includes, guc_submission_busy_loop ->
guc_submission_send_busy_loop, drop negatie timeout trick, move a
refactor of guc_context_unpin to earlier path (John H)
v4: Add stddef.h back into intel_gt_requests.h, sort circuit idle
function if not in GuC submission mode
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-16-matthew.brost@intel.com
With GuC scheduling, it isn't safe to unpin a context while scheduling
is enabled for that context as the GuC may touch some of the pinned
state (e.g. LRC). To ensure scheduling isn't enabled when an unpin is
done, a call back is added to intel_context_unpin when pin count == 1
to disable scheduling for that context. When the response CTB is
received it is safe to do the final unpin.
Future patches may add a heuristic / delay to schedule the disable
call back to avoid thrashing on schedule enable / disable.
v2:
(John H)
- s/drm_dbg/drm_err
(Daneiel)
- Clean up sched state function
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-9-matthew.brost@intel.com
Implement GuC context operations which includes GuC specific operations
alloc, pin, unpin, and destroy.
v2:
(Daniel Vetter)
- Use msleep_interruptible rather than cond_resched in busy loop
(Michal)
- Remove C++ style comment
v3:
(Matthew Brost)
- Drop GUC_ID_START
(John Harrison)
- Fix a bunch of typos
- Use drm_err rather than drm_dbg for G2H errors
(Daniele)
- Fix ;; typo
- Clean up sched state functions
- Add lockdep for guc_id functions
- Don't call __release_guc_id when guc_id is invalid
- Use MISSING_CASE
- Add comment in guc_context_pin
- Use shorter path to rpm
(Daniele / CI)
- Don't call release_guc_id on an invalid guc_id in destroy
v4:
(Daniel Vetter)
- Add FIXME comment
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-7-matthew.brost@intel.com
CTB writes are now in the path of command submission and should be
optimized for performance. Rather than reading CTB descriptor values
(e.g. head, tail) which could result in accesses across the PCIe bus,
store shadow local copies and only read/write the descriptor values when
absolutely necessary. Also store the current space in the each channel
locally.
v2:
(Michal)
- Add additional sanity checks for head / tail pointers
- Use GUC_CTB_HDR_LEN rather than magic 1
v3:
(Michal / John H)
- Drop redundant check of head value
v4:
(John H)
- Drop redundant checks of tail / head values
v5:
(Michal)
- Address more nits
v6:
(Michal)
- Add GEM_BUG_ON sanity check on ctb->space
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-7-matthew.brost@intel.com
Add non blocking CTB send function, intel_guc_send_nb. GuC submission
will send CTBs in the critical path and does not need to wait for these
CTBs to complete before moving on, hence the need for this new function.
The non-blocking CTB now must have a flow control mechanism to ensure
the buffer isn't overrun. A lazy spin wait is used as we believe the
flow control condition should be rare with a properly sized buffer.
The function, intel_guc_send_nb, is exported in this patch but unused.
Several patches later in the series make use of this function.
v2:
(Michal)
- Use define for H2G room calculations
- Move INTEL_GUC_SEND_NB define
(Daniel Vetter)
- Use msleep_interruptible rather than cond_resched
v3:
(Michal)
- Move includes to following patch
- s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
v4:
(John H)
- Update comment, add type local variable
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-5-matthew.brost@intel.com
Ensure H2G buffer updates are visible before descriptor tail updates by
inserting a barrier between the H2G buffer update and the tail. The
barrier is simple wmb() for SMEM and is register write for LMEM. This is
needed if more than 1 H2G can be inflight at once.
If this barrier is not inserted it is possible the descriptor tail
update is scene by the GuC before H2G buffer update which results in the
GuC reading a corrupt H2G value. This can bring down the H2G channel
among other bad things.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-16-matthew.brost@intel.com
We have a couple of failure injection points in the CT enablement path,
so we need to use i915_probe_error() to select the appropriate log level.
A new macro (CT_PROBE_ERROR) has been added to the set of CT logging
macros to be used in this scenario and upcoming ones.
While adding the new macros, fix the underlying logging mechanics used
by the existing ones (DRM_DEV_* -> drm_*) and move the inlines to
before they're used inside the macros.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-3-matthew.brost@intel.com
We should never BUG_ON on any corruption in CTB descriptor as
data there can be also modified by the GuC. Instead we can
use flag "is_in_error" to indicate that we will not process
any further messages over this CTB (until reset). While here
move descriptor error reporting to the function that actually
touches that descriptor.
Note that unexpected content of the specific CT messages, that
still complies with generic CT message format, shall not trigger
disabling whole CTB, as that might just indicate new unsupported
message types.
v2: drop redundant message (Daniele)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-2-michal.wajdeczko@intel.com
The GuC supports having multiple CT buffer pairs and we designed our
implementation with that in mind. However, the different channels are not
processed in parallel within the GuC, so there is very little advantage
in having multiple channels (independent locks?), compared to the
drawbacks (one channel can starve the other if messages keep being
submitted to it). Given this, it is unlikely we'll ever add a second
channel and therefore we can simplify our code by removing the
flexibility.
v2: split substructure grouping to separate patch, improve docs (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217012316.13271-3-daniele.ceraolospurio@intel.com