We need to flush TLBs before releasing backing store otherwise userspace
is able to encounter stale entries if a) it is not declaring access to
certain buffers and b) it races with the backing store release from a
such undeclared execution already executing on the GPU in parallel.
The approach taken is to mark any buffer objects which were ever bound
to the GPU and to trigger a serialized TLB flush when their backing
store is released.
Alternatively the flushing could be done on VMA unbind, at which point
we would be able to ascertain whether there is potential a parallel GPU
execution (which could race), but essentially it boils down to paying
the cost of TLB flushes potentially needlessly at VMA unbind time (when
the backing store is not known to be going away so not needed for
safety), versus potentially needlessly at backing store relase time
(since we at that point cannot tell whether there is anything executing
on the GPU which uses that object).
Thereforce simplicity of implementation has been chosen for now with
scope to benchmark and refine later as required.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TC voltage swing programming sequence was updated with a new step.
BSpec: 54956
Cc: stable@vger.kernel.org
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220113174826.50272-1-jose.souza@intel.com
(cherry picked from commit 5ff59dddac)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Driver Changes:
- Added bits of DG2 support around page table handling (Stuart Summers, Matthew Auld)
- Fixed wakeref leak in PMU busyness during reset in GuC mode (Umesh Nerlige Ramappa)
- Fixed debugfs access crash if GuC failed to load (John Harrison)
- Bring back GuC error log to error capture, undoing accidental earlier breakage (Thomas Hellström)
- Fixed memory leak in error capture caused by earlier refactoring (Thomas Hellström)
- Exclude reserved stolen from driver use (Chris Wilson)
- Add memory region sanity checking and optional full test (Chris Wilson)
- Fixed buffer size truncation in TTM shmemfs backend (Robert Beckett)
- Use correct lock and don't overwrite internal data structures when stealing GuC context ids (Matthew Brost)
- Don't hog IRQs when destroying GuC contexts (John Harrison)
- Make GuC to Host communication more robust (Matthew Brost)
- Continuation of locking refactoring around VMA and backing store handling (Maarten Lankhorst)
- Improve performance of reading GuC log from debugfs (John Harrison)
- Log when GuC fails to reset an engine (John Harrison)
- Speed up GuC/HuC firmware loading by requesting RP0 (Vinay Belgaumkar)
- Further work on asynchronous VMA unbinding (Thomas Hellström, Christian König)
- Refactor GuC/HuC firmware handling to prepare for future platforms (John Harrison)
- Prepare for future different GuC/HuC firmware signing key sizes (Daniele Ceraolo Spurio, Michal Wajdeczko)
- Add noreclaim annotations (Matthew Auld)
- Remove racey GEM_BUG_ON between GPU reset and GuC communication handling (Matthew Brost)
- Refactor i915->gt with to_gt(i915) to prepare for future platforms (Michał Winiarski, Andi Shyti)
- Increase GuC log size for CONFIG_DEBUG_GEM (John Harrison)
- Fixed engine busyness in selftests when in GuC mode (Umesh Nerlige Ramappa)
- Make engine parking work with PREEMPT_RT (Sebastian Andrzej Siewior)
- Replace X86_FEATURE_PAT with pat_enabled() (Lucas De Marchi)
- Selftest for stealing of guc ids (Matthew Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YcRvKO5cyPvIxVCi@tursulin-mobl2
By default, GT (and GuC) run at RPn. Requesting for RP0
before firmware load can speed up DMA and HuC auth as well.
In addition to writing to 0xA008, we also need to enable
swreq in 0xA024 so that Punit will pay heed to our request.
SLPC will restore the frequency back to RPn after initialization,
but we need to manually do that for the non-SLPC path.
We don't need a manual override in the SLPC disabled case, just
use the intel_rps_set function to ensure consistent RPS state.
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216233022.21351-1-vinay.belgaumkar@intel.com
drm/i915 feature pull #2 for v5.17:
Features and functionality:
- Add eDP privacy screen support (Hans)
- Add Raptor Lake S (RPL-S) support (Anusha)
- Add CD clock squashing support (Mika)
- Properly support ADL-P without force probe (Clint)
- Enable pipe color support (10 bit gamma) for display 13 platforms (Uma)
- Update ADL-P DMC firmware to v2.14 (Madhumitha)
Refactoring and cleanups:
- More FBC refactoring preparing for multiple FBC instances (Ville)
- Plane register cleanups (Ville)
- Header refactoring and include cleanups (Jani)
- Crtc helper and vblank wait function cleanups (Jani, Ville)
- Move pipe/transcoder/abox masks under intel_device_info.display (Ville)
Fixes:
- Add a delay to let eDP source OUI write take effect (Lyude)
- Use div32 version of MPLLB word clock for UHBR on SNPS PHY (Jani)
- Fix DMC firmware loader overflow check (Harshit Mogalapalli)
- Fully disable FBC on FIFO underruns (Ville)
- Disable FBC with double wide pipe as mutually exclusive (Ville)
- DG2 workarounds (Matt)
- Non-x86 build fixes (Siva)
- Fix HDR plane max width for NV12 (Vidya)
- Disable IRQ for selftest timestamp calculation (Anshuman)
- ADL-P VBT DDC pin mapping fix (Tejas)
Merges:
- Backmerge drm-next for privacy screen plumbing (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87ee6f5h9u.fsf@intel.com
Set CD clock squashing registers based on selected CD clock.
v2: use slk_cdclk_decimal() to compute decimal values instead of a
specific table (Ville)
Set waveform based on CD clock table (Ville)
Drop unnecessary local variable (Ville)
v3: Correct function naming (Ville)
Correct if-else structure (Ville)
[v4: vsyrjala: Fix spaces vs. tabs]
[v5: vsyrjala: Fix cd2x divider calculation (Uma),
Add warn to waveform lookup (Uma),
Handle bypass freq in waveform lookup,
Generalize waveform handling in bxt_set_cdclk()]
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211119131348.725220-4-mika.kahola@intel.com
Let's just stick to 32bit mmio accesses so we can get rid
of the bare "uncore" reg access in display code. The register
are defined as 32bit in the spec anyway.
We could define a 64bit "de" variant I suppose, but doesn't
really make much sense just for this one case, and when we
start to use the DSB for this stuff we'd also need another
64bit variant for that. Just easier to do 32bit always.
While at it we can reorder stuff a bit so that we write the
registers in order of increasing offset (more or less).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211201152552.7821-2-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
This workaround is documented a bit strangely in the bspec; it's listed
as an A0 workaround, but the description clarifies that the workaround
is implicitly handled by the hardware and what the driver really needs
to do is program a chicken bit to reenable some internal behavior.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116174818.2128062-3-matthew.d.roper@intel.com
TileF(Tile4 in bspec) format is 4K tile organized into
64B subtiles with same basic shape as for legacy TileY
which will be supported by Display13.
v2: - Fixed wrong case condition(Jani Nikula)
- Increased I915_FORMAT_MOD_F_TILED up to 12(Imre Deak)
v3: - s/I915_TILING_F/TILING_4/g
- s/I915_FORMAT_MOD_F_TILED/I915_FORMAT_MOD_4_TILED/g
- Removed unneeded fencing code
v4: - Rebased, fixed merge conflict with new table-oriented
format modifier checking(Stan)
- Replaced the rest of "Tile F" mentions to "Tile 4"(Stan)
v5: - Still had to remove some Tile F mentionings
- Moved has_4tile from adlp to DG2(Ramalingam C)
- Check specifically for DG2, but not the Display13(Imre)
v6: - Moved Tile4 associating struct for modifier/display to
the beginning(Imre Deak)
- Removed unneeded case I915_FORMAT_MOD_4_TILED modifier
checks(Imre Deak)
- Fixed I915_FORMAT_MOD_4_TILED to be 9 instead of 12
(Imre Deak)
v7: - Fixed display_ver to { 13, 13 }(Imre Deak)
- Removed redundant newline(Imre Deak)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122211420.31584-1-stanislav.lisovskiy@intel.com
The bspec's performance guide suggests programming specific values into
a few registers for optimal performance. Although these aren't
workarounds, it's easiest to handle them inside the GT workaround
functions (which will also ensure that the values set here are properly
melded with other bits in the same registers that _are_ set by
workarounds).
Bspec: 68331, 45395
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Siddiqui Ayaz A <ayaz.siddiqui@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211102222511.534310-4-matthew.d.roper@intel.com
Add the initial set of workarounds for Xe_HP SDV.
There are some additional workarounds specific to the compute engines
that we're holding back for now. Those will be added later, after
general compute engine support lands.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211102222511.534310-2-matthew.d.roper@intel.com
In the case of FBC_LLC_READ_CTRL the "FBC" stands for
frame buffer _caching_, not frame buffer compression.
Move the register definition out from the middle of the
frame buffer compression register definitions. Let's
just stick it somewhere with similar looking register
offsets.
And while at it switch it over to REG_BIT().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-15-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
The FBC register defines are a mess:
- namespace changes between DPFC_, FBC_, and some platform
specific prefix at a whim
- ilk+ reuses most g4x bits but still has some separate bit
defines elsewhere
- it's not clear from the defines that the bit defines are
shared
So let's clean it up:
- both g4x and ilk register share the same defines now
- only defines which conflict have a _PLATFORM suffix, everyone
else just gets comments to indicate which platforms do what
- namespace is consistent DPFC_ now
- SNB system agent fence registers also get a consistent namespace
- REG_BIT() & co. for everything
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-13-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Just use a same mask for ivb/hsw as for bdw+. The extra bit
in the bdw mask is mbz on ivb/hsw anyway so this is just
pointless complexity.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-12-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
XE_LPD display adds support for display audio codec keepalive feature.
This feature works also when display codec is in D3 state and the audio
link is off (BCLK off). To enable this functionality, display driver
must update the AUD_TS_CDCLK_M/N registers whenever CDCLK is changed.
Actual timestamps are generated only when the audio codec driver
specifically enables the KeepAlive (KAE) feature.
This patch adds new hooks to intel_set_cdclk() in order to inform
display audio driver when CDCLK change is started and when it is
complete.
Bspec: 53679
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211021105915.4128635-1-kai.vehmanen@linux.intel.com
With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:
- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)
At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation
engine busyness = total + (now - start)
All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.
The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.
Note:
There might be an over-accounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.
v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at GuC level to update engine stats
- Document worker specific details
v3: (Tvrtko/Umesh)
- Demarcate GuC and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of GuC state
- Add hooks to gt park/unpark for GuC busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to GuC initialization
- Drop helpers that are called only once
v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and GuC stats objects
- Since disable_submission is called from many places, move resetting
stats to intel_guc_submission_reset_prepare
v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
callbacks and worker with gt reset
v6: (CI BAT failures)
- DUTs using execlist submission failed to boot since __gt_unpark is
called during i915 load. This ends up calling the GuC busyness unpark
hook and results in kick-starting an uninitialized worker. Let
park/unpark hooks check if GuC submission has been initialized.
- drop cant_sleep() from trylock helper since rcu_read_lock takes care
of that.
v7: (CI) Fix igt@i915_selftest@live@gt_engines
- For GuC mode of submission the engine busyness is derived from gt time
domain. Use gt time elapsed as reference in the selftest.
- Increase busyness calculation to 10ms duration to ensure batch runs
longer and falls within the busyness tolerances in selftest.
v8:
- Use ktime_get in selftest as before
- intel_reset_trylock_no_wait results in a lockdep splat that is not
trivial to fix since the PMU callback runs in irq context and the
reset paths are tightly knit into the driver. The test that uncovers
this is igt@perf_pmu@faulting-read. Drop intel_reset_trylock_no_wait,
instead use the reset_count to synchronize with gt reset during pmu
callback. For the ping, continue to use intel_reset_trylock since ping
is not run in irq context.
- GuC PM timestamp does not tick when GuC is idle. This can potentially
result in wrong busyness values when a context is active on the
engine, but GuC is idle. Use the RING TIMESTAMP as GPU timestamp to
process the GuC busyness stats. This works since both GuC timestamp and
RING timestamp are synced with the same clock.
- The busyness stats may get updated after the batch starts running.
This delay causes the busyness reported for 100us duration to fall
below 95% in the selftest. The only option at this time is to wait for
GuC busyness to change from idle to active before we sample busyness
over a 100us period.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-2-umesh.nerlige.ramappa@intel.com
Alderlake-P was getting 'max time under evasion' messages when PSR2
is enabled, this is due PIPE_SCANLINE/PIPEDSL returning 0 over a
period of time longer than VBLANK_EVASION_TIME_US.
For PSR1 we had the same issue so intel_psr_wait_for_idle() was
implemented to wait for PSR1 to get into idle state but nothing was
done for PSR2.
For PSR2 we can't only wait for idle state as PSR2 tends to keep
into sleep state(ready to send selective updates).
Waiting for any state below deep sleep proved to be effective in
avoiding the evasion messages and also not wasted a lot of time.
v2:
- dropping the additional wait_for loops, only the _wait_for_atomic()
is necessary
- waiting for states below EDP_PSR2_STATUS_STATE_DEEP_SLEEP
v3:
- dropping intel_wait_for_condition_atomic() function
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005231851.67698-1-jose.souza@intel.com
Nuke the hsw_get_ddi_port_state() eyesore by putting the
readout code into intel_pch_display.c, and calling it directly
from hsw_crt_get_config().
Note that the nuked TRANS_DDI_FUNC_CTL readout from
hsw_get_ddi_port_state() is now etirely redundant since we
get called from the encoder->get_config() so we already know
we're dealing with the correct DDI port. Previously the
code was called from a place where that wasn't known so
it had to checked manually.
v2: Clarify the TRANS_DDI_FUNC_CTL change (Dave)
Nuke the now unused *TRANS_DDI_FUNC_CTL_VAL_TO_PORT() (Dave)
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018153525.21597-1-ville.syrjala@linux.intel.com
Reviewed-by: Dave Airlie <airlied@redhat.com>
There's no such thing as gen13. It is either display 13
or graphics 13. Don't propagate the gen12 confusion
further.
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211015091650.87270-1-rodrigo.vivi@intel.com
This memory frequency calculated is only used to check if it is zero,
what is not useful as it will never actually be zero.
Also the calculation is wrong, we should be checking other bit to
select the appropriate frequency multiplier while this code is stuck
with a fixed multiplier.
So here dropping it as whole.
v2:
- Also remove memory frequency calculation for gen9 LP platforms
Cc: Yakui Zhao <yakui.zhao@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 5d0c938ec9 ("drm/i915/gen11+: Only load DRAM information from pcode")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211013010046.91858-1-jose.souza@intel.com
DKL_TX_LOADGEN_SHARING_PMD_DISABLE doesn't even seem to exist,
also the spec says to skip all loadgen stuff.
The code was dead anyway since it wasn't actually writing the value
anywhere.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211006204937.30774-6-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Need to resync drm-intel-next with TTM and PXP stuff from
drm-intel-gt-next that is now in drm/drm-next.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
UAPI Changes:
- Add uAPI for using PXP protected objects
Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8064
- Add PCI IDs and LMEM discovery/placement uAPI for DG1
Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584
- Disable engine bonding on Gen12+ except TGL, RKL and ADL-S
Cross-subsystem Changes:
- Merges 'tip/locking/wwmutex' branch (core kernel tip)
- "mei: pxp: export pavp client to me client bus"
Core Changes:
- Update ttm_move_memcpy for async use (Thomas)
Driver Changes:
- Enable GuC submission by default on DG1 (Matt B)
- Add PXP (Protected Xe Path) support for Gen12 integrated (Daniele,
Sean, Anshuman)
See "drm/i915/pxp: add PXP documentation" for details!
- Remove force_probe protection for ADL-S (Raviteja)
- Add base support for XeHP/XeHP SDV (Matt R, Stuart, Lucas)
- Handle DRI_PRIME=1 on Intel igfx + Intel dgfx hybrid graphics setup (Tvrtko)
- Use Transparent Hugepages when IOMMU is enabled (Tvrtko, Chris)
- Implement LMEM backup and restore for suspend / resume (Thomas)
- Report INSTDONE_GEOM values in error state for DG2 (Matt R)
- Add DG2-specific shadow register table (Matt R)
- Update Gen11/Gen12/XeHP shadow register tables (Matt R)
- Maintain backward-compatible nested batch behavior on TGL+ (Matt R)
- Add new LRI reg offsets for DG2 (Akeem)
- Initialize unused MOCS entries to device specific values (Ayaz)
- Track and use the correct UC MOCS index on Gen12 (Ayaz)
- Add separate MOCS table for Gen12 devices other than TGL/RKL (Ayaz)
- Simplify the locking and eliminate some RCU usage (Daniel)
- Add some flushing for the 64K GTT path (Matt A)
- Mark GPU wedging on driver unregister unrecoverable (Janusz)
- Major rework in the GuC codebase, simplify locking and add docs (Matt B)
- Add DG1 GuC/HuC firmwares (Daniele, Matt B)
- Remember to call i915_sw_fence_fini on guc_state.blocked (Matt A)
- Use "gt" forcewake domain name for error messages instead of "blitter" (Matt R)
- Drop now duplicate LMEM uAPI RFC kerneldoc section (Daniel)
- Fix early tracepoints for requests (Matt A)
- Use locked access to ctx->engines in set_priority (Daniel)
- Convert gen6/gen7/gen8 read operations to fwtable (Matt R)
- Drop gen11/gen12 specific mmio write handlers (Matt R)
- Drop gen11 specific mmio read handlers (Matt R)
- Use designated initializers for init/exit table (Kees)
- Fix syncmap memory leak (Matt B)
- Add pretty printing for buddy allocator state debug (Matt A)
- Fix potential error pointer dereference in pinned_context() (Dan)
- Remove IS_ACTIVE macro (Lucas)
- Static code checker fixes (Nathan)
- Clean up disabled warnings (Nathan)
- Increase timeout in i915_gem_contexts selftests 5x for GuC submission (Matt B)
- Ensure wa_init_finish() is called for ctx workaround list (Matt R)
- Initialize L3CC table in mocs init (Sreedhar, Ayaz, Ram)
- Get PM ref before accessing HW register (Vinay)
- Move __i915_gem_free_object to ttm_bo_destroy (Maarten)
- Deduplicate frequency dump on debugfs (Lucas)
- Make wa list per-gt (Venkata)
- Do not define dummy vma in stack (Venkata)
- Take pinning into account in __i915_gem_object_is_lmem (Matt B, Thomas)
- Do not report currently active engine when describing objects (Tvrtko)
- Fix pdfdocs build error by removing nested grid from GuC docs (Akira)
- Remove false warning from the rps worker (Tejas)
- Flush buffer pools on driver remove (Janusz)
- Fix runtime pm handling in i915_gem_shrink (Maarten)
- Rework TTM object initialization slightly (Thomas)
- Use fixed offset for PTEs location (Michal Wa)
- Verify result from CTB (de)register action and improve error messages (Michal Wa)
- Fix bug in user proto-context creation that leaked contexts (Matt B)
- Re-use Gen11 forcewake read functions on Gen12 (Matt R)
- Make shadow tables range-based (Matt R)
- Ditch the i915_gem_ww_ctx loop member (Thomas, Maarten)
- Use NULL instead of 0 where appropriate (Ville)
- Rename pci/debugfs functions to respect file prefix (Jani, Lucas)
- Drop guc_communication_enabled (Daniele)
- Selftest fixes (Thomas, Daniel, Matt A, Maarten)
- Clean up inconsistent indenting (Colin)
- Use direction definition DMA_BIDIRECTIONAL instead of
PCI_DMA_BIDIRECTIONAL (Cai)
- Add "intel_" as prefix in set_mocs_index() (Ayaz)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YWAO80MB2eyToYoy@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Looks like skl/bxt/derivatives also need the plane stride
stretch w/a when using async flips and VT-d is enabled, or
else we get corruption on screen. To my surprise this was
even documented in bspec, but only as a note on the
CHICHKEN_PIPESL register description rather than on the
w/a list.
So very much the same thing as on HSW/BDW, except the bits
moved yet again.
Cc: stable@vger.kernel.org
Cc: Karthik B S <karthik.b.s@intel.com>
Fixes: 55ea1cb178 ("drm/i915: Enable async flips in i915")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930190943.17547-1-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
When protected sufaces has flipped and pxp session is disabled,
display black pixels by using plane color CTM correction.
v2:
- Display black pixels in async flip too.
v3:
- Removed the black pixels logic for async flip. [Ville]
- Used plane state to force black pixels. [Ville]
v4 (Daniele): update pxp_is_borked check.
v5: rebase on top of v9 plane decryption moving the decrypt check
(Juston)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Gaurav Kumar <kumar.gaurav@intel.com>
Cc: Shankar Uma <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Juston Li <juston.li@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-15-alan.previn.teres.alexis@intel.com
Add support to enable/disable PLANE_SURF Decryption Request bit.
It requires only to enable plane decryption support when following
condition met.
1. PXP session is enabled.
2. Buffer object is protected.
v2:
- Used gen fb obj user_flags instead gem_object_metadata. [Krishna]
v3:
- intel_pxp_gem_object_status() API changes.
v4: use intel_pxp_is_active (Daniele)
v5: rebase and use the new protected object status checker (Daniele)
v6: used plane state for plane_decryption to handle async flip
as suggested by Ville.
v7: check pxp session while plane decrypt state computation. [Ville]
removed pointless code. [Ville]
v8 (Daniele): update PXP check
v9: move decrypt check after icl_check_nv12_planes() when overlays
have fb set (Juston)
v10 (Daniele): update PXP check again to match rework in earlier
patches and don't consider protection valid if the object has not
been used in an execbuf beforehand.
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Huang Sean Z <sean.z.huang@intel.com>
Cc: Gaurav Kumar <kumar.gaurav@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Juston Li <juston.li@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com> #v9
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-14-alan.previn.teres.alexis@intel.com
The HW will generate a teardown interrupt when session termination is
required, which requires i915 to submit a terminating batch. Once the HW
is done with the termination it will generate another interrupt, at
which point it is safe to re-create the session.
Since the termination and re-creation flow is something we want to
trigger from the driver as well, use a common work function that can be
called both from the irq handler and from the driver set-up flows, which
has the addded benefit of allowing us to skip any extra locks because
the work itself serializes the operations.
v2: use struct completion instead of bool (Chris)
v3: drop locks, clean up functions and improve comments (Chris),
move to common work function.
v4: improve comments, simplify wait logic (Rodrigo)
v5: unconditionally set interrupts, rename state_attacked var (Rodrigo)
v10: remove inclusion of intel_gt_types.h from intel_pxp.h (Jani)
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Huang, Sean Z <sean.z.huang@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-10-alan.previn.teres.alexis@intel.com
Display underrun in HDR mode when cursor is enabled.
RTL fix will be implemented CLKGATE_DIS_PSL_A bit 28-46520h.
As per W/A 1604331009, Disable cursor clock gating in HDR mode.
Bspec : 33451
Changes since V6:
- Address checkpatch warnings
- Bit ordering
Changes since V5:
- replace intel_de_read with intel_de_rmw - Jani
Changes since V4:
- Added WA needed check - Ville
- Replace BIT with REG_BIT - Ville
- Add WA enable/disable support back which was
added in V1 - Ville
Changes since V3:
- Disable WA when not in HDR mode or cursor plane
not active - Ville
- Extract required args from crtc_state - Ville
- Create HDR mode API using bdw_set_pipemisc ref - Ville
- Tested with HDR video as well full setmode, WA
applies and disables
Changes since V2:
- Made it general gen11 WA
- Removed WA needed check
- Added cursor plane active check
- Once WA enable, software will not disable
Changes since V1:
- Modified way CLKGATE_DIS_PSL bit 28 was modified
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929052442.2543054-1-tejaskumarx.surendrakumar.upadhyay@intel.com
Apply the same 512 byte FBC segment alignment to glk+ as we use
on skl+. The only real difference is that we now have a dedicated
register for the FBC override stride. Not 100% sure which
platforms really need the 512B alignment, but it's easiest
to just do it on everything.
Also the hardware no longer seems to misclaculate the CFB stride
for linear, so we can omit the use of the override stride for
linear unless the stride is misaligned.
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210921152517.803-3-ville.syrjala@linux.intel.com
Xe_HP adds some new bits to the FUSE1 register to let us know whether a
given SFC unit is present. We should take this into account while
initializing SFC availability to our VCS and VECS engines.
While we're at it, update the FUSE1 register definition to use
REG_GENMASK / REG_FIELD_GET notation.
Note that, the bspec confusingly names the fuse bits "disable" despite
the register reflecting the *enable* status of the SFC units. The
original architecture documents which the bspec is based on do properly
name this field "SFC_ENABLE."
Bspec: 52543
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210917161203.812251-2-matthew.d.roper@intel.com
Wa_16014451276 fixes the starting coordinate for PSR2 selective
updates. CHICKEN_TRANS definition of the workaround bit has a wrong
name based on workaround definition and HSD.
Wa_14014971508 allows the screen to continue to be updated when
coming back from DC5/DC6 and SF_SINGLE_FULL_FRAME bit is not kept
set in PSR2_MAN_TRK_CTL.
Wa_16012604467 fixes underruns when exiting PSR2 when it is in one
of its internal states.
Wa_14014971508 is still in pending status in BSpec but by
the time this is reviewed and ready to be merged it will be finalized.
v2:
- renamed register to ADLP_1_BASED_X_GRANULARITY
- added comment about all ADL-P supported panels being 1 based X
granularity
BSpec: 54369
BSpec: 50054
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210914212507.177511-5-jose.souza@intel.com
Close the divergence which has caused patches not to apply and
have a solid baseline for the PXP patches that Rodrigo will send
a topic branch PR for.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Blitter commands which do not have MOCS fields rely on
cacheability of BlitterCacheControlRegister which was mapped
to index 0 by default.Once we changed the MOCS value of
index 0 to L3 WB, tests like gem_linear_blits started failing
due to a change in cacheability from UC to WB.
Program and place the BlitterCacheControlRegister in
build_aux_regs().
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903092153.535736-4-ayaz.siddiqui@intel.com