Commit Graph

2390 Commits

Author SHA1 Message Date
Dave Airlie
417c1c1963 Merge tag 'drm-intel-gt-next-2022-07-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver uAPI changes:
- All related to the Small BAR support: (and all by Matt Auld)
 * add probed_cpu_visible_size
 * expose the avail memory region tracking
 * apply ALLOC_GPU only by default
 * add NEEDS_CPU_ACCESS hint
 * tweak error capture on recoverable contexts

Driver highlights:
- Add Small BAR support (Matt)
- Add MeteorLake support (RK)
- Add support for LMEM PCIe resizable BAR (Akeem)

Driver important fixes:
- ttm related fixes (Matt Auld)
- Fix a performance regression related to waitboost (Chris)
- Fix GT resets (Chris)

Driver others:
- Adding GuC SLPC selftest (Vinay)
- Fix ADL-N GuC load (Daniele)
- Add platform workaround (Gustavo, Matt Roper)
- DG2 and ATS-M device ID updates (Matt Roper)
- Add VM_BIND doc rfc with uAPI documentation (Niranjana)
- Fix user-after-free in vma destruction (Thomas)
- Async flush of GuC log regions (Alan)
- Fixes in selftests (Chris, Dan, Andrzej)
- Convert to drm_dbg (Umesh)
- Disable OA sseu config param for newer hardware (Umesh)
- Multi-cast register steering changes (Matt Roper)
- Add lmem_bar_size modparam (Priyanka)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Ys85pcMYLkqF/HtB@intel.com
2022-07-22 15:51:31 +10:00
Priyanka Dandamudi
17cd10a44a drm/i915: Add lmem_bar_size modparam
For testing purposes, support forcing the lmem_bar_size through a new
modparam. In CI we only have a limited number of configurations for DG2,
but we still need to be reasonably sure we get a usable device (also
verifying we report the correct values for things like
probed_cpu_visible_size etc) with all the potential lmem_bar sizes that
we might expect see in the wild.

v2: Update commit message and a minor modification.(Matt)

v3: Optimised lmem bar size code and modified code to resize
bar maximum upto lmem_size instead of maximum supported size.(Nirmoy)

v4: Optimised lmem bar size code.(Nirmoy)

Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220713130209.2573233-3-priyanka.dandamudi@intel.com
2022-07-13 17:47:51 +01:00
Akeem G Abodunrin
a91d1a17cd drm/i915: Add support for LMEM PCIe resizable bar
Add support for the local memory PICe resizable bar, so that
local memory can be resized to the maximum size supported by the device,
and mapped correctly to the PCIe memory bar. It is usual that GPU
devices expose only 256MB BARs primarily to be compatible with 32-bit
systems. So, those devices cannot claim larger memory BAR windows size due
to the system BIOS limitation. With this change, it would be possible to
reprogram the windows of the bridge directly above the requesting device
on the same BAR type.

v2:Moved code to gt/intel_region_lmem.c and used only
single underscore for function names.(Jani)

v3: Optimised code.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Michael J Ruhl <michael.j.ruhl@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220713130209.2573233-2-priyanka.dandamudi@intel.com
2022-07-13 17:47:51 +01:00
Matt Roper
a5e4a53818 drm/i915: Correct ss -> steering calculation for pre-Xe_HP platforms
Accidental use of a "SLICE" macro where a "SUBSLICE" macro was intended
causes the group ID for steering to be calculated incorrectly on
pre-Xe_HP platforms.

Fixes: 9a92732f04 ("drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220712220513.3451794-1-matthew.d.roper@intel.com
2022-07-13 09:22:17 -07:00
Chris Wilson
c877bed82e drm/i915/gt: Only kick the signal worker if there's been an update
One impact of commit 047a1b877e ("dma-buf & drm/amdgpu: remove
dma_resv workaround") is that it stores many, many more fences. Whereas
adding an exclusive fence used to remove the shared fence list, that
list is now preserved and the write fences included into the list. Not
just a single write fence, but now a write/read fence per context. That
causes us to have to track more fences than before (albeit half of those
are redundant), and we trigger more interrupts for multi-engine
workloads.

As part of reducing the impact from handling more signaling, we observe
we only need to kick the signal worker after adding a fence iff we have
good cause to believe that there is work to be done in processing the
fence i.e. we either need to enable the interrupt or the request is
already complete but we don't know if we saw the interrupt and so need
to check signaling.

References: 047a1b877e ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d7b953c7a4ba747c8196a164e2f8c5aef468d048.1657289332.git.karolina.drobnik@intel.com
2022-07-12 17:44:43 -04:00
Chris Wilson
33da978947 drm/i915/gt: Serialize TLB invalidates with GT resets
Avoid trying to invalidate the TLB in the middle of performing an
engine reset, as this may result in the reset timing out. Currently,
the TLB invalidate is only serialised by its own mutex, forgoing the
uncore lock, but we can take the uncore->lock as well to serialise
the mmio access, thereby serialising with the GDRST.

Tested on a NUC5i7RYB, BIOS RYBDWi35.86A.0380.2019.0517.1530 with
i915 selftest/hangcheck.

Cc: stable@vger.kernel.org  # v4.4 and upper
Fixes: 7938d61591 ("drm/i915: Flush TLBs before releasing backing store")
Reported-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Tested-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1e59a7c45dd919a530256b9ac721ac6ea86c0677.1657639152.git.mchehab@kernel.org
2022-07-12 17:38:01 -04:00
Chris Wilson
336561a914 drm/i915/gt: Serialize GRDOM access between multiple engine resets
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.

Cc: stable@vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7dbac3011729.1657639152.git.mchehab@kernel.org
2022-07-12 17:37:59 -04:00
Matt Roper
b7580e669c drm/i915/dg2: Add Wa_15010599737
This workaround may need to be extended to other platforms soon, but for
now it's marked as DG2-specific.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708215804.2889246-1-matthew.d.roper@intel.com
2022-07-12 08:58:19 -07:00
Dan Carpenter
d50f5a109c drm/i915/selftests: fix a couple IS_ERR() vs NULL tests
The shmem_pin_map() function doesn't return error pointers, it returns
NULL.

Fixes: be1cb55a07 ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708094104.GL2316@kadam
2022-07-11 10:14:51 +01:00
Matt Roper
9a92732f04 drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr
Although all DSS belong to a single pool on Xe_HP platforms (i.e.,
they're not organized into slices from a topology point of view), we do
still need to pass 'group' and 'instance' targets when steering register
accesses to a specific instance of a per-DSS multicast register.  The
rules for how to determine group and instance IDs (which previously used
legacy terms "slice" and "subslice") varies by platform.  Some platforms
determine steering by gslice membership, some platforms by cslice
membership, and future platforms may have other rules.

Since looping over each DSS and performing steered unicast register
accesses is a relatively common pattern, let's add a dedicated iteration
macro to handle this (and replace the platform-specific "instdone" loop
we were using previously.  This will avoid the calling code needing to
figure out the details about how to obtain steering IDs for a specific
DSS.

Most of the places where we use this new loop are in the GPU errorstate
code at the moment, but we do have some additional features coming in
the future that will also need to loop over each DSS and steer some
register accesses accordingly.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220701232006.1016135-1-matthew.d.roper@intel.com
2022-07-08 09:32:57 -07:00
Alan Previn
b94a1a207d drm/i915/guc: Asynchronous flush of GuC log regions
Both error-capture and relay-logging mechanism use the GuC
log infrastructure. That means the KMD must send a log flush
complete notification back to GuC after reading the data out.
This call is currently being sent synchronously.
However, synchronous H2Gs cause problems when the system is
backed up. There is no need for this to be synchronous. The
KMD wasn't even looking at the return status from it. So make
it asynchronous and then there is no issue about time outs.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607002314.1451656-2-alan.previn.teres.alexis@intel.com
2022-07-06 14:38:56 -07:00
Gustavo Sousa
3b05c96078 drm/i915/pvc: Implement w/a 16016694945
A new PVC-specific workaround has just been added to the BSpec.

BSpec: 64027

Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220630201407.16770-1-gustavo.sousa@intel.com
2022-07-01 08:29:02 -07:00
Matthew Auld
eb1c535f0d drm/i915: turn on small BAR support
With the uAPI in place we should now have enough in place to ensure a
working system on small BAR configurations.

v2: (Nirmoy & Thomas):
  - s/full BAR/Resizable BAR/ which is hopefully more easily
    understood by users.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-13-matthew.auld@intel.com
2022-07-01 08:30:47 +01:00
Dave Airlie
c6a3d73592 Merge tag 'drm-intel-gt-next-2022-06-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Document memory residency and Flat-CCS capability of obj (Ramalingam C)
- Disable GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK on Xe_HP+ (Matt Roper)

Cross-subsystem Changes:

- Rename intel-gtt symbols (Lucas De Marchi)

Core Changes:

Driver Changes:

- Support programming the EU priority in the GuC descriptor (DG2) (Matthew Brost)
- DG2 HuC loading support (Daniele Ceraolo Spurio)
- Fix build error without CONFIG_PM (YueHaibing)
- Enable THP on Icelake and beyond (Tvrtko Ursulin)
- Only setup private tmpfs mount when needed and fix logging (Tvrtko Ursulin)
- Make __guc_reset_context aware of guilty engines (Umesh Nerlige Ramappa)
- DG2 small bar memory probing fixes (Nirmoy Das)
- Remove unnecessary GuC err capture noise (Alan Previn)
- Fix i915_gem_object_ggtt_pin_ww regression on old platforms (Maarten Lankhorst)
- Fix undefined behavior in GuC backend due to shift overflowing the constant (Borislav Petkov)
- New DG2 workarounds (Swathi Dhanavanthri, Anshuman Gupta)
- Report no hwconfig support on ADL-N (Balasubramani Vivekanandan)
- Fix error_state_read ptr + offset use (Alan Previn)
- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Fix memory leaks in per-gt sysfs (Ashutosh Dixit)
- Fix dma_resv fence handling in multi-batch execbuf (Nirmoy Das)
- Add extra registers to GPU error dump on Gen11+ (Stuart Summers)
- More PVC+DG2 workarounds (Matt Roper)
- Improve user experience and driver robustness under SIGINT or similar (Tvrtko Ursulin)
- Don't show engine classes not present (Tvrtko Ursulin)
- Improve on suspend / resume time with VT-d enabled (Thomas Hellström)
- Add missing else (katrinzhou)
- Don't leak lmem mapping in vma_evict (Juha-Pekka Heikkila)
- Add smem fallback allocation for dpt (Juha-Pekka Heikkila)
- Tweak the ordering in cpu_write_needs_clflush (Matthew Auld)
- Do not access rq->engine without a reference (Niranjana Vishwanathapura)
- Revert "drm/i915: Hold reference to intel_context over life of i915_request" (Niranjana Vishwanathapura)
- Don't update engine busyness stats too frequently (Alan Previn)
- Add additional steps for Wa_22011802037 for execlist backend (Umesh Nerlige Ramappa)
- Fix a lockdep warning at error capture (Nirmoy Das)

- Ponte Vecchio prep work and new blitter engines (Matt Roper, John Harrison, Lucas De Marchi)
- Read correct RP_STATE_CAP register (PVC) (Matt Roper)
- Define MOCS table for PVC (Ayaz A Siddiqui)
- Driver refactor and support Ponte Vecchio forcewake handling (Matt Roper)
- Remove additional 3D flags from PIPE_CONTROL (Ponte Vecchio) (Stuart Summers)
- XEHPSDV and PVC do not use HuC (Daniele Ceraolo Spurio)
- Extract stepping information from PCI revid (Ponte Vecchio) (Matt Roper)
- Add initial PVC workarounds (Stuart Summers)
- SSEU handling driver refactor and Ponte Vecchio support (Matt Roper)
- GuC depriv applies to PVC (Matt Roper)
- Add register steering (Ponte Vecchio) (Matt Roper)
- Add recommended MMIO setting (Ponte Vecchio) (Matt Roper)

- Move multicast register handling to a dedicated file (Matt Roper)
- Cleanup interface for MCR operations (Matt Roper)
- Extend i915_vma_pin_iomap() (CQ Tang)
- Re-do the intel-gtt split (Lucas De Marchi)
- Correct duplicated/misplaced GT register definitions (Matt Roper)
- Prefer "XEHP_" prefix for registers (Matt Roper)

- Don't use DRM_DEBUG_WARN_ON for unexpected l3bank/mslice config (Tvrtko Ursulin)
- Don't use DRM_DEBUG_WARN_ON for ring unexpectedly not idle (Tvrtko Ursulin)
- Make drop_pages() return bool (Lucas De Marchi)
- Fix CFI violation with show_dynamic_id() (Nathan Chancellor)
- Use i915_probe_error instead of drm_error in GuC code (Vinay Belgaumkar)
- Fix use of static in macro mismatch (Andi Shyti)
- Update tiled blits selftest (Bommu Krishnaiah)
- Future-proof platform checks (Matt Roper)
- Only include what's needed (Jani Nikula)
- remove accidental static from a local variable (Jani Nikula)
- Add global forcewake request to drpc (Vinay Belgaumkar)
- Fix spelling typo in comment (pengfuyuan)
- Increase timeout for live_parallel_switch selftest (Akeem G Abodunrin)
- Use non-blocking H2G for waitboost (Vinay Belgaumkar)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YrwtLM081SQUG1Dc@tursulin-desk
2022-07-01 14:14:52 +10:00
Daniele Ceraolo Spurio
971e4a9781 drm/i915/guc: ADL-N should use the same GuC FW as ADL-S
The only difference between the ADL S and P GuC FWs is the HWConfig
support. ADL-N does not support HWConfig, so we should use the same
binary as ADL-S, otherwise the GuC might attempt to fetch a config
table that does not exist. ADL-N is internally identified as an ADL-P,
so we need to special-case it in the FW selection code.

Fixes: 7e28d0b267 ("drm/i915/adl-n: Enable ADL-N platform")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621233005.3952293-1-daniele.ceraolospurio@intel.com
2022-06-30 14:25:09 -07:00
Vinay Belgaumkar
79398d24da drm/i915/guc/slpc: Add a new SLPC selftest
This test will validate we can achieve actual frequency of RP0. Pcode
grants frequencies based on what GuC is requesting. However, thermal
throttling can limit what is being granted. Add a test to request for
max, but don't fail the test if RP0 is not granted due to throttle
reasons.

Also optimize the selftest by using a common run_test function to avoid
code duplication. Rename the "clamp" tests to vary_max_freq and
vary_min_freq.

v2: Fix compile warning
v3: Review comments (Ashutosh). Added a FIXME for the media RP0 case.
v4: Checkpatch (strict) fixes, remove FIXME and other comments (Ashutosh)

Fixes commit 8ee2c22782 ("drm/i915/guc/slpc: Add SLPC selftest")

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220627230346.27720-1-vinay.belgaumkar@intel.com
2022-06-29 13:50:23 -07:00
Nirmoy Das
a069685637 drm/i915: Fix a lockdep warning at error capture
For some platfroms we use stop_machine version of
gen8_ggtt_insert_page/gen8_ggtt_insert_entries to avoid a
concurrent GGTT access bug but this causes a circular locking
dependency warning:

  Possible unsafe locking scenario:
        CPU0                    CPU1
        ----                    ----
   lock(&ggtt->error_mutex);
                                lock(dma_fence_map);
                                lock(&ggtt->error_mutex);
   lock(cpu_hotplug_lock);

Fix this by calling gen8_ggtt_insert_page/gen8_ggtt_insert_entries
directly at error capture which is concurrent GGTT access safe because
reset path make sure of that.

v2: Fix rebase conflict and added a comment.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5595
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624110821.29190-1-nirmoy.das@intel.com
2022-06-29 14:52:50 +05:30
Vinay Belgaumkar
58eaa6b3fb drm/i915/guc/slpc: Use non-blocking H2G for waitboost
SLPC min/max frequency updates require H2G calls. We are seeing
timeouts when GuC channel is backed up and it is unable to respond
in a timely fashion causing warnings and affecting CI.

This is seen when waitboosting happens during a stress test.
this patch updates the waitboost path to use a non-blocking
H2G call instead, which returns as soon as the message is
successfully transmitted.

v2: Use drm_notice to report any errors that might occur while
sending the waitboost H2G request (Tvrtko)
v3: Add drm_notice inside force_min_freq (Ashutosh)

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623003225.23301-1-vinay.belgaumkar@intel.com
2022-06-27 15:08:58 -07:00
Umesh Nerlige Ramappa
0667429ce6 drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend
For execlists backend, current implementation of Wa_22011802037 is to
stop the CS before doing a reset of the engine. This WA was further
extended to wait for any pending MI FORCE WAKEUPs before issuing a
reset. Add the extended steps in the execlist path of reset.

In addition, extend the WA to gen11.

v2: (Tvrtko)
- Clarify comments, commit message, fix typos
- Use IS_GRAPHICS_VER for gen 11/12 checks

v3: (Daneile)
- Drop changes to intel_ring_submission since WA does not apply to it
- Log an error if MSG IDLE is not defined for an engine

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: f6aa0d713c ("drm/i915: Add Wa_22011802037 force cs halt")
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com
2022-06-27 12:21:22 -07:00
Alan Previn
59bcdb564b drm/i915/guc: Don't update engine busyness stats too frequently
Using two different types of workoads, it was observed that
guc_update_engine_gt_clks was being called too frequently and/or
causing a CPU-to-lmem bandwidth hit over PCIE. Details on
the workloads and numbers are in the notes below.

Background: At the moment, guc_update_engine_gt_clks can be invoked
via one of 3 ways. #1 and #2 are infrequent under normal operating
conditions:
     1.When a predefined "ping_delay" timer expires so that GuC-
       busyness can sample the GTPM clock counter to ensure it
       doesn't miss a wrap-around of the 32-bits of the HW counter.
       (The ping_delay is calculated based on 1/8th the time taken
       for the counter go from 0x0 to 0xffffffff based on the
       GT frequency. This comes to about once every 28 seconds at a
       GT frequency of 19.2Mhz).
     2.In preparation for a gt reset.
     3.In response to __gt_park events (as the gt power management
       puts the gt into a lower power state when there is no work
       being done).

Root-cause: For both the workloads described farther below, it was
observed that when user space calls IOCTLs that unparks the
gt momentarily and repeats such calls many times in quick succession,
it triggers calling guc_update_engine_gt_clks as many times. However,
the primary purpose of guc_update_engine_gt_clks is to ensure we don't
miss the wraparound while the counter is ticking. Thus, the solution
is to ensure we skip that check if gt_park is calling this function
earlier than necessary.

Solution: Snapshot jiffies when we do actually update the busyness
stats. Then get the new jiffies every time intel_guc_busyness_park
is called and bail if we are being called too soon. Use half of the
ping_delay as a safe threshold.

NOTE1: Workload1: IGTs' gem_create was modified to create a file handle,
allocate memory with sizes that range from a min of 4K to the max supported
(in power of two step-sizes). Its maps, modifies and reads back the
memory. Allocations and modification is repeated until total memory
allocation reaches the max. Then the file handle is closed. With this
workload, guc_update_engine_gt_clks was called over 188 thousand times
in the span of 15 seconds while this test ran three times. With this patch,
the number of calls reduced to 14.

NOTE2: Workload2: 30 transcode sessions are created in quick succession.
While these sessions are created, pcm-iio tool was used to measure I/O
read operation bandwidth consumption sampled at 100 milisecond intervals
over the course of 20 seconds. The total bandwidth consumed over 20 seconds
without this patch was measured at average at 311KBps per sample. With this
patch, the number went down to about 175Kbps which is about a 43% savings.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623023157.211650-2-alan.previn.teres.alexis@intel.com
2022-06-27 11:35:54 -07:00
Matt Roper
7d8097073c drm/i915: Prefer "XEHP_" prefix for registers
We've been introducing new registers with a mix of "XEHP_"
(architecture) and "XEHPSDV_" (platform) prefixes.  For consistency,
let's settle on "XEHP_" as the preferred form.

XEHPSDV_RP_STATE_CAP stays with its current name since that's truly a
platform-specific register and not something that applies to the Xe_HP
architecture as a whole.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Caz Yokoyama <caz@caztech.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-2-matthew.d.roper@intel.com
2022-06-27 07:44:25 -07:00
Matt Roper
8524bb6714 drm/i915: Correct duplicated/misplaced GT register definitions
XEHPSDV_FLAT_CCS_BASE_ADDR, GEN8_L3_LRA_1_GPGPU, and MMCD_MISC_CTRL were
duplicated between i915_reg.h and intel_gt_regs.h.  These are all GT
registers, so we should drop the copy from i915_reg.h.

XEHPSDV_TILE0_ADDR_RANGE was defined in i915_reg.h, but really belongs
in intel_gt_regs.h.  Move it.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-1-matthew.d.roper@intel.com
2022-06-27 07:44:17 -07:00
Dave Airlie
805ada63ba Merge tag 'drm-intel-next-2022-06-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- General driver clean-up (Jani, Ville, Julia)
- DG2 enabling (Anusha, Vandita)
- Fix sparse warnings (Imre, Jani)
- DMC MMIO range checks (Anusha)
- Audio related fixes (Jani)
- Runtime PM fixes (Anshuman)
- PSR fixes (Jouni, Jose)
- Media freq factor and per-gt enhancements (Ashutosh, Dale)
- DSI fixes for ICL+ (Jani)
- Disable DMC flip queue handlers (Imre)
- ADL_P voltage swing updates (Balasubramani)
- Use more the VBT for panel information (Ville, Animesh)
- Fix on Type-C ports with TBT mode (Vivek)
- Improve fastset and allow seamless M/N changes (Ville)
- Accept more fixed modes with VRR/DMRRS panels (Ville)
- FBC fix (Jose)
- Remove noise logs (Luca)
- Disable connector polling for a headless SKU (Jouni)
- Sanitize display underrun reporting (Ville)
- ADL-S display PLL w/a (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YrNzP2WTf3WBvpvd@intel.com
2022-06-24 12:07:47 +10:00
Lucas De Marchi
9ce07d94c9 drm/i915/gt: Re-do the intel-gtt split
Re-do what was attempted in commit 7a5c922377 ("drm/i915/gt: Split
intel-gtt functions by arch"). The goal of that commit was to split the
handlers for older hardware that depend on intel-gtt.ko so i915 can
be built for non-x86 archs, after some more patches. Other archs do not
need intel-gtt.ko.

Main issue with the previous approach: it moved all the hooks, including
the gen8, which is used by all platforms gen8 and newer.  Re-do the
split moving only the handlers for gen < 6, which are the only ones
calling out to the separate module.

While at it do some minor cleanups:
  - Rename the prefix s/gen5_/gmch_/ to be more accurate what platforms
    are covered by intel_ggtt_gmch.c
  - Remove dead code for gen12 out of needs_idle_maps()
  - Remove TODO comment leftover
  - Re-order if/else ladder in ggtt_probe_hw() to keep newest platforms
    first

v2: Add minor cleanups (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-2-lucas.demarchi@intel.com
2022-06-22 15:52:56 -07:00
Lucas De Marchi
64e06652e3 agp/intel: Rename intel-gtt symbols
Exporting the symbols like intel_gtt_* creates some confusion inside
i915 that has symbols named similarly. In an attempt to isolate
platforms needing intel-gtt.ko, commit 7a5c922377 ("drm/i915/gt: Split
intel-gtt functions by arch") moved way too much
inside gt/intel_gt_gmch.c, even the functions that don't callout to this
module. Rename the symbols to make the separation clear.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-1-lucas.demarchi@intel.com
2022-06-22 15:52:55 -07:00
Vinay Belgaumkar
fc98eb494c drm/i915: Add global forcewake request to drpc
We have seen multiple RC6 issues where it is useful to know
which global forcewake bits are set. Add this to the 'drpc'
debugfs output.

v2: Review comments (Ashutosh)

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617212032.34577-1-vinay.belgaumkar@intel.com
2022-06-21 11:02:49 -07:00
Thomas Hellström
2ef6efa79f drm/i915: Improve on suspend / resume time with VT-d enabled
When DMAR / VT-d is enabled, the display engine uses overfetching,
presumably to deal with the increased latency. To avoid display engine
errors and DMAR faults, as a workaround the GGTT is populated with scatch
PTEs when VT-d is enabled. However starting with gen10, Write-combined
writing of scratch PTES is no longer possible and as a result, populating
the full GGTT with scratch PTEs like on resume becomes very slow as
uncached access is needed.

Therefore, on integrated GPUs utilize the fact that the PTEs are stored in
stolen memory which retain content across S3 suspend. Don't clear the PTEs
on suspend and resume. This improves on resume time with around 100 ms.
While 100+ms might appear like a short time it's 10% to 20% of total resume
time and important in some applications.

One notable exception is Intel Rapid Start Technology which may cause
stolen memory to be lost across what the OS percieves as S3 suspend.
If IRST is enabled or if we can't detect whether IRST is enabled, retain
the old workaround, clearing and re-instating PTEs.

As an additional measure, if we detect that the last ggtt pte was lost
during suspend, print a warning and re-populate the GGTT ptes

On discrete GPUs, the display engine scans out from LMEM which isn't
subject to DMAR, and presumably the workaround is therefore not needed,
but that needs to be verified and disabling the workaround for dGPU,
if possible, will be deferred to a follow-up patch.

v2:
- Rely on retained ptes to also speed up suspend and resume re-binding.
- Re-build GGTT ptes if Intel rst is enabled.
v3:
- Re-build GGTT ptes also if we can't detect whether Intel rst is enabled,
  and if the guard page PTE and end of GGTT was lost.
v4:
- Fix some kerneldoc issues (Matthew Auld), rebase.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617152856.249295-1-thomas.hellstrom@linux.intel.com
2022-06-20 10:50:01 +02:00
Matt Roper
3fe6c7f53e drm/i915/gt: Cleanup interface for MCR operations
Let's replace the assortment of intel_gt_* and intel_uncore_* functions
that operate on MCR registers with a cleaner set of interfaces:

  * intel_gt_mcr_read -- unicast read from specific instance
  * intel_gt_mcr_read_any[_fw] -- unicast read from any non-terminated
    instance
  * intel_gt_mcr_unicast_write -- unicast write to specific instance
  * intel_gt_mcr_multicast_write[_fw] -- multicast write to all instances

We'll also replace the historic "slice" and "subslice" terminology with
"group" and "instance" to match the documentation for more recent
platforms; these days MCR steering applies to more types of replication
than just slice/subslice.

v2:
 - Reference the new kerneldoc from i915.rst.  (Jani)
 - Tweak the wording of the documentation for a couple functions to
   clarify the difference between "_fw" and non-"_fw" forms.

v3:
 - s/read/write/ to fix copy-paste mistake in a couple comments.
   (Harish)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615001019.1821989-3-matthew.d.roper@intel.com
2022-06-17 08:05:40 -07:00
Matt Roper
e7858254f9 drm/i915/gt: Move multicast register handling to a dedicated file
Handling of multicast/replicated registers is spread across intel_gt.c
and intel_uncore.c today.  As multicast handling and the related
steering logic gets more complicated with the addition of new platforms
and new rules it makes sense to centralize it all in one place.

For now the existing functions have been moved to the new .c/.h as-is.
Function renames and updates to operate in a more consistent manner will
be done in subsequent patches.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615001019.1821989-2-matthew.d.roper@intel.com
2022-06-17 08:05:12 -07:00
Tvrtko Ursulin
45c64ecf97 drm/i915: Improve user experience and driver robustness under SIGINT or similar
We have long standing customer complaints that pressing Ctrl-C (or to the
effect of) causes engine resets with otherwise well behaving programs.

Not only is logging engine resets during normal operation not desirable
since it creates support incidents, but more fundamentally we should avoid
going the engine reset path when we can since any engine reset introduces
a chance of harming an innocent context.

Reason for this undesirable behaviour is that the driver currently does
not distinguish between banned contexts and non-persistent contexts which
have been closed.

To fix this we add the distinction between the two reasons for revoking
contexts, which then allows the strict timeout only be applied to banned,
while innocent contexts (well behaving) can preempt cleanly and exit
without triggering the engine reset path.

Note that the added context exiting category applies both to closed non-
persistent context, and any exiting context when hangcheck has been
disabled by the user.

At the same time we rename the backend operation from 'ban' to 'revoke'
which more accurately describes the actual semantics. (There is no ban at
the backend level since banning is a concept driven by the scheduling
frontend. Backends are simply able to revoke a running context so that
is the more appropriate name chosen.)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220527072452.2225610-1-tvrtko.ursulin@linux.intel.com
2022-06-17 09:03:11 +01:00
Matt Roper
1556c3b4c7 drm/i915/pvc: Add recommended MMIO setting
As with past platforms, the bspec's performance tuning guide provides
recommended MMIO settings.  Although not technically "workarounds" we
apply these through the workaround framework to ensure that they're
re-applied at the proper times (e.g., on engine resets) and that any
conflicts with real workarounds are flagged.

Bspec: 72161
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220613165314.862029-1-matthew.d.roper@intel.com
2022-06-15 10:54:35 -07:00
Matt Roper
9affc1b87e drm/i915/pvc: Adjust EU per SS according to HAS_ONE_EU_PER_FUSE_BIT()
If we're treating each bit in the EU fuse register as a single EU
instead of a pair of EUs, then that also cuts the number of potential
EUs per subslice in half.

Fixes: 5ac342ef84 ("drm/i915/pvc: Add SSEU changes")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220610230801.459577-1-matthew.d.roper@intel.com
2022-06-14 14:52:13 -07:00
Matt Roper
e0d7371b46 drm/i915/pvc: Add register steering
Ponte Vecchio no longer has MSLICE or LNCF steering, but the bspec does
document several new types of multicast register ranges.  Fortunately,
most of the different MCR types all provide valid values at instance
(0,0) so there's no need to read fuse registers and calculate a
non-terminated instance.  We'll lump all of those range types (BSLICE,
HALFBSLICE, TILEPSMI, CC, and L3BANK) into a single category called
"INSTANCE0" to keep things simple.  We'll also perform explicit steering
for each of these multicast register types, even if the implicit
steering setup for COMPUTE/DSS ranges would have worked too; this is
based on guidance from our hardware architects who suggested that we
move away from implicit steering and start explicitly steer all MCR
register accesses on modern platforms (we'll work on transitioning
COMPUTE/DSS to explicit steering in the future).

Note that there's one additional MCR range type defined in the bspec
(SQIDI) that we don't handle here.  Those ranges use a different
steering control register that we never touch; since instance 0 is also
always a valid setting there, we can just ignore those ranges.

Finally, we'll rename the HAS_MSLICES() macro to HAS_MSLICE_STEERING().
PVC hardware still has units referred to as mslices, but there's no
register steering based on mslice for this platform.

v2:
 - Rebase on other recent changes
 - Swap two table rows to keep table sorted & easy to read.  (Harish)

Bspec: 67609
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608170700.4026648-1-matthew.d.roper@intel.com
2022-06-09 07:49:30 -07:00
Matt Roper
17f65658c8 drm/i915/xehp: Correct steering initialization
Another mistake during the conversion to DSS bitmaps:  after retrieving
the DSS ID intel_sseu_find_first_xehp_dss() we forgot to modulo it down
to obtain which ID within the current gslice it is.

Fixes: b87d390196 ("drm/i915/sseu: Disassociate internal subslice mask representation from uapi")
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607175716.3338661-1-matthew.d.roper@intel.com
2022-06-08 07:25:58 -07:00
Matt Roper
c5cb0002d1 drm/i915: More PVC+DG2 workarounds
A new PVC+DG2 workaround has appeared recently:
 - Wa_16015675438

And a couple existing DG2 workarounds have been extended to PVC:
 - Wa_14015795083
 - Wa_18018781329

Note that Wa_16015675438 asks us to program a register that is in the
0x2xxx range typically associated with the RCS engine, even though PVC
does not have an RCS.  By default the GuC will think we've made a
mistake and throw an exception when it sees this register on a CCS
engine's save/restore list, so we need to pass an extra GuC control flag
to tell it that this is expected and not a problem.

Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608005108.3717895-1-matthew.d.roper@intel.com
2022-06-08 07:25:03 -07:00
Jani Nikula
5821a0bbb4 drm/i915/uc: remove accidental static from a local variable
The arrays are static const, but the pointer shouldn't be static.

Fixes: 3d832f370d ("drm/i915/uc: Allow platforms to have GuC but not HuC")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220511094619.27889-1-jani.nikula@intel.com
2022-06-08 11:47:57 +03:00
Matt Roper
81298056a7 drm/i915/dg2: Correct DSS check for Wa_1308578152
When converting our DSS masks to bitmaps, we fumbled the condition used
to check whether any DSS are present in the first gslice.  Since
intel_sseu_find_first_xehp_dss() returns a 0-based number, we need a >=
condition rather than >.

Fixes: b87d390196 ("drm/i915/sseu: Disassociate internal subslice mask representation from uapi")
Reported-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607154724.3155521-1-matthew.d.roper@intel.com
2022-06-07 17:51:54 -07:00
Anshuman Gupta
c6e3806705 drm/i915/dg2: Add Wa_14015795083
i915 must disable Render DOP clock gating globally.

v2:
- Addressed cosmetic review comments.

Bspec: 52621
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607104542.8559-1-anshuman.gupta@intel.com
2022-06-07 13:54:24 -07:00
Stuart Summers
b729cfee70 drm/i915: Add extra registers to GPU error dump
Our internal teams have identified a few additional engine registers
that are worth inspecting in error state dumps during development &
debug.  Let's capture and print them as part of our error dump.

For simplicity we'll just dump these registers on gen11 and beyond.
Most of these registers have existed since earlier platforms (e.g., gen6
or gen7) but were initially introduced only for a subset of the
platforms' engines; gen11 seems to be where they became available on all
engines.

Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601210646.615946-1-matthew.d.roper@intel.com
2022-06-02 09:14:36 -07:00
Matt Roper
5ac342ef84 drm/i915/pvc: Add SSEU changes
PVC splits the mask of enabled DSS over two registers.  It also changes
the meaning of the EU fuse register such that each bit represents a
single EU rather than a pair of EUs.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-7-matthew.d.roper@intel.com
2022-06-02 07:21:09 -07:00
Matt Roper
b87d390196 drm/i915/sseu: Disassociate internal subslice mask representation from uapi
As with EU masks, it's easier to store subslice/DSS masks internally in
a format that's more natural for the driver to work with, and then only
covert into the u8[] uapi form when the query ioctl is invoked.  Since
the hardware design changed significantly with Xe_HP, we'll use a union
to choose between the old "hsw-style" subslice masks or the newer xehp
mask.  HSW-style masks will be stored in an array of u8's, indexed by
slice (there's never more than 6 subslices per slice on older
platforms).  For Xe_HP and beyond where slices no longer exist, we only
need a single bitmask.  However we already know that this mask is
eventually going to grow too large for a simple u64 to hold, so we'll
represent it in a manner that can be operated on by the utilities in
linux/bitmap.h.

v2:
 - Fix typo: BIT(s) -> BIT(ss) in gen9_sseu_device_status()

v3:
 - Eliminate sseu->ss_stride and just calculate the stride while
   specifically handling uapi.  (Tvrtko)
 - Use BITMAP_BITS() macro to refer to size of masks rather than
   passing I915_MAX_SS_FUSE_BITS directly.  (Tvrtko)
 - Report compute/geometry DSS masks separately when dumping Xe_HP SSEU
   info.  (Tvrtko)
 - Restore dropped range checks to intel_sseu_has_subslice().  (Tvrtko)

v4:
 - Make the bitmap size macro check the size of the .xehp field rather
   than the containing union.  (Tvrtko)
 - Don't add GEM_BUG_ON() intel_sseu_has_subslice()'s check for whether
   slice or subslice ID exceed sseu->max_[sub]slices; various loops
   in the driver are expected to exceed these, so we should just
   silently return 'false.'

v5:
 - Move XEHP_BITMAP_BITS() to the header so that we can also replace a
   usage of I915_MAX_SS_FUSE_BITS in one of the inline functions.
   (Bala)
 - Change the local variable in intel_slicemask_from_xehp_dssmask() from
   u16 to 'unsigned long' to make it a bit more future-proof.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-6-matthew.d.roper@intel.com
2022-06-02 07:20:59 -07:00
Matt Roper
bc3c5e0809 drm/i915/sseu: Don't try to store EU mask internally in UAPI format
Storing the EU mask internally in the same format the I915_QUERY
topology queries use makes the final copy_to_user() a bit simpler, but
makes the rest of the driver's SSEU more complicated and harder to
follow.  Let's switch to an internal representation that's more natural:
Xe_HP platforms will be a simple array of u16 masks, whereas pre-Xe_HP
platforms will be a two-dimensional array, indexed by [slice][subslice].
We'll convert to the uapi format only when the query uapi is called.

v2:
 - Drop has_common_ss_eumask.  We waste some space repeating identical
   EU masks for every single DSS, but the code is simpler without it.
   (Tvrtko)

v3:
 - Mask down EUs passed to sseu_set_eus at the callsite rather than
   inside the function.  (Tvrtko)
 - Eliminate sseu->eu_stride and calculate it when needed.  (Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-5-matthew.d.roper@intel.com
2022-06-02 07:19:20 -07:00
Matt Roper
4cfd166596 drm/i915/sseu: Simplify gen11+ SSEU handling
Although gen11 and gen12 architectures supported the concept of multiple
slices, in practice all the platforms that were actually designed only
had a single slice (i.e., note the parameters to 'intel_sseu_set_info'
that we pass for each platform).  We can simplify the code slightly by
dropping the multi-slice logic from gen11+ platforms.

v2:
 - Promote drm_dbg to drm_WARN_ON if the slice fuse register reports
   unexpected fusing.  (Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-4-matthew.d.roper@intel.com
2022-06-02 07:19:14 -07:00
Matt Roper
935a3c66eb drm/i915/xehp: Use separate sseu init function
Xe_HP has enough fundamental differences from previous platforms that it
makes sense to use a separate SSEU init function to keep things
straightforward and easy to understand.  We'll also add a has_xehp_dss
flag to the SSEU structure that will be used by other upcoming changes.

v2:
 - Add has_xehp_dss flag

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-2-matthew.d.roper@intel.com
2022-06-02 07:18:45 -07:00
Stuart Summers
ce581ae142 drm/i915/pvc: Add initial PVC workarounds
Bspec: 64027
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220527163348.1936146-3-matthew.d.roper@intel.com
2022-05-31 14:44:58 -07:00
Ashutosh Dixit
69d6bf5c37 drm/i915/gt: Fix memory leaks in per-gt sysfs
All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

v2: s/gtn/gt/ (Andi)

Fixes: b770bcfae9 ("drm/i915/gt: create per-tile sysfs interface")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a6f6686517c85fba61a0c45097f5bb4fe7e257fb.1653484574.git.ashutosh.dixit@intel.com
2022-05-26 09:37:05 +01:00
Dale B Stimson
9d15dd1bb3 drm/i915/gt: Add media RP0/RPn to per-gt sysfs
Retrieve RP0 and RPn freq for media IP from PCODE and display in per-gt
sysfs. This patch adds the following files to gt/gtN sysfs:
* media_RP0_freq_mhz
* media_RPn_freq_mhz

v2: Fixed commit author (Rodrigo)
v3: Convert to new uncore interface for pcode functions
v4: Adapt to intel_pcode.* function rename
v5: #include "intel_pcode.h" in alphabetical order (Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/45e34127a79e808f6582db8afb77f2f728a446e6.1653484574.git.ashutosh.dixit@intel.com
2022-05-26 09:36:59 +01:00
Ashutosh Dixit
26be7cd8aa drm/i915/gt: Add media freq factor to per-gt sysfs
Expose new sysfs to program and retrieve media freq factor. Factor values
of 0 (dynamic), 0.5 and 1.0 are supported via a u8.8 fixed point
representation (corresponding to integer values of 0, 128 and 256
respectively).

Media freq factor is converted to media_ratio_mode for GuC. It is
programmed into GuC using H2G SLPC interface. It is retrieved from GuC
through a register read. A cached media_ratio_mode is maintained to
preserve set values across GuC resets.

This patch adds the following sysfs files to gt/gtN sysfs:
* media_freq_factor
* media_freq_factor.scale

v2: Minor wording change in drm_warn (Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7ad7578335d8af9cba047b4bcf33d1887453d2e1.1653484574.git.ashutosh.dixit@intel.com
2022-05-26 09:36:57 +01:00
Linus Torvalds
2518f226c6 Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
 "Intel have enabled DG2 on certain SKUs for laptops, AMD has started
  some new GPU support, msm has user allocated VA controls

  dma-buf:
   - add dma_resv_replace_fences
   - add dma_resv_get_singleton
   - make dma_excl_fence private

  core:
   - EDID parser refactorings
   - switch drivers to drm_mode_copy/duplicate
   - DRM managed mutex initialization

  display-helper:
   - put HDMI, SCDC, HDCP, DSC and DP into new module

  gem:
   - rework fence handling

  ttm:
   - rework bulk move handling
   - add common debugfs for resource managers
   - convert to kvcalloc

  format helpers:
   - support monochrome formats
   - RGB888, RGB565 to XRGB8888 conversions

  fbdev:
   - cfb/sys_imageblit fixes
   - pagelist corruption fix
   - create offb platform device
   - deferred io improvements

  sysfb:
   - Kconfig rework
   - support for VESA mode selection

  bridge:
   - conversions to devm_drm_of_get_bridge
   - conversions to panel_bridge
   - analogix_dp - autosuspend support
   - it66121 - audio support
   - tc358767 - DSI to DPI support
   - icn6211 - PLL/I2C fixes, DT property
   - adv7611 - enable DRM_BRIDGE_OP_HPD
   - anx7625 - fill ELD if no monitor
   - dw_hdmi - add audio support
   - lontium LT9211 support, i.MXMP LDB
   - it6505: Kconfig fix, DPCD set power fix
   - adv7511 - CEC support for ADV7535

  panel:
   - ltk035c5444t, B133UAN01, NV3052C panel support
   - DataImage FG040346DSSWBG04 support
   - st7735r - DT bindings fix
   - ssd130x - fixes

  i915:
   - DG2 laptop PCI-IDs ("motherboard down")
   - Initial RPL-P PCI IDs
   - compute engine ABI
   - DG2 Tile4 support
   - DG2 CCS clear color compression support
   - DG2 render/media compression formats support
   - ATS-M platform info
   - RPL-S PCI IDs added
   - Bump ADL-P DMC version to v2.16
   - Support static DRRS
   - Support multiple eDP/LVDS native mode refresh rates
   - DP HDR support for HSW+
   - Lots of display refactoring + fixes
   - GuC hwconfig support and query
   - sysfs support for multi-tile
   - fdinfo per-client gpu utilisation
   - add geometry subslices query
   - fix prime mmap with LMEM
   - fix vm open count and remove vma refcounts
   - contiguous allocation fixes
   - steered register write support
   - small PCI BAR enablement
   - GuC error capture support
   - sunset igpu legacy mmap support for newer devices
   - GuC version 70.1.1 support

  amdgpu:
   - Initial SoC21 support
   - SMU 13.x enablement
   - SMU 13.0.4 support
   - ttm_eu cleanups
   - USB-C, GPUVM updates
   - TMZ fixes for RV
   - RAS support for VCN
   - PM sysfs code cleanup
   - DC FP rework
   - extend CG/PG flags to 64-bit
   - SI dpm lockdep fix
   - runtime PM fixes

  amdkfd:
   - RAS/SVM fixes
   - TLB flush fixes
   - CRIU GWS support
   - ignore bogus MEC signals more efficiently

  msm:
   - Fourcc modifier for tiled but not compressed layouts
   - Support for userspace allocated IOVA (GPU virtual address)
   - DPU: DSC (Display Stream Compression) support
   - DP: eDP support
   - DP: conversion to use drm_bridge and drm_bridge_connector
   - Merge DPU1 and MDP5 MDSS driver
   - DPU: writeback support

  nouveau:
   - make some structures static
   - make some variables static
   - switch to drm_gem_plane_helper_prepare_fb

  radeon:
   - misc fixes/cleanups

  mxsfb:
   - rework crtc mode setting
   - LCDIF CRC support

  etnaviv:
   - fencing improvements
   - fix address space collisions
   - cleanup MMU reference handling

  gma500:
   - GEM/GTT improvements
   - connector handling fixes

  komeda:
   - switch to plane reset helper

  mediatek:
   - MIPI DSI improvements

  omapdrm:
   - GEM improvements

  qxl:
   - aarch64 support

  vc4:
   - add a CL submission tracepoint
   - HDMI YUV support
   - HDMI/clock improvements
   - drop is_hdmi caching

  virtio:
   - remove restriction of non-zero blob types

  vmwgfx:
   - support for cursormob and cursorbypass 4
   - fence improvements

  tidss:
   - reset DISPC on startup

  solomon:
   - SPI support
   - DT improvements

  sun4i:
   - allwinner D1 support
   - drop is_hdmi caching

  imx:
   - use swap() instead of open-coding
   - use devm_platform_ioremap_resource
   - remove redunant initializations

  ast:
   - Displayport support

  rockchip:
   - Refactor IOMMU initialisation
   - make some structures static
   - replace drm_detect_hdmi_monitor with drm_display_info.is_hdmi
   - support swapped YUV formats,
   - clock improvements
   - rk3568 support
   - VOP2 support

  mediatek:
   - MT8186 support

  tegra:
   - debugabillity improvements"

* tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm: (1740 commits)
  drm/i915/dsi: fix VBT send packet port selection for ICL+
  drm/i915/uc: Fix undefined behavior due to shift overflowing the constant
  drm/i915/reg: fix undefined behavior due to shift overflowing the constant
  drm/i915/gt: Fix use of static in macro mismatch
  drm/i915/audio: fix audio code enable/disable pipe logging
  drm/i915: Fix CFI violation with show_dynamic_id()
  drm/i915: Fix 'mixing different enum types' warnings in intel_display_power.c
  drm/i915/gt: Fix build error without CONFIG_PM
  drm/msm/dpu: handle pm_runtime_get_sync() errors in bind path
  drm/msm/dpu: add DRM_MODE_ROTATE_180 back to supported rotations
  drm/msm: don't free the IRQ if it was not requested
  drm/msm/dpu: limit writeback modes according to max_linewidth
  drm/amd: Don't reset dGPUs if the system is going to s2idle
  drm/amdgpu: Unmap legacy queue when MES is enabled
  drm: msm: fix possible memory leak in mdp5_crtc_cursor_set()
  drm/msm: Fix fb plane offset calculation
  drm/msm/a6xx: Fix refcount leak in a6xx_gpu_init
  drm/msm/dsi: don't powerup at modeset time for parade-ps8640
  drm/rockchip: Change register space names in vop2
  dt-bindings: display: rockchip: make reg-names mandatory for VOP2
  ...
2022-05-25 16:18:27 -07:00
Matt Roper
16e214d4ae drm/i915/hwconfig: Future-proof platform checks
PVC also has a hwconfig table.  Actually the current expectation is that
all future platforms will have hwconfig, so let's just change the
condition to an IP version check so that we don't need to keep updating
this for each new platform that shows up.

Cc: John Harrison <john.c.harrison@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220524235906.529771-1-matthew.d.roper@intel.com
2022-05-25 12:38:22 -07:00