drm/i915 features for v5.11:
Highlights:
- Enable big joiner to join two pipes to one port to overcome pipe restrictions
(Manasi, Ville, Maarten)
Display:
- More DG1 enabling (Lucas, Aditya)
- Fixes to cases without display (Lucas, José, Jani)
- Initial PSR state improvements (José)
- JSL eDP vswing updates (Tejas)
- Handle EDID declared max 16 bpc (Ville)
- Display refactoring (Ville)
Other:
- GVT features
- Backmerge
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87czzzkk1s.fsf@intel.com
After having written the entire OA buffer with reports, the HW will
write again at the beginning of the OA buffer. It'll indicate it by
setting the WRAP bits in the OASTATUS register.
When a wrap happens and that at the end of the read vfunc we write the
OASTATUS register back to clear the REPORT_LOST bit, we sometimes see
that the OATAILPTR register is reset to a previous position on Gen8/9
(apparently not the case on Gen11+). This leads the next call to the
read vfunc to process reports we've already read. Because we've marked
those as read by clearing the reason & timestamp dwords, they're
discarded and a "Skipping spurious, invalid OA report" message is
emitted.
The workaround to avoid this OATAILPTR value reset seems to be to set
the wrap bits when writing back OASTATUS.
This change has no impact on userspace, it only avoids a bunch of
DRM_NOTE("Skipping spurious, invalid OA report\n") messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 19f81df285 ("drm/i915/perf: Add OA unit support for Gen 8+")
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117130124.829979-1-lionel.g.landwerlin@intel.com
Cross-subsystem Changes:
- DMA mapped scatterlist fixes in i915 to unblock merging of
https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom)
Driver Changes:
- Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"):
Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris)
- Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce)
- Delay execlist processing for Tigerlake to avoid hang (Chris)
- Fix for Tigerlake RCS engine health check through heartbeat (Chris)
- Fix for Tigerlake reserved MOCS entries (Ayaz, Chris)
- Fix Media power gate sequence on Tigerlake (Rodrigo)
- Enable eLLC caching of display buffers for SKL+ (Ville)
- Support parsing of oversize batches on Gen9 (Matt, Chris)
- Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris)
- Flush engines before Tigerlake breadcrumbs (Chris)
- Use the local HWSP offset during submission (Chris)
- Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew)
- Use the active reference on the vma while capturing to avoid use-after-free (Chris)
- Fix MOCS PTE setting for gen9+ (Ville)
- Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris)
- Avoid NULL dereference from PT/PD stash allocation error (Matt)
- Hold request reference for canceling an active context (Chris)
- Avoid infinite loop on x86-32 when mapping a lot of objects (Chris)
- Disallow WC mappings when processor doesn't support them (Chris)
- Return correct error in i915_gem_object_copy_blt() error path (Dan)
- Return correct error in intel_context_create_request() error path (Maarten)
- Tune down GuC communication enabled/disabled messages to debug (Jani)
- Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris)
- Cancel outstanding work after disabling heartbeats on an engine (Chris)
- Signal cancelled requests (Chris)
- Retire cancelled requests on unload (Chris)
- Scrub HW state on driver remove (Chris)
- Undo forced context restores after trivial preemptions (Chris)
- Handle PCI unbind in PMU code (Tvrtko)
- Fix CPU hotplug with multiple GPUs in PMU code (Trtkko)
- Correctly set SFC capability for video engines (Venkata)
- Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal)
- Improve GuC warnings on loading failure (John)
- Avoid ownership race in buffer pool by clearing age (Chris)
- Use MMIO to read CSB in case of failure (Chris, Mika)
- Show engine properties in engine state dump to indicate changes (Chris, Joonas)
- Break up error capture compression loops with cond_resched() (Chris)
- Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris)
- Serialise debugfs i915_gem_objects with ctx->mutex (Chris)
- Always test execution status on closing the context and close if not persistent (Chris)
- Avoid mixing integer types during batch copies (Chris, Jared)
- Skip over MI_NOOP when parsing to avoid overhead (Chris)
- Hold onto an explicit ref to i915_vma_work.pinned (Chris)
- Perform all asynchronous waits prior to marking payload start (Chris)
- Pull phys pread/pwrite implementations to the backend (Matt)
- Improve record of hung engines in error state (Tvrtko)
- Allow backends to override pread implementation (Matt)
- Reinforce LRC poisoning checks to confirm context survives execution (Chris)
- Fix memory region max size calculation (Matt)
- Fix order when adding blocks to memory region (Matt)
- Eliminate unused intel_virtual_engine_get_sibling func (Chris)
- Cleanup kasan warning for on-stack (unsigned long) casting (Chris)
- Onion unwind for scratch page allocation failure (Chris)
- Poison stolen pages before use (Chris)
- Selftest improvements (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com
DG1 uses 2 registers for the ddi clock mapping, with PHY A and B using
DPCLKA_CFGCR0 and PHY C and D using DPCLKA1_CFGCR0. Hide this behind a
single macro that chooses the correct register according to the phy
being accessed, use the correct bitfields for each pll/phy and implement
separate functions for DG1 since it doesn't share much with ICL/TGL
anymore.
The previous values were correct for PHY A and B since they were using
the same register as before and the bitfields were matching.
v2: Add comment and try to simplify DG1_DPCLKA* macros by reusing
previous ones
v3:
- Fix DG1_DPCLKA_CFGCR0_DDI_CLK_SEL_MASK() after wrong macro reuse
- Move phy -> id map to a separate macro (Aditya)
- Remove DG1_DPCLKA_CFGCR0_DDI_CLK_SEL_MASK where not required
(Aditya)
- Use drm_WARN_ON
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Aditya Swarup <aditya.swarup@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106210006.837953-1-lucas.demarchi@intel.com
The power well that we've been referring to as the 'blitter' well is
actually more of a general GT power well which contains a lot of things
other than the blitter engine registers. The FORCEWAKE_BLITTER name in
the code was used for historic reasons, but no longer matches how the
bspec describes this power well and just causes confusion for people not
familiar with this area of the code. Let's rename it to FORCEWAKE_GT to
more accurately describe the role of the power well and match how the
modern bspec refers to it.
v2:
- Add a comment noting that the GT power well includes the blitter
engine. (Jose)
Bspec: 66696, 66534, 67609
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009194442.3668677-2-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
The BIOS of at least one ASUS-Z170M system with an SKL I have programs
the 101b WRPLL PDIV divider value, which is the encoding for PDIV=7 with
bit#0 incorrectly set.
This happens with the
"3840x2160": 30 262750 3840 3888 3920 4000 2160 2163 2168 2191 0x48 0x9
HDMI mode (scaled from a 1024x768 src fb) set by BIOS and the
ref_clock=24000, dco_integer=383, dco_fraction=5802, pdiv=7, qdiv=1, kdiv=1
WRPLL parameters (assuming PDIV=7 was the intended setting). This
corresponds to 262749 PLL frequency/port clock.
Later the driver sets the same mode for which it calculates the same
dco_int/dco_frac/div WRPLL parameters (with the correct PDIV=7 encoding).
Based on the above, let's assume that PDIV=7 was intended and the HW
just ignores bit#0 in the PDIV register field for this setting, treating
100b and 101b encodings the same way.
While at it add the MISSING_CASE() for the p0,p2 divider decodings.
v2: (Ville)
- Add a define for the incorrect divider value.
- Emit only a debug message when detecting the incorrect divider value.
- Use fallthrough from the incorrect divider value case.
- Add the MISSING_CASE()s.
v3: Return 0 freq for incorrect divider values. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006013555.1488262-1-imre.deak@intel.com
DG1 does some additional pcode/uncore handshaking at
boot time; this handshaking must complete before various other pcode
commands are effective and before general work is submitted to the GPU.
We need to poll a new pcode mailbox during startup until it reports that
this handshaking is complete.
The bspec doesn't give guidance on how long we may need to wait for this
handshaking to complete. For now, let's just set a really long timeout;
if we still don't get a completion status by the end of that timeout,
we'll just continue on and hope for the best.
v2 (Lucas): Rename macros to make clear the relation between command and
result (requested by José)
Bspec: 52065
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001063917.3133475-2-lucas.demarchi@intel.com
The command register is the PCODE MBOX low register not the high one as
described by the spec. This left the system with the TC-cold power state
being blocked all the time. Fix things by using the correct register.
Also to make sure we retry a request for at least 600usec, when the
PCODE MBOX command itself succeeded, but the TC-cold block command
failed, sleep for 1msec unconditionally after any fail.
The change was tested with JTAG register read of the HW/FW's actual
TC-cold state, which reported the expected states after this change.
Tested-by: Nivedita Swaminathan <nivedita.swaminathan@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200805150056.24248-1-imre.deak@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
After doing normal PHY-B initialization on Rocket Lake, we need to
manually copy some additional PHY-A register values into PHY-B
registers.
Note that the bspec's combo phy page doesn't specify that this
workaround is restricted to specific platform steppings (and doesn't
even do a very good job of specifying that RKL is the only platform this
is needed on), but the RKL workaround page lists this as relevant only
for A and B steppings, so I'm trusting that information for now.
v2: Make rkl_combo_phy_b_init_wa() static
v3:
- Minimize variables in WA function. (Jose)
- Fix timeout duration (usec vs msec). (Jose)
- Add verification of workaround. (Jose)
- Fix stepping bounds in comment.
Bspec: 49291
Bspec: 53273
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716220551.2730644-6-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If HTI (also sometimes called HDPORT) is enabled at startup, it may be
using some of the PHYs and DPLLs making them unavailable for general
usage. Let's read out the HDPORT_STATE register and avoid making use of
resources that HTI is already using.
v2:
- Fix minor checkpatch warnings
v3:
- Just readout HDPORT_STATE register once during init and then parse it
later as needed.
- Add a 'has_hti' device info flag to track whether we should readout
HDPORT_STATE or not. We can skip the platform/flag tests later since
the hti_state in dev_priv will remain 0 for platforms it does not
apply to.
- Move PLL masking into icl_get_combo_phy_dpll() since at the moment
RKL is the only platform that has HTI. (Jose)
Bspec: 49189
Bspec: 53707
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716220551.2730644-5-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Rocket Lake has a third DPLL (called 'DPLL4') that must be used to
enable a third display. Unlike EHL's variant of DPLL4, the RKL variant
behaves the same as DPLL0/1. And despite its name, the DPLL4 registers
are offset as if it were DPLL2.
v2:
- Add new .update_ref_clks() hook.
v3:
- Renumber TBT PLL to '3' and switch _MMIO_PLL3 to _MMIO_PLL (Lucas)
v4:
- Don't drop _MMIO_PLL3; although it's now unused, we're going to need
it very soon again for upcoming DG1 patches. (Lucas)
v5:
- Don't re-number TBT PLL and beyond, just use new RKL_DPLL_CFGCR
macros to lookup the proper registers instead. Although renumbering
the PLLs might be something we want to consider down the road, it
opens a big can of worms right now since a bunch of places in the
code have an assumption that the PLL table has idx==id and no holes.
Renumbering creates a hole for TGL, so we'd either need to allow
holes in the table or break the idx==id invariant, both of which are
somewhat invasive changes to the design.
Bspec: 49202
Bspec: 49443
Bspec: 50288
Bspec: 50289
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716220551.2730644-4-matthew.d.roper@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Hours Of Battery Life is a new GEN12+ power-saving feature that allows
supported motherboards to use a special voltage swing table for eDP
panels that uses less power.
So here if supported by HW, OEM will set it in VBT and i915 will try
to train link with HOBL vswing table if link training fails it fall
back to the original table.
intel_ddi_dp_preemph_max() was optimized to only check the HOBL flag
instead of do something like is done in intel_ddi_dp_voltage_max()
because it is only called after the first entry of the voltage swing
table was loaded so the HOBL flag is valid at that point.
v3:
- removed a few parameters of icl_ddi_combo_vswing_program() that
can be taken from encoder
v4:
- using the HOBL vswing table until training fails completely (Ville)
v5:
- not reducing lane or link rate when link training fails with HOBL
active
- duplicated the HOBL voltage swing entry to match DP spec requirement
v6:
- removed the optional VS 3 & pre-emp 0 from HOBL table
- changed from u8:1 to bool to store hobl_failed/active
BSpec: 49291
BSpec: 49399
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200715175637.33763-1-jose.souza@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Normally i85x/i865 3D activity will block FBC until a 2D blit
occurs. I suppose this was meant to avoid recompression while
3D activity is still going on but the frame hasn't yet been
presented. Unfortunately that also means that a page flipped
3D workload will permanently block FBC even if it only renders
a single frame and then does nothing.
Since we are using software render tracking anyway we might as
well flip the chicken bit so that 3D does not block FBC. This
will avoid the permament FBC blockage in the aforemention use
case, but thanks to the software tracking the compressor will
not disturb 3D rendering activity.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702153723.24327-5-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Catch up with upstream, in particular to get c1e8d7c6a7 ("mmap locking
API: convert mmap_sem comments").
Signed-off-by: Jani Nikula <jani.nikula@intel.com>