This fixes the following build error with GCC 11:
In function ‘snb_wm_latency_quirk’,
inlined from ‘ilk_setup_wm_latency’ at drivers/gpu/drm/i915/intel_pm.c:3109:3,
inlined from ‘intel_init_pm’ at drivers/gpu/drm/i915/intel_pm.c:7695:3:
drivers/gpu/drm/i915/intel_pm.c:3058:9: error: ‘intel_print_wm_latency’ reading 16 bytes from a region of size 10 [-Werror=stringop-overread]
3058 | intel_print_wm_latency(dev_priv, "Primary", dev_priv->wm.pri_latency);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/intel_pm.c: In function ‘intel_init_pm’:
drivers/gpu/drm/i915/intel_pm.c:3058:9: note: referencing argument 3 of type ‘const u16 *’ {aka ‘const short unsigned int *’}
drivers/gpu/drm/i915/intel_pm.c:2995:13: note: in a call to function ‘intel_print_wm_latency’
2995 | static void intel_print_wm_latency(struct drm_i915_private *dev_priv,
| ^~~~~~~~~~~~~~~~~~~~~~
As far as I can tell, we don't actually need 8 elements except on SKL
and that uses dev_priv->wm.skl_latency which has enough.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210413173259.472405-1-jason@jlekstrand.net
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Gen to ver conversions across the driver
The main change is Lucas' series [1], with Ville's GLK fixes [2] and a
cherry-pick of Matt's commit [3] from drm-intel-next as a base to avoid
conflicts.
[1] https://patchwork.freedesktop.org/series/88825/
[2] https://patchwork.freedesktop.org/series/88938/
[3] 70bfb30743 ("drm/i915/display: Eliminate IS_GEN9_{BC,LP}")
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# Conflicts:
# drivers/gpu/drm/i915/display/intel_bios.c
# drivers/gpu/drm/i915/display/intel_cdclk.c
# drivers/gpu/drm/i915/display/intel_ddi.c
# drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c
# drivers/gpu/drm/i915/display/intel_display.c
# drivers/gpu/drm/i915/display/intel_display_power.c
# drivers/gpu/drm/i915/display/intel_dp.c
# drivers/gpu/drm/i915/display/intel_dpll_mgr.c
# drivers/gpu/drm/i915/display/intel_fbc.c
# drivers/gpu/drm/i915/display/intel_gmbus.c
# drivers/gpu/drm/i915/display/intel_hdcp.c
# drivers/gpu/drm/i915/display/intel_hdmi.c
# drivers/gpu/drm/i915/display/intel_pps.c
# drivers/gpu/drm/i915/intel_pm.c
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/878s5ebny0.fsf@intel.com
While converting the rest of the driver to use GRAPHICS_VER() and
MEDIA_VER(), following what was done for display, some discussions went
back on what we did for display:
1) Why is the == comparison special that deserves a separate
macro instead of just getting the version and comparing directly
like is done for >, >=, <=?
2) IS_DISPLAY_RANGE() is weird in that it omits the "_VER" for
brevity. If we remove the current users of IS_DISPLAY_VER(), we
could actually repurpose it for a range check
With (1) there could be an advantage if we used gen_mask since multiple
conditionals be combined by the compiler in a single and instruction and
check the result. However a) INTEL_GEN() doesn't use the mask since it
would make the code bigger everywhere else and b) in the cases it made
sense, it also made sense to convert to the _RANGE() variant.
So here we repurpose IS_DISPLAY_VER() to work with a [ from, to ] range
like was the IS_DISPLAY_RANGE() and convert the current IS_DISPLAY_VER()
users to use == and != operators. Aside from the definition changes,
this was done by the following semantic patch:
@@ expression dev_priv, E1; @@
- !IS_DISPLAY_VER(dev_priv, E1)
+ DISPLAY_VER(dev_priv) != E1
@@ expression dev_priv, E1; @@
- IS_DISPLAY_VER(dev_priv, E1)
+ DISPLAY_VER(dev_priv) == E1
@@ expression dev_priv, from, until; @@
- IS_DISPLAY_RANGE(dev_priv, from, until)
+ IS_DISPLAY_VER(dev_priv, from, until)
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
[Jani: Minor conflict resolve while applying.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210413051002.92589-4-lucas.demarchi@intel.com
Although most of the code in this file is display-related (watermarks),
there's some functions that are not (e.g., clock gating). Thus we need
to do the conversions to DISPLAY_VER() manually here rather than using
Coccinelle.
In the near-future we'll probably want to think about moving watermark
logic out of intel_pm.c and into watermark-specific files under the
display/ directory.
v2:
- Use new IS_DISPLAY_VER macro where appropriate.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210320044245.3920043-5-matthew.d.roper@intel.com
On HSW/BDW with VT-d active the first tile row scanned out
after the first async flip of the frame often ends up corrupted.
Whether the corruption happens or not depends on the scanline
on which the async flip happens, but the behaviour seems very
consistent. Ie. the same set of scanlines (which are most scanlines)
always show the corruption. And another set of scanlines (far less
of them) never shows the corruption.
I discovered that disabling the fetch-stride stretching
feature cures the corruption. This is some kind of TLB related
prefetch thing AFAIK. We already disable it on SNB primary
planes due to a documented workaround. The hardware folks
indicated that disabling this should be fine, so let's go
with that.
And while we're here, let's document the relevant bits on all
pre-skl platforms.
Fixes: 2a636e240c ("drm/i915: Implement async flip for ivb/hsw")
Fixes: cda195f13a ("drm/i915: Implement async flips for bdw")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210220103303.3448-1-ville.syrjala@linux.intel.com
Reviewed-by: Karthik B S <karthik.b.s@intel.com>
(cherry picked from commit b7a7053ab2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Say we have two planes enabled with watermarks configured
as follows:
plane A: wm0=enabled/can_sagv=false, wm1=enabled/can_sagv=true
plane B: wm0=enabled/can_sagv=true, wm1=disabled
This is possible since the latency we use to calculate
can_sagv may not be the same for both planes due to
skl_needs_memory_bw_wa().
In this case skl_crtc_can_enable_sagv() will see that
both planes have enabled at least one watermark level
with can_sagv==true, and thus proceeds to allow SAGV.
However, since plane B does not have wm1 enabled
plane A can't actually use it either. Thus we are
now running with SAGV enabled, but plane A can't
actually tolerate the extra latency it imposes.
To remedy this only allow SAGV on if the highest common
enabled watermark level for all active planes can tolerate
the extra SAGV latency.
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210305153610.12177-3-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
On HSW/BDW with VT-d active the first tile row scanned out
after the first async flip of the frame often ends up corrupted.
Whether the corruption happens or not depends on the scanline
on which the async flip happens, but the behaviour seems very
consistent. Ie. the same set of scanlines (which are most scanlines)
always show the corruption. And another set of scanlines (far less
of them) never shows the corruption.
I discovered that disabling the fetch-stride stretching
feature cures the corruption. This is some kind of TLB related
prefetch thing AFAIK. We already disable it on SNB primary
planes due to a documented workaround. The hardware folks
indicated that disabling this should be fine, so let's go
with that.
And while we're here, let's document the relevant bits on all
pre-skl platforms.
Fixes: 2a636e240c ("drm/i915: Implement async flip for ivb/hsw")
Fixes: cda195f13a ("drm/i915: Implement async flips for bdw")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210220103303.3448-1-ville.syrjala@linux.intel.com
Reviewed-by: Karthik B S <karthik.b.s@intel.com>
In order to make the dbuf state computation less fragile
let's make it stand on its own feet by not requiring someone
to peek into a crystall ball ahead of time to figure out
which pipes need to be added to the state under which potential
future conditions. Instead we compute each piece of the state
as we go along, and if any fallout occurs that affects more than
the current set of pipes we add the affected pipes to the state
naturally.
That requires that we track a few extra thigns in the global
dbuf state: dbuf slices for each pipe, and the weight each
pipe has when distributing the same set of slice(s) between
multiple pipes. Easy enough.
We do need to follow a somewhat careful sequence of computations
though as there are several steps involved in cooking up the dbuf
state. Thoguh we could avoid some of that by computing more things
on demand instead of relying on earlier step of the algorithm to
have filled it out. I think the end result is still reasonable
as the entire sequence is pretty much consolidated into a single
function instead of being spread around all over.
The rough sequence is this:
1. calculate active_pipes
2. calculate dbuf slices for every pipe
3. calculate total enabled slices
4. calculate new dbuf weights for any crtc in the state
5. calculate new ddb entry for every pipe based on the sets of
slices and weights, and add any affected crtc to the state
6. calculate new plane ddb entries for all crtcs in the state,
and add any affected plane to the state so that we'll perform
the requisite hw reprogramming
And as a nice bonus we get to throw dev_priv->wm.distrust_bios_wm
out the window.
v2: Keep crtc_state->wm.skl.ddb
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210122205633.18492-8-ville.syrjala@linux.intel.com
The dbuf state will be where we collect all the inter-pipe dbuf
allocation stuff. Start by adding the actual per-pipe ddb entries
there.
Originally the plan was to move them there outright, but that no longer
works as we're no longer guaranteed to have a dbuf state when it comes
time to sanity check the ddb overlaps in skl_commit_modeset_enables().
I think when I wrote this originally we did the watermark/ddb
calculation last, and so we couldn't have any crtcs in the state w/o
also having the dbuf state. But that has since changed and we do the
watermark/ddb calculation much earlier, and thus it is now possible
to commit crtcs w/o a dbuf state. So we keep another copy of the
information in the crtc state.
v2: Rebase
v3: Duplicate the entries instead of moving
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210122205633.18492-6-ville.syrjala@linux.intel.com
drm/i915 features for v5.11:
Highlights:
- Enable big joiner to join two pipes to one port to overcome pipe restrictions
(Manasi, Ville, Maarten)
Display:
- More DG1 enabling (Lucas, Aditya)
- Fixes to cases without display (Lucas, José, Jani)
- Initial PSR state improvements (José)
- JSL eDP vswing updates (Tejas)
- Handle EDID declared max 16 bpc (Ville)
- Display refactoring (Ville)
Other:
- GVT features
- Backmerge
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87czzzkk1s.fsf@intel.com
Store the relative data rate for planes in the crtc state
so that we don't have to use
intel_atomic_crtc_state_for_each_plane_state() to compute
it even for the planes that are no part of the current state.
Should probably just nuke this stuff entirely an use the normal
plane data rate instead. The two are slightly different since this
relative data rate doesn't factor in the actual pixel clock, so
it's a bit odd thing to even call a "data rate". And since the
watermarks are computed based on the actual data rate anyway
I don't really see what the point of this relative data rate
is. But that's for the future...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-6-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
In order to remove intel_atomic_crtc_state_for_each_plane_state()
from skl_crtc_can_enable_sagv() we can simply precompute whether
each wm level can tolerate the SAGV block time latency or not.
This has the nice side benefit that we remove the duplicated
wm level latency calculation. In fact the copy of that code
we had in skl_crtc_can_enable_sagv() didn't even handle
WaIncreaseLatencyIPCEnabled/Display WA #1141 whereas the copy
in skl_compute_plane_wm() did. So now we just have the one
copy which handles all the w/as.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-5-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
intel_atomic_crtc_state_for_each_plane_state() peeks at the
plane's current state without holding the plane's mutex, trusting
that the crtc's mutex will protect it. In practice that does work
since our planes can't move between pipes, but it sets a bad
example. intel_atomic_crtc_state_for_each_plane_state() also
relies on crtc_state.uapi.plane_mask which may be full of lies
when it comes to the bigjoiner stuff, so soon we can't use it as
is anyway. So best to just get rid of it entirely. Which we can
easily do by switching to the g4x/vlv "raw" watermark approach.
Later on we should even be able to move the "raw" watermark
computation into the normal .plane_check() code, leaving only
the merging/clamping of the final watermarks to the later
stages. But that will require adjusting the ilk+ wm code
similarly as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-3-ville.syrjala@linux.intel.com
With bigjoiner, there will be 2 pipes driving 2 halves of 1 transcoder,
because of this, we need a pipe_mode for various calculations, including
for example watermarks, plane clipping, etc.
v10:
* remove redundant pipe_mode assignment (Ville)
v9:
* pipe_mode in state dump nd state check (Ville)
v8:
* Add pipe_mode in readout in verify_crtc_state (Ville)
v7:
* Remove redundant comment (Ville)
* Just keep mode instead of pipe_mode (Ville)
v6:
* renaming in separate function, only pipe_mode here (Ville)
* Add description (Maarten)
v5:
* Rebase (Manasi)
v4:
* Manual rebase (Manasi)
v3:
* Change state to crtc_state, fix rebase err (Manasi)
v2:
* Manual Rebase (Manasi)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
[vsyrjala:
* Fix state checker
* Fix state dump
* Use pipe_mode for linetime watermarks
* Make sure pipe_mode normal timings are correct since the
silly ddb code uses them
* Drop the redundant pipe_mode copies from intel_modeset_pipe_config()
and intel_crtc_copy_uapi_to_hw_state()
* Use drm_mode_copy() all over]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-7-ville.syrjala@linux.intel.com
Cross-subsystem Changes:
- DMA mapped scatterlist fixes in i915 to unblock merging of
https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom)
Driver Changes:
- Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"):
Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris)
- Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce)
- Delay execlist processing for Tigerlake to avoid hang (Chris)
- Fix for Tigerlake RCS engine health check through heartbeat (Chris)
- Fix for Tigerlake reserved MOCS entries (Ayaz, Chris)
- Fix Media power gate sequence on Tigerlake (Rodrigo)
- Enable eLLC caching of display buffers for SKL+ (Ville)
- Support parsing of oversize batches on Gen9 (Matt, Chris)
- Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris)
- Flush engines before Tigerlake breadcrumbs (Chris)
- Use the local HWSP offset during submission (Chris)
- Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew)
- Use the active reference on the vma while capturing to avoid use-after-free (Chris)
- Fix MOCS PTE setting for gen9+ (Ville)
- Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris)
- Avoid NULL dereference from PT/PD stash allocation error (Matt)
- Hold request reference for canceling an active context (Chris)
- Avoid infinite loop on x86-32 when mapping a lot of objects (Chris)
- Disallow WC mappings when processor doesn't support them (Chris)
- Return correct error in i915_gem_object_copy_blt() error path (Dan)
- Return correct error in intel_context_create_request() error path (Maarten)
- Tune down GuC communication enabled/disabled messages to debug (Jani)
- Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris)
- Cancel outstanding work after disabling heartbeats on an engine (Chris)
- Signal cancelled requests (Chris)
- Retire cancelled requests on unload (Chris)
- Scrub HW state on driver remove (Chris)
- Undo forced context restores after trivial preemptions (Chris)
- Handle PCI unbind in PMU code (Tvrtko)
- Fix CPU hotplug with multiple GPUs in PMU code (Trtkko)
- Correctly set SFC capability for video engines (Venkata)
- Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal)
- Improve GuC warnings on loading failure (John)
- Avoid ownership race in buffer pool by clearing age (Chris)
- Use MMIO to read CSB in case of failure (Chris, Mika)
- Show engine properties in engine state dump to indicate changes (Chris, Joonas)
- Break up error capture compression loops with cond_resched() (Chris)
- Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris)
- Serialise debugfs i915_gem_objects with ctx->mutex (Chris)
- Always test execution status on closing the context and close if not persistent (Chris)
- Avoid mixing integer types during batch copies (Chris, Jared)
- Skip over MI_NOOP when parsing to avoid overhead (Chris)
- Hold onto an explicit ref to i915_vma_work.pinned (Chris)
- Perform all asynchronous waits prior to marking payload start (Chris)
- Pull phys pread/pwrite implementations to the backend (Matt)
- Improve record of hung engines in error state (Tvrtko)
- Allow backends to override pread implementation (Matt)
- Reinforce LRC poisoning checks to confirm context survives execution (Chris)
- Fix memory region max size calculation (Matt)
- Fix order when adding blocks to memory region (Matt)
- Eliminate unused intel_virtual_engine_get_sibling func (Chris)
- Cleanup kasan warning for on-stack (unsigned long) casting (Chris)
- Onion unwind for scratch page allocation failure (Chris)
- Poison stolen pages before use (Chris)
- Selftest improvements (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com