Driver code has no business with the internals of the irq descriptor.
Aside of that the count is per interrupt line and therefore takes
interrupts from other devices into account which share the interrupt line
and are not handled by the graphics driver.
Replace it with a pmu private count which only counts interrupts which
originate from the graphics card.
To avoid atomics or heuristics of some sort make the counter field
'unsigned long'. That limits the count to 4e9 on 32bit which is a lot and
postprocessing can easily deal with the occasional wraparound.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://lore.kernel.org/r/20201210194043.957046529@linutronix.de
Nothing uses the result and nothing should ever use it in driver code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20201210194043.862572239@linutronix.de
To support legacy gamma ioctls the drivers need to set
drm_crtc_funcs.gamma_set either to a custom implementation or to
drm_atomic_helper_legacy_gamma_set. Most of the atomic drivers do the
latter.
We can simplify this by making the core handle it automatically.
Move the drm_atomic_helper_legacy_gamma_set() functionality into
drm_color_mgmt.c to make drm_mode_gamma_set_ioctl() use
drm_crtc_funcs.gamma_set if set or GAMMA_LUT property if not.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Philippe Cornu <philippe.cornu@st.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201211114237.213288-2-tomi.valkeinen@ti.com
- migrate_disable/enable() support which originates from the RT tree and
is now a prerequisite for the new preemptible kmap_local() API which aims
to replace kmap_atomic().
- A fair amount of topology and NUMA related improvements
- Improvements for the frequency invariant calculations
- Enhanced robustness for the global CPU priority tracking and decision
making
- The usual small fixes and enhancements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XwK4THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoX28D/9cVrvziSQGfBfuQWnUiw8iOIq1QBa2
Me+Tvenhfrlt7xU6rbP9ciFu7eTN+fS06m5uQPGI+t22WuJmHzbmw1bJVXfkvYfI
/QoU+Hg7DkDAn1p7ZKXh0dRkV0nI9ixxSHl0E+Zf1ATBxCUMV2SO85flg6z/4qJq
3VWUye0dmR7/bhtkIjv5rwce9v2JB2g1AbgYXYTW9lHVoUdGoMSdiZAF4tGyHLnx
sJ6DMqQ+k+dmPyYO0z5MTzjW/fXit4n9w2e3z9TvRH/uBu58WSW1RBmQYX6aHBAg
dhT9F4lvTs6lJY23x5RSFWDOv6xAvKF5a0xfb8UZcyH5EoLYrPRvm42a0BbjdeRa
u0z7LbwIlKA+RFdZzFZWz8UvvO0ljyMjmiuqZnZ5dY9Cd80LSBuxrWeQYG0qg6lR
Y2povhhCepEG+q8AXIe2YjHKWKKC1s/l/VY3CNnCzcd21JPQjQ4Z5eWGmHif5IED
CntaeFFhZadR3w02tkX35zFmY3w4soKKrbI4EKWrQwd+cIEQlOSY7dEPI/b5BbYj
MWAb3P4EG9N77AWTNmbhK4nN0brEYb+rBbCA+5dtNBVhHTxAC7OTWElJOC2O66FI
e06dREjvwYtOkRUkUguWwErbIai2gJ2MH0VILV3hHoh64oRk7jjM8PZYnjQkdptQ
Gsq0rJW5iiu/OQ==
=Oz1V
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Thomas Gleixner:
- migrate_disable/enable() support which originates from the RT tree
and is now a prerequisite for the new preemptible kmap_local() API
which aims to replace kmap_atomic().
- A fair amount of topology and NUMA related improvements
- Improvements for the frequency invariant calculations
- Enhanced robustness for the global CPU priority tracking and decision
making
- The usual small fixes and enhancements all over the place
* tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits)
sched/fair: Trivial correction of the newidle_balance() comment
sched/fair: Clear SMT siblings after determining the core is not idle
sched: Fix kernel-doc markup
x86: Print ratio freq_max/freq_base used in frequency invariance calculations
x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC
x86, sched: Calculate frequency invariance for AMD systems
irq_work: Optimize irq_work_single()
smp: Cleanup smp_call_function*()
irq_work: Cleanup
sched: Limit the amount of NUMA imbalance that can exist at fork time
sched/numa: Allow a floating imbalance between NUMA nodes
sched: Avoid unnecessary calculation of load imbalance at clone time
sched/numa: Rename nr_running and break out the magic number
sched: Make migrate_disable/enable() independent of RT
sched/topology: Condition EAS enablement on FIE support
arm64: Rebuild sched domains on invariance status changes
sched/topology,schedutil: Wrap sched domains rebuild
sched/uclamp: Allow to reset a task uclamp constraint value
sched/core: Fix typos in comments
Documentation: scheduler: fix information on arch SD flags, sched_domain and sched_debug
...
core:
- documentation updates
- deprecate DRM_FORMAT_MOD_NONE
- atomic crtc enable/disable rework
- GEM convert drivers to gem object functions
- remove SCATTER_LIST_MAX_SEGMENT
sched:
- avoid infinite waits
ttm:
- remove AGP support
- don't modify caching for swapout
- ttm pinning rework
- major TTM reworks
- new backend allocator
- multihop support
vram-helper:
- top down BO placement fix
- TTM changes
- GEM object support
displayport:
- DP 2.0 DPCD prep work
- DP MST extended DPCD caps
fbdev:
- mark as orphaned
amdgpu:
- Initial Vangogh support
- Green Sardine support
- Dimgrey Cavefish support
- SG display support for renoir
- SMU7 improvements
- gfx9+ modiifier support
- CI BACO fixes
radeon:
- expose voltage via hwmon on SUMO
amdkfd:
- fix unique id handling
i915:
- more DG1 enablement
- bigjoiner support
- integer scaling filter support
- async flip support
- ICL+ DSI command mode
- Improve display shutdown
- Display refactoring
- eLLC machine fbdev loading fix
- dma scatterlist fixes
- TGL hang fixes
- eLLC display buffer caching on SKL+
- MOCS PTE seeting for gen9+
msm:
- Shutdown hook
- GPU cooling device support
- DSI 7nm and 10nm phy/pll updates
- sm8150/sm2850 DPU support
- GEM locking re-work
- LLCC system cache support
aspeed:
- sysfs output config support
ast:
- LUT fix
- new display mode
gma500:
- remove 2d framebuffer accel
panfrost:
- move gpu reset to a worker
exynos:
- new HDMI mode support
mediatek:
- MT8167 support
- yaml bindings
- MIPI DSI phy code moved
etnaviv:
- new perf counter
- more lockdep annotation
hibmc:
- i2c DDC support
ingenic:
- pixel clock reset fix
- reserved memory support
- allow both DMA channels at once
- different pixel format support
- 30/24/8-bit palette modes
tilcdc:
- don't keep vblank irq enabled
vc4:
- new maintainer added
- DSI registration fix
virtio:
- blob resource support
- host visible and cross-device support
- uuid api support
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJf0upGAAoJEAx081l5xIa+1EoP/2OkZnl5d9S26qPja15EoRFl
S69OjNci331Br9Y111jD2OCtyqA7w3ppnvCmzpHOBK1IZjhkxOVNC6PSUFSV4M3V
oVOxZK0KaMHpLU2p90NbURWHa2TOktj7IWb9FrhPaEeBECbFuORZ2TbloFhaoyyt
9auEAwqYRPgF8CSYOjQGGZJ85MQN4ImExTdY13+BZgQlGLiSPHfpnLVJ1Q5TPt6A
BLgcU/DFcqOZqyjeu+CuA+LZSHjHeVJxTOGRX65PoTtU3Xus8TRZ/qL4r8e6mAI1
boFLmsevvQlzaQ9GFohc+l9QR/dtnm6SpZxuEelewh7sQvsz2GI+SNF+OHcwHCph
TYIEtyZNaz1bf7ip75FGbhEVaWh2PUMn3zkGlYt+zqAtznYB+dFPc31hhuVn3o5X
c8UwLDUUJLzTePKPZ0UtzIu4Gm2RYTyRsnUAP0OKP/0WaZRyxnoQMYm5Llg7RBe0
5ZJSWjJPBlv1YMWAHQ0YMZ+MhnFE8k4eV/8WfBQnb2INosgzKfJXEmu6ffAkPqSq
jxBsrVQwtOMF2P9VEfdQDv3fs0GKDuZN5ezTFuW59Dt4VYfCUe2FTssSwFBIp5X9
erPJ/nk883rcI6F0PdArNYvWpwPlVSDJyfTxQbYYxVAf8X1ARJCU3PT6iBnGO3i4
d5tveSc8HoOXr4W3eIjn
=c9rl
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-12-11' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Not a huge amount of big things here, AMD has support for a few new HW
variants (vangogh, green sardine, dimgrey cavefish), Intel has some
more DG1 enablement. We have a few big reworks of the TTM layers and
interfaces, GEM and atomic internal API reworks cross tree. fbdev is
marked orphaned in here as well to reflect the current reality.
core:
- documentation updates
- deprecate DRM_FORMAT_MOD_NONE
- atomic crtc enable/disable rework
- GEM convert drivers to gem object functions
- remove SCATTER_LIST_MAX_SEGMENT
sched:
- avoid infinite waits
ttm:
- remove AGP support
- don't modify caching for swapout
- ttm pinning rework
- major TTM reworks
- new backend allocator
- multihop support
vram-helper:
- top down BO placement fix
- TTM changes
- GEM object support
displayport:
- DP 2.0 DPCD prep work
- DP MST extended DPCD caps
fbdev:
- mark as orphaned
amdgpu:
- Initial Vangogh support
- Green Sardine support
- Dimgrey Cavefish support
- SG display support for renoir
- SMU7 improvements
- gfx9+ modiifier support
- CI BACO fixes
radeon:
- expose voltage via hwmon on SUMO
amdkfd:
- fix unique id handling
i915:
- more DG1 enablement
- bigjoiner support
- integer scaling filter support
- async flip support
- ICL+ DSI command mode
- Improve display shutdown
- Display refactoring
- eLLC machine fbdev loading fix
- dma scatterlist fixes
- TGL hang fixes
- eLLC display buffer caching on SKL+
- MOCS PTE seeting for gen9+
msm:
- Shutdown hook
- GPU cooling device support
- DSI 7nm and 10nm phy/pll updates
- sm8150/sm2850 DPU support
- GEM locking re-work
- LLCC system cache support
aspeed:
- sysfs output config support
ast:
- LUT fix
- new display mode
gma500:
- remove 2d framebuffer accel
panfrost:
- move gpu reset to a worker
exynos:
- new HDMI mode support
mediatek:
- MT8167 support
- yaml bindings
- MIPI DSI phy code moved
etnaviv:
- new perf counter
- more lockdep annotation
hibmc:
- i2c DDC support
ingenic:
- pixel clock reset fix
- reserved memory support
- allow both DMA channels at once
- different pixel format support
- 30/24/8-bit palette modes
tilcdc:
- don't keep vblank irq enabled
vc4:
- new maintainer added
- DSI registration fix
virtio:
- blob resource support
- host visible and cross-device support
- uuid api support"
* tag 'drm-next-2020-12-11' of git://anongit.freedesktop.org/drm/drm: (1754 commits)
drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs
drm/amdgpu: fix size calculation with stolen vga memory
drm/amdgpu: remove amdgpu_ttm_late_init and amdgpu_bo_late_init
drm/amdgpu: free the pre-OS console framebuffer after the first modeset
drm/amdgpu: enable runtime pm using BACO on CI dGPUs
drm/amdgpu/cik: enable BACO reset on Bonaire
drm/amd/pm: update smu10.h WORKLOAD_PPLIB setting for raven
drm/amd/pm: remove one unsupported smu function for vangogh
drm/amd/display: setup system context for APUs
drm/amd/display: add S/G support for Vangogh
drm/amdkfd: Fix leak in dmabuf import
drm/amdgpu: use AMDGPU_NUM_VMID when possible
drm/amdgpu: fix sdma instance fw version and feature version init
drm/amd/pm: update driver if version for dimgrey_cavefish
drm/amd/display: 3.2.115
drm/amd/display: [FW Promotion] Release 0.0.45
drm/amd/display: Revert DCN2.1 dram_clock_change_latency update
drm/amd/display: Enable gpu_vm_support for dcn3.01
drm/amd/display: Fixed the audio noise during mode switching with HDCP mode on
drm/amd/display: Add wm table for Renoir
...
Chris spotted that since 16ffe73c18 ("drm/i915/pmu: Use GT parked for
estimating RC6 while asleep") we don't rely on runtime pm internals when
estimating RC6 while asleep. We can remove the ifdef code to simplify and
at the same time wake up the device less when querying RC6 if CONFIG_PM is
not compiled in.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 16ffe73c18 ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-3-tvrtko.ursulin@linux.intel.com
Chris found a CI report which points out calling intel_runtime_pm_get from
inside i915_pmu_enable hook is not allowed since it can be invoked from
hard irq context. This is something we knew but forgot, so lets fix it
once again.
We do this by syncing the internal book keeping with hardware rc6 counter
on driver load.
v2:
* Always sync on parking and fully sync on init.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: f4e9894b69 ("drm/i915/pmu: Correct the rc6 offset upon enabling")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-1-tvrtko.ursulin@linux.intel.com
Reduce the module/device probe error into a mere debug to hide issues
where the initial modeset is failing (after lies told by hw probe) and
the system hangs with a livelock in cleaning up the failed commit.
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=210619
Fixes: b3bf99daae ("drm/i915/display: Defer initial modeset until after GGTT is initialised")
Fixes: ccc9e67ab2 ("drm/i915/display: Defer initial modeset until after GGTT is initialised")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210230741.17140-1-chris@chris-wilson.co.uk
When we reset the legacy ring context, due to potential corruption over
suspend/resume, remove the valid bit so that we avoid loading garbage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210080240.24529-1-chris@chris-wilson.co.uk
The above workaround was added as an engine workaround not a GT
workaround. Moved it to the correct location.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210170615.3107266-1-lucas.demarchi@intel.com
For an enabled DSC during HW readout the corresponding power reference
is taken along the CRTC power domain references in
get_crtc_power_domains(). Remove the incorrect get ref from the DSI
encoder hook.
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209153952.3397959-1-imre.deak@intel.com
Move the initialization of the rc_model_size from the common code into
encoder code, allowing different encoders to specify the size according
to their needs. Keep using the hard coded value in the encoders for now
to make this a non-functional change.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6843c4f6958619f7389180aa92fded7b9fdbb4ba.1607429866.git.jani.nikula@intel.com
The rc_model_size is specified in the DSC config, and the hardware
programming should respect that instead of hard coding a value of 8192.
Regardless, the rc_model_size in DSC config is currently hard coded to
the same value, so this should have no impact, other than allowing the
use of other sizes as needed.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/27d86ad25832bbb985f6e996f3d02dca01a66895.1607429866.git.jani.nikula@intel.com
These functions are independent from the backend used and can therefore
be split out of the exelists submission file, so they can be re-used by
the upcoming GuC submission backend.
Based on a patch by Chris Wilson.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-3-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We want to separate the utility functions for controlling the logical
ring context from the execlists submission mechanism (which is an
overgrown scheduler).
This is similar to Daniele's work to split up the files, but being
selfish I wanted to base it after my own changes to intel_lrc.c petered
out.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-2-chris@chris-wilson.co.uk
Cleanup intel_lrc.h by moving some of the residual common register
definitions into intel_lrc_reg.h, prior to rebranding and splitting off
the submission backends.
v2: keep the SCHEDULE enum in the old file, since it is specific to the
gvt usage of the execlists submission backend (John)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-1-chris@chris-wilson.co.uk
Tigerlake is plagued by spontaneous DMAR faults [reason 7, next page
table ptr is invalid] which lead to GPU hangs. These faults occur when
an iommu map is immediately reused. Adding further clflushes and
barriers around either the GTT PTE or iommu PTE updates do not prevent
the faults. So far the only effect has been from inducing a delay
between reuse of the iommu on the GPU, and applying the delay at the
iommu map allows for the smallest stable delay.
Note that such a delay is hideous and clearly does not fix the root cause,
and so should only be a bandaid until a complete solution is found. The
delay was determined by running igt/gem_exec_fence/parallel in a loop for
a few hours (unpatched MTBF is about 10s).
We have also seen such DMAR fault [reason 7] errors on other platforms,
notably gen9-gen11, but so far it has only been trivially and
consistently reproduced on Tigerlake.
v2: Leave a tell-tale to know when we apply the vt'd quirk, and as a
reminder to remove it again. Hopefully.
Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-2-chris@chris-wilson.co.uk
A call to wait for the GT to idle from inside the put_pages fallback is
prone to cause an uninterruptible livelock. As it does not provide
adequate serialisation with new requests, simply fallback to a trivial
sleep.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-1-chris@chris-wilson.co.uk
Document what a masked register is according to bspec so we avoid
developers using the wrong functions to implement WAs.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-3-lucas.demarchi@intel.com
The use of "masked" in this function is due to its history. Once upon a
time it received a mask and a value as parameter. Since
commit eeec73f8a4 ("drm/i915/gt: Skip rmw for masked registers")
that is not true anymore and now there is a clear and a set parameter.
Depending on the case, that can still be thought as a mask and value,
but there are some subtle differences: what we clear doesn't need to be
the same bits we are setting, particularly when we are using masked
registers.
The fact that we also have "masked registers", i.e. registers whose mask
is stored in the upper 16 bits of the register, makes it even more
confusing, because "masked" in wa_write_masked_or() has little to do
with masked registers, but rather refers to the old mask parameter the
function received (that can also, but not exclusively, be used to write
to masked register).
Avoid the ambiguity and misnomer by renaming it to something else,
hopefully less confusing: wa_write_clr_set(), to designate that we are
doing both clr and set operations in the register.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-2-lucas.demarchi@intel.com
When using masked registers, there is nothing to clear since a masked
register has the mask in the upper 16b: we can just write to the
location we want and use the mask to control what bits we are writing
to.
However that doesn't mean we don't want to read back the register and
check the value actually matched what we wanted to write, i.e. that
the WA stick. That should be an explicit opt-out for registers that are
either write-only or that are affected by hardware misbehavior.
Moreover both wa_masked_en() and wa_masked_dis() check the WA stick, so
skipping the check just because the field is more than 1 bit is
surprising and error-prone.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-1-lucas.demarchi@intel.com
We checked the table size against a hardcoded number of entries, and
that number was excluding the special mocs registers at the end.
Fixes: 777a7717d6 ("drm/i915/gt: Program mocs:63 for cache eviction on gen9")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127102540.13117-1-chris@chris-wilson.co.uk
(cherry picked from commit 444fbf5d70)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[backported and updated the Fixes sha]
This patch fixes the slice count computation algorithm
for calculating the slice count based on Peak pixel rate
and the max slice width allowed on the DSC engines.
We need to ensure slice count > min slice count req
as per DP spec based on peak pixel rate and that it is
greater than min slice count based on the max slice width
advertised by DPCD. So use max of these two.
In the prev patch we were using min of these 2 causing it
to violate the max slice width limitation causing a blank
screen on 8K@60.
Fixes: d9218c8f6c ("drm/i915/dp: Add helpers for Compressed BPP and Slice Count for DSC")
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: <stable@vger.kernel.org> # v5.0+
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204205804.25225-1-manasi.d.navare@intel.com
(cherry picked from commit d371d6ea92)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently the check that the unsigned size_t variable i is >= 0
is always true because the unsigned variable will never be negative,
causing the loop to run forever. Fix this by changing the
pre-decrement check to a zero check on i followed by a decrement of i.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes: bfed6708d6 ("drm/i915: use vmap in shmem_pin_map")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002170354.94627-1-colin.king@canonical.com
(cherry picked from commit e70956a249)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We currently presume that the engine reset is successful, cancelling the
expired preemption timer in the process. However, engine resets can
fail, leaving the timeout still pending and we will then respond to the
timeout again next time the tasklet fires. What we want is for the
failed engine reset to be promoted to a full device reset, which is
kicked by the heartbeat once the engine stops processing events.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-2-chris@chris-wilson.co.uk
(cherry picked from commit d997e240ce)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Before reseting the engine, we suspend the execution of the guilty
request, so that we can continue execution with a new context while we
slowly compress the captured error state for the guilty context. However,
if the reset fails, we will promptly attempt to reset the same request
again, and discover the ongoing capture. Ignore the second attempt to
suspend and capture the same request.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 32ff621fd7 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-1-chris@chris-wilson.co.uk
(cherry picked from commit b969540500)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6bd0 ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.
Fixes: 61231f6bd0 ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
(cherry picked from commit ba38b79eae)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There is a copy and paste bug in this code. It's supposed to check
"obj2" instead of checking "obj" a second time.
Fixes: 80f0b679d6 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/8ilneOcJAjwqU4t@mwand
(cherry picked from commit 14f2d7604f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Closed vma are protected by the GT wakeref held as we lookup the vma, so
we know that the vma will not be freed as we process it for the execbuf.
Instead we expect to catch the closed status of the context, and simply
allow the close-race on an individual vma to be washed away.
Longer term, the GT wakeref protection will be removed by explicit
vma.kref tracking.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2245
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207193824.18114-1-chris@chris-wilson.co.uk
This patch fixes the slice count computation algorithm
for calculating the slice count based on Peak pixel rate
and the max slice width allowed on the DSC engines.
We need to ensure slice count > min slice count req
as per DP spec based on peak pixel rate and that it is
greater than min slice count based on the max slice width
advertised by DPCD. So use max of these two.
In the prev patch we were using min of these 2 causing it
to violate the max slice width limitation causing a blank
screen on 8K@60.
Fixes: d9218c8f6c ("drm/i915/dp: Add helpers for Compressed BPP and Slice Count for DSC")
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: <stable@vger.kernel.org> # v5.0+
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204205804.25225-1-manasi.d.navare@intel.com
The Bspec does not mention polling the FEC Enable
Live status bit. That is only there for debug purposes.
So remove the polling from driver.
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201125072634.27664-1-manasi.d.navare@intel.com
When we fail to find a single block large enough to require splitting,
report the largest block we did find.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207130346.11849-1-chris@chris-wilson.co.uk
Currently the check that the unsigned size_t variable i is >= 0
is always true because the unsigned variable will never be negative,
causing the loop to run forever. Fix this by changing the
pre-decrement check to a zero check on i followed by a decrement of i.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes: bfed6708d6 ("drm/i915: use vmap in shmem_pin_map")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002170354.94627-1-colin.king@canonical.com
Remove the last macro and implement it as a function like the rest of
the operations that don't assume there is a `wal` list, but rather
receive it as argument.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-4-lucas.demarchi@intel.com
Just ommitting the list it's operating on doesn't save much typing
and adds another way to do the same thing. Just replace it with
wa_masked_dis().
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-3-lucas.demarchi@intel.com
Just ommitting the list it's operating on doesn't save much typing and
adds another way to do the same thing. Just replace it with
wa_masked_en().
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-2-lucas.demarchi@intel.com
Set GS Timer to 224 to prevent a HS/DS hang.
Bspec: 53508
v2: reword commit message and add comment explaining why read
verification is ignored (Chris)
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-1-lucas.demarchi@intel.com
Switch off the scanout during driver unregister, so we can shutdown the
HW immediately for unbind.
v2: Remove the old shutdown from remove, it should now be redundant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204161601.20897-1-chris@chris-wilson.co.uk
Let's do the kill_bigjoiner_slave() thing from
intel_bigjoiner_add_affected_crtcs() since it's related to
what we do there. This cleans up the logic in the
compute_config() loop a bit.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124201156.17095-4-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
If either of the bigjoiner pipes needs a modeset then we need
a modeset on both pipes. Make it so.
v2: Split out the kill_bigjoiner_slave() change (Manasi)
Add affected connectors/planes
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124201156.17095-3-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
drm_atomic_add_affected_planes() only considers planes which
are logically enabled in the uapi state. For bigjoiner we need
to consider planes logically enabled in the hw state. Add a
helper for that.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124201156.17095-2-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Currently crtc_state->uapi.plane_mask only tracks logically
enabled planes on the uapi level. For bigjoiner purposes
we want to do the same for the hw state. Let's follow the
pattern established by active_planes & co. here.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124201156.17095-1-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Across a reset, we stop the engine but not the timers. This leaves a
window where the timers have inconsistent state with the engine, but
should only result in a spurious timeout. As we cancel the outstanding
events, also cancel their timers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-4-chris@chris-wilson.co.uk
We currently presume that the engine reset is successful, cancelling the
expired preemption timer in the process. However, engine resets can
fail, leaving the timeout still pending and we will then respond to the
timeout again next time the tasklet fires. What we want is for the
failed engine reset to be promoted to a full device reset, which is
kicked by the heartbeat once the engine stops processing events.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-2-chris@chris-wilson.co.uk
Before reseting the engine, we suspend the execution of the guilty
request, so that we can continue execution with a new context while we
slowly compress the captured error state for the guilty context. However,
if the reset fails, we will promptly attempt to reset the same request
again, and discover the ongoing capture. Ignore the second attempt to
suspend and capture the same request.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 32ff621fd7 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-1-chris@chris-wilson.co.uk
In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6bd0 ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.
Fixes: 61231f6bd0 ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
Throw away all the debugfs entries that are not being actively used for
debugging/developing IGT. Note that a couple of these are already and
will remain available under the gt/
Files removed:
i915_gem_fence_regs
i915_gem_interrupt
i915_ring_freq_table
i915_context_status
i915_llc
i915_shrinker_info
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202112140.16759-1-chris@chris-wilson.co.uk
All the display power domain references are wakeref tracked now, so we
can mark intel_display_power_put_unchecked() as an internal function
(for suppressing wakeref tracking in non-debug builds).
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130212200.2811939-10-imre.deak@intel.com
Add wakeref tracking for the display power domain reference taken to
keep the display power well functionality disabled.
v2: Add missing wakeref zeroing to intel_power_domains_driver_remove()
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201161340.2879202-2-imre.deak@intel.com
Rename power_domains.wakeref to power_domains.init_wakeref to make the
use of this reference clearer. The next patch adds tracking for another
power reference user of the power_domains functionality.
While at it add a missing zero wakeref assert when setting the wakeref.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130212200.2811939-8-imre.deak@intel.com
Factor out helper functions to get/put a set of power domains that are
tracked using their wakeref handles. The same is needed by the next
patch adding tracking for enabled CRTC power domains.
v2: s/uint64_t/u64/ (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201161340.2879202-1-imre.deak@intel.com
The for_each_oldnew_intel_crtc_in_state() iterator index does match
crtc->pipe, but using the same thing as array index when getting and
putting CRTC power domains makes things clearer.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130212200.2811939-2-imre.deak@intel.com
There is a copy and paste bug in this code. It's supposed to check
"obj2" instead of checking "obj" a second time.
Fixes: 80f0b679d6 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/8ilneOcJAjwqU4t@mwand
This moves the functions into static const instead of having
funcs and data in the same struct.
It leaves the power callback alone, as it is used in a different
manner.
v2: leave power callback alone (Jani)
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130210945.31850-1-airlied@gmail.com
In most cases, we are better off letting the compiler decide whether to
inline static functions in .c files or not. In this case, the inline
will be ignored anyway as mmio_pm_restore_handler() is passed as a
function pointer.
Fixes: 5f60b12edc ("drm/i915/gvt: Save/restore HW status to support GVT suspend/resume")
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Hang Yuan <hang.yuan@linux.intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: intel-gvt-dev@lists.freedesktop.org
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201130111353.25406-1-jani.nikula@intel.com
drm/i915 features for v5.11:
Highlights:
- Enable big joiner to join two pipes to one port to overcome pipe restrictions
(Manasi, Ville, Maarten)
Display:
- More DG1 enabling (Lucas, Aditya)
- Fixes to cases without display (Lucas, José, Jani)
- Initial PSR state improvements (José)
- JSL eDP vswing updates (Tejas)
- Handle EDID declared max 16 bpc (Ville)
- Display refactoring (Ville)
Other:
- GVT features
- Backmerge
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87czzzkk1s.fsf@intel.com
!HAS_DISPLAY() implies !HAS_OVERLAY(), skipping overlay setup anyway, so
return earlier from intel_modeset_init() for clarity.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-4-lucas.demarchi@intel.com
(cherry picked from commit 71c8415d0d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We treat idling the GT (intel_rps_park) as a downclock event, and reduce
the frequency we intend to restart the GT with. Since the two workloads
are likely related (e.g. a compositor rendering every 16ms), we want to
carry the frequency and load information from across the idling.
However, we do also need to update the frequencies so that workloads
that run for less than 1ms are autotuned by RPS (otherwise we leave
compositors running at max clocks, draining excess power). Conversely,
if we try to run too slowly, the next workload has to run longer. Since
there is a hysteresis in the power graph, below a certain frequency
running a short workload for longer consumes more energy than running it
slightly higher for less time. The exact balance point is unknown
beforehand, but measurements with 30fps media playback indicate that RPe
is a better choice.
Reported-by: Edward Baker <edward.baker@intel.com>
Tested-by: Edward Baker <edward.baker@intel.com>
Fixes: 043cd2d14e ("drm/i915/gt: Leave rps->cur_freq on unpark")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124183521.28623-1-chris@chris-wilson.co.uk
(cherry picked from commit f7ed83cc19)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As we use a shmemfs file to hold the context state, when not in use it
may be swapped out, such as across suspend. Since we wrote into the
shmemfs without marking the pages as dirty, the contents may be dropped
instead of being written back to swap. On re-using the shmemfs file,
such as creating a new context after resume, the contents of that file
were likely garbage and so the new context could then hang the GPU.
Simply mark the page as being written when copying into the shmemfs
file, and it the new contents will be retained across swapout.
Fixes: be1cb55a07 ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.8+
Link: https://patchwork.freedesktop.org/patch/msgid/20201127120718.454037-161-matthew.auld@intel.com
(cherry picked from commit a9d71f76cc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.
This is required for a later patch to handle locking around virtual
requests in the signaler, as those requests may want to move between
engines and be destroyed while we are holding b->irq_lock on a physical
engine.
v2: Drop mutex_reinit(), if we never mark the mutex as destroyed we
don't need to reset the debug code, at the loss of having the mutex
debug code spot us attempting to destroy a locked mutex.
v3: As the intended use will remain strongly referenced counted, with
very little inflight access across reuse, drop the ctor.
v4: Drop the unrequired change to remove the temporary reference around
dropping the active context, and add back some more missing ctor
operations.
v5: The ctor is back. Tvrtko spotted that ce->signal_lock [introduced
later] maybe accessed under RCU and so needs special care not to be
reinitialised.
v6: Don't mix SLAB_TYPESAFE_BY_RCU and RCU list iteration.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-3-chris@chris-wilson.co.uk
(cherry picked from commit 14d1eaf088)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Since we try and estimate how long we require to update the registers to
perform a plane update, it is of vital importance that we measure the
distribution of plane updates to better guide our estimate. If we
underestimate how long it takes to perform the plane update, we may
slip into the next scanout frame causing a tear. If we overestimate, we
may unnecessarily delay the update to the next frame, causing visible
jitter.
Replace the warning that we exceed some arbitrary threshold for the
vblank update with a histogram for debugfs.
v2: Add a per-crtc debugfs entry so that the information is easier to
extract when testing individual CRTC, and so that it can be reset before
a test.
v3: Flip the graph on its side; creates space to label the time axis.
Updates: 4684
|
1us |
|
4us |********
|**********
16us |***********
|*****
66us |
|
262us |
|
1ms |
|
4ms |
|
17ms |
|
Min update: 5918ns
Max update: 54781ns
Average update: 16628ns
Overruns > 250us: 0
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1982
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20201202212814.26320-1-chris@chris-wilson.co.uk
Mixing I915_ALLOC_CONTIGUOUS and I915_ALLOC_MAX_SEGMENT_SIZE fared
badly. The two directives conflict, with the contiguous request setting
the min_order to the full size of the object, and the max-segment-size
setting the max_order to the limit of the DMA mapper. This results in a
situation where max_order < min_order, causing our sanity checks to
fail.
Instead of limiting the buddy block size, in the previous patch we split
the oversized buddy into multiple scatterlist elements.
Fixes: d2cf0125d4 ("drm/i915/lmem: Limit block size to 4G")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202173444.14903-2-chris@chris-wilson.co.uk
Adhere to the i915_sg_max_segment() limit on the lengths of individual
scatterlist elements, and in doing so split up very large chunks of lmem
into manageable pieces for the dma-mapping backend.
Reported-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202173444.14903-1-chris@chris-wilson.co.uk
Good riddance! Remove the macros and their remaining references in
comments.
The following functions should be used instead, depending on the use
case:
- intel_uncore_read(), intel_uncore_write(), intel_uncore_posting_read()
- intel_de_read(), intel_de_write(), intel_de_posting_read()
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-10-jani.nikula@intel.com
There are some corner cases wrt underrun when we enable
FBC with PSR2 on TGL. Recommendation from hardware is to
keep this combination disabled.
Bspec: 50422 HSD: 14010260002
v2: Added psr2 enabled check from crtc_state (Anshuman)
Added Bspec link and HSD referneces (Jose)
v3: Moved the logic to disable fbc to intel_fbc_update_state_cache
and removed the crtc->config usages, as per Ville's recommendation.
v4: Introduced a variable in fbc state_cache instead of the earlier
plane.visible WA, as suggested by Jose.
v5: Dropped an extra check for fbc in intel_fbc_enable and addressed
review comments by Jose.
v6: Move WA to end of function and added Jose's RB.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201190406.1752-2-uma.shankar@intel.com
Adding any kinds of "last" abi markers is usually a mistake which I
repeated when implementing the PMU because it felt convenient at the time.
This patch marks I915_PMU_LAST as deprecated and stops the internal
implementation using it for sizing the event status bitmask and array.
New way of sizing the fields is a bit less elegant, but it omits reserving
slots for tracking events we are not interested in, and as such saves some
runtime space. Adding sampling events is likely to be a special event and
the new plumbing needed will be easily detected in testing. Existing
asserts against the bitfield and array sizes are keeping the code safe.
First event which gets the new treatment in this new scheme are the
interrupts - which neither needs any tracking in i915 pmu nor needs
waking up the GPU to read it.
v2:
* Streamline helper names. (Chris)
v3:
* Comment which events need tracking. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201131757.206367-1-tvrtko.ursulin@linux.intel.com
Let's avoid adding new I915_WRITE uses while we try to get rid of them.
Fixes: 5f60b12edc ("drm/i915/gvt: Save/restore HW status to support GVT suspend/resume")
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Hang Yuan <hang.yuan@linux.intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: intel-gvt-dev@lists.freedesktop.org
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-9-jani.nikula@intel.com
Implement Read back of HDR metadata infoframes i.e Dynamic Range
and Mastering Infoframe for LSPCON devices.
v2: Added proper bitmask of enabled infoframes as per Ville's
recommendation.
v3: Dropped a redundant wrapper as per Ville's comment.
v4: Dropped a redundant print, added Ville's RB.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-14-uma.shankar@intel.com
Implemented Infoframes enabled readback for LSPCON devices.
This will help align the implementation with state readback
infrastructure.
v2: Added proper bitmask of enabled infoframes as per Ville's
recommendation.
v3: Added pcon specific infoframe types instead of using the HSW
one's, as recommended by Ville.
v4: Addressed Ville's review comment by adding HDMI infoframe
versions directly instead of DIP wrappers.
v5: Re-ordered the patches to avoid potential break in usage,
as suggested by Ville.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-13-uma.shankar@intel.com
Lspcon has Infoframes as well as DIP for HDR metadata(DRM Infoframe).
Create a separate mechanism for lspcon compared to HDMI in order to
address the same and ensure future scalability.
v2: Streamlined this as per Ville's suggestions, making sure that
HDMI infoframe versions are directly returned instead of a redundant
and confusing DIP overhead.
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-12-uma.shankar@intel.com
Enable HDR for LSPCON based on Parade along with MCA.
v2: Added a helper for status reg as suggested by Ville.
v3: Removed a redundant variable, added Ville's RB.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Vipin Anand <vipin.anand@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-11-uma.shankar@intel.com
Enable HDMI Colorspace for LSPCON based devices. Sending Colorimetry
data for HDR using AVI infoframe. LSPCON firmware expects this and though
SOC drives DP, for HDMI panel AVI infoframe is sent to the LSPCON device
which transfers the same to HDMI sink.
v2: Dropped state managed in drm core as per Jani Nikula's suggestion.
v3: Aligned colorimetry handling for lspcon as per compute_avi_infoframes,
as suggested by Ville.
v4: Finally fixed this with Ville's help, re-phrased the commit header
and description.
v5: Register HDMI colorspace for lspcon and move this to
intel_dp_add_properties as we can't create property at late_register.
Credits-to: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-9-uma.shankar@intel.com
With LSPCON we use the AVI infoframe to convey the colorimetry
information (as opposed to DP MSA/SDP), so the property we expose
should match the values we can stuff into the infoframe. Ie. we
must use the HDMI variant of the property, even though we drive
LSPCON in PCON mode. To that end just split
intel_attach_colorspace_property() into HDMI and DP variants
and let the caller worry about which one it wants to use.
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-8-uma.shankar@intel.com
Content type is supported on HDMI sink devices. Attached the
property for the same for LSPCON based devices.
v2: Added the content type programming when we are attaching
the property to connector, as suggested by Ville.
v3: Need to attach content type on intel_dp_add_properties
as creating of new properties is not possible at late_register.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-7-uma.shankar@intel.com
Attach HDR property for Gen9 devices with MCA LSPCON
chips.
v2: Cleaned HDR property attachment logic based on capability
as per Jani Nikula's suggestion.
v3: Fixed the HDR property attachment logic as per the new changes
by Kai-Feng to align with lspcon detection failure on some devices.
v4: Add HDR proprty in late_register to handle lspcon detection,
as suggested by Ville.
v5: Init Lspcon only if advertized from BIOS.
v6: Added a Todo to plan a cleanup later, added Ville's RB.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-4-uma.shankar@intel.com
Gen9 hardware supports HDMI2.0 through LSPCON chips.
Extending HDR support for MCA LSPCON based GEN9 devices.
SOC will drive LSPCON as DP and send HDR metadata as standard
DP SDP packets. LSPCON will be set to operate in PCON mode,
will receive the metadata and create Dynamic Range and
Mastering Infoframe (DRM packets) and send it to HDR capable
HDMI sink devices.
v2: Re-used hsw infoframe write implementation for HDR metadata
for LSPCON as per Ville's suggestion.
v3: Addressed Jani Nikula's review comments.
v4: Addressed Ville's review comments, removed redundant wrapper
and checks, passed arguments instead of hardcodings.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-3-uma.shankar@intel.com
LSPCON firmware exposes HDR capability through LPCON_CAPABILITIES
DPCD register. LSPCON implementations capable of supporting
HDR set HDR_CAPABILITY bit in LSPCON_CAPABILITIES to 1. This patch
reads the same, detects the HDR capability and adds this to
intel_lspcon struct.
v2: Addressed Jani Nikula's review comment and fixed the HDR
capability detection logic
v3: Deferred HDR detection from lspcon_init (Ville)
v4: Addressed Ville's minor review comments, added his RB.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130204738.2443-2-uma.shankar@intel.com
Add the calculations to set plane selective fetch registers depending
in the value of the area damaged.
It is still using the whole plane area as damaged but that will change
in next patches.
v2:
- fixed new_plane_state->uapi.dst.y2 typo in
intel_psr2_sel_fetch_update()
- do not shifthing new_plane_state->uapi.dst only src is in 16.16 format
BSpec: 55229
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130125750.17820-1-jose.souza@intel.com
Arguably some of these should use intel_de_read() or intel_de_write(),
however not all. Prioritize I915_READ() and I915_WRITE() removal in
general over migrating to the pedantically correct replacements right
away.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-8-jani.nikula@intel.com
Arguably some of these should use intel_de_read() or intel_de_write(),
however not all. Prioritize I915_READ() and I915_WRITE() removal in
general over migrating to the pedantically correct replacements right
away.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-7-jani.nikula@intel.com
Another straggler with I915_READ() uses gone.
Arguably some of these should use intel_de_read(), however not
all. Prioritize I915_READ() removal in general over migrating to the
pedantically correct replacement right away.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-5-jani.nikula@intel.com
The i915_cache_sharing file is a debugfs interface for gen 6-7 with no
validation or user. Remove it.
This also removes the last I915_WRITE() use in i915_debugfs.c.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-4-jani.nikula@intel.com
Good riddance! Remove the macros and their remaining references in
comments.
intel_uncore_read_fw() and intel_uncore_write_fw() should be used
instead.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-2-jani.nikula@intel.com
The information is no longer relevant, so remove it. This also removes
the last users of I915_READ_FW()
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130111601.2817-1-jani.nikula@intel.com
Ville noticed that the last mocs entry is used unconditionally by the HW
when it performs cache evictions, and noted that while the value is not
meant to be writable by the driver, we should program it to a reasonable
value nevertheless.
As it turns out, we can change the value of mocs:63 and the value we
were programming into it would cause hard hangs in conjunction with
atomic operations.
v2: Add details from bspec about how it is used by HW
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2707
Fixes: 3bbaba0cea ("drm/i915: Added Programming of the MOCS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140841.1982-1-chris@chris-wilson.co.uk
(cherry picked from commit 977933b5da)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
After a cursory check on the parameters to i915_gem_object_pin_map(),
where we return a precise error, if the backend rejects the mapping we
always return PTR_ERR(-ENOMEM). Let us also return a more precise error
here so we can differentiate between running out of memory and
programming errors (or situations where we may be trying different paths
and looking for an error from an unsupported map).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127195334.13134-1-chris@chris-wilson.co.uk
Block sizes are only limited by the largest power-of-two that will fit
in the region size, but to construct an object we also require feeding
it into an sg list, where the upper limit of the sg entry is at most
UINT_MAX. Therefore to prevent issues with allocating blocks that are
too large, add the flag I915_ALLOC_MAX_SEGMENT_SIZE which should limit
block sizes to the i915_sg_segment_size().
v2: (matt)
- query the max segment.
- prefer flag to limit block size to 4G, since it's best not to assume
the user will feed the blocks into an sg list.
- simple selftest so we don't have to guess.
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130134721.54457-1-matthew.auld@intel.com
For the LMEM case if we have suitable alignment and 2M physical pages we
should always get 2M GTT pages within the constraints of the hugepages
selftest. If we don't then something might be wrong in our construction
of the backing pages.
References: 330b7d3305 ("drm/i915/region: fix order when adding blocks")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130141809.65330-2-matthew.auld@intel.com
In igt_ppgtt_sanity_check we should also exercise the non-contiguous
option for LMEM, since this will give us slightly different sg layouts
and alignment.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130141809.65330-1-matthew.auld@intel.com
We treat idling the GT (intel_rps_park) as a downclock event, and reduce
the frequency we intend to restart the GT with. Since the two workloads
are likely related (e.g. a compositor rendering every 16ms), we want to
carry the frequency and load information from across the idling.
However, we do also need to update the frequencies so that workloads
that run for less than 1ms are autotuned by RPS (otherwise we leave
compositors running at max clocks, draining excess power). Conversely,
if we try to run too slowly, the next workload has to run longer. Since
there is a hysteresis in the power graph, below a certain frequency
running a short workload for longer consumes more energy than running it
slightly higher for less time. The exact balance point is unknown
beforehand, but measurements with 30fps media playback indicate that RPe
is a better choice.
Reported-by: Edward Baker <edward.baker@intel.com>
Tested-by: Edward Baker <edward.baker@intel.com>
Fixes: 043cd2d14e ("drm/i915/gt: Leave rps->cur_freq on unpark")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124183521.28623-1-chris@chris-wilson.co.uk
idr_init() uses base 0 which is an invalid identifier. The new function
idr_init_base allows IDR to set the ID lookup from base 1. This avoids
all lookups that otherwise starts from 0 since 0 is always unused.
References: commit 6ce711f275 ("idr: Make 1-based IDRs more efficient")
Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104150339.GA68663@localhost
We know a problem exists in the ifwi shipped with the early
pre-production Tigerlake and DG1 prototypes, later revisions are fine.
However, CI still relies on the earlier ifwi and we grow tired of
the volume of warnings as we wait for replacements.
Since the warning is a bug, we do not want to lose the warning in its
entirety, so only suppress the warning for the platforms currently
exhibiting the issue.
Suggested-by: José Roberto de Souza <gitlab@gitlab.freedesktop.org>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2411
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127210059.10702-1-chris@chris-wilson.co.uk
As we use a shmemfs file to hold the context state, when not in use it
may be swapped out, such as across suspend. Since we wrote into the
shmemfs without marking the pages as dirty, the contents may be dropped
instead of being written back to swap. On re-using the shmemfs file,
such as creating a new context after resume, the contents of that file
were likely garbage and so the new context could then hang the GPU.
Simply mark the page as being written when copying into the shmemfs
file, and it the new contents will be retained across swapout.
Fixes: be1cb55a07 ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.8+
Link: https://patchwork.freedesktop.org/patch/msgid/20201127120718.454037-161-matthew.auld@intel.com
We now use ilk_hpd_irq_setup for all GMCH platforms that do not have
hotplug. These are early gen3 and gen2 devices that now explode on boot
as they try to access non-existent registers.
Fixes: 794d61a190 ("drm/i915: re-order if/else ladder for hpd_irq_setup")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127145748.29491-1-chris@chris-wilson.co.uk
We checked the table size against a hardcoded number of entries, and
that number was excluding the special mocs registers at the end.
Fixes: 977933b5da ("drm/i915/gt: Program mocs:63 for cache eviction on gen9")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127102540.13117-1-chris@chris-wilson.co.uk
If while we are cancelling the breadcrumb signaling, we find that the
request is already completed, move it to the irq signaler and let it be
signaled.
v2: Tweak reference counting so that we only acquire a new reference on
adding to a signal list, as opposed to a hidden i915_request_put of the
caller's reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-5-chris@chris-wilson.co.uk
Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.
This is required for a later patch to handle locking around virtual
requests in the signaler, as those requests may want to move between
engines and be destroyed while we are holding b->irq_lock on a physical
engine.
v2: Drop mutex_reinit(), if we never mark the mutex as destroyed we
don't need to reset the debug code, at the loss of having the mutex
debug code spot us attempting to destroy a locked mutex.
v3: As the intended use will remain strongly referenced counted, with
very little inflight access across reuse, drop the ctor.
v4: Drop the unrequired change to remove the temporary reference around
dropping the active context, and add back some more missing ctor
operations.
v5: The ctor is back. Tvrtko spotted that ce->signal_lock [introduced
later] maybe accessed under RCU and so needs special care not to be
reinitialised.
v6: Don't mix SLAB_TYPESAFE_BY_RCU and RCU list iteration.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-3-chris@chris-wilson.co.uk
Pull the repeated check for the last active request being completed to a
single spot, when deciding whether or not execlist preemption is
required.
In doing so, we remove the tasklet kick, introduced with the completion
checks in commit 35f3fd8182 ("drm/i915/execlists: Workaround switching
back to a completed context"), if we find the request was completed but
have not yet seen the corresponding CS event. This was devolving into a
busy spin of the tasklet while we waited for the event as the delivery
was not as instantaneous as expected. Under load this is sufficient to
exhaust the tasklet softirq timeslice, and force ksoftirqd. Quite
noticeable overhead for no apparent improvement in latency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-2-chris@chris-wilson.co.uk
Since the introduction of preempt-to-busy, requests can complete in the
background, even while they are not on the engine->active.requests list.
As such, the engine->active.request list itself is not in strict
retirement order, and we have to scan the entire list while unwinding to
not miss any. However, if the request is completed we currently leave it
on the list [until retirement], but we could just as simply remove it
and stop treating it as active. We would only have to then traverse it
once while unwinding in quick succession.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-1-chris@chris-wilson.co.uk
Ville noticed that the last mocs entry is used unconditionally by the HW
when it performs cache evictions, and noted that while the value is not
meant to be writable by the driver, we should program it to a reasonable
value nevertheless.
As it turns out, we can change the value of mocs:63 and the value we
were programming into it would cause hard hangs in conjunction with
atomic operations.
v2: Add details from bspec about how it is used by HW
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2707
Fixes: 3bbaba0cea ("drm/i915: Added Programming of the MOCS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140841.1982-1-chris@chris-wilson.co.uk
Since preempt-to-busy, we may unsubmit a request while it is still on
the HW and completes asynchronously. That means it may be retired and in
the process destroy the virtual engine (as the user has closed their
context), but that engine may still be holding onto the unsubmitted
compelted request. Therefore we need to potentially cleanup the old
request on destroying the virtual engine. We also have to keep the
virtual_engine alive until after the sibling's execlists_dequeue() have
finished peeking into the virtual engines, for which we serialise with
RCU.
v2: Be paranoid and flush the tasklet as well.
v3: And flush the tasklet before the engines, as the tasklet may
re-attach an rb_node after our removal from the siblings.
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-4-chris@chris-wilson.co.uk
(cherry picked from commit 46eecfccb4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We currently want to keep the interrupt enabled until the interrupt after
which we have no more work to do. This heuristic was broken by us
kicking the irq-work on adding a completed request without attaching a
signaler -- hence it appearing to the irq-worker that an interrupt had
fired when we were idle.
Fixes: 2854d86632 ("drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-3-chris@chris-wilson.co.uk
(cherry picked from commit 3aef910d26)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Move the register slow register write and readback from out of the
critical path for execlists submission and delay it until the following
worker, shaving off around 200us. Note that the same signal_irq_work() is
allowed to run concurrently on each CPU (but it will only be queued once,
once running though it can be requeued and reexecuted) so we have to
remember to lock the global interactions as we cannot rely on the
signal_irq_work() itself providing the serialisation (in constrast to a
tasklet).
By pushing the arm/disarm into the central signaling worker we can close
the race for disarming the interrupt (and dropping its associated
GT wakeref) on parking the engine. If we loose the race, that GT wakeref
may be held indefinitely, preventing the machine from sleeping while
the GPU is ostensibly idle.
v2: Move the self-arming parking of the signal_irq_work to a flush of
the irq-work from intel_breadcrumbs_park().
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2271
Fixes: e23005604b ("drm/i915/gt: Hold context/request reference while breadcrumbs are active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-1-chris@chris-wilson.co.uk
(cherry picked from commit 9d5612ca16)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
After having written the entire OA buffer with reports, the HW will
write again at the beginning of the OA buffer. It'll indicate it by
setting the WRAP bits in the OASTATUS register.
When a wrap happens and that at the end of the read vfunc we write the
OASTATUS register back to clear the REPORT_LOST bit, we sometimes see
that the OATAILPTR register is reset to a previous position on Gen8/9
(apparently not the case on Gen11+). This leads the next call to the
read vfunc to process reports we've already read. Because we've marked
those as read by clearing the reason & timestamp dwords, they're
discarded and a "Skipping spurious, invalid OA report" message is
emitted.
The workaround to avoid this OATAILPTR value reset seems to be to set
the wrap bits when writing back OASTATUS.
This change has no impact on userspace, it only avoids a bunch of
DRM_NOTE("Skipping spurious, invalid OA report\n") messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 19f81df285 ("drm/i915/perf: Add OA unit support for Gen 8+")
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117130124.829979-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 059a0beb48)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Get rid of the __call_single_node union and clean up the API a little
to avoid external code relying on the structure layout as much.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
CT event handler is called under the gt->irq_lock from the interrupt
handling paths so make it the same from the init path. I don't think this
mismatch caused any functional issue but we need to wean the code of the
global i915->irq_lock.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120095636.1987395-2-tvrtko.ursulin@linux.intel.com
Guc->mmio_msg is set under the guc->irq_lock in guc_get_mmio_msg so it
should be consumed under the same lock from guc_handle_mmio_msg.
I am not sure if the overall flow here makes complete sense but at least
the correct lock is now used.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120095636.1987395-1-tvrtko.ursulin@linux.intel.com
Since preempt-to-busy, we may unsubmit a request while it is still on
the HW and completes asynchronously. That means it may be retired and in
the process destroy the virtual engine (as the user has closed their
context), but that engine may still be holding onto the unsubmitted
compelted request. Therefore we need to potentially cleanup the old
request on destroying the virtual engine. We also have to keep the
virtual_engine alive until after the sibling's execlists_dequeue() have
finished peeking into the virtual engines, for which we serialise with
RCU.
v2: Be paranoid and flush the tasklet as well.
v3: And flush the tasklet before the engines, as the tasklet may
re-attach an rb_node after our removal from the siblings.
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-4-chris@chris-wilson.co.uk
We currently want to keep the interrupt enabled until the interrupt after
which we have no more work to do. This heuristic was broken by us
kicking the irq-work on adding a completed request without attaching a
signaler -- hence it appearing to the irq-worker that an interrupt had
fired when we were idle.
Fixes: 2854d86632 ("drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-3-chris@chris-wilson.co.uk
Move the register slow register write and readback from out of the
critical path for execlists submission and delay it until the following
worker, shaving off around 200us. Note that the same signal_irq_work() is
allowed to run concurrently on each CPU (but it will only be queued once,
once running though it can be requeued and reexecuted) so we have to
remember to lock the global interactions as we cannot rely on the
signal_irq_work() itself providing the serialisation (in constrast to a
tasklet).
By pushing the arm/disarm into the central signaling worker we can close
the race for disarming the interrupt (and dropping its associated
GT wakeref) on parking the engine. If we loose the race, that GT wakeref
may be held indefinitely, preventing the machine from sleeping while
the GPU is ostensibly idle.
v2: Move the self-arming parking of the signal_irq_work to a flush of
the irq-work from intel_breadcrumbs_park().
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2271
Fixes: e23005604b ("drm/i915/gt: Hold context/request reference while breadcrumbs are active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201123113717.20500-1-chris@chris-wilson.co.uk
The current interface of intel_gvt_register_hypervisor() expects a
non-const pointer to struct intel_gvt_mpt, even though the mediator
never modifies (or should modifiy) the content of this struct.
Change the function signature and relevant struct members to const to
properly express the API's intent and allow instances of intel_gvt_mpt
to be allocated as const.
While I was here, I also made KVM's instance of this struct const to
reduce the number of writable function pointers in the kernel.
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: intel-gvt-dev@lists.freedesktop.org
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201111172811.558443-1-julian.stecklina@cyberus-technology.de
The old IPS interface did not match the RPS interface that we tried to
plug it into (bool vs int return). Once repaired, our minimal
selftesting is finally happy!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201121190352.15996-1-chris@chris-wilson.co.uk
If we run out of ring space, or exceed the desired runtime, we wish to
stop the subtest. Put these checks together, so that we always keep the
requests flushed on completion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120140314.24749-3-chris@chris-wilson.co.uk
We print out the "logical" context support before we discover whether or
not the engines have logical contexts. No one, except Tvrtko, seems to
have noticed the error, so the debug message must not be useful to
anyone.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120140314.24749-1-chris@chris-wilson.co.uk
For DG1 we have a little of mix up wrt to DDI/port names and indexes.
Bspec refers to the ports as DDIA, DDIB, DDI USBC1 and DDI USBC2
(besides the DDIA, DDIB, DDIC, DDID), but the previous naming is the
most unambiguous one. This means that for any register on Display Engine
we should use the index of A, B, D and E. However in some places this is
not true:
- VBT: uses C and D and have to be mapped to D/E
- IO/Combo: uses C and D, but we already differentiate those when
we created the phy vs port distinction.
This additional mapping for VBT and phy are already covered in previous
patches, so now we can initialize all the DDIs as A, B, D and E.
v2: Squash previous patch enabling just ports A and B since most of the
pumbling code is already merged now
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117084836.2318234-1-lucas.demarchi@intel.com
Include the signalers each request in the timeline is waiting on, as a
means to try and identify the cause of a stall. This can be quite
verbose, even as for now we only show each request in the timeline and
its immediate antecedents.
This generates output like:
Timeline 886: { count 1, ready: 0, inflight: 0, seqno: { current: 664, last: 666 }, engine: rcs0 }
U 886:29a- prio=0 @ 134ms: gem_exec_parall<4621>
U bc1:27a- prio=0 @ 134ms: gem_exec_parall[4917]
Timeline 825: { count 1, ready: 0, inflight: 0, seqno: { current: 802, last: 804 }, engine: vcs0 }
U 825:324 prio=0 @ 107ms: gem_exec_parall<4518>
U b75:140- prio=0 @ 110ms: gem_exec_parall<5486>
Timeline b46: { count 1, ready: 0, inflight: 0, seqno: { current: 782, last: 784 }, engine: vcs0 }
U b46:310- prio=0 @ 70ms: gem_exec_parall<5428>
U c11:170- prio=0 @ 70ms: gem_exec_parall[5501]
Timeline 96b: { count 1, ready: 0, inflight: 0, seqno: { current: 632, last: 634 }, engine: vcs0 }
U 96b:27a- prio=0 @ 67ms: gem_exec_parall<4878>
U b75:19e- prio=0 @ 67ms: gem_exec_parall<5486>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-6-chris@chris-wilson.co.uk
Include the active timelines for debugfs/i915_engine_info, so that we
can see which have unready requests inflight which are not shown
otherwise.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-4-chris@chris-wilson.co.uk
We plan to expand upon the number of available statuses for when we
pretty-print the requests along the timelines, and so need a new set of
flags. We have settled upon:
Unready [U]
- initial status after being submitted, the request is not
ready for execution as it is waiting for external fences
Ready [R]
- all fences the request was waiting on have been signaled,
and the request is now ready for execution and will be
in a backend queue
- a ready request may still need to wait on semaphores
[internal fences]
Ready/virtual [V]
- same as ready, but queued over multiple backends
Executing [E]
- the request has been transferred from the backend queue and
submitted for execution on HW
- a completed request may still be regarded as executing, its
status may not be updated until it is retired and removed
from the lists
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-3-chris@chris-wilson.co.uk
Extract i915_request_show for reuse in other request chain pretty
printers.
For a bonus point, quietly change the seqno format from %llx to %lld to
match everywhere else.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-2-chris@chris-wilson.co.uk
Forcing mocs:1 [used for our winsys follows-pte mode] to be cached
caused display glitches. Though it is documented as deprecated (and so
likely behaves as uncached) use the follow-pte bit and force it out of
L3 cache.
Testcase: igt/kms_frontbuffer_tracking
Testcase: igt/kms_big_fb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-4-chris@chris-wilson.co.uk
(cherry picked from commit a04ac82736)
Fixes: 849c0fe9e8 ("drm/i915/gt: Initialize reserved and unspecified MOCS indices")
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo: Updated Fixes tag]
After having written the entire OA buffer with reports, the HW will
write again at the beginning of the OA buffer. It'll indicate it by
setting the WRAP bits in the OASTATUS register.
When a wrap happens and that at the end of the read vfunc we write the
OASTATUS register back to clear the REPORT_LOST bit, we sometimes see
that the OATAILPTR register is reset to a previous position on Gen8/9
(apparently not the case on Gen11+). This leads the next call to the
read vfunc to process reports we've already read. Because we've marked
those as read by clearing the reason & timestamp dwords, they're
discarded and a "Skipping spurious, invalid OA report" message is
emitted.
The workaround to avoid this OATAILPTR value reset seems to be to set
the wrap bits when writing back OASTATUS.
This change has no impact on userspace, it only avoids a bunch of
DRM_NOTE("Skipping spurious, invalid OA report\n") messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 19f81df285 ("drm/i915/perf: Add OA unit support for Gen 8+")
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117130124.829979-1-lionel.g.landwerlin@intel.com
Add the new vma_set_file() function to allow changing
vma->vm_file with the necessary refcount dance.
v2: add more users of this.
v3: add missing EXPORT_SYMBOL, rebase on mmap cleanup,
add comments why we drop the reference on two occasions.
v4: make it clear that changing an anonymous vma is illegal.
v5: move vma_set_file to mm/util.c
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://patchwork.freedesktop.org/patch/399360/
Just a normal comment, not a kerneldoc function description.
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'vgpu' not described in 'bxt_ppat_low_write'
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'offset' not described in 'bxt_ppat_low_write'
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'p_data' not described in 'bxt_ppat_low_write'
drivers/gpu/drm/i915/gvt/handlers.c:1666: warning: Function parameter or member 'bytes' not described in 'bxt_ppat_low_write'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201103204307.15723-1-chris@chris-wilson.co.uk
Since we allocate some breadcrumbs for the virtual engine, and the
virtual engine has a custom destructor, we also need to free the
breadcrumbs after use.
Fixes: b3786b2937 ("drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201118133839.1783-1-chris@chris-wilson.co.uk
(cherry picked from commit 45e50f48b7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110210447.27454-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 2ca5a7b85b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Since we allocate some breadcrumbs for the virtual engine, and the
virtual engine has a custom destructor, we also need to free the
breadcrumbs after use.
Fixes: b3786b2937 ("drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201118133839.1783-1-chris@chris-wilson.co.uk
We can't call drm_plane_state_src() this late for the slave plane since
it would consult the wrong uapi state. We've alreayd done the correct
uapi->hw copy earlier, so let's just preserve the unclipped src/dst
rects using a temp copy across the intel_atomic_plane_check_clipping()
call.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-14-manasi.d.navare@intel.com
Dump debugfs and planar links as well, this will make it easier to debug
when things go wrong.
v4:
* Rebase
Changes since v1:
- Report planar slaves as such, now that we have the plane_state switch.
Changes since v2:
- Rebase on top of the new plane format dumping
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-12-manasi.d.navare@intel.com
We need to look at hw.fb for the framebuffer, and add the translation
for the slave_plane_state. With these changes we set the correct
rectangle on the bigjoiner slave, and don't set incorrect
src/dst/visibility on the slave plane.
v2:
* Manual rebase (Manasi)
v3:
* hw.rotation instead of uapi.rotation (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-11-manasi.d.navare@intel.com
When using bigjoiner userspace is only controlling the "master"
plane, so use its uapi state for the "slave" plane as well.
hw.crtc needs a bit of magic since we don't want to copy that from
the uapi state (as it points to the wrong pipe for the "slave
" plane). Instead we pass the right crtc in explicitly but only
assign it when the uapi state indicates the plane to be logically
enabled (ie. uapi.crtc != NULL).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-10-manasi.d.navare@intel.com
Make sure both the bigjoiner "master" and "slave" plane are
in the state whenever either of them is in the state.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-9-manasi.d.navare@intel.com
Skip iterating over bigjoiner slaves, only the master has the state we
care about.
Add the width of the bigjoiner slave to the reconstructed fb.
Hide the bigjoiner slave to userspace, and double the mode on bigjoiner
master.
And last, disable bigjoiner slave from primary if reconstruction fails.
v3:
* Fix the ddi_get_config slave error (Ankit Nautiyal)
v2:
* Unsupported bigjoiner config for initial fb (Ville)
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala:
* Don't do any hw->uapi state copy for bigjoiner slave
* We still have hw.mode so no need to pass it in
* Appease checkpatch]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-7-manasi.d.navare@intel.com
Enabling is done in a special sequence and so should plane updates
be. Ideally the end user never notices the second pipe is used.
This way ideally everything will be tear free, and updates are
really atomic as userspace expects it.
This uses generic modeset_enables() calls like trans port sync
but still has special handling for disable since for slave we
should not disable things like encoder, plls that are not enabled
for slave.
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala: Appease checkpatch]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-6-manasi.d.navare@intel.com
Make vdsc work when no output is enabled. The big joiner needs VDSC
on the slave, so enable it and set the appropriate bits.
So remove encoder usage from dsc functions.
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-5-manasi.d.navare@intel.com
When the clock is higher than the dotclock, try with 2 pipes enabled.
If we can enable 2, then we will go into big joiner mode, and steal
the adjacent crtc.
This only links the crtc's in software, no hardware or plane
programming is done yet. Blobs are also copied from the master's
crtc_state, so it doesn't depend at commit time on the other
crtc_state.
v6:
* Enable dSC for any mode->hdisplay > 5120
v5:
* Remove intel_dp_max_dotclock (Manasi)
v4:
* Fixes in intel_crtc_compute_config (Ville)
v3:
* Manual Rebase (Manasi)
Changes since v1:
- Rename pipe timings to transcoder timings, as they are now different.
Changes since v2:
- Rework bigjoiner checks; always disable slave when recalculating
master. No need to have a separate bigjoiner pass any more.
- Use pipe_mode instead of transcoder_mode, to clean up the code.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala:
* hskew isn't a thing
* Do the dsc compute if bigjoiner is enabled, not the other way around]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-4-manasi.d.navare@intel.com
Small changes to intel_dp_mode_valid(), allow listing modes that
can only be supported in the bigjoiner configuration, which is
not supported yet.
v13:
* Allow bigjoiner if hdisplay >5120
v12:
* slice_count logic simplify (Ville)
* Fix unnecessary changes in downstream_mode_valid (Ville)
v11:
* Make intel_dp_can_bigjoiner non static
so it can be used in intel_display (Manasi)
v10:
* Simplify logic (Ville)
* Allow bigjoiner on edp (Ville)
v9:
* Restric Bigjoiner on PORT A (Ville)
v8:
* use source dotclock for max dotclock (Manasi)
v7:
* Add can_bigjoiner() helper (Ville)
* Pass bigjoiner to plane_size validation (Ville)
v6:
* Rebase after dp_downstream mode valid changes (Manasi)
v5:
* Increase max plane width to support 8K with bigjoiner (Maarten)
v4:
* Rebase (Manasi)
Changes since v1:
- Disallow bigjoiner on eDP.
Changes since v2:
- Rename intel_dp_downstream_max_dotclock to intel_dp_max_dotclock,
and split off the downstream and source checking to its own function.
(Ville)
v3:
* Rebase (Manasi)
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[vsyrjala:
* Keep bigjoiner disabled until everything is ready
* Appease checkpatch]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-3-manasi.d.navare@intel.com
When doing the plane state copy from the UV plane to the Y plane
let's just copy the hw state directly instead of using the original
uapi state. The UV plane has already had its uapi state copied into
its hw state, so this extra detour via the uapi state for the Y plane
is pointless.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117194718.11462-2-manasi.d.navare@intel.com
I totally fumbled the ?: usage when generating the DDI encoder
names. Reverse the things that need reversing, and to make it
a bit less messy add a few macros to hide the arithmetic on the
port enums.
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Fixes: 2d709a5a62 ("drm/i915: Give DDI encoders even better names")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117154028.8516-1-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
The presumption was that some time would always elapse between recording
the start and the finish of a context switch. This turns out to be a
regular occurrence and emitting a debug statement superfluous.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117113103.21480-4-chris@chris-wilson.co.uk
The WA specifies that we need to toggle a SDE chicken bit on and then
off as the final step in preparation for s0ix entry.
Bspec: 33450
Bspec: 8402
However, something is happening after we toggle the bit that causes
the WA to be invalidated. This makes dispcnlunit1_cp_xosc_clkreq
active being already in s0ix state i.e SLP_S0 counter incremented.
Tweaking the Wa_14010685332 by setting the bit on suspend and clearing
it on resume turns down the dispcnlunit1_cp_xosc_clkreq.
B.Spec has Documented this tweaked sequence of WA as an alternative.
Let keep this tweaked WA for Gen11 platforms and keep untweaked WA for
other platforms which never observed this issue.
v2 (MattR):
- Change the comment on the workaround to give PCH names rather than
platform names. Although the bspec is setup to list workarounds by
platform, the hardware team has confirmed that the actual issue being
worked around here is something that was introduced back in the
Cannon Lake PCH and carried forward to subsequent PCH's.
- Extend the untweaked version of the workaround to include PCH_CNP as
well. Note that since PCH_CNP is used to represent CMP, this will
apply on CML and some variants of RKL too.
- Cap the untweaked version of the workaround so that it won't apply to
"fake" PCH's (i.e., DG1). The issue we're working around really is
an issue in the PCH itself, not the South Display, so it shouldn't
apply when there isn't a real PCH.
v3:
- use intel_de_rmw(). [Rodrigo]
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110121700.4338-1-anshuman.gupta@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110210447.27454-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Just like for rkl and tgl, this should be permanent as well for dg1
instead just for A0. The commit making it permanent for those platforms
ended up "racing" with the commit adding the DG1 WAs, so now fix that up.
v2: Add "tgl,dg1" to WA comment (Matt)
Cc: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027043228.696518-3-lucas.demarchi@intel.com
If intel context create failed, the perf_request_latency() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: 25c26f18ea ("drm/i915/selftests: Measure dispatch latency")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116143540.3648870-1-zhangxiaoxu5@huawei.com
(cherry picked from commit 1938445205)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If intel context create failed, the perf_series_engines() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: cbfd3a0c5a ("drm/i915/selftests: Add request throughput measurement to perf")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116144112.3673011-1-zhangxiaoxu5@huawei.com
(cherry picked from commit 01d708840c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
I forgot to free the old list when growing past 16 entries.
Luckily, as much as I checked, none of the current platforms has more than
16 workarounds on a single list.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 452420d22d ("drm/i915: Fuse per-context workaround handling with the common framework")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113132510.2298483-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 77c296966e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Some media power gates are disabled by default. commit 5d86923060
("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
tried to enable it, but it duplicated an existent register.
So, the main PG setup sequences ended up overwriting it.
So, let's now merge this to the main PG setup sequence.
v2: (Chris): s/BIT/REG_BIT, remove useless comment,
remove useless =0, use the right gt,
remove rc6 sequence doubt from commit message.
Fixes: 5d86923060 ("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: stable@vger.kernel.org#v5.5+
Cc: Dale B Stimson <dale.b.stimson@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111072859.1186070-1-rodrigo.vivi@intel.com
(cherry picked from commit 695dc55b57)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently the DPLL .get_freq() uses pll->state.hw_state which
is not the thing we actually read out (except during driver
load/resume). Outside of that pll->state.hw_state is just the
thing we committed last time around. During state check we
just read the thing into crtc_state->dpll_hw_state, so that
is what we should use for calculating the DPLL output frequency.
I think we used to do this so that the results of the readout
were actually used, but somehow it got changed when the
.get_freq() refactoring happened.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109231239.17002-3-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
On icl+ we want to populate both crtc_state.{shared_dpll,dpll_hw_state}
and crtc_state.port_dplls[] during readout, whereas on pre-icl we
want to leave the latter stuff untouched. Rather than adding more ifs
into hsw_get_ddi_port_state() to copy the DPLL hw state around let's
just move the whole dpll readout into hsw_get_ddi_dpll() & co.
Slightly repetitive, but meh.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109231239.17002-2-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Store the relative data rate for planes in the crtc state
so that we don't have to use
intel_atomic_crtc_state_for_each_plane_state() to compute
it even for the planes that are no part of the current state.
Should probably just nuke this stuff entirely an use the normal
plane data rate instead. The two are slightly different since this
relative data rate doesn't factor in the actual pixel clock, so
it's a bit odd thing to even call a "data rate". And since the
watermarks are computed based on the actual data rate anyway
I don't really see what the point of this relative data rate
is. But that's for the future...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-6-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
In order to remove intel_atomic_crtc_state_for_each_plane_state()
from skl_crtc_can_enable_sagv() we can simply precompute whether
each wm level can tolerate the SAGV block time latency or not.
This has the nice side benefit that we remove the duplicated
wm level latency calculation. In fact the copy of that code
we had in skl_crtc_can_enable_sagv() didn't even handle
WaIncreaseLatencyIPCEnabled/Display WA #1141 whereas the copy
in skl_compute_plane_wm() did. So now we just have the one
copy which handles all the w/as.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-5-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
If intel context create failed, the perf_request_latency() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: 25c26f18ea ("drm/i915/selftests: Measure dispatch latency")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116143540.3648870-1-zhangxiaoxu5@huawei.com
If intel context create failed, the perf_series_engines() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: cbfd3a0c5a ("drm/i915/selftests: Add request throughput measurement to perf")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116144112.3673011-1-zhangxiaoxu5@huawei.com
I forgot to free the old list when growing past 16 entries.
Luckily, as much as I checked, none of the current platforms has more than
16 workarounds on a single list.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 452420d22d ("drm/i915: Fuse per-context workaround handling with the common framework")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113132510.2298483-1-tvrtko.ursulin@linux.intel.com
intel_atomic_crtc_state_for_each_plane_state() peeks at the
plane's current state without holding the plane's mutex, trusting
that the crtc's mutex will protect it. In practice that does work
since our planes can't move between pipes, but it sets a bad
example. intel_atomic_crtc_state_for_each_plane_state() also
relies on crtc_state.uapi.plane_mask which may be full of lies
when it comes to the bigjoiner stuff, so soon we can't use it as
is anyway. So best to just get rid of it entirely. Which we can
easily do by switching to the g4x/vlv "raw" watermark approach.
Later on we should even be able to move the "raw" watermark
computation into the normal .plane_check() code, leaving only
the merging/clamping of the final watermarks to the later
stages. But that will require adjusting the ilk+ wm code
similarly as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-3-ville.syrjala@linux.intel.com
Pass the whole intel_atomic_state to skl_build_pipe_wm()
and skl_allocate_pipe_ddb() so we can start to iterate
stuff containerd in the commit.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106173042.7534-2-ville.syrjala@linux.intel.com
No functional changes here, just adds a from_crtc_state
as a prep for bigjoiner
v2:
* More prep with intel_atomic_state (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113155656.17630-2-manasi.d.navare@intel.com
No functional changes, to align with previous cleanups pass
intel_atomic_state instead of drm_atomic_state.
Also pass this intel_atomic_state with crtc_state to
some of the atomic_check functions.
v2:
* Squash some changes from next patch (Ville)
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201113155656.17630-1-manasi.d.navare@intel.com
With bigjoiner, there will be 2 pipes driving 2 halves of 1 transcoder,
because of this, we need a pipe_mode for various calculations, including
for example watermarks, plane clipping, etc.
v10:
* remove redundant pipe_mode assignment (Ville)
v9:
* pipe_mode in state dump nd state check (Ville)
v8:
* Add pipe_mode in readout in verify_crtc_state (Ville)
v7:
* Remove redundant comment (Ville)
* Just keep mode instead of pipe_mode (Ville)
v6:
* renaming in separate function, only pipe_mode here (Ville)
* Add description (Maarten)
v5:
* Rebase (Manasi)
v4:
* Manual rebase (Manasi)
v3:
* Change state to crtc_state, fix rebase err (Manasi)
v2:
* Manual Rebase (Manasi)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
[vsyrjala:
* Fix state checker
* Fix state dump
* Use pipe_mode for linetime watermarks
* Make sure pipe_mode normal timings are correct since the
silly ddb code uses them
* Drop the redundant pipe_mode copies from intel_modeset_pipe_config()
and intel_crtc_copy_uapi_to_hw_state()
* Use drm_mode_copy() all over]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-7-ville.syrjala@linux.intel.com
Collect up a bunch of derived state "readout" into
a common helper, which we can call from both
intel_encoder_get_config() and intel_crtc_get_pipe_config().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-6-ville.syrjala@linux.intel.com
Generalize intel_mode_from_pipe_config() to work on any two
arbitrary modes. Also relocate the code for the future, and
make it static since it's not needed elsewhere.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-5-ville.syrjala@linux.intel.com
No reason to make the callers of intel_crtc_get_pipe_config()
populate hw.active. Let's do it in intel_crtc_get_pipe_config()
itself. hw.enable we leave up to the callers since it's slightly
different for readout vs. state check.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-4-ville.syrjala@linux.intel.com
Create a new function intel_crtc_get_pipe_config()
that calls platform specific hooks for get_pipe_config()
No functional change here.
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala: Conform to modern i915 coding style, fix patch subject]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-3-ville.syrjala@linux.intel.com
No functional changes, create a separate intel_encoder_get_config()
function that calls encoder->get_config hook.
This is needed so that later we can add beigjoienr related
readout here.
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
[vsyrjala: Move the code around for the future]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112191718.16683-2-ville.syrjala@linux.intel.com
Cross-subsystem Changes:
- DMA mapped scatterlist fixes in i915 to unblock merging of
https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom)
Driver Changes:
- Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"):
Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris)
- Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce)
- Delay execlist processing for Tigerlake to avoid hang (Chris)
- Fix for Tigerlake RCS engine health check through heartbeat (Chris)
- Fix for Tigerlake reserved MOCS entries (Ayaz, Chris)
- Fix Media power gate sequence on Tigerlake (Rodrigo)
- Enable eLLC caching of display buffers for SKL+ (Ville)
- Support parsing of oversize batches on Gen9 (Matt, Chris)
- Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris)
- Flush engines before Tigerlake breadcrumbs (Chris)
- Use the local HWSP offset during submission (Chris)
- Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew)
- Use the active reference on the vma while capturing to avoid use-after-free (Chris)
- Fix MOCS PTE setting for gen9+ (Ville)
- Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris)
- Avoid NULL dereference from PT/PD stash allocation error (Matt)
- Hold request reference for canceling an active context (Chris)
- Avoid infinite loop on x86-32 when mapping a lot of objects (Chris)
- Disallow WC mappings when processor doesn't support them (Chris)
- Return correct error in i915_gem_object_copy_blt() error path (Dan)
- Return correct error in intel_context_create_request() error path (Maarten)
- Tune down GuC communication enabled/disabled messages to debug (Jani)
- Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris)
- Cancel outstanding work after disabling heartbeats on an engine (Chris)
- Signal cancelled requests (Chris)
- Retire cancelled requests on unload (Chris)
- Scrub HW state on driver remove (Chris)
- Undo forced context restores after trivial preemptions (Chris)
- Handle PCI unbind in PMU code (Tvrtko)
- Fix CPU hotplug with multiple GPUs in PMU code (Trtkko)
- Correctly set SFC capability for video engines (Venkata)
- Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal)
- Improve GuC warnings on loading failure (John)
- Avoid ownership race in buffer pool by clearing age (Chris)
- Use MMIO to read CSB in case of failure (Chris, Mika)
- Show engine properties in engine state dump to indicate changes (Chris, Joonas)
- Break up error capture compression loops with cond_resched() (Chris)
- Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris)
- Serialise debugfs i915_gem_objects with ctx->mutex (Chris)
- Always test execution status on closing the context and close if not persistent (Chris)
- Avoid mixing integer types during batch copies (Chris, Jared)
- Skip over MI_NOOP when parsing to avoid overhead (Chris)
- Hold onto an explicit ref to i915_vma_work.pinned (Chris)
- Perform all asynchronous waits prior to marking payload start (Chris)
- Pull phys pread/pwrite implementations to the backend (Matt)
- Improve record of hung engines in error state (Tvrtko)
- Allow backends to override pread implementation (Matt)
- Reinforce LRC poisoning checks to confirm context survives execution (Chris)
- Fix memory region max size calculation (Matt)
- Fix order when adding blocks to memory region (Matt)
- Eliminate unused intel_virtual_engine_get_sibling func (Chris)
- Cleanup kasan warning for on-stack (unsigned long) casting (Chris)
- Onion unwind for scratch page allocation failure (Chris)
- Poison stolen pages before use (Chris)
- Selftest improvements (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com
When we fail to take the module reference, we go to the 'undo*' branch and
return. But the returned variable 'ret' has been set as zero by the
above code. Change 'ret' to '-ENODEV' in this situation.
Fixes: 9bdb073464 ("drm/i915/gvt: Change KVMGT as self load module")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1605187352-51761-1-git-send-email-wangxiongfeng2@huawei.com
Move the specialised interactions with the physical GEM object from the
pread/pwrite ioctl handler into the phys backend.
Currently, if one is able to exhaust the entire aperture and then try to
pwrite into an object not backed by struct page, we accidentally invoked
the phys pwrite handler on a non-phys object; calamitous.
Fixes: c6790dc223 ("drm/i915: Wean off drm_pci_alloc/drm_pci_free")
Testcase: igt/gem_pwrite/exhaustion
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-2-chris@chris-wilson.co.uk
(cherry picked from commit 852e1b3644)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As there are more and more complicated interactions between the different
backing stores and userspace, push the control into the backends rather
than accumulate them all inside the ioctl handlers.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-1-chris@chris-wilson.co.uk
(cherry picked from commit 0049b68845)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
No functional changes. This patch just moves some mode checks
around to prepare for adding bigjoiner related mode validation
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112023954.12301-1-manasi.d.navare@intel.com
Just following what we do in many other places, DG1 is a exception so
move it to the top instead of add it inside of INTEL_GEN(dev_priv) >= 12.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111162408.98002-2-jose.souza@intel.com
DC9 has a separate HW flow from the rest of the DC states and it is
available in GEN9 LP platforms and on GEN11 and newer, so here
moving the assignment of the mask to a single conditional block to
simplifly code.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111162408.98002-1-jose.souza@intel.com
Specification says the bit7 of the DPCD MAX_LANE_COUNT (offset 0x02) must
be set to 1 when comes to the displayport version 1.2. This patch respects
the definition.
W/o this patch, guest i915 driver can only set the resolution to 1024*768,
and complains about the unsuccessful link training:
[ 5.692193] i915 0000:00:02.0: [drm] *ERROR* index 0, lane_count 1 Link Training Unsuccessful
Fixes: e2e02cbb5b ("drm/i915/gvt: make dpcd_fix_data supports DP1.2")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200921065807.247847-1-tina.zhang@intel.com
Now that hpd/display related calls are split from the rest in
intel_irq_init(), skip all of that in case we don't have display.
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-8-lucas.demarchi@intel.com
!HAS_DISPLAY() implies !HAS_OVERLAY(), skipping overlay setup anyway, so
return earlier from intel_modeset_init() for clarity.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-4-lucas.demarchi@intel.com
Display is always disabled and enabled when resetting any engine, but if
there is no display it should not do anything with display and only
reset the needed engines.
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-3-lucas.demarchi@intel.com
Some media power gates are disabled by default. commit 5d86923060
("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
tried to enable it, but it duplicated an existent register.
So, the main PG setup sequences ended up overwriting it.
So, let's now merge this to the main PG setup sequence.
v2: (Chris): s/BIT/REG_BIT, remove useless comment,
remove useless =0, use the right gt,
remove rc6 sequence doubt from commit message.
Fixes: 5d86923060 ("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: stable@vger.kernel.org#v5.5+
Cc: Dale B Stimson <dale.b.stimson@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201111072859.1186070-1-rodrigo.vivi@intel.com
idr_init() uses base 0 which is an invalid identifier. The new function
idr_init_base allows IDR to set the ID lookup from base 1. This avoids
all lookups that otherwise starts from 0 since 0 is always unused.
References: commit 6ce711f275 ("idr: Make 1-based IDRs more efficient")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201104121532.GA48202@localhost
Reduce this maintenance nightmare a bit by converting the plane
min/max width/height stuff into vfuncs.
Now, if I could just think of a nice way to also use this for
intel_mode_valid_max_plane_size()...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924185113.30849-1-ville.syrjala@linux.intel.com
Reviewed-by: Aditya Swarup <aditya.swarup@intel.com>
We need commit f8f6ae5d07 ("mm: always have io_remap_pfn_range() set
pgprot_decrypted()") to be able to merge Jason's cleanup patch.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl+oiOgeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGKBQIAJw6oad/FA7j9OO2
dMoaXb8UaBehGWgW2rdfWrFPV5v0DBnp/GkdRpLoZIjV3W4mBfnog7bIa4Eswlxo
Y8sZxo5/3JlgJQUkHvzR1TYk5z61lHkUw9Kj/cCyx6YdbjSl19AfFsnhQVVMuyp9
TXL2c7hxkHlw8eBGrymVu0Ip7Zq0x8O2g+8nQpmRcvaR6SBuSHdikDF/iWCtU1YW
wpk5eWEVaAO67keZOz6b+aCFHqjFX+1dUBBuPnslucYLR73Qi16hfaU9pebe97Gb
lX/MJ1bR9BeRp314cF0PYbm4WhKjRLudHOFJH8x3dj/BiYNrFK3SJGLiiTwsrAZ8
kytU0Xs=
=Ke/D
-----END PGP SIGNATURE-----
Merge v5.10-rc3 into drm-next
We need commit f8f6ae5d07 ("mm: always have io_remap_pfn_range() set
pgprot_decrypted()") to be able to merge Jason's cleanup patch.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some disply regs are not setup correctly during HPD for BXT/APL thus
vfio_edid still not working. Temporarily disable the vfio_edid dynamic
update until issue fixed.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201109073939.758302-1-colin.xu@intel.com
Program display related vregs to proper value at initialization, setup
virtual monitor and hotplug.
vGPU virtual display vregs inherit the value from pregs. The virtual DP
monitor is always setup on PORT_B for BXT/APL. However the host may
connect monitor on other PORT or without any monitor connected. Without
properly setup PIPE/DDI/PLL related vregs, guest driver may not setup
the virutal display as expected, and the guest desktop may not be
created.
Since only one virtual display is supported, enable PIPE_A only. And
enable transcoder/DDI/PLL based on which port is setup for BXT/APL.
V2:
Revise commit message.
V3:
set_edid should on PORT_B for BXT.
Inject hpd event for BXT.
V4:
Temporarily disable vfio edid on BXT/APL until issue fixed.
V5:
Rebase to use new HPD define GEN8_DE_PORT_HOTPLUG for BXT.
Put vfio edid disabling on BXT/APL to a separate patch.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201109073922.757759-1-colin.xu@intel.com
This patch add gvt resume wrapper into i915_drm_resume().
GVT relies on i915 so resume gvt at last.
V2:
- Direct call into gvt suspend/resume wrapper in intel_gvt.h/intel_gvt.c.
The wrapper and implementation will check and call gvt routine. (zhenyu)
V3:
Refresh.
V4:
Rebase.
V5:
Fail intel_gvt_suspend() if fail to save GGTT.
V6:
Save host entry to per-vGPU gtt.ggtt_mm on each host_entry update so
only need the resume routine.
V7:
Refresh.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201027045406.159566-1-colin.xu@intel.com
This patch save/restore necessary GVT info during i915 suspend/resume so
that GVT enabled QEMU VM can continue running.
Only GGTT and fence regs are saved/restored now. GVT will save GGTT
entries on each host_entry update, restore the saved dirty entries
and re-init fence regs in resume routine.
V2:
- Change kzalloc/kfree to vzalloc/vfree since the space allocated
from kmalloc may not enough for all saved GGTT entries.
- Keep gvt suspend/resume wrapper in intel_gvt.h/intel_gvt.c and
move the actual implementation to gvt.h/gvt.c. (zhenyu)
- Check gvt config on and active with intel_gvt_active(). (zhenyu)
V3: (zhenyu)
- Incorrect copy length. Should be num entries * entry size.
- Use memcpy_toio()/memcpy_fromio() instead of memcpy for iomem.
- Add F_PM_SAVE flags to indicate which MMIOs to save/restore for PM.
V4:
Rebase.
V5:
Fail intel_gvt_save_ggtt as -ENOMEM if fail to alloc memory to save
ggtt. Free allocated ggtt_entries on failure.
V6:
Save host entry to per-vGPU gtt.ggtt_mm on each host_entry update.
V7:
Restore GGTT entry based on present bit.
Split fence restore and mmio restore in different functions.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201027045308.158955-1-colin.xu@intel.com
JSL has update in vswing table for eDP.
BSpec: 21257
Changes since V1:
- Fixed few checkpatch errors
Cc: Souza Jose <jose.souza@intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020053657.99890-1-tejaskumarx.surendrakumar.upadhyay@intel.com
DG1 uses 2 registers for the ddi clock mapping, with PHY A and B using
DPCLKA_CFGCR0 and PHY C and D using DPCLKA1_CFGCR0. Hide this behind a
single macro that chooses the correct register according to the phy
being accessed, use the correct bitfields for each pll/phy and implement
separate functions for DG1 since it doesn't share much with ICL/TGL
anymore.
The previous values were correct for PHY A and B since they were using
the same register as before and the bitfields were matching.
v2: Add comment and try to simplify DG1_DPCLKA* macros by reusing
previous ones
v3:
- Fix DG1_DPCLKA_CFGCR0_DDI_CLK_SEL_MASK() after wrong macro reuse
- Move phy -> id map to a separate macro (Aditya)
- Remove DG1_DPCLKA_CFGCR0_DDI_CLK_SEL_MASK where not required
(Aditya)
- Use drm_WARN_ON
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Aditya Swarup <aditya.swarup@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106210006.837953-1-lucas.demarchi@intel.com
When performing an allocation we try split it down into the largest
possible power-of-two blocks/pages-sizes, and for the common case we
expect to allocate the blocks in descending order. This also naturally
fits with our GTT alignment tricks(including the hugepages selftest),
where we sometimes try to align to the largest possible GTT page-size
for the allocation, in the hope that translates to bigger GTT
page-sizes. Currently, we seem to incorrectly add the blocks in the
opposite order, which is definitely not the intended behaviour.
Reported-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109111249.109365-1-matthew.auld@intel.com
Instead of printing out the internal engine mask, which can change between
kernel versions making it difficult to map to actual engines, present a
bitmask of hanging engines ABI classes. For example:
[drm] GPU HANG: ecode 9:8:24dffffd, in gem_exec_schedu [1334]
Engine ABI class is useful to quickly categorize render vs media etc hangs
in bug reports. Considering virtual engine even more so than the current
scheme.
v2:
* Do not re-order fields. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105113842.1395391-1-tvrtko.ursulin@linux.intel.com
Between events which trigger engine and GPU resets and capturing the error
state we lose information on which engine triggered the reset. Improve
this by passing in the hung engine mask down to error capture.
Result is that the list of engines in user visible "GPU HANG: ecode
<gen>:<engines>:<ecode>, <process>" is now a list of hanging and not just
active engines. Most importantly the displayed process is now the one
which was actually hung.
v2:
* Stub prototype. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104134743.916027-1-tvrtko.ursulin@linux.intel.com
SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.
Fixes: c5d3e39caa ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniele.ceraolospurio@intel.com
Only the following drivers aren't converted:
- amdgpu, because of the driver_feature mangling due to virt support.
Subsequent patch will address this.
- nouveau, because DRIVER_ATOMIC uapi is still not the default on the
platforms where it's supported (i.e. again driver_feature mangling)
- vc4, again because of driver_feature mangling
- qxl, because the ioctl table is somewhere else and moving that is
maybe a bit too much, hence the num_ioctls assignment prevents a
const driver structure.
- arcpgu, because that is stuck behind a pending tiny-fication series
from me.
- legacy drivers, because legacy requires non-const drm_driver.
Note that for armada I also went ahead and made the ioctl array const.
Only cc'ing the driver people who've not been converted (everyone else
is way too much).
v2: Fix one misplaced const static, should be static const (0day)
v3:
- Improve commit message (Sam)
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: kernel test robot <lkp@intel.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104100425.1922351-5-daniel.vetter@ffwll.ch
UAPI Changes:
Cross-subsystem Changes:
- arch/arm64: Describe G12b GPU as coherent
- iommu: Support coherency for Mali LPAE
Core Changes:
- atomic: Pass full state to CRTC atomic_{check, begin, flush}(); Use
atomic-state pointers
- drm: Remove SCATTER_LIST_MAX_SEGMENT; Cleanups
- doc: Document legacy_cursor_update better; cleanups
- edid: Don't warn n EDIDs of zero
- ttm: New backend allocation pool; Remove old page allocator; Rework
no_retry handling; Replace flags with booleans in struct ttm_operation_ctx
- vram-helper: Cleanups
- fbdev: Cleanups
- console: Store font size as unsigned value
Driver Changes:
- ast: Support new display mode
- amdgpu: Switch to new TTM allocator
- hisilicon: Cleanups
- nouveau: Switch to new TTM allocator; Fix include of swiotbl.h and
limits.h; Use state helper instead of CRTC state pointer
- panfrost: Support cache-coherent integrations; Fix mutex corruption on
open/close; Cleanupse
- qxl: Cleanups
- radeon: Switch to new TTM allocator
- ticdc: Fix build failure
- vmwgfx: Switch to new TTM allocator
- xlnx: Use dma_request_chan
- fbdev/sh_mobile: Cleanups
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl+j0DcACgkQaA3BHVML
eiNdAQf/fadZLQeKHuUYTlvyCkjxcWnpT2gXpcX8pq/7AMv1dW2+R7ZEs1SfuFD7
/4LPd9+prFgYBB5T7GAoUN46Ni6fO3/hfP7KfQNW3o1GC4a0+kvgbgEp+bzRQotX
6AAH0nNA0ADYQWSWpouOEJOQVrAtLFe7eQcieMna8/SYMvKDhxch5sxP5HYONbsU
eF9wPzGVlcVFf2wdow2lllp3pKU40etxyNhdJuwbuqBtudJZgr7yWFKC4XoW0V2w
kWmbY5nZfAIPgb9DwD9FhWnhOd2jVVpeea1ZUjhCAqt1E8T7szzJI1BTl4vs8npT
O8KIG+RAI/l5WPrQ1AbMjbknIQTQVw==
=mzXQ
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-11-05' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.11:
UAPI Changes:
Cross-subsystem Changes:
- arch/arm64: Describe G12b GPU as coherent
- iommu: Support coherency for Mali LPAE
Core Changes:
- atomic: Pass full state to CRTC atomic_{check, begin, flush}(); Use
atomic-state pointers
- drm: Remove SCATTER_LIST_MAX_SEGMENT; Cleanups
- doc: Document legacy_cursor_update better; cleanups
- edid: Don't warn n EDIDs of zero
- ttm: New backend allocation pool; Remove old page allocator; Rework
no_retry handling; Replace flags with booleans in struct ttm_operation_ctx
- vram-helper: Cleanups
- fbdev: Cleanups
- console: Store font size as unsigned value
Driver Changes:
- ast: Support new display mode
- amdgpu: Switch to new TTM allocator
- hisilicon: Cleanups
- nouveau: Switch to new TTM allocator; Fix include of swiotbl.h and
limits.h; Use state helper instead of CRTC state pointer
- panfrost: Support cache-coherent integrations; Fix mutex corruption on
open/close; Cleanupse
- qxl: Cleanups
- radeon: Switch to new TTM allocator
- ticdc: Fix build failure
- vmwgfx: Switch to new TTM allocator
- xlnx: Use dma_request_chan
- fbdev/sh_mobile: Cleanups
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105101641.GA13099@linux-uq9g
Move the specialised interactions with the physical GEM object from the
pread/pwrite ioctl handler into the phys backend.
Currently, if one is able to exhaust the entire aperture and then try to
pwrite into an object not backed by struct page, we accidentally invoked
the phys pwrite handler on a non-phys object; calamitous.
Fixes: c6790dc223 ("drm/i915: Wean off drm_pci_alloc/drm_pci_free")
Testcase: igt/gem_pwrite/exhaustion
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-2-chris@chris-wilson.co.uk
As there are more and more complicated interactions between the different
backing stores and userspace, push the control into the backends rather
than accumulate them all inside the ioctl handlers.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105154934.16022-1-chris@chris-wilson.co.uk
As per W/A implemented for TGL to program half of the nominal
DCO divider fraction value which is also applicable on EHL.
Changes since V2:
- Apply stepping B0 till FOREVER
- B0 - revid update as per Bspec 29153
Changes since V1:
- ehl_ used as to keep earliest platform prefix
- WA required B0 stepping onwards
Cc: Deak Imre <imre.deak@intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104050655.171185-1-tejaskumarx.surendrakumar.upadhyay@intel.com
The kernel test robot is not happy with that.
This reverts commit 2b5b95b1ff.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/394773/
Replace the previous approach to force compute the initial PSR state
after i915 take over from firmware by the better and recently added
initial_fastset_check() hook.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221048.104294-1-jose.souza@intel.com
Add the new vma_set_file() function to allow changing
vma->vm_file with the necessary refcount dance.
v2: add more users of this.
v3: add missing EXPORT_SYMBOL, rebase on mmap cleanup,
add comments why we drop the reference on two occasions.
v4: make it clear that changing an anonymous vma is illegal.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/394773/
Fix a typo that led to some MST short pulse event handling issue (the
short pulse event was handled for both encoder instances, each having
its own state).
Fixes: 1d8ca00245 ("drm/i915: Add PORT_TCn aliases to enum port")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104010000.4165574-1-imre.deak@intel.com
Since __vma_release is run by a kworker after the fence has been
signaled, it is no longer protected by the active reference on the vma,
and so the alias of vw->pinned to vma->obj is also not protected by a
reference on the object. Add an explicit reference for vw->pinned so it
will always be safe.
Found by inspection.
Fixes: 54d7195f8c ("drm/i915: Unpin vma->obj on early error")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102161931.30031-1-chris@chris-wilson.co.uk
(cherry picked from commit bc73e5d330)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In a simple test case that writes to scratch and then busy-waits for the
batch to be signaled, we observe that the signal is before the write is
posted. That is bad news.
Splitting the flush + write_dword into two separate flush_dw prevents
the issue from being reproduced, we can presume the post-sync op is not
so post-sync.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/216
Testcase: igt/gem_exec_fence/parallel
Testcase: igt/i915_selftest/live/gt_timelines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221057.29626-2-chris@chris-wilson.co.uk
(cherry picked from commit 09212e81e5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add another lower level to emit_ggtt_write so that the GGTT nature of
the write is not hardcoded into the emitter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221057.29626-1-chris@chris-wilson.co.uk
(cherry picked from commit 2739d8cfc5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We wrap the timeline on construction of the next request, but there may
still be requests in flight that have not yet finalized the breadcrumb.
(The breadcrumb is delayed as we need engine-local offsets, and for the
virtual engine that is not known until execution.) As such, by the time
we write to the timeline's HWSP offset it may have changed, and we
should use the value we preserved in the request instead.
Though the window is small and infrequent (at full flow we can expect a
timeline's seqno to wrap once every 30 minutes), the impact of writing
the old seqno into the new HWSP is severe: the old requests are never
completed, and the new requests are completed before they are even
submitted.
Fixes: ebece75392 ("drm/i915: Keep timeline HWSP allocated until idle across the system")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022064127.10159-1-chris@chris-wilson.co.uk
(cherry picked from commit c10f6019d0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Avoid skipping what appears to be a no-op set-domain-ioctl if the cache
coherency state is inconsistent with our target domain. This also has
the utility of using the population of the pages to validate the backing
store.
The danger in skipping the first set-domain is leaving the cache
inconsistent and submitting stale data, or worse leaving the clean data
in the cache and not flushing it to the GPU. The impact should be small
as it requires a no-op set-domain as the very first ioctl in a
particular sequence not found in typical userspace.
Reported-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Fixes: 754a254427 ("drm/i915: Skip object locking around a no-op set-domain ioctl")
Testcase: igt/gem_mmap_offset/blt-coherency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019203825.10966-1-chris@chris-wilson.co.uk
(cherry picked from commit 44c2200afc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The initial breadcrumb marks the transition from context wait and setup
into the request payload. We use the marker to determine if the request
is merely waiting to begin, or is inside the payload and hung.
Forgetting to include a breadcrumb before the user payload would mean we
do not reset the guilty user request, and conversely if the initial
breadcrumb is too early we blame the user for a problem elsewhere.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007090947.19950-1-chris@chris-wilson.co.uk
Since __vma_release is run by a kworker after the fence has been
signaled, it is no longer protected by the active reference on the vma,
and so the alias of vw->pinned to vma->obj is also not protected by a
reference on the object. Add an explicit reference for vw->pinned so it
will always be safe.
Found by inspection.
Fixes: 54d7195f8c ("drm/i915: Unpin vma->obj on early error")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102161931.30031-1-chris@chris-wilson.co.uk
In a simple test case that writes to scratch and then busy-waits for the
batch to be signaled, we observe that the signal is before the write is
posted. That is bad news.
Splitting the flush + write_dword into two separate flush_dw prevents
the issue from being reproduced, we can presume the post-sync op is not
so post-sync.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/216
Testcase: igt/gem_exec_fence/parallel
Testcase: igt/i915_selftest/live/gt_timelines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102221057.29626-2-chris@chris-wilson.co.uk
Since commit 9a40401cfa ("lib/scatterlist: Do not limit max_segment to
PAGE_ALIGNED values") the max_segment input to sg_alloc_table_from_pages()
does not have to be any special value. The new algorithm will always
create something less than what the user provides. Thus eliminate this
confusing constant.
- vmwgfx should use the HW capability, not mix in the OS page size for
calling dma_set_max_seg_size()
- i915 uses i915_sg_segment_size() both for sg_alloc_table_from_pages
and for some open coded sgl construction. This doesn't change the value
since rounddown(size, UINT_MAX) == SCATTERLIST_MAX_SEGMENT
- drm_prime_pages_to_sg uses it as a default if max_segment is zero,
UINT_MAX is fine to use directly.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Qian Cai <cai@lca.pw>
Cc: "Ursulin, Tvrtko" <tvrtko.ursulin@intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v1-44733fccd781+13d-rm_scatterlist_max_jgg@nvidia.com
If the VBT assigned tc->legacy_port mismatches the live_status indicator
for the connector, we ignore the VBT directive and switch over to the HW
setting. This is not a driver error, unless we happen to misparse the
VBT or the live_status registers. However, for the system in CI where
the error is only reported on 1 port out of 4, the evidence indicates
the VBT is wrong. Stop flaging this as an error since the cause is
beyond our control, fixup the mistake and continue on.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201030153209.14808-1-chris@chris-wilson.co.uk
ibx_irq_pre_postinstall() looks totally pointless. We can just
init both SDEIMR and SDEIER at the same time before enabling the
master interrupt. It's equally racy as the other order due
to doing all of this from the postinstall stage with the interrupt
handler already in place. That is, safe with MSI but racy with
shared legacy interrupts. Fortunately we should have MSI on all ilk+.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-20-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Let's enable the hardware hpd logic only for the ports we
can actually use.
In theory this may save some miniscule amounts of power,
and more importantly it eliminates a lot if platform specific
codepaths since the generic thing can now deal with any
combination of ports being present on each SKU.
v2: Deal with DG1
v3: Deal with DG1 some more
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-18-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
We no longer unmask all HPD irqs, so we can drop the ugly per-platform
HPD IIR masking. IMR will prevent unsupported bits from appearing in
IIR.
v2: Deal with DG1
Include "HOTPLUG" in the mask names (Lucas)
v3: Fix typos in subject
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-17-ville.syrjala@linux.intel.com
No reason that I can see why we should enable the hpd detection logic
already during irq postinstall phase. We don't even do this on all
the platforms. We just need it before we actually enable the hotplug
interrupts in .hpd_irq_setup(), and in fact we already do it there as
well. Let's just eliminate the redundant early setup.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-15-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Use hpd_pin to parametrize BXT_DE_PORT_HP_DDI() to make it clear
these have nothing to do with DDI ports or PHYs as such. The only
thing that matters is the HPD pin assignment.
v2: Remember the gvt
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-8-ville.syrjala@linux.intel.com
Let's make the AUX CH names match the spec (AUX A-F for pre-tgl,
AUX A-C or AUX USBC1-6 for tgl+). And while at it let's include
the full encoder name in the AUX CH name as well (as opposed to
just using port_name() which wouldn't give us the right thing on
tgl+).
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-6-ville.syrjala@linux.intel.com
Let's pimp the DDI encoder->name to reflect what the spec calls them.
Ie. on pre-tgl DDI A-F, on tgl+ DDI A-C or DDI TC1-6.
Also since each encoder is really a combination of the DDI and the PHY
we include the PHY name as well.
ICL is a bit special since it already has the two different types
of DDIs (combo or TC) but it still calls them just DDI A-F regarless
of the type. For that let's add an extra "(TC)" note to remind
is which type of DDI it really is.
The code is darn ugly, but not sure there's much we can do about it.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028213323.5423-4-ville.syrjala@linux.intel.com
- Remove dup mmio handler for BXT/APL. Otherwise mmio handler will fail
to init.
- Add engine GPR with F_CMD_ACCESS since BXT/APL will load them via
LRI. Otherwise, guest will enter failsafe mode.
V2:
Use RCS/BCS GPR macros instead of offset.
Revise commit message.
V3:
Use GEN8_RING_CS_GPR macros on ring base.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201016052913.209248-1-colin.xu@intel.com
One issue exposed after below commit with which the system will freeze
at suspend after vGPU is created (no need to activate the vGPU).
commit e6ba764802 ("drm/i915: Remove i915->kernel_context")
Old implementation pin the intel_context at setup_submission and
unpin it at clean_submission. So after some vGPU is created, the
intel_context is always pinned there although no workload using it.
It will then block i915 enter suspend state.
There is no need to pin it all the time. Pin/unpin it around workload
lifecycle is more reasonable. After GVT enabled suspend/resume, the
pinned intel_context will also get unpined when userspace put VM process
into suspend state since all workloads are retired, then it's safe to
unpin all intel_context for workloads created. So move the pin/unpin to
create_workload and destroy_workload, while still keep the
create/destroy in old place.
V2:
Rebase.
Fixes: e6ba764802 ("drm/i915: Remove i915->kernel_context")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201016054059.238371-1-colin.xu@intel.com
Restore RPS for ILK-M. We lost it when an extra HAS_RPS()
check appeared in intel_rps_enable().
Unfortunaltey this just makes the performance worse on my
ILK because intel_ips insists on limiting the GPU freq to
the minimum. If we don't do the RPS init then intel_ips will
not limit the frequency for whatever reason. Either it can't
get at some required information and thus makes wrong decisions,
or we mess up some weights/etc. and cause it to make the wrong
decisions when RPS init has been done, or the entire thing is
just wrong. Would require a bunch of reverse engineering to
figure out what's going on.
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 2bf06370bc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We are incorrectly limiting the max allocation size as per the mm
max_order, which is effectively the largest power-of-two that we can fit
in the region size. However, it's normal to setup the region or
allocator with a non-power-of-two size(for example 3G), which we should
already handle correctly, except it seems for the early too-big-check.
v2: make sure we also exercise the I915_BO_ALLOC_CONTIGUOUS path, which
is quite different, since for that we are actually limited by the
largest power-of-two that we can fit within the region size. (Chris)
Fixes: b908be543e ("drm/i915: support creating LMEM objects")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021103606.241395-1-matthew.auld@intel.com
(cherry picked from commit 83ebef47f8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This patch fixes below warnings reported by coccicheck
./drivers/gpu/drm/i915/i915_debugfs.c:789:5-8: Unneeded variable: "ret". Return "0" on line 1012
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1603937925-53176-1-git-send-email-zou_wei@huawei.com
Clear out some pointers when objects have been de-allocated. This
makes it much easier to track down use-after-free type issues.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-4-John.C.Harrison@Intel.com
Rather than just saying 'GuC failed to load: -110', actually print out
the GuC status register and break it down into the individual fields.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-3-John.C.Harrison@Intel.com
The latest GuC firmware includes a number of interface changes that
require driver updates to match.
* Starting from Gen11, the ID to be provided to GuC needs to contain
the engine class in bits [0..2] and the instance in bits [3..6].
NOTE: this patch breaks pointer dereferences in some existing GuC
functions that use the guc_id to dereference arrays but these functions
are not used for now as we have GuC submission disabled and we will
update these functions in follow up patch which requires new IDs.
* The new GuC requires the additional data structure (ADS) and associated
'private_data' pointer to be setup. This is basically a scratch area
of memory that the GuC owns. The size is read from the CSS header.
* There is now a physical to logical engine mapping table in the ADS
which needs to be configured in order for the firmware to load. For
now, the table is initialised with a 1 to 1 mapping.
* GUC_CTL_CTXINFO has been removed from the initialization params.
* reg_state_buffer is maintained internally by the GuC as part of
the private data.
* The ADS layout has changed significantly. This patch updates the
shared structure and also adds better documentation of the layout.
* While i915 does not use GuC doorbells, the firmware now requires
that some initialisation is done.
* The number of engine classes and instances supported in the ADS has
been increased.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-2-John.C.Harrison@Intel.com
On bxt, we require a VT'd w/a to serialise all GGTT updates with memory
transfers, and use stop_machine() for this purpose. stop_machine() is a
global serialisation barrier and so dangerous to use from within
critical sections, as the stop_machine() will wait for all cpus to enter
the stop_machine callback, and those cpus may be waiting for the
critical section already held.
Fixes: d7085b0faa ("drm/i915/gem: Poison stolen pages before use")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027184759.29888-1-chris@chris-wilson.co.uk
First check in the function is if swsci() is supported. All the error
paths are easy to figure out the reason, so remove the extra debug
message: it's normal not to support swsci() e.g. in dgfx.
v2: Rather than special case dgfx, just remove the debug message
(from Ville)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027044618.719064-2-lucas.demarchi@intel.com
Do not create the display debugfs files when we don't have display.
Based on previous patch by José Souza.
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027044618.719064-1-lucas.demarchi@intel.com
HPD pins are inverted for DG1 platform.
Bspec: 49956
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Clinton A Taylor <clinton.a.taylor@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021082034.3170478-3-lucas.demarchi@intel.com
DG1 has one more combo phy port, no TC and all irq handling goes through
SDE, like for MCC.
v2: Also change intel_hpd_pin_default() to include DG1 mapping
v3, v4: Rebase on hpd refactor
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021082034.3170478-2-lucas.demarchi@intel.com
Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
glitches that are often reproduced when executing CPU intensive
workloads while a eDP 4K panel is attached.
Manually exiting PSR causes the frontbuffer to be updated without
glitches and the IOMMU errors are also gone but this comes at the cost
of less time with PSR active.
So using this workaround until this issue is root caused and a better
fix is found.
The current code is already ready to enable PSR after this exit if
there is not other frontbuffer modifications.
Adding a new if block in psr_force_hw_tracking_exit() instead of reuse
the else/gen8- block because the plan is to revert this workaround
as soon as a better solution is found.
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002231627.24528-1-jose.souza@intel.com
fbcon/fonts:
- Two patches to prevent OOB access
ttm:
- fix for evicition value range check
amdgpu:
- Sienna Cichlid fixes
- MST manager resource leak fix
- GPU reset fix
amdkfd:
- Luxmark fix for Navi1x
i915:
- Tweak initial DPCD backlight.enabled value (Sean)
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfkhyvAAoJEAx081l5xIa+3qIQAKjvgIvuoLix+bl/gkOHwymv
QHTXVA3Gi4Rup778L3k1GNppQX4/LE42Of9wnsZIELxG0vxKl59JGzlWJVld/rtJ
ZXfoT5kh7Hh7kkiXnC2s11aqQSsgr25lfsY8gOWSPIGfwOn07JSMTkqPWlbwrOz2
il6qlOlgfSNfXwn2NxSTzGxZMrgOyUvKNZczRXU9gSuuoTsEo4bAvS7/vEN4hazX
DFQAZfd82PfcdAIkVzk/gOoaCQ6a9YgjOzg1RQ4gKhrj8UaWu4gUyJwWPjSMnLrh
uP2RM3gU54MhVF3jHt+D0Trv2ti2zStD9wc4AEJwOVQZtcDSsgGduOdUs5Xc/1l5
dBOvpumBmNxsYbVvvThijNeSx6Y5ybzI3iUp8SLDftiRZtwsXds2aRaskuVgMqj4
MSBdOiZqoJudLjwCBHWKe328+r00X+f14Vi30Y0cy4VW59NxLG5D7qjc5BGqJw1q
J3FQ/9uDh0lWbeKT/grapjP43IWcLApykZa3Rn6p2w0mW2+8Wht/WbrFYyNYGlzf
aNS9RnknTaMYWpvZUZLVG83dJpn6Y9ooHa9L/blMzfCxpF6ftEYf+Iq2x8s0gprz
tIq0xsGvBacsnQIOWRuHjuF87zibVbDp9ba+x78F/woyUqEhip+lBXaPofp132gQ
HtIdQewvG9KcLZcoluO4
=rAUM
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-10-23' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
"This should be the last round of things for rc1, a bunch of i915
fixes, some amdgpu, more font OOB fixes and one ttm fix just found
reading code:
fbcon/fonts:
- Two patches to prevent OOB access
ttm:
- fix for evicition value range check
amdgpu:
- Sienna Cichlid fixes
- MST manager resource leak fix
- GPU reset fix
amdkfd:
- Luxmark fix for Navi1x
i915:
- Tweak initial DPCD backlight.enabled value (Sean)
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)"
* tag 'drm-next-2020-10-23' of git://anongit.freedesktop.org/drm/drm: (31 commits)
drm/amdgpu: correct the cu and rb info for sienna cichlid
drm/amd/pm: remove the average clock value in sysfs
drm/amd/pm: fix pp_dpm_fclk
Revert drm/amdgpu: disable sienna chichlid UMC RAS
drm/amd/pm: fix pcie information for sienna cichlid
drm/amdkfd: Use same SQ prefetch setting as amdgpu
drm/amd/swsmu: correct wrong feature bit mapping
drm/amd/psp: Fix sysfs: cannot create duplicate filename
drm/amd/display: Avoid MST manager resource leak.
drm/amd/display: Revert "drm/amd/display: Fix a list corruption"
drm/amdgpu: update golden setting for sienna_cichlid
drm/amd/swsmu: add missing feature map for sienna_cichlid
drm/amdgpu: correct the gpu reset handling for job != NULL case
drm/amdgpu: add rlc iram and dram firmware support
drm/amdgpu: add function to program pbb mode for sienna cichlid
drm/i915: Drop runtime-pm assert from vgpu io accessors
drm/i915: Force VT'd workarounds when running as a guest OS
drm/i915: Exclude low pages (128KiB) of stolen from use
drm/i915/gt: Onion unwind for scratch page allocation failure
drm/ttm: fix eviction valuable range check.
...
intel_timeline_read_hwsp() is used to support semaphore waits between
engines, that may themselves be deferred for arbitrary periods -- that
is the read of the target request's HWSP is at an indeterminant point in
the future. To support this, we need to prevent overwriting a HWSP that
is being watched across a seqno wrap (otherwise the next request will
write its value into the old HWSP preventing the watcher from making
progress, ad infinitum.) To simulate the observer across a wrap, let's
create a request that reads from the HWSP and dispatch it at different
points around a wrap to see if the value is lost.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021220411.5777-2-chris@chris-wilson.co.uk
We wrap the timeline on construction of the next request, but there may
still be requests in flight that have not yet finalized the breadcrumb.
(The breadcrumb is delayed as we need engine-local offsets, and for the
virtual engine that is not known until execution.) As such, by the time
we write to the timeline's HWSP offset it may have changed, and we
should use the value we preserved in the request instead.
Though the window is small and infrequent (at full flow we can expect a
timeline's seqno to wrap once every 30 minutes), the impact of writing
the old seqno into the new HWSP is severe: the old requests are never
completed, and the new requests are completed before they are even
submitted.
Fixes: ebece75392 ("drm/i915: Keep timeline HWSP allocated until idle across the system")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022064127.10159-1-chris@chris-wilson.co.uk
Since Ironlake uses intel_ips.ko for its dynamic frequency adjustment,
we do not have direct control over the frequency management so such
tests are defunct. Similarly, we can't check the gen6+ RPS registers on
Ironlake.
Hopefully this catches all the invalid tests now that Ironlake has
rejoined the dynamic GPU frequency club. There is an opportunity for the
reader to add tests to exercise MEMINTRSTS and co.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022210814.23004-1-chris@chris-wilson.co.uk
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAl+R8a4ACgkQ+mJfZA7r
E8oazwgAu+NP5GlIHTcG6bV4vxRaEDdlDf3uH+ENOGfRT0PoX6qVNpkdc/nmTZC/
RFzHVsKJi1zXS9XetP33YKZ7KFPZxJMXdhGSu+f08Q6nf0jBdFUm3mhnrDWoLO9+
fS88L4HqXREvBLDctN89akuZ1/pZ3p0KsBqAOuP+LjRAsquHgIP3eUzwZMVbTgTZ
KLojKMHHrZxRtynMFrW6JO+gBok+mGydpOp7F1bK4jSmOIlYA5OsIW3xn9+Ytveu
9Od/yIjKeCyPD64ncz8wcbcPxZhnXENzZElBYT6sQSHDZ9wNCOUhfJT0v8t8G5So
vzOvf2YXa+2IKiGY2sd2ov9svz3Wgg==
=oidT
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-fixes-2020-10-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Tweak initia DPCD backlight.enabled value (Sean)
- Initialize reserved MOCS indices (Ayaz)
- Mark initial fb obj as WT on eLLC machines to avoid rcu lockup (Ville)
- Support parsing of oversize batches (Chris)
- Delay execlists processing for TGL (Chris)
- Use the active reference on the vma during error capture (Chris)
- Widen CSB pointer (Chris)
- Wait for CSB entries on TGL (Chris)
- Fix unwind for scratch page allocation (Chris)
- Exclude low patches of stolen memory (Chris)
- Force VT'd workarounds when running as a guest OS (Chris)
- Drop runtime-pm assert from vpgu io accessors (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022205613.GA3469192@intel.com
As we disable the interrupt during suspend, also reset the irq_mask to
short-circuit subsystems that later try to turn off their interrupt
source.
<4>[ 101.816730] i915 0000:00:02.0: drm_WARN_ON(!intel_irqs_enabled(dev_priv))
<4>[ 101.816853] WARNING: CPU: 3 PID: 4241 at drivers/gpu/drm/i915/i915_irq.c:343 ilk_update_display_irq+0xb3/0x130 [i915]
v2: Reset irq_mask for i8xx_irq_reset as well, and split patch to focus
on only i915->irq_mask
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022114246.28566-1-chris@chris-wilson.co.uk
Since we keep a driver global mask of online CPUs and base the decision
whether PMU needs to be migrated upon it, we need to make sure the
migration is done for all registered PMUs (so GPUs).
To do this we need to track the current CPU for each PMU and base the
decision on whether to migrate on a comparison between global and local
state.
At the same time, since dynamic CPU hotplug notification slots are a
scarce resource and given how we already register the multi instance type
state, we can and should add multiple instance of the i915 PMU to this
same state and not allocate a new one for every GPU.
v2:
* Use pr_notice. (Chris)
v3:
* Handle a nasty interaction where unregistration which triggers a false
CPU offline event. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Daniel Vetter <daniel.vetter@intel.com> # dynamic slot optimisation
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161144.678668-1-tvrtko.ursulin@linux.intel.com
Mark the device as closed and keep references to driver data alive to
allow for safe driver unbind with active PMU clients. Perf core does not
otherwise handle this case so we have to do it manually like this.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020100822.543332-1-tvrtko.ursulin@linux.intel.com
Avoid skipping what appears to be a no-op set-domain-ioctl if the cache
coherency state is inconsistent with our target domain. This also has
the utility of using the population of the pages to validate the backing
store.
The danger in skipping the first set-domain is leaving the cache
inconsistent and submitting stale data, or worse leaving the clean data
in the cache and not flushing it to the GPU. The impact should be small
as it requires a no-op set-domain as the very first ioctl in a
particular sequence not found in typical userspace.
Reported-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Fixes: 754a254427 ("drm/i915: Skip object locking around a no-op set-domain ioctl")
Testcase: igt/gem_mmap_offset/blt-coherency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019203825.10966-1-chris@chris-wilson.co.uk
The block comment for cnl_program_nearest_filter_coefs() has a wonderful
diagram, but although it is marked up as kerneldoc does not use the
markup for providing the function definition.
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'dev_priv' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'pipe' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'id' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'set' not described in 'cnl_program_nearest_filter_coefs'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021185649.17759-1-chris@chris-wilson.co.uk
Let's unmask the PCU event irq _after_ we've set up the
hardware and software to deal with the fallout. We can
also drop the PCU event bit from DEIER except when we
need it for rps.
And on the disable side we replace the hand rolled (and
unlocked) DEIER/IIR/IMR frobbing with ilk_disable_display_irq().
Ocd does require me to reorder it to be symmetric with
the enable path however.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-5-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
There is no GEN6_RPSTAT1 on ILK. Instead of reading that let's
try to get the same information from MEMSTAT_ILK. At least it
seems to track MEMSWCTL frequency request perfectly on my ILK.
It needs the same invert trick as the request value.
We don't want to put the invert thing into intel_gpu_freq()
and intel_freq_opcode() because that would incorrectly invert
the min/max/etc frequencies also.
One day someone might want to reverse engineer the formula for
converting these numbers to Hz, but for now we'll just report
them raw.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Restore RPS for ILK-M. We lost it when an extra HAS_RPS()
check appeared in intel_rps_enable().
Unfortunaltey this just makes the performance worse on my
ILK because intel_ips insists on limiting the GPU freq to
the minimum. If we don't do the RPS init then intel_ips will
not limit the frequency for whatever reason. Either it can't
get at some required information and thus makes wrong decisions,
or we mess up some weights/etc. and cause it to make the wrong
decisions when RPS init has been done, or the entire thing is
just wrong. Would require a bunch of reverse engineering to
figure out what's going on.
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
We are incorrectly limiting the max allocation size as per the mm
max_order, which is effectively the largest power-of-two that we can fit
in the region size. However, it's normal to setup the region or
allocator with a non-power-of-two size(for example 3G), which we should
already handle correctly, except it seems for the early too-big-check.
v2: make sure we also exercise the I915_BO_ALLOC_CONTIGUOUS path, which
is quite different, since for that we are actually limited by the
largest power-of-two that we can fit within the region size. (Chris)
Fixes: b908be543e ("drm/i915: support creating LMEM objects")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021103606.241395-1-matthew.auld@intel.com
In order to test how fast the heartbeat can respond, we measure with the
interval set to its minimum. Before we measure though, we want to be
sure we start with a fresh pulse, and so wait until any old one is
complete. During that wait though, we were continually flushing the
work, and so continually re-evaluating to see if the pulse was complete,
and each attempt would count as an unresponsive system. If the engine
did not complete the request in the couple of busy-spins, it would flag
an error. This is unfortunate, so let's not busy-spin waiting for the
old heartbeat, but terminate it and start afresh.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019142841.32273-1-chris@chris-wilson.co.uk
The "mmio" writes into vgpu registers are simple memory traps from the
guest into the host. We do not need to assert in the guest that the
device is awake for the io as we do not write to the device itself.
However, over time we have refactored all the mmio accessors with the
result that the vgpu reuses the gen2 accessors and so inherits the
assert for runtime-pm of the native device. The assert though has
actually been there since commit 3be0bf5acc ("drm/i915: Create vGPU
specific MMIO operations to reduce traps").
References: 3be0bf5acc ("drm/i915: Create vGPU specific MMIO operations to reduce traps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200811092532.13753-1-chris@chris-wilson.co.uk
(cherry picked from commit 0e65ce24a3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If i915.ko is being used as a passthrough device, it does not know if
the host is using intel_iommu. Mixing the iommu and gfx causes a few
issues (such as scanout overfetch) which we need to workaround inside
the driver, so if we detect we are running under a hypervisor, also
assume the device access is being virtualised.
Reported-by: Stefan Fritsch <sf@sfritsch.de>
Suggested-by: Stefan Fritsch <sf@sfritsch.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Stefan Fritsch <sf@sfritsch.de>
Cc: stable@vger.kernel.org
Tested-by: Stefan Fritsch <sf@sfritsch.de>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019101523.4145-1-chris@chris-wilson.co.uk
(cherry picked from commit f566fdcd6c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The GPU is trashing the low pages of its reserved memory upon reset. If
we are using this memory for ringbuffers, then we will dutiful resubmit
the trashed rings after the reset causing further resets, and worse. We
must exclude this range from our own use. The value of 128KiB was found
by empirical measurement (and verified now with a selftest) on gen9.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019165005.18128-2-chris@chris-wilson.co.uk
(cherry picked from commit d3606757e6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In switching to using objects for our ppGTT scratch pages, care was not
taken to avoid trying to unref NULL objects on failure. And for gen6
ppGTT, it appears we forgot entirely to unwind after a partial allocation
failure.
Fixes: 89351925a4 ("drm/i915/gt: Switch to object allocations for page directories")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019083444.1286-1-chris@chris-wilson.co.uk
(cherry picked from commit fa812ce96a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
GEN >= 10 hardware supports the programmable scaler filter.
Attach scaling filter property for CRTC and plane for GEN >= 10
hardwares and program scaler filter based on the selected filter
type.
changes since v3:
* None
changes since v2:
* Use updated functions
* Add ps_ctrl var to contain the full PS_CTRL register value (Ville)
* Duplicate the scaling filter in crtc and plane hw state (Ville)
changes since v1:
* None
Changes since RFC:
* Enable properties for GEN >= 10 platforms (Ville)
* Do not round off the crtc co-ordinate (Danial Stone, Ville)
* Add new functions to handle scaling filter setup (Ville)
* Remove coefficient set 0 hardcoding.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-5-pankaj.laxminarayan.bharadiya@intel.com
Integer scaling (IS) is a nearest-neighbor upscaling technique that
simply scales up the existing pixels by an integer
(i.e., whole number) multiplier.Nearest-neighbor (NN) interpolation
works by filling in the missing color values in the upscaled image
with that of the coordinate-mapped nearest source pixel value.
Both IS and NN preserve the clarity of the original image. Integer
scaling is particularly useful for pixel art games that rely on
sharp, blocky images to deliver their distinctive look.
Introduce functions to configure the scaler filter coefficients to
enable nearest-neighbor filtering.
Bspec: 49247
changes since v6:
* Trust compiler, remove pointless inline keyword from cnl_coef_tap()
& cnl_nearest_filter_coef() functions (Ville)
changes since v4:
* Make cnl_coef_tap(), cnl_nearest_filter_coef() inline (Uma)
changes since v3:
* None
changes since v2:
* Move APIs from 5/5 into this patch.
* Change filter programming related function names to cnl_*, move
filter select bits related code into inline function (Ville)
changes since v1:
* Rearrange skl_scaler_setup_nearest_neighbor_filter() to iterate the
registers directly instead of the phases and taps (Ville)
changes since RFC:
* Refine the skl_scaler_setup_nearest_neighbor_filter() logic (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-4-pankaj.laxminarayan.bharadiya@intel.com
Introduce scaler registers and bit fields needed to configure the
scaling filter in prgrammed mode and configure scaling filter
coefficients.
changes since v3:
* None
changes since v2:
* Change macro names to CNL_* and use +(set)*8 instead of adding
another trip through _PICK_EVEN (Ville).
changes since v1:
* None
changes since RFC:
* Parametrize scaler coeffient macros by 'set' (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-3-pankaj.laxminarayan.bharadiya@intel.com
No functional changes in this patch.
With Bigjoiner, there are 2 pipes driving 2 halfs of 1
transcoder. The transcoder_mode has the full timings, and is used
for configuring the transcoder with the intended mode after
joining the 2 halves.
To clear the confusion, we rename intel_set_pipe_timings to
intel_set_transcoder_timings
v2:
* Split the renaming into separate patch (Ville)
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201008214535.22942-2-manasi.d.navare@intel.com
Move the DSC stuff out from the middle of the ICP HPD register
definitions. The location seems to have been selected by a
dice roll.
SHPD_FILTER_CNT addition also went astray due to the DSC
mess, so we also fix that vs. ICP_TC_HPD_{SHORT,LONG}_DETECT().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006143349.5561-2-ville.syrjala@linux.intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
i915:
- Set all unused color plane offsets to ~0xfff again (Ville)
- Fix TGL DKL PHY DP vswing handling (Ville)
amdgpu:
- DCN clang warning fix
- eDP fix
- BACO fix
- Kernel documentation fixes
- SMU7 mclk fix
- VCN1 hw bug workaround
amdkfd:
- kvfree vs kfree fix
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfjSHWAAoJEAx081l5xIa+uFsP/2uljZFRr2IsiEsB7pI4cmpr
lZMRRA6SdCvSbSIF0Lu2Ndi3LVDM0TsLezsy0uoQWHPUB/TTI6uU+FcRCHevSOAs
JCyfp+DsFgJr5OIWiQzgP6qk67ygPLeSpzCr+Lr0HwXdlfuMQi/zo1Flp2srndLk
M1FwTb6WYGWfBB77q9qYzO9sJb8lnykd+cyOkvgYJsEcJUy/XCKyYi4IG21qaSCH
louciBMme9TbuE4IuIvQjQMFBVxCkE0ZTVrLPLC4VIBsQEH9Ld3PSxHIiCZmyo3k
nHRIxuxy4FnbB6bulToyxG4w94HoRtRbtCh6aBdRDpSNuGO9j1hTZhfR9Pbchyph
eI4BF4JpS4K5BoSYVqM/uviB0Ck6I0acr415p0guDI0BdeQCCjDZkZRnou3dW27p
FNWRaFlMCMr9n2elYoB4saKHd8hSjVYTFyaP/ftPZOYiO9IeZg8VrOC2QJcHirVG
4M77pixjCzUNZLGSvg55liLhmt2YsRWqrYABuv20MkeZUEqc329wjPjyeJFB1fBn
msq7dup37pNttD0XlU5x6Goabbcg3BeAyTAuMVWLCf0mQPOo5yfTUoRuyE4qJsfp
JSNe7wDN8U2N1uze5pIO1QriGcWb2++QGm9mXcoDJ0dbdGW4giZ+tVzssDloqb0X
/mQN0Af4HQj0R/Sh4jGx
=/+Vb
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-10-19' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Some fixes queued up already for i915 and amdgpu, I've also included
the fix for the clang warning you've seen.
i915:
- set all unused color plane offsets to ~0xfff again (Ville)
- fix TGL DKL PHY DP vswing handling (Ville)
amdgpu:
- DCN clang warning fix
- eDP fix
- BACO fix
- kernel documentation fixes
- SMU7 mclk fix
- VCN1 hw bug workaround
amdkfd:
- kvfree vs kfree fix"
* tag 'drm-next-2020-10-19' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Fix incorrect dsc force enable logic
drm/amdkfd: Use kvfree in destroy_crat_image
drm/amdgpu: vcn and jpeg ring synchronization
drm/amd/pm: increase mclk switch threshold to 200 us
docs: amdgpu: fix a warning when building the documentation
drm/amd/display: kernel-doc: document force_timing_sync
drm/amdgpu/swsmu: init the baco mutex in early_init
drm/amd/display: Fix module load hangs when connected to an eDP
drm/i915: Set all unused color plane offsets to ~0xfff again
drm/i915: Fix TGL DKL PHY DP vswing handling
Currently we call .hpd_irq_setup() directly just before display
resume, and follow it with another call via intel_hpd_init()
just afterwards. Assuming the hpd pins are marked as enabled
during the open-coded call these two things do exactly the
same thing (ie. enable HPD interrupts). Which even makes sense
since we definitely need working HPD interrupts for MST sideband
during the display resume.
So let's nuke the open-coded call and move the intel_hpd_init()
call earlier. However we need to leave the poll_init_work stuff
behind after the display resume as that will trigger display
detection while we're resuming. We don't want that trampling over
the display resume process. To make this a bit more symmetric
we turn this into a intel_hpd_poll_{enable,disable}() pair.
So we end up with the following transformation:
intel_hpd_poll_init() -> intel_hpd_poll_enable()
lone intel_hpd_init() -> intel_hpd_init()+intel_hpd_poll_disable()
.hpd_irq_setup()+resume+intel_hpd_init() -> intel_hpd_init()+resume+intel_hpd_poll_disable()
If we really would like to prevent all *long* HPD processing during
display resume we'd need some kind of software mechanism to simply
ignore all long HPDs. Currently we appear to have that just for
fbdev via ifbdev->hpd_suspended. Since we aren't exploding left and
right all the time I guess that's mostly sufficient.
For a bit of history on this, we first got a mechanism to block
hotplug processing during suspend in commit 15239099d7 ("drm/i915:
enable irqs earlier when resuming") on account of moving the irq enable
earlier. This then got removed in commit 50c3dc970a ("drm/fb-helper:
Fix hpd vs. initial config races") because the fdev initial config
got pushed to a later point. The second ad-hoc hpd_irq_setup() for
resume was added in commit 0e32b39cee ("drm/i915: add DP 1.2 MST
support (v0.7)") to be able to do MST sideband during the resume.
And finally we got a partial resurrection of the hpd blocking
mechanism in commit e8a8fedd57 ("drm/i915: Block fbdev HPD
processing during suspend"), but this time it only prevent fbdev
from handling hpd while resuming.
v2: Leave the poll_init_work behind
v3: Remove the extra intel_hpd_poll_disable() from display reset (Lyude)
Add the missing intel_hpd_poll_disable() to display init (Imre)
Cc: Lyude Paul <lyude@redhat.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201013181137.30560-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Rename intel_dp_sink_dpms() to intel_dp_set_power()
so one doesn't always have to convert from the DPMS
enum values to the actual DP D-states.
Also when dealing with a branch device this has nothing to
do with any sink, so the old name was nonsense anyway.
Also adjust the debug message accordingly, and pimp it
with the standard encoder id+name thing.
Trivial bits done with cocci:
@@
expression DP;
@@
(
- intel_dp_sink_dpms(DP, DRM_MODE_DPMS_OFF)
+ intel_dp_set_power(DP, DP_SET_POWER_D3)
|
- intel_dp_sink_dpms(DP, DRM_MODE_DPMS_ON)
+ intel_dp_set_power(DP, DP_SET_POWER_D0)
)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201016194800.25581-2-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Rather that try to trick LSPCON back into PCON mode from the .reset()
hook let's just do that as a regular part of the normal modeset
sequence, which is going to take care of the system resume case. During
a normal modeset this should normally be a nop as the mode should have
already been switched by .detect().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201016194800.25581-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The "mmio" writes into vgpu registers are simple memory traps from the
guest into the host. We do not need to assert in the guest that the
device is awake for the io as we do not write to the device itself.
However, over time we have refactored all the mmio accessors with the
result that the vgpu reuses the gen2 accessors and so inherits the
assert for runtime-pm of the native device. The assert though has
actually been there since commit 3be0bf5acc ("drm/i915: Create vGPU
specific MMIO operations to reduce traps").
References: 3be0bf5acc ("drm/i915: Create vGPU specific MMIO operations to reduce traps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200811092532.13753-1-chris@chris-wilson.co.uk
If i915.ko is being used as a passthrough device, it does not know if
the host is using intel_iommu. Mixing the iommu and gfx causes a few
issues (such as scanout overfetch) which we need to workaround inside
the driver, so if we detect we are running under a hypervisor, also
assume the device access is being virtualised.
Reported-by: Stefan Fritsch <sf@sfritsch.de>
Suggested-by: Stefan Fritsch <sf@sfritsch.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Stefan Fritsch <sf@sfritsch.de>
Cc: stable@vger.kernel.org
Tested-by: Stefan Fritsch <sf@sfritsch.de>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019101523.4145-1-chris@chris-wilson.co.uk
The GPU is trashing the low pages of its reserved memory upon reset. If
we are using this memory for ringbuffers, then we will dutiful resubmit
the trashed rings after the reset causing further resets, and worse. We
must exclude this range from our own use. The value of 128KiB was found
by empirical measurement (and verified now with a selftest) on gen9.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019165005.18128-2-chris@chris-wilson.co.uk
Underruns happens when plane height + y offset is not a modulo of 4
when FBC is enabled. It happens when scanline is at vactive - 10 but
that is not feasible to do from the software side so here completely
disabling FBC when height + y offset matches to avoid visual glitches.
Specification says that it only affects TGL display C stepping and
newer but to simply the check and as TGL is already in final costumers
hands, pre-production display stepping A and B was also included.
BSpec: 52887 ICL
BSpec: 52888 EHL/JSL
BSpec: 52890/55378 TGL
BSpec: 53508 DG1
BSpec: 53273 RKL
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019175609.28715-1-jose.souza@intel.com
This sequence is not part of "Sequences to Initialize Display" but
as noted in the MBus page the DBUF_CTL.Tracker_state_service needs
to be set to 8.
BSpec: 49213
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019173906.18892-1-jose.souza@intel.com
On Tigerlake, we are seeing a repeat of commit d8f5053117 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.
However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:
513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f5053117 ("drm/i915/icl: Forcibly evict stale csb entries")
References: HSDES#22011327657, HSDES#1508287568
Suggested-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org # v5.4
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-2-chris@chris-wilson.co.uk
(cherry picked from commit 233c1ae3c8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
A CSB entry is 64b, and it is simpler for us to treat it as an array of
64b entries than as an array of pairs of 32b entries.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-1-chris@chris-wilson.co.uk
(cherry picked from commit f24a44e52f)
(cherry picked from commit 3d4dbe0e0f0d04ebcea917b7279586817da8cf46)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We may try to preempt the currently executing request, only to find that
after unravelling all the dependencies that the original executing
context is still the earliest in the topological sort and re-submitted
back to HW (if we do detect some change in the ELSP that requires
re-submission). However, due to the way we check for wrap-around during
the unravelling, we mark any context that has been submitted just once
(i.e. with the rq->wa_tail set, but the ring->tail earlier) as
potentially wrapping and requiring a forced restore on resubmission.
This was expected to be not a problem, as it was anticipated that most
unwinding for preemption would result in a context switch and the few
that did not would be lost in the noise. It did not take long for
someone to find one particular workload where the cost of those extra
context restores was measurable.
However, since we know the wa_tail is of fixed size, and we know that a
request must be larger than the wa_tail itself, we can safely maintain
the check for request wrapping and check against a slightly future point
in the ring that includes an expected wa_tail. (That is if the
ring->tail is already set to rq->wa_tail, including another 8 bytes in
the check does not invalidate the incremental wrap detection.)
Fixes: 8ab3a3812a ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002083425.4605-1-chris@chris-wilson.co.uk
(cherry picked from commit bb65548e3c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When running gem_exec_nop, it floods the system with many requests (with
the goal of userspace submitting faster than the HW can process a single
empty batch). This causes the driver to continually resubmit new
requests onto the end of an active context, a flood of lite-restore
preemptions. If we time this just right, Tigerlake hangs.
Inserting a small delay between the processing of CS events and
submitting the next context, prevents the hang. Naturally it does not
occur with debugging enabled. The suspicion then is that this is related
to the issues with the CS event buffer, and inserting an mmio read of
the CS pointer status appears to be very successful in preventing the
hang. Other registers, or uncached reads, or plain mb, do not prevent
the hang, suggesting that register is key -- but that the hang can be
prevented by a simple udelay, suggests it is just a timing issue like
that encountered by commit 233c1ae3c8 ("drm/i915/gt: Wait for CSB
entries on Tigerlake"). Also note that the hang is not prevented by
applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
between requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015195023.32346-1-chris@chris-wilson.co.uk
(cherry picked from commit 6ca7217dff)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld@intel.com>
Fixes: 435e8fc059 ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk
(cherry picked from commit 57b2d834bf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <stable@vger.kernel.org> # v5.7+
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk
(cherry picked from commit d46b60a2e8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In order to avoid functional breakage of mis-programmed applications that
have grown to depend on unused MOCS entries, we are programming
those entries to be equal to fully cached ("L3 + LLC") entry.
These reserved and unspecified entries should not be used as they may be
changed to less performant variants with better coherency in the future
if more entries are needed.
v2: As suggested by Lucas De Marchi to utilise __init_mocs_table for
programming default value, setting I915_MOCS_PTE index of tgl_mocs_table
with desired value.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Mathew Alwin <alwin.mathew@intel.com>
Cc: Mcguire Russell W <russell.w.mcguire@intel.com>
Cc: Spruit Neil R <neil.r.spruit@intel.com>
Cc: Zhou Cheng <cheng.zhou@intel.com>
Cc: Benemelis Mike G <mike.g.benemelis@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729102539.134731-2-ayaz.siddiqui@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit 4d8a5cfe3b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In commit 7994672309 ("drm/i915: Assume 100% brightness when not in
DPCD control mode"), we fixed the brightness level when DPCD control was
not active to max brightness. This is as good as we can guess since most
backlights go on full when uncontrolled.
However in doing so we changed the semantics of the initial
'backlight.enabled' value. At least on Pixelbooks, they were relying
on the brightness level in DP_EDP_BACKLIGHT_BRIGHTNESS_MSB to be 0 on
boot such that enabled would be false. This causes the device to be
enabled when the brightness is set. Without this, brightness control
doesn't work. So by changing brightness to max, we also flipped enabled
to be true on boot.
To fix this, make enabled a function of brightness and backlight control
mechanism.
Fixes: 7994672309 ("drm/i915: Assume 100% brightness when not in DPCD control mode")
Cc: Lyude Paul <lyude@redhat.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Kevin Chowski <chowski@chromium.org>>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918002845.32766-1-sean@poorly.run
(cherry picked from commit 4ade8f31c2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In switching to using objects for our ppGTT scratch pages, care was not
taken to avoid trying to unref NULL objects on failure. And for gen6
ppGTT, it appears we forgot entirely to unwind after a partial allocation
failure.
Fixes: 89351925a4 ("drm/i915/gt: Switch to object allocations for page directories")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019083444.1286-1-chris@chris-wilson.co.uk
If guest fills non-priv bb on ApolloLake/Broxton as Mesa i965 does in:
717e7539124d (i965: Use a WC map and memcpy for the batch instead of pw-)
Due to the missing flush of bb filled by VM vCPU, host GPU hangs on
executing these MI_BATCH_BUFFER.
Temporarily workaround this by setting SNOOP bit for PAT3 used by PPGTT
PML4 PTE: PAT(0) PCD(1) PWT(1).
The performance is still expected to be low, will need further improvement.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201012045231.226748-1-colin.xu@intel.com
Guest driver may reset HWSP to 0 as init value during D3->D0:
The full sequence is:
- Boot ->D0
- Update HWSP
- D0->D3
- ...In D3 state...
- D3->D0
- DMLR reset.
- Set engine HWSP to 0.
- Set engine ring mode to 0.
- Set engine HWSP to correct value.
- Set engine ring mode to correct value.
Ring mode is masked register so set 0 won't take effect.
However HWPS addr 0 is considered as invalid GGTT address which will
report error like:
gvt: vgpu 1: write invalid HWSP address, reg:0x2080, value:0x0
gvt: vgpu 1: fail to emulate MMIO write 00002080 len 4
Detected your guest driver doesn't support GVT-g.
Now vgpu 2 will enter failsafe mode.
Zero out HWSP addr is considered as a valid setting from device driver
so don't treat it as invalid HWSP addr.
V2:
Treat HWSP addr 0 as valid. (zhenyu)
V3:
Change patch title.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200911065239.147789-1-colin.xu@intel.com
i915_gem_object_map implements fairly low-level vmap functionality in a
driver. Split it into two helpers, one for remapping kernel memory which
can use vmap, and one for I/O memory that uses vmap_pfn.
The only practical difference is that alloc_vm_area prefeaults the vmalloc
area PTEs, which doesn't seem to be required here for the kernel memory
case (and could be added to vmap using a flag if actually required).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Link: https://lkml.kernel.org/r/20201002122204.1534411-9-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kmap for !PageHighmem is just a convoluted way to say page_address, and
kunmap is a no-op in that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Link: https://lkml.kernel.org/r/20201002122204.1534411-8-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
shmem_pin_map somewhat awkwardly reimplements vmap using alloc_vm_area and
manual pte setup. The only practical difference is that alloc_vm_area
prefeaults the vmalloc area PTEs, which doesn't seem to be required here
(and could be added to vmap using a flag if actually required). Switch to
use vmap, and use vfree to free both the vmalloc mapping and the page
array, as well as dropping the references to each page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Link: https://lkml.kernel.org/r/20201002122204.1534411-7-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The typical set of driver updates across the subsystem:
- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma, hns,
usnic, qib, qedr, cxgb4, hns, bnxt_re
- Various rtrs fixes and updates
- Bug fix for mlx4 CM emulation for virtualization scenarios where MRA
wasn't working right
- Use tracepoints instead of pr_debug in the CM code
- Scrub the locking in ucma and cma to close more syzkaller bugs
- Use tasklet_setup in the subsystem
- Revert the idea that 'destroy' operations are not allowed to fail at
the driver level. This proved unworkable from a HW perspective.
- Revise how the umem API works so drivers make fewer mistakes using it
- XRC support for qedr
- Convert uverbs objects RWQ and MW to new the allocation scheme
- Large queue entry sizes for hns
- Use hmm_range_fault() for mlx5 On Demand Paging
- uverbs APIs to inspect the GID table instead of sysfs
- Move some of the RDMA code for building large page SGLs into
lib/scatterlist
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl+J37MACgkQOG33FX4g
mxrKfRAAnIecwdE8df0yvVU5k0Eg6qVjMy9MMHq4va9m7g6GpUcNNI0nIlOASxH2
l+9vnUQS3ebgsPeECaDYzEr0hh/u53+xw2g4WV5ts/hE8KkQ6erruXb9kasCe8yi
5QWJ9K36T3c03Cd3EeH6JVtytAxuH42ombfo9BkFLPVyfG/R2tsAzvm5pVi73lxk
46wtU1Bqi4tsLhyCbifn1huNFGbHp08OIBPAIKPUKCA+iBRPaWS+Dpi+93h3g3Bp
oJwDhL9CBCGcHM+rKWLzek3Dy87FnQn7R1wmTpUFwkK+4AH3U/XazivhX035w1vL
YJyhakVU0kosHlX9hJTNKDHJGkt0YEV2mS8dxAuqilFBtdnrVszb5/MirvlzC310
/b5xCPSEusv9UVZV0G4zbySVNA9knZ4YaRiR3VDVMLKl/pJgTOwEiHIIx+vs3ejk
p8GRWa1SjXw5LfZEQcq39J689ljt6xjCTonyuBSv7vSQq5v8pjBxvHxiAe2FIa2a
ZyZeSCYoSh0SwJQukO2VO7aprhHP3TcCJ/987+X03LQ8tV2VWPktHqm62YCaDcOl
fgiQuQdPivRjDDkJgMfDWDGKfZeHoWLKl5XsJhWByt0lablVrsvc+8ylUl1UI7gI
16hWB/Qtlhfwg10VdApn+aOFpIS+s5P4XIp8ik57MZO+VeJzpmE=
=LKpl
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A usual cycle for RDMA with a typical mix of driver and core subsystem
updates:
- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
hns, usnic, qib, qedr, cxgb4, hns, bnxt_re
- Various rtrs fixes and updates
- Bug fix for mlx4 CM emulation for virtualization scenarios where
MRA wasn't working right
- Use tracepoints instead of pr_debug in the CM code
- Scrub the locking in ucma and cma to close more syzkaller bugs
- Use tasklet_setup in the subsystem
- Revert the idea that 'destroy' operations are not allowed to fail
at the driver level. This proved unworkable from a HW perspective.
- Revise how the umem API works so drivers make fewer mistakes using
it
- XRC support for qedr
- Convert uverbs objects RWQ and MW to new the allocation scheme
- Large queue entry sizes for hns
- Use hmm_range_fault() for mlx5 On Demand Paging
- uverbs APIs to inspect the GID table instead of sysfs
- Move some of the RDMA code for building large page SGLs into
lib/scatterlist"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
RDMA/ucma: Fix use after free in destroy id flow
RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
RDMA: Explicitly pass in the dma_device to ib_register_device
lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
IB/mlx4: Convert rej_tmout radix-tree to XArray
RDMA/rxe: Fix bug rejecting all multicast packets
RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
RDMA/rxe: Remove duplicate entries in struct rxe_mr
IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
IB/rdmavt: Fix sizeof mismatch
MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
RDMA/umem: Move to allocate SG table from pages
lib/scatterlist: Add support in dynamic allocation of SG table from pages
tools/testing/scatterlist: Show errors in human readable form
tools/testing/scatterlist: Rejuvenate bit-rotten test
RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
RDMA/uverbs: Expose the new GID query API to user space
...
intel_dp_ycbcr420_config() is rather pointless. Just inline it
directly into intel_dp_compute_config(). This gets rid of the
ugly double assignment of output_format.
Not really sure what the best policy would be when the user
supplies a mode classified by the display as "YCbCr 4:2:0
only", but we know that we can't do YCbCr 4:2:0 output. For
now keep the current behaviour of just silently upgrade
it to RGB 4:4:4.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-3-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Remove the lspcon special case from intel_dp_compute_config() and
just treat it like any other DFP than can do 4:4:4->4:2:0 conversion.
The only difference between the two codepaths was that the lspcon
code tried to already halve port_clock. That was just total nonsense
as we hadn't even computed the base port_clock at that time.
All that stuff happens intel_dp_compute_link_config*() and it
already takes care of the 4:2:0 clock reduction.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-2-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
crtc_state->lspcon_downsampling isn't particularly useful at
the moment since we can't even do proper readout for it.
Let's get rid of it. Will help with unifying the LSPCON with
the regular DFP YCbCr output support.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-1-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <stable@vger.kernel.org> # v5.7+
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
WAC6entrylatency is trying to fix excessive rc6 entry latency caused
by the extra delay from FBC_LLC_READ_CTRL, which is there for some
extra sync with uncore for frame buffer caching in LLC.
Reading through the hsd the recommendation was to set the FBC_LLC_FULLY_OPEN
bit to disable this extra delay entirely. This can be done whenever fb LLC
caching is not used. The alternative suggestion was to reduce the delay to
eg. 0x5 via updated BIOS programming instructions. But all the kbl/cfl
machines I've seen still have the default 0xff programmed. As we never use
fb LLC caching let's just apply the w/a to all skl derivatives to get
consistent rc6 latencies.
I was able to measure the effect of FBC_LLC_READ_CTRL to rc6 latency
via forcewake. Here's a graph of some of the results:
sleep;fw_req=1;wait fw_ack==1;sleep;fw_req=0;wait fw_ack==0
fw_ack==1 duration
160us +----------------------------------------------------------------+
| + + $$+ + + |
| $$ $ $ ******$$ ** $ $**$* #########$$######|
140us |-$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$*$$$$$$$$$$$$$$$$ $$$$$$|
| $ * # |
| $ * # |
120us |$+ * # +-|
|$ * # |
|$ * # # |
100us |$+ ************######################## +-|
|$ * *# |
|$ ***** ######### |
80us |$+ * # #### ## +-|
|$ **** ### # # |
| ** #### FBC_LLC_READ_CTRL: 0x8000 ******* |
60us |-###### FBC_LLC_READ_CTRL: 0xffff #######-|
|## + + FBC_LLC_READ_CTRL: 0x400000ff $$$$$$$ |
+----------------------------------------------------------------+
0ms 10ms 20ms 30ms 40ms 50ms 60ms
sleep duration
The default FBC_LLC_READ_CTRL value of 0xff is documented to give us
a 170usec delay. That tracks well with the knees at 0xffff->~44msec and
0x8000->~22msec we see in the graph.
We can see that if we sleep longer than the FBC_LLC_READ_CTRL delay
we always observe the full (~145usec) rc6 wakeup latency. But if we sleep
for less than the FBC_LLC_READ_CTRL delay we see a quicker fw wakeup,
presumably due the hardware not having yet entered rc6 fully.
The other plateaus in the graph I suspect correspond to some shallower
internal rc states.
v2: s/usec/msec/ typo in commit msg
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716190426.17047-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
A recent bspec update has provided a new cdclk table for RKL. All of
the cdclk values are the same as those we've been using on ICL, TGL,
etc., but we obtain them by doubling both the PLL ratio and CD2X divider
numbers.
Bspec: 49202
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015220038.271740-1-matthew.d.roper@intel.com
Repeat our sanitychecks from before execution to after execution. One
expects that if we were to see these, the gpu would already be on fire,
but the timing may be informative.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015190816.31763-1-chris@chris-wilson.co.uk
We may try to preempt the currently executing request, only to find that
after unravelling all the dependencies that the original executing
context is still the earliest in the topological sort and re-submitted
back to HW (if we do detect some change in the ELSP that requires
re-submission). However, due to the way we check for wrap-around during
the unravelling, we mark any context that has been submitted just once
(i.e. with the rq->wa_tail set, but the ring->tail earlier) as
potentially wrapping and requiring a forced restore on resubmission.
This was expected to be not a problem, as it was anticipated that most
unwinding for preemption would result in a context switch and the few
that did not would be lost in the noise. It did not take long for
someone to find one particular workload where the cost of those extra
context restores was measurable.
However, since we know the wa_tail is of fixed size, and we know that a
request must be larger than the wa_tail itself, we can safely maintain
the check for request wrapping and check against a slightly future point
in the ring that includes an expected wa_tail. (That is if the
ring->tail is already set to rq->wa_tail, including another 8 bytes in
the check does not invalidate the incremental wrap detection.)
Fixes: 8ab3a3812a ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002083425.4605-1-chris@chris-wilson.co.uk
When running gem_exec_nop, it floods the system with many requests (with
the goal of userspace submitting faster than the HW can process a single
empty batch). This causes the driver to continually resubmit new
requests onto the end of an active context, a flood of lite-restore
preemptions. If we time this just right, Tigerlake hangs.
Inserting a small delay between the processing of CS events and
submitting the next context, prevents the hang. Naturally it does not
occur with debugging enabled. The suspicion then is that this is related
to the issues with the CS event buffer, and inserting an mmio read of
the CS pointer status appears to be very successful in preventing the
hang. Other registers, or uncached reads, or plain mb, do not prevent
the hang, suggesting that register is key -- but that the hang can be
prevented by a simple udelay, suggests it is just a timing issue like
that encountered by commit 233c1ae3c8 ("drm/i915/gt: Wait for CSB
entries on Tigerlake"). Also note that the hang is not prevented by
applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
between requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015195023.32346-1-chris@chris-wilson.co.uk
While we do lack the faster shared LLC, we should still have support
for snooping over PCIe.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-11-lucas.demarchi@intel.com
Update the DMC_DEBUG_DC5 register to its new location and do not try
reading the DC6 counter since DG1 doesn't support DC6.
v2: Use IS_DGFX() instead of IS_DG1(). Even if not having DC6 is not
directly related to DGFX, the register move to a new location is. So in
future, if there is one supporting DC6, it would just need to add the
other register rather than fixing the case of a wrong register being
read (Matt)
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-10-lucas.demarchi@intel.com
DC6 is not supported on DG1, so change the allowed DC mask for DG1.
This is not yet on bspec, but it has been confirmed by HW engineers.
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-9-lucas.demarchi@intel.com
DG1 shares some workarounds with TGL and RKL and also has some
additional workarounds of its own.
v2: Correct location of Wa_1408615072 (JohnH).
v3: Apply WAs 1606700617, 18011464164 and 22010931296 to DG1 (José)
v4 (Anusha)
- Add Wa_22010271021
- s/Wa_14010096844/Wa_1409836686
v5:
- Extend Wa_14010919138 to all revs (Matt Atwood)
- Power gate media is global gen12 design. (Rodrigo)
- Rebase (Lucas)
v6: use REG_BIT() to fix checkpatch warning (Lucas)
BSpec: 53508
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-8-lucas.demarchi@intel.com
Add support to load DMC v2.0.2 on DG1
While we're at it, make TGL use the same GEN12 firmware size definition
and remove obsolete comment.
Bpec: 49230
v2: do not replace GEN12_CSR_MAX_FW_SIZE (from José)
and replace stale comment
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-7-lucas.demarchi@intel.com
Add DG1 DPLL Enable register macro and use the macro to enable the
correct DPLL based on PLL id. Although we use
_MG_PLL1_ENABLE/_MG_PLL2_ENABLE these are rather combo phys.
While at it, fix coding style: wrong newlines and use if/else chain
v2: Rewrite original patch from Aditya Swarup based on refactors
upstream
Bspec: 49443, 49206
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Aditya Swarup <aditya.swarup@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-6-lucas.demarchi@intel.com
Add entries for dg1 plls and setup dg1_pll_mgr to reuse ICL callbacks.
Initial setup for shared dplls DPLL0/1 for DDIA/DDIB and DPLL2/3 for
DDI-TC1/DDI-TC2. Configure dpll cfgcrx registers to drive the plls on
DG1.
v2 (Lucas): Reword commit message and add missing update_ref_clks hook
(requested by Matt Roper)
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-5-lucas.demarchi@intel.com
DG1 has 4 DPLLs where DPLL0 and DPLL1 drive DDIA/B and
DPLL2 and DPLL3 drive DDI-TC1/DDI-TC2.
Introduce DG1_DPLL_CFCRx() helper macros to configure
DPLL registers.
Bspec: 50288, 50299
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-4-lucas.demarchi@intel.com
TGL power wells can be re-used for DG1 with the exception of the fake
power well for TC_COLD.
v2: use logic to skip power wells while copying instead of duplicating
the definition of TGL power wells (Matt Roper)
Bspec: 49182
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-3-lucas.demarchi@intel.com
The skus guarded by IS_CNL_WITH_PORT_F() have port F and thus they need
those power wells. The others don't have those. Up to now we were
just overriding the number of power wells on !IS_CNL_WITH_PORT_F(),
relying on those power wells to be the last ones. Now that we have logic
in place to skip power wells by id, use it instead.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-2-lucas.demarchi@intel.com
This allows us to skip power wells on a platform allowing it to re-use
the table from another one instead of having to create a new table from
scratch that is basically a copy with a few removals.
Cc: Imre Deak <imre.deak@intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
[ Adapt ignore logic to be based on pw id rather than adding a new
field, as suggested by Imre ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201014191937.1266226-1-lucas.demarchi@intel.com
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld@intel.com>
Fixes: 435e8fc059 ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk
New driver:
Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfh579AAoJEAx081l5xIa+GqoP/0amz+ZN7y/L7+f32CRinJ7/
3e4xjXNDmtWG4Whe/WKjlYmbAcvSdWV/4HYpurW2BFJnOAB/5lIqYcS/PyqErPzA
w4EpRoJ+ZdFgmlDH0vdsDwPLT/HFmhUN9AopNkoZpbSMxrManSj5QgmePXyiKReP
Q+ZAK5UW5AdOVY4bgXUSEkVq2eilCLXf+bSBR/LrVQuNgu7GULX8SIy/Y1CuMtv8
LgzzjLKfIZaIWC+F/RU7BxJ7YnrVq7z7yXnUx8j2416+k/Wwe+BeSUCSZstT7q9G
UkX8jWfR7ZKqhwP+UQeSwDbHkALz7lv88nyjQdxJZ3SrXRe4hy14YjxnR4maeNAj
3TAYSdcAMWyRHqeEZIZ7Hj5sQtTq5OZAoIjxzH3vpVdAnnAkcWoF77pqxV8XPqTC
nw40DihAxQOshGwMkjd5DqkEwnMv43Hs1WTVYu9dPTOfOdqPNt+Vqp7Xl9Z46+kV
k6PDcx60T9ayDW1QZ6MoIXHta9E7ixzu7gYBL3vP4LuporY0uNG3bzF3CMvof1BK
sHYcYTdZkqbTD2d6rHV+TbpPQXgTtlej9qVlQM4SeX37Xtc7LxCYpnpUHKz2S/fK
1vyeGPgdytHblwlxwZOPZ4R2I/HTfnITdr4kMcJHhxAsEewfW1Rd4+stQqVJ2Mph
Vz+CFP2BngivGFz5vuky
=4H8J
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Not a major amount of change, the i915 trees got split into display
and gt trees to better facilitate higher level review, and there's a
major refactoring of i915 GEM locking to use more core kernel concepts
(like ww-mutexes). msm gets per-process pagetables, older AMD SI cards
get DC support, nouveau got a bump in displayport support with common
code extraction from i915.
Outside of drm this contains a couple of patches for hexint
moduleparams which you've acked, and a virtio common code tree that
you should also get via it's regular path.
New driver:
- Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config"
* tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm: (1494 commits)
drm/ingenic: Fix bad revert
drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_init
drm/amdgpu: Remove warning for virtual_display
drm/amdgpu: kfd_initialized can be static
drm/amd/pm: setup APU dpm clock table in SMU HW initialization
drm/amdgpu: prevent spurious warning
drm/amdgpu/swsmu: fix ARC build errors
drm/amd/display: Fix OPTC_DATA_FORMAT programming
drm/amd/display: Don't allow pstate if no support in blank
drm/panfrost: increase readl_relaxed_poll_timeout values
MAINTAINERS: Update entry for st7703 driver after the rename
Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"
drm/amd/display: HDMI remote sink need mode validation for Linux
drm/amd/display: Change to correct unit on audio rate
drm/amd/display: Avoid set zero in the requested clk
drm/amdgpu: align frag_end to covered address space
drm/amdgpu: fix NULL pointer dereference for Renoir
drm/vmwgfx: fix regression in thp code due to ttm init refactor.
drm/amdgpu/swsmu: add interrupt work handler for smu11 parts
drm/amdgpu/swsmu: add interrupt work function
...
Forcing mocs:1 [used for our winsys follows-pte mode] to be cached
caused display glitches. Though it is documented as deprecated (and so
likely behaves as uncached) use the follow-pte bit and force it out of
L3 cache.
Fixes: 4d8a5cfe3b ("drm/i915/gt: Initialize reserved and unspecified MOCS indices")
Testcase: igt/kms_frontbuffer_tracking
Testcase: igt/kms_big_fb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-4-chris@chris-wilson.co.uk
Since SKL the eLLC has been sitting on the far side of the system
agent, meaning the display engine can utilize it. Let's enable that.
I chose WB for the caching mode, because my numbers are indicating
that WT might actually be WB and WC might actually be UC. I'm not
100% sure that is indeed the case but at least my simple rendercopy
based benchmark didn't see any difference in performance.
Also if I configure things to do LLCeLLC+WT I still get cache dirt
on my screen, suggesting that is in fact operating in WB mode
anyway. This is also the reason I had to fix the MOCS target cache
to really say PTE rather than LLC+eLLC.
Since SKL the eLLC has been sitting on the far side of the system agent,
meaning the display engine can utilize it. Let's enable that.
Eero's earlier benchmarks numbers:
"* Results in GfxBench and Unigine (Valley/Heaven) tests were within daily
variation on the tested SKL machines
* SKL GT4e (128MB eLLC) / Wayland / Weston:
+15-20% SynMark TexMem512 (512MB of textures)
+4-6% SynMark TerrainFly*, CSCloth, ShMapVsm
-5-10% SynMark TexMem128 (128MB of textures)
* SKL GT3e (64MB eLLC) / Xorg / Unity:
+4-8% GpuTest Triangle fullscreen (FullHD)
-5-10% GpuTest Triangle windowed (1/2 screen)
* SKL GT2 (no eLLC) / Xorg / Unity:
* Some of the higher FPS SynMark pixel and vertex shader tests
are few percent higher, more than daily variance
=> Do you see any reason why this machine would be impacted
although it doesn't eLLC?"
Caveats:
- Still haven't tested with a prime setup
- Still not entirely sure this a good idea, but I've been
using it on my cfl anyway :)
v2: Split the MOCS PTE change out
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-3-ville.syrjala@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-3-chris@chris-wilson.co.uk
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <stable@vger.kernel.org> # v5.7+
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk
Recently we came across requirement to identify EHL and JSL
platform to program them differently. Thus Split the basic
platform definition, macros, and PCI IDs to differentiate
between EHL and JSL platforms. Also, IS_ELKHARTLAKE is replaced
with IS_JSL_EHL everywhere.
Changes since V1 :
- Rebased to avoid merge conflicts
- Added missed check for jasperlake in intel_uc_fw.c
Cc : Matt Roper <matthew.d.roper@intel.com>
Cc : Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201013192948.63470-1-tejaskumarx.surendrakumar.upadhyay@intel.com
i915 does not want to see value entries. Switch it to use
find_lock_page() instead, and remove the export of find_lock_entry().
Move find_lock_entry() and find_get_entry() to mm/internal.h to discourage
any future use.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: https://lkml.kernel.org/r/20200910183318.20139-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order to avoid functional breakage of mis-programmed applications that
have grown to depend on unused MOCS entries, we are programming
those entries to be equal to fully cached ("L3 + LLC") entry.
These reserved and unspecified entries should not be used as they may be
changed to less performant variants with better coherency in the future
if more entries are needed.
v2: As suggested by Lucas De Marchi to utilise __init_mocs_table for
programming default value, setting I915_MOCS_PTE index of tgl_mocs_table
with desired value.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Mathew Alwin <alwin.mathew@intel.com>
Cc: Mcguire Russell W <russell.w.mcguire@intel.com>
Cc: Spruit Neil R <neil.r.spruit@intel.com>
Cc: Zhou Cheng <cheng.zhou@intel.com>
Cc: Benemelis Mike G <mike.g.benemelis@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729102539.134731-2-ayaz.siddiqui@intel.com
Cc: stable@vger.kernel.org
BOE panel with ID 2270 claims both PWM_PIN_CAP and AUX_SET_CAP backlight
control bits, but default chip backlight failed to control brightness.
Check AUX_SET_CAP and proceed to check quirks or VBT backlight type.
DPCD can control the brightness of this pannel.
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009085750.88490-1-aaron.ma@canonical.com
In commit 7994672309 ("drm/i915: Assume 100% brightness when not in
DPCD control mode"), we fixed the brightness level when DPCD control was
not active to max brightness. This is as good as we can guess since most
backlights go on full when uncontrolled.
However in doing so we changed the semantics of the initial
'backlight.enabled' value. At least on Pixelbooks, they were relying
on the brightness level in DP_EDP_BACKLIGHT_BRIGHTNESS_MSB to be 0 on
boot such that enabled would be false. This causes the device to be
enabled when the brightness is set. Without this, brightness control
doesn't work. So by changing brightness to max, we also flipped enabled
to be true on boot.
To fix this, make enabled a function of brightness and backlight control
mechanism.
Fixes: 7994672309 ("drm/i915: Assume 100% brightness when not in DPCD control mode")
Cc: Lyude Paul <lyude@redhat.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Kevin Chowski <chowski@chromium.org>>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918002845.32766-1-sean@poorly.run
When the number of potential color planes grew to 4 we stopped
setting all unused color plane offsets to ~0xfff. The code
still tries to do this, but actually does nothing since the
loop limits are bogus.
skl_check_main_surface() actually depends on this ~0xfff
behaviour as it will make sure to move the main surface
offset below the aux surface offset because the hardware
AUX_DIST must be a non-negative value [1], and for simplicity
it doesn't bother checking if the AUX plane is actually
needed or not. So currently it may end up shuffling the
main surface around based on some stale leftover AUX offset.
The skl+ plane code also just blindly calculates the AUX_DIST
whether or not the AUX plane is actually needed by the hw or
not, and that too will now potentially use some stale AUX
surface offset in the calculation. Would seem nicer to
guarantee a consistent non-negative AUX_DIST always.
So bring back the original ~0xfff offset behaviour for
unused color planes. Though it doesn't seem super likely
that this inconsistency would cause any real issues.
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Fixes: 2dfbf9d287 ("drm/i915/tgl: Gen-12 display can decompress surfaces compressed by the media engine")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201008101608.8652-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit 79148ce4b2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The HDMI vs. not-HDMI check got inverted whem the bogus encoder->type
checks were eliminated. So now we're using 0 as the link rate on DP
and potentially non-zero on HDMI, which is exactly the opposite of
what we want. The original bogus check actually worked more correctly
by accident since if would always evaluate to true. Due to this we
now always use the RBR/HBR1 vswing table and never ever the HBR2+
vswing table. That is probably not a good way to get a high quality
signal at HBR2+ rates. Fix the check so we pick the right table.
Cc: stable@vger.kernel.org
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Fixes: 94641eb6c6 ("drm/i915/display: Fix the encoder type check")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930223642.28565-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
(cherry picked from commit 945b18fb48)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
- Make all debug object descriptors constant. There is no reason to have
them writeable.
- Free the per CPU object pool after CPU unplug to avoid memory waste.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+EMM8THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYobvLD/95P1eqk5Hku2uHeHowj9t4PVqk68Gq
71rJTj/x32JnGVsNFfwoqm0u8CueAitT1Id5aIjiMAGK2eH83StwdfqFjdHsitwl
4zH6qywjjgq86jOSXXUQ2TSyYjlL7l2Pr8jwEZkTXUrShZyHAOvD9oW3KwlyV88H
QXiTGFAm4aAaFctgGlbVafxtp1IJkS+FzFT1/cyPV064d3uGjX6maKwKjssO/8Bv
WLzPuyOKh7IoEMfyFEXw3Ks3GK3Fo9Rkm4+CNLxjWy7DX7tY2Xvu4kuPqQ0dyn31
zzxJHOESLOjtw8w0vSEKpuNyIfL/66/MF8p4CXzkUWQTa9h26yos+mDnSzxth++d
ZERCpx7udTG3BbteJ8ZEf3/QyH6kJw9RccDET180/CvhBTHELEsroy9qOfDMJkgt
RODtwh+2+O0zqUGjbqEd+PTmkU5p0UwIWAI9t7pic5Dntse7stRwotDtI0FGNnF1
aj/4ZkTb+lq3EF/x9xh6Hw/SfcVGtmySeMaPlaj9fgq39dtTPIspYeI5GieZlE0s
1ZfoRSgWA+/K9+4iTsuGvTwXnLhsRYVZl+GIkzeG2FL3euK0t+vaIvIbE9IDHdsS
mgEFaUXwIosHOfn+8f9KWl6YupBtT/4PyPVu5DseREl4up0W6s26aamM1Ml16A8x
JUNS/oa+cGpI4A==
=BP3+
-----END PGP SIGNATURE-----
Merge tag 'core-debugobjects-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects updates from Thomas Gleixner:
"A small set of updates for debug objects:
- Make all debug object descriptors constant. There is no reason to
have them writeable.
- Free the per CPU object pool after CPU unplug to avoid memory
waste"
* tag 'core-debugobjects-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobjects: Free per CPU pool after CPU unplug
treewide: Make all debug_obj_descriptors const
debugobjects: Allow debug_obj_descr to be const