Commit Graph

10183 Commits

Author SHA1 Message Date
Graham Sider
4056b03377 drm/amdkfd: replace kgd_dev in static gfx v10 funcs
Static funcs in amdgpu_amdkfd_gfx_v10.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider
9a17c9b79b drm/amdkfd: replace kgd_dev in static gfx v9 funcs
Static funcs in amdgpu_amdkfd_gfx_v9.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider
1cca608742 drm/amdkfd: replace kgd_dev in static gfx v8 funcs
Static funcs in amdgpu_amdkfd_gfx_v8.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Graham Sider
9365fbf3d7 drm/amdkfd: replace kgd_dev in static gfx v7 funcs
Static funcs in amdgpu_amdkfd_gfx_v7.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Christian König
fa78e367a2 drm/amdgpu: stop getting excl fence separately
Just grab all fences for the display flip in one go.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028132630.2330-2-christian.koenig@amd.com
2021-11-17 14:32:27 +01:00
Linus Torvalds
304ac8032d drm next/fixes for 5.16-rc1
bridge:
 - HPD improvments for lt9611uxc
 - eDP aux-bus support for ps8640
 - LVDS data-mapping selection support
 
 ttm:
 - remove huge page functionality (needs reworking)
 - fix a race condition during BO eviction
 
 panels:
 - add some new panels
 
 fbdev:
 - fix double-free
 - remove unused scrolling acceleration
 - CONFIG_FB dep improvements
 
 locking:
 - improve contended locking logging
 - naming collision fix
 
 dma-buf:
 - add dma_resv_for_each_fence iterator
 - fix fence refcounting bug
 - name locking fixesA
 
 prime:
 - fix object references during mmap
 
 nouveau:
 - various code style changes
 - refcount fix
 - device removal fixes
 - protect client list with a mutex
 - fix CE0 address calculation
 
 i915:
 - DP rates related fixes
 - Revert disabling dual eDP that was causing state readout problems
 - put the cdclk vtables in const data
 - Fix DVO port type for older platforms
 - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
 - CCS FBs related fixes
 - Fix recursive lock in GuC submission
 - Revert guc_id from i915_request tracepoint
 - Build fix around dmabuf
 
 amdgpu:
 - GPU reset fix
 - Aldebaran fix
 - Yellow Carp fixes
 - DCN2.1 DMCUB fix
 - IOMMU regression fix for Picasso
 - DSC display fixes
 - BPC display calculation fixes
 - Other misc display fixes
 - Don't allow partial copy from user for DC debugfs
 - SRIOV fixes
 - GFX9 CSB pin count fix
 - Various IP version check fixes
 - DP 2.0 fixes
 - Limit DCN1 MPO fix to DCN1
 
 amdkfd:
 - SVM fixes
 - Fix gfx version for renoir
 - Reset fixes
 
 udl:
 - timeout fix
 
 imx:
 - circular locking fix
 
 virtio:
 - NULL ptr deref fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGN3YwACgkQDHTzWXnE
 hr6aZQ/+Pobf1VE7V3wPUcopxccJYmgBvG/uY8EDyjA8qaxHs2pQqGN2IooOGxr6
 F8G1N94Hem/PCDn3T8JI2Tqw5z4sy4UwLahEWISurFCen1IMAfA7hYfutp9X3O7X
 8h7b+PgkvVruEAHF7z0kqnWGPHmcro29cIHNkXVRjnJuz+Gmn1XRfo6Jj65n6D7u
 NfMeU4/lWRR3767oJQzTqyAYtGxsKaZT3/tBD5WggZBzEKC7hqhAl8EUoOLWwojo
 fDqwiEpLXpraPRIQH8trkXVHhzPeLAmG916WwS8JG3CEk9mUQ+I7Jshhd8cw+bsQ
 XPuk3OBfU9mtuiGgNzrLP3xXJZs/QN3EkpKZWLefTnJY+C4BgiP2RifTnghmwV31
 6/7Pr83CX/cn3BRd7r0xaeBZYvVYBZmwoZcsZFJBM8SVjd/ofKUfAmCzZZKheio2
 5qa6bj9DQoyjEoFAULh23plcX6hvATGP7wzfRTnJ9AlAJ0KyEjVJ3r0qE6jHMDc/
 uzcTAnKIWCxt9kSgE5qwLQtxLBaBpr/iOniZbCqGkPjiZeMzqP/ug1AKVP7kk39x
 FxZVT8ZOKk8Xt4iLZx8jmHi2KKheXYZi9LqieoTrJd44qMXDOmR9DCtQX9FZuWJS
 EJAlMj6sCowAZdODPZMVpoMc3Gti9nZ2Fpu7mLrRcMk1gKfjKwo=
 =qMNk
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Dave Airlie:
 "I missed a drm-misc-next pull for the main pull last week. It wasn't
  that major and isn't the bulk of this at all. This has a bunch of
  fixes all over, a lot for amdgpu and i915.

  bridge:
   - HPD improvments for lt9611uxc
   - eDP aux-bus support for ps8640
   - LVDS data-mapping selection support

  ttm:
   - remove huge page functionality (needs reworking)
   - fix a race condition during BO eviction

  panels:
   - add some new panels

  fbdev:
   - fix double-free
   - remove unused scrolling acceleration
   - CONFIG_FB dep improvements

  locking:
   - improve contended locking logging
   - naming collision fix

  dma-buf:
   - add dma_resv_for_each_fence iterator
   - fix fence refcounting bug
   - name locking fixesA

  prime:
   - fix object references during mmap

  nouveau:
   - various code style changes
   - refcount fix
   - device removal fixes
   - protect client list with a mutex
   - fix CE0 address calculation

  i915:
   - DP rates related fixes
   - Revert disabling dual eDP that was causing state readout problems
   - put the cdclk vtables in const data
   - Fix DVO port type for older platforms
   - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
   - CCS FBs related fixes
   - Fix recursive lock in GuC submission
   - Revert guc_id from i915_request tracepoint
   - Build fix around dmabuf

  amdgpu:
   - GPU reset fix
   - Aldebaran fix
   - Yellow Carp fixes
   - DCN2.1 DMCUB fix
   - IOMMU regression fix for Picasso
   - DSC display fixes
   - BPC display calculation fixes
   - Other misc display fixes
   - Don't allow partial copy from user for DC debugfs
   - SRIOV fixes
   - GFX9 CSB pin count fix
   - Various IP version check fixes
   - DP 2.0 fixes
   - Limit DCN1 MPO fix to DCN1

  amdkfd:
   - SVM fixes
   - Fix gfx version for renoir
   - Reset fixes

  udl:
   - timeout fix

  imx:
   - circular locking fix

  virtio:
   - NULL ptr deref fix"

* tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm: (126 commits)
  drm/ttm: Double check mem_type of BO while eviction
  drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
  drm/amdgpu: drop jpeg IP initialization in SRIOV case
  drm/amd/display: reject both non-zero src_x and src_y only for DCN1x
  drm/amd/display: Add callbacks for DMUB HPD IRQ notifications
  drm/amd/display: Don't lock connection_mutex for DMUB HPD
  drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends
  drm/amdkfd: Fix retry fault drain race conditions
  drm/amdkfd: lower the VAs base offset to 8KB
  drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly
  drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
  drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
  drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages
  drm/i915/fb: Fix rounding error in subsampled plane size calculation
  drm/i915/hdmi: Turn DP++ TMDS output buffers back on in encoder->shutdown()
  drm/locking: fix __stack_depot_* name conflict
  drm/virtio: Fix NULL dereference error in virtio_gpu_poll
  drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
  drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
  drm/amd/amdkfd: Don't sent command to HWS on kfd reset
  ...
2021-11-12 12:11:07 -08:00
Dave Airlie
951bad0bd9 Merge tag 'amd-drm-fixes-5.16-2021-11-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-5.16-2021-11-10:

amdgpu:
- Don't allow partial copy from user for DC debugfs
- SRIOV fixes
- GFX9 CSB pin count fix
- Various IP version check fixes
- DP 2.0 fixes
- Limit DCN1 MPO fix to DCN1

amdkfd:
- SVM fixes
- Reset fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110222536.7527-1-alexander.deucher@amd.com
2021-11-11 10:14:44 +10:00
Dave Airlie
f8ca7b7419 Removed the TTM Huge Page functionnality to address a crash, a timeout
fix for udl, CONFIG_FB dependency improvements, a fix for a circular
 locking depency in imx, a NULL pointer dereference fix for virtio, and a
 naming collision fix for drm/locking.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYYuA2wAKCRDj7w1vZxhR
 xQdZAP4oqiWpCO1JfnZdvEJ/lOULqvdzYkbUZexshGLdbb4ECwEA83TzIbQvXP8p
 jsC1hPNAIsOBkQ+nGZwJkTOtDcpEAQ4=
 =N2pK
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-fixes-2021-11-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

Removed the TTM Huge Page functionnality to address a crash, a timeout
fix for udl, CONFIG_FB dependency improvements, a fix for a circular
locking depency in imx, a NULL pointer dereference fix for virtio, and a
naming collision fix for drm/locking.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110082114.vfpkpnecwdfg27lk@gilmour
2021-11-11 08:14:19 +10:00
Guchun Chen
4d395f938a drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
Fixes: 96b8dd4423 ("drm/amdgpu/amdgpu_vcn: convert to IP version checking")
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-10 12:03:41 -05:00
Guchun Chen
b45a36032d drm/amdgpu: drop jpeg IP initialization in SRIOV case
Fixes: b05b9c591f ("drm/amdgpu: clean up set IP function")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-10 12:00:08 -05:00
shaoyunl
9f4f2c1a35 drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
The KFD pre_reset should be called before reset been executed, it will
hold the lock to prevent other rocm process to sent the packlage to hiq
during host execute the real reset on the HW

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Evan Quan
4fc30ea780 drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
There was a change(below) target for such issue:
d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")
But the fix for VI ASICs was missing there. This is a supplement for
that.

Fixes: d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:06:15 -05:00
Alex Deucher
2d32ffd6e9 drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not
set.

Fixes: f7f12b2582 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:53 -04:00
Felix Kuehling
5702d05295 drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be
freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the
release_notify callback then the amdgpu_bo is freed.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-By: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:45 -04:00
Evan Quan
e6ef9b396b drm/amdgpu: correctly toggle gfx on/off around RLC_SPM_* register access
As part of the ib padding process, accessing the RLC_SPM_* register may
trigger gfx hang. Since gfxoff may be already kicked during the whole period.
To address that, we manually toggle gfx on/off around the RLC_SPM_*
register access.

This can resolve the gfx hang issue observed on running Talos with RDP launched
in parallel.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:29 -04:00
Tao Zhou
7513c9ff44 drm/amdgpu: correct xgmi ras error count reset
The error count reset for xgmi3x16 pcs is missed.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:21 -04:00
YuBiao Wang
6ddc0eb7a2 drm/amd/amdgpu: Fix csb.bo pin_count leak on gfx 9
[Why]
csb bo is not unpinned in gfx 9. It will lead to pin_count leak on
driver unload.

[How]
Call bo_free_kernel corresponding to bo_create_kernel in
gfx_rlc_init_csb. This will also unify the code path with other gfx
versions.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:32 -04:00
YuBiao Wang
c4fc13b581 drm/amd/amdgpu: Avoid writing GMC registers under sriov in gmc9
[Why]
For Vega10, disabling gart of gfxhub could mess up KIQ and PSP
under sriov mode, and lead to DMAR on host side.

[How]
Skip writing GMC registers under sriov.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:20 -04:00
Kent Russell
7ef6b7f844 drm/amdgpu: Make sure to reserve BOs before adding or removing
BOs need to be reserved before they are added or removed, so ensure that
they are reserved during kfd_mem_attach and kfd_mem_detach

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:05 -04:00
Jason Gunthorpe
0d97950953 drm/ttm: remove ttm_bo_vm_insert_huge()
The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.

get_user_pages_fast() considers any page that passed pmd_huge() as
usable:

	if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
		     pmd_devmap(pmd))) {

And vmf_insert_pfn_pmd_prot() unconditionally sets

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));

eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.

As such gup_huge_pmd() will try to deref a struct page:

	head = try_grab_compound_head(pmd_page(orig), refs, flags);

and thus crash.

Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.

Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.

Fixes: 314b6580ad ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
[danvet: Update subject per Thomas' &Christian's review]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com
2021-11-05 11:13:19 +01:00
Linus Torvalds
5c904c66ed Char/Misc driver update for 5.16-rc1
Here is the big set of char and misc and other tiny driver subsystem
 updates for 5.16-rc1.
 
 Loads of things in here, all of which have been in linux-next for a
 while with no reported problems (except for one called out below.)
 
 Included are:
 	- habanana labs driver updates, including dma_buf usage,
 	  reviewed and acked by the dma_buf maintainers
 	- iio driver update (going through this tree not staging as they
 	  really do not belong going through that tree anymore)
 	- counter driver updates
 	- hwmon driver updates that the counter drivers needed, acked by
 	  the hwmon maintainer
 	- xillybus driver updates
 	- binder driver updates
 	- extcon driver updates
 	- dma_buf module namespaces added (will cause a build error in
 	  arm64 for allmodconfig, but that change is on its way through
 	  the drm tree)
 	- lkdtm driver updates
 	- pvpanic driver updates
 	- phy driver updates
 	- virt acrn and nitr_enclaves driver updates
 	- smaller char and misc driver updates
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPX2A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymUUgCbB4EKysgLuXYdjUalZDx+vvZO4k0AniS14O4k
 F+2dVSZ5WX6wumUzCaA6
 =bXQM
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char and misc and other tiny driver subsystem
  updates for 5.16-rc1.

  Loads of things in here, all of which have been in linux-next for a
  while with no reported problems (except for one called out below.)

  Included are:

   - habanana labs driver updates, including dma_buf usage, reviewed and
     acked by the dma_buf maintainers

   - iio driver update (going through this tree not staging as they
     really do not belong going through that tree anymore)

   - counter driver updates

   - hwmon driver updates that the counter drivers needed, acked by the
     hwmon maintainer

   - xillybus driver updates

   - binder driver updates

   - extcon driver updates

   - dma_buf module namespaces added (will cause a build error in arm64
     for allmodconfig, but that change is on its way through the drm
     tree)

   - lkdtm driver updates

   - pvpanic driver updates

   - phy driver updates

   - virt acrn and nitr_enclaves driver updates

   - smaller char and misc driver updates"

* tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits)
  comedi: dt9812: fix DMA buffers on stack
  comedi: ni_usb6501: fix NULL-deref in command paths
  arm64: errata: Enable TRBE workaround for write to out-of-range address
  arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  coresight: trbe: Work around write to out of range
  coresight: trbe: Make sure we have enough space
  coresight: trbe: Add a helper to determine the minimum buffer size
  coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  coresight: trbe: Add infrastructure for Errata handling
  coresight: trbe: Allow driver to choose a different alignment
  coresight: trbe: Decouple buffer base from the hardware base
  coresight: trbe: Add a helper to pad a given buffer area
  coresight: trbe: Add a helper to calculate the trace generated
  coresight: trbe: Defer the probe on offline CPUs
  coresight: trbe: Fix incorrect access of the sink specific data
  coresight: etm4x: Add ETM PID for Kryo-5XX
  coresight: trbe: Prohibit trace before disabling TRBE
  coresight: trbe: End the AUX handle on truncation
  coresight: trbe: Do not truncate buffer on IRQ
  coresight: trbe: Fix handling of spurious interrupts
  ...
2021-11-04 08:21:47 -07:00
James Zhu
93cec18478 drm/amdgpu: remove duplicated kfd_resume_iommu
Remove duplicated kfd_resume_iommu which already runs
in mdgpu_amdkfd_device_init.

Tested-By: Ken Moffat <zarniwhoop@ntlworld.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:08 -04:00
Aaron Liu
e8a423c589 drm/amdgpu: update RLC_PG_DELAY_3 Value to 200us for yellow carp
For yellow carp, the desired CGPG hysteresis value is 0x4E20.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:08 -04:00
Mario Limonciello
c92f909614 drm/amdgpu: Convert SMU version to decimal in debugfs
This is more useful when talking to the SMU team to have the information
in this format, save one less step to manually do it.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Oak Zeng
72c148d776 drm/amdgpu: use correct register mask to extract field
Aldebaran has different register mask definitions for
regiter MC_VM_XGMI_LFB_CNTL. Use the correct masks
to interpret fields of this register.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Jingwen Chen
38d4e4638e drm/amd/amdgpu: fix bad job hw_fence use after free in advance tdr
[Why]
In advance tdr mode, the real bad job will be resubmitted twice, while
in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job
is put one more time than other jobs.

[How]
Adding dma_fence_get before resbumit job in
amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Linus Torvalds
56d3375448 drm for 5.16-rc1
core:
 - improve dma_fence, lease and resv documentation
 - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin
 - sched fixes/improvements
 - allow empty drm leases
 - add dma resv iterator
 - add more DP 2.0 headers
 - DP MST helper improvements for DP2.0
 
 dma-buf:
 - avoid warnings, remove fence trace macros
 
 bridge:
 - new helper to get rid of panels
 - probe improvements for it66121
 - enable DSI EOTP for anx7625
 
 fbdev:
 - efifb: release runtime PM on destroy
 
 ttm:
 - kerneldoc switch
 - helper to clear all DMA mappings
 - pool shrinker optimizaton
 - remove ttm_tt_destroy_common
 - update ttm_move_memcpy for async use
 
 panel:
 - add new panel-edp driver
 
 amdgpu:
  - Initial DP 2.0 support
  - Initial USB4 DP tunnelling support
  - Aldebaran MCE support
  - Modifier support for DCC image stores for GFX 10.3
  - Display rework for better FP code handling
  - Yellow Carp/Cyan Skillfish updates
  - Cyan Skillfish display support
  - convert vega/navi to IP discovery asic enumeration
  - validate IP discovery table
  - RAS improvements
  - Lots of fixes
 
  i915:
  - DG1 PCI IDs + LMEM discovery/placement
  - DG1 GuC submission by default
  - ADL-S PCI IDs updated + enabled by default
  - ADL-P (XE_LPD) fixed and updates
  - DG2 display fixes
  - PXP protected object support for Gen12 integrated
  - expose multi-LRC submission interface for GuC
  - export logical engine instance to user
  - Disable engine bonding on Gen12+
  - PSR cleanup
  - PSR2 selective fetch by default
  - DP 2.0 prep work
  - VESA vendor block + MSO use of it
  - FBC refactor
  - try again to fix fast-narrow vs slow-wide eDP training
  - use THP when IOMMU enabled
  - LMEM backup/restore for suspend/resume
  - locking simplification
  - GuC major reworking
  - async flip VT-D workaround changes
  - DP link training improvements
  - misc display refactorings
 
 bochs:
 - new PCI ID
 
 rcar-du:
 - Non-contiguious buffer import support for rcar-du
 - r8a779a0 support prep
 
 omapdrm:
 - COMPILE_TEST fixes
 
 sti:
 - COMPILE_TEST fixes
 
 msm:
 - fence ordering improvements
 - eDP support in DP sub-driver
 - dpu irq handling cleanup
 - CRC support for making igt happy
 - NO_CONNECTOR bridge support
 - dsi: 14nm phy support for msm8953
 - mdp5: msm8x53, sdm450, sdm632 support
 
 stm:
 - layer alpha + zpo support
 
 v3d:
 - fix Vulkan CTS failure
 - support multiple sync objects
 
 gud:
 - add R8/RGB332/RGB888 pixel formats
 
 vc4:
 - convert to new bridge helpers
 
 vgem:
 - use shmem helpers
 
 virtio:
 - support mapping exported vram
 
 zte:
 - remove obsolete driver
 
 rockchip:
 - use bridge attach no connector for LVDS/RGB
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGByPYACgkQDHTzWXnE
 hr6fxA//cXUvTHlEtF7UJDBRAYv+9lXH39NbGYU4aLJuBNlZztCuUi5JOSyDFDH1
 N9VI5biVseev2PEnCzJUubWxTqbUO7FBQTw0TyvZ4Eqn+UZMuFeo0dvdKZRAkvjV
 VHSUc0fm0+WSYanKUK7XK0fwG8aE6JVyYngzgKPSjifhszTdiiRsbU21iTinFhkS
 rgh3HEVELp+LqfoG4qzAYqFUjYqUjvCjd/hX/UkzCII8ZXKr38/4127e95443WOk
 +jes0gWGJe9TvSDrqo9TMx4qukcOniINFUvnzoD2RhOS+Jzr/i5rBh51Xy92g3NO
 Q7hy6byZdk/ZO/MXCDQ2giUOkBiqn5fQjlRGQp4iAZYw9pb3HU+/xrTq0BWVWd8o
 /vmzZYEKKU/sCGpxVDMZxsHV3mXIuVBvuZq6bjmSGcybgOBCiDx5F/Rum4nY2yHp
 lr3cuc0HP3m3f4b/HVvACO4tGd1nDDpVcon7CuhBB7HB7t6Zl9u18qc/qFw0tCTh
 3sgAhno6XFXtPFcSX2KAeeg0mhKDKKrsOnq5y3bDRr05Z0jLocJk95aXEKs6em4j
 gbyHwNaX3CHtiCnFn2/5169+n1K7zqHBtVSGmQlmFDv55rcdx7L3Spk7tCahQeSQ
 ur24r+sEggm8d5Wjl+MYq6wW3oP31s04JFaeV6oCkaSp1wS+alg=
 =jdhH
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Summary below. i915 starts to add support for DG2 GPUs, enables DG1
  and ADL-S support by default, lots of work to enable DisplayPort 2.0
  across drivers. Lots of documentation updates and fixes across the
  board.

  core:
   - improve dma_fence, lease and resv documentation
   - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin
   - sched fixes/improvements
   - allow empty drm leases
   - add dma resv iterator
   - add more DP 2.0 headers
   - DP MST helper improvements for DP2.0

  dma-buf:
   - avoid warnings, remove fence trace macros

  bridge:
   - new helper to get rid of panels
   - probe improvements for it66121
   - enable DSI EOTP for anx7625

  fbdev:
   - efifb: release runtime PM on destroy

  ttm:
   - kerneldoc switch
   - helper to clear all DMA mappings
   - pool shrinker optimizaton
   - remove ttm_tt_destroy_common
   - update ttm_move_memcpy for async use

  panel:
   - add new panel-edp driver

  amdgpu:
   - Initial DP 2.0 support
   - Initial USB4 DP tunnelling support
   - Aldebaran MCE support
   - Modifier support for DCC image stores for GFX 10.3
   - Display rework for better FP code handling
   - Yellow Carp/Cyan Skillfish updates
   - Cyan Skillfish display support
   - convert vega/navi to IP discovery asic enumeration
   - validate IP discovery table
   - RAS improvements
   - Lots of fixes

  i915:
   - DG1 PCI IDs + LMEM discovery/placement
   - DG1 GuC submission by default
   - ADL-S PCI IDs updated + enabled by default
   - ADL-P (XE_LPD) fixed and updates
   - DG2 display fixes
   - PXP protected object support for Gen12 integrated
   - expose multi-LRC submission interface for GuC
   - export logical engine instance to user
   - Disable engine bonding on Gen12+
   - PSR cleanup
   - PSR2 selective fetch by default
   - DP 2.0 prep work
   - VESA vendor block + MSO use of it
   - FBC refactor
   - try again to fix fast-narrow vs slow-wide eDP training
   - use THP when IOMMU enabled
   - LMEM backup/restore for suspend/resume
   - locking simplification
   - GuC major reworking
   - async flip VT-D workaround changes
   - DP link training improvements
   - misc display refactorings

  bochs:
   - new PCI ID

  rcar-du:
   - Non-contiguious buffer import support for rcar-du
   - r8a779a0 support prep

  omapdrm:
   - COMPILE_TEST fixes

  sti:
   - COMPILE_TEST fixes

  msm:
   - fence ordering improvements
   - eDP support in DP sub-driver
   - dpu irq handling cleanup
   - CRC support for making igt happy
   - NO_CONNECTOR bridge support
   - dsi: 14nm phy support for msm8953
   - mdp5: msm8x53, sdm450, sdm632 support

  stm:
   - layer alpha + zpo support

  v3d:
   - fix Vulkan CTS failure
   - support multiple sync objects

  gud:
   - add R8/RGB332/RGB888 pixel formats

  vc4:
   - convert to new bridge helpers

  vgem:
   - use shmem helpers

  virtio:
   - support mapping exported vram

  zte:
   - remove obsolete driver

  rockchip:
   - use bridge attach no connector for LVDS/RGB"

* tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm: (1259 commits)
  drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
  drm/amd/display: MST support for DPIA
  drm/amdgpu: Fix even more out of bound writes from debugfs
  drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts
  drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts
  drm/amdgpu/UAPI: rearrange header to better align related items
  drm/amd/display: Enable dpia in dmub only for DCN31 B0
  drm/amd/display: Fix USB4 hot plug crash issue
  drm/amd/display: Fix deadlock when falling back to v2 from v3
  drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
  drm/amd/display: move FPU associated DCN301 code to DML folder
  drm/amd/display: fix link training regression for 1 or 2 lane
  drm/amd/display: add two lane settings training options
  drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings
  drm/amd/display: implement decide lane settings
  drm/amd/display: adopt DP2.0 LT SCR revision 8
  drm/amd/display: FEC configuration for dpia links in MST mode
  drm/amd/display: FEC configuration for dpia links
  drm/amd/display: Add workaround flag for EDID read on certain docks
  drm/amd/display: Set phy_mux_sel bit in dmub scratch register
  ...
2021-11-02 16:47:49 -07:00
Linus Torvalds
6e5772c8d9 Add an interface called cc_platform_has() which is supposed to be used
by confidential computing solutions to query different aspects of the
 system. The intent behind it is to unify testing of such aspects instead
 of having each confidential computing solution add its own set of tests
 to code paths in the kernel, leading to an unwieldy mess.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/uLUACgkQEsHwGGHe
 VUqGbQ/+LOmz8hmL5vtbXw/lVonCSBRKI2KVefnN2VtQ3rjtCq8HlNoq/hAdi15O
 WntABFV8u4daNAcssp+H/p+c8Mt/NzQa60TRooC5ZIynSOCj4oZQxTWjcnR4Qxrf
 oABy4sp09zNW31qExtTVTwPC/Ejzv4hA0Vqt9TLQOSxp7oYVYKeDJNp79VJK64Yz
 Ky7epgg8Pauk0tAT76ATR4kyy9PLGe4/Ry0bOtAptO4NShL1RyRgI0ywUmptJHSw
 FV/MnoexdAs4V8+4zPwyOkf8YMDnhbJcvFcr7Yd9AEz2q9Z1wKCgi1M3aZIoW8lV
 YMXECMGe9DfxmEJbnP5zbnL6eF32x+tbq+fK8Ye4V2fBucpWd27zkcTXjoP+Y+zH
 NLg+9QykR9QCH75YCOXcAg1Q5hSmc4DaWuJymKjT+W7MKs89ywjq+ybIBpLBHbQe
 uN9FM/CEKXx8nQwpNQc7mdUE5sZeCQ875028RaLbLx3/b6uwT6rBlNJfxl/uxmcZ
 iF1kG7Cx4uO+7G1a9EWgxtWiJQ8GiZO7PMCqEdwIymLIrlNksAk7nX2SXTuH5jIZ
 YDuBj/Xz2UUVWYFm88fV5c4ogiFlm9Jeo140Zua/BPdDJd2VOP013rYxzFE/rVSF
 SM2riJxCxkva8Fb+8TNiH42AMhPMSpUt1Nmd1H2rcEABRiT83Ow=
 =Na0U
 -----END PGP SIGNATURE-----

Merge tag 'x86_cc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull generic confidential computing updates from Borislav Petkov:
 "Add an interface called cc_platform_has() which is supposed to be used
  by confidential computing solutions to query different aspects of the
  system.

  The intent behind it is to unify testing of such aspects instead of
  having each confidential computing solution add its own set of tests
  to code paths in the kernel, leading to an unwieldy mess"

* tag 'x86_cc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide: Replace the use of mem_encrypt_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_active() with cc_platform_has()
  x86/sme: Replace occurrences of sme_active() with cc_platform_has()
  powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
  x86/sev: Add an x86 version of cc_platform_has()
  arch/cc: Introduce a function to check for confidential computing features
  x86/ioremap: Selectively build arch override encryption functions
2021-11-01 15:16:52 -07:00
Alex Deucher
403475be6d drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
The DMA mask on SI parts is 40 bits not 44.  Copy
paste typo.

Fixes: 244511f386 ("drm/amdgpu: simplify and cleanup setting the dma mask")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:27:00 -04:00
Alex Deucher
58f8c7fa88 drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts
Add secondary instance version info for soc15 parts.

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Alex Deucher
074b2092d9 drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts
Add secondary instance version info for vega20, arcturure, and
aldebaran.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Tao Zhou
3d1a8d950d drm/amdgpu: remove GPRs init for ALDEBARAN in gpu reset (v3)
Remove GPRs init for ALDEBARAN in gpu reset temporarily, will add the init once the
algorithm is stable.

v2: Only remove GPRs init in gpu reset.
v3: Suspend needs it, only skip it in gpu reset.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:13 -04:00
Tao Zhou
f7e053435c drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN
Skip GPRs init in specific condition since current GPRs init algorithm only works for some CU settings.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:13 -04:00
Candice Li
4320e6f86d drm/amdgpu: Update TA version output in driver
TA version should only be displayed in firmware version column.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Lang Yu
a5c5d8d50e drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()
amdgpu_fence_driver_sw_fini() should be executed before
amdgpu_device_ip_fini(), otherwise fence driver resource
won't be properly freed as adev->rings have been tore down.

Fixes: 72c8c97b15 ("drm/amdgpu: Split amdgpu_device_fini into early and late")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Lang Yu
68df0f195a drm/amdkfd: Separate pinned BOs destruction from general routine
Currently, all kfd BOs use same destruction routine. But pinned
BOs are not unpinned properly. Separate them from general routine.

v2 (Felix):
Add safeguard to prevent user space from freeing signal BO.
Kunmap signal BO in the event of setting event page error.
Just kunmap signal BO to avoid duplicating the code.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Philip Yang
3b8a23ae52 drm/amdkfd: restore userptr ignore bad address error
The userptr can be unmapped by application and still registered to
driver, restore userptr work return user pages will get -EFAULT bad
address error. Pretend this error as succeed. GPU access this userptr
will have VM fault later, it is better than application soft hangs with
stalled user mode queues.

v2: squash in warning fix (Alex)

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Kent Russell
68daadf3d6 drm/amdgpu: Add kernel parameter support for ignoring bad page threshold
When a GPU hits the bad_page_threshold, it will not be initialized by
the amdgpu driver. This means that the table cannot be cleared, nor can
information gathering be performed (getting serial number, BDF, etc).

If the bad_page_threshold kernel parameter is set to -2,
continue to initialize the GPU, while printing a warning to dmesg that
this action has been done

v2: squash in Luben's fix to restore RAS info reporting

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Kent Russell
8483fdfea7 drm/amdgpu: Warn when bad pages approaches 90% threshold
dmesg doesn't warn when the number of bad pages approaches the
threshold for page retirement. WARN when the number of bad pages
is at 90% or greater for easier checks and planning, instead of waiting
until the GPU is full of bad pages.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Dave Airlie
970eae1560 Linux 5.15-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF298ceHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGIJYH/1rsEFQQ6caeQdy1
 z9eFIe48DNM4l7bFk+qEj2UAbzPdahVJ299Mg5fW0n2CDemOc9/n0b9TxQ37YObi
 mOzu0xwJVupIxkyFMPQSSc2q8aLm67NSpJy08DsmaNses5hSvu8x15RPHLQTybjt
 SwtKns+jpCq79P1GWbrB5e5UkLb0VNoxNp4L1U4pMrYGcEkJUXbaxNY2V/JcXdM7
 Vtn+qN0T/J6V6QVftv0t8Ecj3bjEnmL3kZHaTaNg3dGeKRpCGyHc5lcBQ0cNFG6t
 vjZ9VbuhBzGI3TN2tHH5hpA1UXo7HPBBCwQqxF1jeGLGHULikYwZ3TAPWqL3QZqC
 9cxr9SY=
 =p75d
 -----END PGP SIGNATURE-----

BackMerge tag 'v5.15-rc7' into drm-next

The msm next tree is based on rc3, so let's just backmerge rc7 before pulling it in.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-10-28 14:59:38 +10:00
Maxime Ripard
736638246e
Merge drm/drm-next into drm-misc-next
drm-misc-next hasn't been updated in a while and I need a post -rc2
state to merge some vc4 patches.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-10-25 15:27:56 +02:00
Greg Kroah-Hartman
16b0314aa7 dma-buf: move dma-buf symbols into the DMA_BUF module namespace
In order to better track where in the kernel the dma-buf code is used,
put the symbols in the namespace DMA_BUF and modify all users of the
symbols to properly import the namespace to not break the build at the
same time.

Now the output of modinfo shows the use of these symbols, making it
easier to watch for users over time:

$ modinfo drivers/misc/fastrpc.ko | grep import
import_ns:      DMA_BUF

Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20211010124628.17691-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-25 14:53:08 +02:00
Alex Deucher
df9feb1a69 drm/amdgpu/nbio7.4: use original HDP_FLUSH bits
The extended bits were not available for use on vega20 and
presumably arcturus as well.

Fixes: a0f9f85466 ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12")
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-22 10:11:41 -04:00
Christian König
25b8a14e88 drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable
Simplifying the code a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-13-christian.koenig@amd.com
2021-10-22 14:41:59 +02:00
Christian König
930ca2a7cb drm/amdgpu: use the new iterator in amdgpu_sync_resv
Simplifying the code a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-12-christian.koenig@amd.com
2021-10-22 14:41:07 +02:00
Alex Deucher
8cbc52c207 drm/amdgpu: Workaround harvesting info for some navy flounder boards
Some navy flounder boards do not properly mark harvested
VCN instances.  Fix that here.

v2: use IP versions

Fixes: 1b592d00b4 ("drm/amdgpu/vcn: remove manual instance setting")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1743
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:39:00 -04:00
Alex Deucher
47be978be0 drm/amdgpu/vcn3.0: remove intermediate variable
No need to use the id variable, just use the constant
plus instance offset directly.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:57 -04:00
Alex Deucher
7876c7ea14 drm/amdgpu/vcn2.0: remove intermediate variable
No need to use the tmp variable, just use the constant
directly.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:53 -04:00
Alex Deucher
c5dd5667f4 drm/amdgpu: Consolidate VCN firmware setup code
Roughly the same code was present in all VCN versions.
Consolidate it into a single function.

v2: use AMDGPU_UCODE_ID_VCN + i, check if num_inst >= 2

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
2021-10-21 23:38:46 -04:00
Alex Deucher
e8ac9e93b4 drm/amdgpu/vcn3.0: handle harvesting in firmware setup
Only enable firmware for the instance that is enabled.

v2: use AMDGPU_UCODE_ID_VCN + i

Fixes: 1b592d00b4 ("drm/amdgpu/vcn: remove manual instance setting")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1743
Reviewed-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:41 -04:00
Jingwen Chen
e77f0f5c6a drm/amd/amdgpu: add dummy_page_addr to sriov msg
Add dummy_page_addr to sriov msg for host driver to set
GCVM_L2_PROTECTION_DEFAULT_ADDR* registers correctly.

v2:
should update vf2pf msg instead

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:16 -04:00
Huang Rui
a61794bd2f drm/amdgpu: remove grbm cam index/data operations for gfx v10
PSP firmware will be responsible for applying the GRBM CAM remapping in
the production. And the GRBM_CAM_INDEX / GRBM_CAM_DATA registers will be
protected by PSP under security policy. So remove it according to the
new security policy.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:10 -04:00
Aaron Liu
53c2ff8bcb drm/amdgpu: support B0&B1 external revision id for yellow carp
B0 internal rev_id is 0x01, B1 internal rev_id is 0x02.
The external rev_id for B0 and B1 is 0x20.
The original expression is not suitable for B1.

v2: squash in fix for display code (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:27:31 -04:00
Kent Russell
dcd5ea9f94 drm/amdgpu: Clarify error when hitting bad page threshold
Change the error message when the bad_page_threshold is reached,
explicitly stating that the GPU will not be initialized.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher
0d055f09e1 drm/amdgpu: drop navi reg init functions
No longer used since IP enumeration is driven by the IP
discovery table now.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher
bf99b9b032 drm/amdgpu: drop nv_set_ip_blocks()
No longer used since IP enumeration is now driven by
amdgpu IP discovery code.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher
7092432e3c drm/amdgpu: drop soc15_set_ip_blocks()
No longer used since IP enumeration is now driven by
amdgpu IP discovery code.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher
c9c7d18045 drm/amdgpu/gfx10: fix typo in gfx_v10_0_update_gfx_clock_gating()
Check was incorrectly converted to IP version checking.

Fixes: 4b0ad84254 ("drm/amdgpu/gfx10: convert to IP version checking")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:56 -04:00
Qing Wang
40320159f0 drm/amdgpu: replace snprintf in show functions with sysfs_emit
show() must not use snprintf() when formatting the value to be
returned to user space.

Fix the following coccicheck warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:427:
WARNING: use scnprintf or sprintf.

Signed-off-by: Qing Wang <wangqing@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:56 -04:00
Aaron Liu
5efacdf072 drm/amdgpu: support B0&B1 external revision id for yellow carp
B0 internal rev_id is 0x01, B1 internal rev_id is 0x02.
The external rev_id for B0 and B1 is 0x20.
The original expression is not suitable for B1.

v2: squash in fix for display code (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:56 -04:00
Christian König
a0a8e75948 drm/amdgpu: use new iterator in amdgpu_vm_prt_fini
No need to actually allocate an array of fences here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-14-christian.koenig@amd.com
2021-10-20 14:06:35 +02:00
Guchun Chen
dac35c4239 drm/amdgpu/discovery: parse hw_id_name for SDMA instance 2 and 3
Otherwise, hw_id_name string is NULL for SDMA 2 and 3 when dumping
ip version from VBIOS.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:32:52 -04:00
Tao Zhou
42f88ab772 drm/amdgpu: output warning for unsupported ras error inject (v2)
Output a warning message if RAS TA returns UNSUPPORTED_ERROR_INJ status.

v2: implement it in psp_ras_ta_check_status function.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:32:52 -04:00
Tao Zhou
1b5254e8d9 drm/amdgpu: centralize checking for RAS TA status
Create new function to check status returned by RAS TA.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:32:52 -04:00
YuBiao Wang
a3848df60b drm/amd/amdgpu: Do irq_fini_hw after ip_fini_early
[Why]
drm_irq_uninstall is called in irq_fini_hw so that irq is disabled in sw
stage. SMU (and maybe other IP blocks) fini_hw will call irq_put for
cleanup and the whole cleanup process will be skipped because of
drm->irq_enable = false.

[How]
Move ip_fini_early before irq_fini_hw.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:16:50 -04:00
Tao Zhou
c72942c167 drm/amdgpu: load PSP RL in resume path
Some registers' access will fail without PSP RL after resume.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:14:33 -04:00
Lang Yu
5aeeac6fa3 drm/amdkfd: Fix an inappropriate error handling in allloc memory of gpu
We should unreference a gem object instead of an amdgpu bo here.

Fixes: fd9a9f8801 ("drm/amdgpu: Use GEM obj reference for KFD BOs")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:13:48 -04:00
Alex Deucher
48737ac4d7 drm/amdgpu/psp: add some missing cases to psp_check_pmfw_centralized_cstate_management
Missed a few asics.

v2: update comment

Fixes: 82d05736c4 ("drm/amdgpu/amdgpu_psp: convert to IP version checking")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 22:51:41 -04:00
Philip Yang
43fc10c187 drm/amdkfd: unregistered svm range not overlap with TTM range
When creating unregistered new svm range to recover retry fault, avoid
new svm range to overlap with ranges or userptr ranges managed by TTM,
otherwise svm migration will trigger TTM or userptr eviction, to evict
user queues unexpectedly.

Change helper amdgpu_ttm_tt_affect_userptr to return userptr which is
inside the range. Add helper svm_range_check_vm_userptr to scan all
userptr of the vm, and return overlap userptr bo start, last.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 22:20:13 -04:00
Yifan Zhang
afd18180c0 drm/amdkfd: fix boot failure when iommu is disabled in Picasso.
When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2
init will fail. But this failure should not block amdgpu driver init.

Reported-by: youling <youling257@gmail.com>
Tested-by: youling <youling257@gmail.com>
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:15:52 -04:00
Mukul Joshi
91a1a52d03 drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran
During mode2 reset, the GPU is temporarily removed from the
mgpu_info list. As a result, page retirement fails because it
cannot find the GPU in the GPU list.
To fix this, create our own list of GPUs that support MCE notifier
based page retirement and use that list to check if the UMC error
occurred on a GPU that supports MCE notifier based page retirement.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:48 -04:00
Mukul Joshi
a4967a1ebf drm/amdgpu: Enable RAS error injection after mode2 reset on Aldebaran
Add the missing call to re-enable RAS error injections on the Aldebaran
mode2 reset code path.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:35 -04:00
Lang Yu
fe04957e26 drm/amdgpu: enable display for cyan skillfish
Display support for cyan skillfish is ready now. Enable it!

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:34 -04:00
Alex Deucher
369b7d04ba drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12
It's used internally by firmware.  Using it in the driver
could conflict with firmware.

v2: squash in fix for navi1x (Alex)

Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:34 -04:00
Alex Deucher
a0f9f85466 drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12
It's used internally by firmware.  Using it in the driver
could conflict with firmware.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-11 11:57:00 -04:00
Dave Airlie
797d72ce8e drm-misc-next for v5.16:
UAPI Changes:
 - Allow empty drm leases for creating separate GEM namespaces.
 
 Cross-subsystem Changes:
 - Slightly rework dma_buf_poll.
 - Add dma_resv_for_each_fence_unlocked to iterate, and use it inside
   the lockless dma-resv functions.
 
 Core Changes:
 - Allow devm_drm_of_get_bridge to build without CONFIG_OF for compile testing.
 - Add more DP2 headers.
 - fix CONFIG_FB dependency in fb_helper.
 - Add DRM_FORMAT_R8 to drm_format_info, and helpers for RGB332 and RGB888.
 - Fix crash on a 0 or invalid EDID.
 
 Driver Changes:
 - Apply and revert DRM_MODESET_LOCK_ALL_BEGIN.
 - Add mode_valid to ti-sn65dsi86 bridge.
 - Support multiple syncobjs in v3d.
 - Add R8, RGB332 and RGB888 pixel formats to GUD.
 - Use devm_add_action_or_reset in dw-hdmi-cec.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmFdfuwACgkQ/lWMcqZw
 E8OTgg/+Nmsqhj1tsbSCWF1yx81CXHVSOhExPaMl+GPs6+y+sZ+U2rN99dnbULvA
 U56eOmjc8FvgmK89BwhSYNt++QYIRRpzjBGlCYm4bwpgqFOmYsK+en35PYMwHdxM
 Ke8newhzqa6/detvjX52igddZzrBv1Cs8aXuV5rw7Dg0ivlSlQUV0MO8JYwCliWI
 arRT8bg7wzUzhyRZqwqOqKXjvRirqBlFjJmvfL0WgHevZbzYuXbn4eWCUgCVthMH
 pU9QgK6FMW912pBxVppDO2aTDmNvqwj1BsB3RFfRuqS/JJ4s/gf39JxsipnI+/qn
 kPxZVFzzonR8Nl6h9sPi1jZrcVDCBebFgyG8hSgIVb/09U7AVYomtP18VKeh8yCy
 Pp4iQINqOcyMPmXKF491LIL92dcXZAIRaRQFKc/ZSHcfIDA7ZB1+7zf1ixBjlxjP
 GqtjLbmPspI2DzBRlTFEdf58jvX70E5nFYdQyYcy3VprJHuqEgL5PKz2Xcnve6R0
 dEkGA2vMrGtb23YyjbFTNfkdvg9WYXze9HbQLt7kc8mI77TugkG0/rCcwv5pEEu3
 WSwqMeb+5H+7va4AI715MoXbxgnCba2zPTUm1s8kSqTK0Oighc/vWcnnJ4iVuEGE
 8Xt8AIIYUtccufR6ujucVUh7nju2ZOnFE7S92LybnGnByAIADfM=
 =qxpr
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-10-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.16:

UAPI Changes:
- Allow empty drm leases for creating separate GEM namespaces.

Cross-subsystem Changes:
- Slightly rework dma_buf_poll.
- Add dma_resv_for_each_fence_unlocked to iterate, and use it inside
  the lockless dma-resv functions.

Core Changes:
- Allow devm_drm_of_get_bridge to build without CONFIG_OF for compile testing.
- Add more DP2 headers.
- fix CONFIG_FB dependency in fb_helper.
- Add DRM_FORMAT_R8 to drm_format_info, and helpers for RGB332 and RGB888.
- Fix crash on a 0 or invalid EDID.

Driver Changes:
- Apply and revert DRM_MODESET_LOCK_ALL_BEGIN.
- Add mode_valid to ti-sn65dsi86 bridge.
- Support multiple syncobjs in v3d.
- Add R8, RGB332 and RGB888 pixel formats to GUD.
- Use devm_add_action_or_reset in dw-hdmi-cec.

Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Wed 06 Oct 2021 20:48:12 AEST
# gpg:                using RSA key B97BD6A80CAC4981091AE547FE558C72A67013C3
# gpg: Good signature from "Maarten Lankhorst <maarten.lankhorst@linux.intel.com>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten@debian.org>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten.lankhorst@canonical.com>" [expired]
# gpg: Note: This key has expired!
# Primary key fingerprint: B97B D6A8 0CAC 4981 091A  E547 FE55 8C72 A670 13C3
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2602f4e9-a8ac-83f8-6c2a-39fd9ca2e1ba@linux.intel.com
2021-10-11 12:39:15 +10:00
Guchun Chen
c58a863b1c drm/amdgpu: use adev_to_drm for consistency when accessing drm_device
adev_to_drm is used everywhere, so improve recent changes
when accessing drm_device pointer from amdgpu_device.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-08 13:22:13 -04:00
Alex Deucher
35bdf463de drm/amdgpu: add missing case for HDP for renoir
Missing 4.1.2.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-08 13:20:13 -04:00
Alex Deucher
73bf66712d drm/amdgpu/discovery: add missing case for SMU 11.0.5
Was missed when converting the driver over to IP based
initialization.

Tested-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-08 13:20:13 -04:00
Nirmoy Das
58144d2837 drm/amdgpu: unify BO evicting method in amdgpu_ttm
Unify BO evicting functionality for possible memory
types in amdgpu_ttm.c.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-07 11:55:46 -04:00
Mukul Joshi
12b2cab790 drm/amdgpu: Register MCE notifier for Aldebaran RAS
On Aldebaran, GPU driver will handle bad page retirement
for GPU memory even though UMC is host managed. As a result,
register a bad page retirement handler on the mce notifier
chain to retire bad pages on Aldebaran.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-06 15:53:38 -04:00
Nirmoy Das
5b9581df9f drm/amdgpu: return early if debugfs is not initialized
Check first if debugfs is initialized before creating
amdgpu debugfs files.

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-06 15:53:14 -04:00
Marek Olšák
f2e7d85680 drm/amd/display: fix DCC settings for DCN3
ind_block_64b_no_128bcl means INDEP_64B && INDEP_128B &&
MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx10.3.

ind_block_64b means INDEP_64B && !INDEP_128B &&
MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx9 and gfx10.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-06 15:50:43 -04:00
Guchun Chen
248b061689 drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 13:02:31 -04:00
Yifan Zhang
714d9e4574 drm/amdgpu: init iommu after amdkfd device init
This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.2 AMD-APP (3364.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback

  Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 13:02:20 -04:00
Guchun Chen
e17e27f9bd drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:23:56 -04:00
Christian König
127aedf979 drm/amdgpu: print warning and taint kernel if lockup timeout is disabled
Make sure that we notice this in error reports.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:23:42 -04:00
Christian König
c8365dbda0 drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"
This reverts commit 728e7e0cd6.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:36 -04:00
Yifan Zhang
286826d7d9 drm/amdgpu: init iommu after amdkfd device init
This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.2 AMD-APP (3364.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback

  Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:29 -04:00
Lijo Lazar
1d617c029f drm/amdgpu: During s0ix don't wait to signal GFXOFF
In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-05 10:55:07 -04:00
Lang Yu
b072ef1215 drm/amdkfd: fix a potential ttm->sg memory leak
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr,
but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it!

Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 10:53:45 -04:00
Alex Deucher
630e959f25 drm/amdgpu/gmc9: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Guo Zhengkui
8001ba85d0 drm/amdgpu: remove some repeated includings
Remove two repeated includings in line 46 and 47.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Lijo Lazar
d04287d062 drm/amdgpu: During s0ix don't wait to signal GFXOFF
In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
4b3a624c4c drm/amdgpu: consolidate case statements
IP_VERSION(11, 0, 13) does the exact same thing as
IP_VERSION(11, 0, 12) so squash them together.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
James Zhu
c60511493b drm/amdgpu/jpeg: add jpeg2.6 start/end
Add jpeg2.6 start/end with updated PCTL0_MMHUB_DEEPSLEEP_IB address.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.lilu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
James Zhu
d4b0ee65de drm/amdgpu/jpeg2: move jpeg2 shared macro to header file
Move jpeg2 shared macro to header file

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.lilu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Lang Yu
546dc20fed drm/amdkfd: fix a potential ttm->sg memory leak
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr,
but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it!

Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
a79d3709c4 drm/amdgpu: add an option to override IP discovery table from a file
If you set amdgpu.discovery=2 you can force the the driver to
fetch the IP discovery table from a file rather than from the
table shipped on the device.  This is useful for debugging and
for device bring up and emulation when the tables may be in flux.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
5b983db8c3 drm/amdkfd: clean up parameters in kgd2kfd_probe
We can get the pdev and asic type from the adev.  No need
to pass them explicitly.

v2: squash in build fix for !CONFIG_HSA_AMD from Anson

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
6d46d419af drm/amdgpu: add support for SRIOV in IP discovery path
Handle SRIOV requirements when adding IP blocks.

v2: add comment about UVD/VCE support on vega20 SR-IOV

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
b05b9c591f drm/amdgpu: clean up set IP function
Split into several smaller per IP functions to make it
easier to handle ordering issues for things like
SR-IOV in a follow up patch.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
1d789535a0 drm/amdgpu: convert IP version array to include instances
Allow us to query instances versions more cleanly.

Instancing support is not consistent unfortunately. SDMA is a
good example.  Sienna cichlid has 4 total SDMA instances, each
enumerated separately (HWIDs 42, 43, 68, 69).  Arcturus has 8
total SDMA instances, but they are enumerated as multiple
instances of the same HWIDs (4x HWID 42, 4x HWID 43).  UMC
is another example.  On most chips there are multiple
instances with the same HWID.  This allows us to support both
forms.

v2: rebase
v3: clarify instancing support

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
d0761fd24e drm/amdgpu: set CHIP_IP_DISCOVERY as the asic type by default
For new chips with no explicit entry in the PCI ID list.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
3ae695d691 drm/amdgpu: add new asic_type for IP discovery
Add a new asic type for asics where we don't have an
explicit entry in the PCI ID list.  We don't need
an asic type for these asics, other than something higher
than the existing ones, so just use this for all new
asics.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
aa9f8cc349 drm/amdgpu/ucode: add default behavior
Default to PSP ucode loading unless the user specifies
direct.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
f174161517 drm/amdgpu: get VCN harvest information from IP discovery table
Use the table rather than asic specific harvest registers.

v2: remove harvesting register checking

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
1b592d00b4 drm/amdgpu/vcn: remove manual instance setting
Handled by IP discovery now.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
fe323f039d drm/amdgpu/sdma: remove manual instance setting
Handled by IP discovery now.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
5c3720be7d drm/amdgpu: get VCN and SDMA instances from IP discovery table
Rather than hardcoding it.  We already have the number of VCN
instances from a previous patch, so just update the VCN
instances for chips with static tables.

v2: squash in checks for SDMA3,4 (Guchun)
v3: clarify VCN changes

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
5eceb20192 drm/amdgpu: add VCN1 hardware IP
So we can store the VCN IP revision for each instance of VCN.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
75a07bcd1d drm/amdgpu/soc15: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
0b64a5a852 drm/amdgpu/vcn2.5: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
96b8dd4423 drm/amdgpu/amdgpu_vcn: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: squash in fix for navy flounder and sienna cichlid

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
1fcc208cd7 drm/amdgpu/psp_v13.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
e47868ea15 drm/amdgpu/psp_v11.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
82d05736c4 drm/amdgpu/amdgpu_psp: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
9d0cb2c318 drm/amdgpu/gfx9.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
24be2d7004 drm/amdgpu/hdp4.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
43bf00f21e drm/amdgpu/sdma4.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
f7f12b2582 drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support
We are not going to support any new chips with the old
non-DC code so make it the default.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
9878844094 drm/amdgpu: drive all vega asics from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
91e9db33be drm/amdgpu/soc15: get rev_id in soc15_common_early_init
for consistency with other SoCs.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
d4c6e870bd drm/amdgpu: add initial IP discovery support for vega based parts
Hardcode the IP versions for asics without IP discovery tables
and then enumerate the asics based on the IP versions.

TODO: fix SR-IOV support

v2: Squash in HDP fix for Renoir

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
994470b252 drm/amdgpu/soc15: export common IP functions
So they can be driven by IP discovery table.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
5f93148955 drm/amdgpu: add DCI HWIP
So we can track grab the appropriate DCE info out of the
IP discovery table.  This is a separare IP from DCN.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
75aa18415a drm/amdgpu: drive all navi asics from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

v2: rebase

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
3e67f4f2e2 drm/amdgpu/nv: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
7c69d6153e drm/amdgpu/navi10_ih: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
258fa17d1a drm/amdgpu/athub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
13ebe284a2 drm/amdgpu/athub2.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
4edbbfde89 drm/amdgpu/vcn3.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
bc7c3d1d8a drm/amdgpu/mmhub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
ce2d99a84f drm/amdgpu/mmhub2.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
fac1772374 drm/amdgpu/gfxhub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
524cf3ab85 drm/amdgpu: drive nav10 from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

Only tested on Navi10 so far.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
63352b7f98 drm/amdgpu: Use IP discovery to drive setting IP blocks by default
Drive the asic setup from the IP discovery table rather than
hardcoded settings based on asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
5db9d0657e drm/amdgpu/gmc10.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: squash in gmc fixes
v3: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
eb4fd29afd drm/amdgpu: bind to any 0x1002 PCI diplay class device
Bind to all 0x1002 GPU devices.

For now we explicitly return -ENODEV for generic bindings.
Remove this check once IP discovery based checking is in place.

v2: rebase (Alex)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
bdbeb0dde4 drm/amdgpu: filter out radeon PCI device IDs
Once we claim all 0x1002 PCI display class devices, we will
need to filter out devices owned by radeon.

v2: rename radeon id array to make it more clear that
the devices are not supported by amdgpu.
    add r128, mach64 pci ids as well

Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
4b0ad84254 drm/amdgpu/gfx10: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase,  squash in navi10 fixes (Alex)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
8f4bb1e784 drm/amdgpu/sdma5.2: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
02200e910c drm/amdgpu/sdma5.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
795d08391b drm/amdgpu: add initial IP enumeration via IP discovery table
Add initial support for all navi based parts.

v2: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
a1f62df75b drm/amdgpu/nv: export common IP functions
So they can be driven by IP dicovery table.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
1534db5549 drm/amdgpu: add XGMI HWIP
So we can track grab the appropriate XGMI info out of the
IP discovery table.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
54d2b1f402 drm/amdgpu: fill in IP versions from IP discovery table
Prerequisite for using IP versions in the driver rather
than asic type.

v2: Use IP_VERSION() macro instead of new function

Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
5f52e9a780 drm/amdgpu: store HW IP versions in the driver structure
So we can check the IP versions directly rather than using
asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
81d1bf01e4 drm/amdgpu: add debugfs access to the IP discovery table
Useful for debugging and new asic validation.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
f76f795a8f drm/amdgpu: move headless sku check into harvest function
Consolidate harvesting information.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
John Clements
eb601e61d3 drm/amdgpu: resolve RAS query bug
clear error count when persistant harvesting is not enabled

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Tao Zhou
c749094923 amd/amdkfd: add ras page retirement handling for sq/sdma (v3)
In ras poison mode, page retirement will be handled by the irq handler of the
module which consumes corrupted data.

v2: rename ras_process_cb to ras_poison_consumption_handler.
    move the handler's implementation from ASIC specific file to common
file.

v3: call gpu reset for xGMI connected mode.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Prike Liang
e5d59cfa33 drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by firmware miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Zhan Liu
3f68c01be9 drm/amd/display: add cyan_skillfish display support
[Why]
add display related cyan_skillfish files in.

makefile controlled by CONFIG_DRM_AMD_DC_DCN201 flag.

v2: squash in clang fixes from Harry, Nathan
v3: squash in missing CONFIG_DRM_AMD_DC check (Alex)

Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Jun Lei <jun.lei@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Sean Paul
6f67e6fd4d Revert "drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()"
This reverts commit 299f040e85.

This patchset breaks on intel platforms and was previously NACK'd by
Ville.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Fernando Ramos <greenfoo@u92.eu>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211002154542.15800-2-sean@poorly.run
2021-10-04 09:34:55 -04:00
Tom Lendacky
e9d1d2bb75 treewide: Replace the use of mem_encrypt_active() with cc_platform_has()
Replace uses of mem_encrypt_active() with calls to cc_platform_has() with
the CC_ATTR_MEM_ENCRYPT attribute.

Remove the implementation of mem_encrypt_active() across all arches.

For s390, since the default implementation of the cc_platform_has()
matches the s390 implementation of mem_encrypt_active(), cc_platform_has()
does not need to be implemented in s390 (the config option
ARCH_HAS_CC_PLATFORM is not set).

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210928191009.32551-9-bp@alien8.de
2021-10-04 11:47:24 +02:00
Fernando Ramos
299f040e85 drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
As requested in Documentation/gpu/todo.rst, replace driver calls to
drm_modeset_lock_all() with DRM_MODESET_LOCK_ALL_BEGIN() and
DRM_MODESET_LOCK_ALL_END()

Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924064324.229457-16-greenfoo@u92.eu
2021-10-01 13:00:57 -04:00
Andrey Grodzovsky
5c67ff3a4c drm/amdgpu: Add a UAPI flag for hot plug/unplug
To support libdrm tests.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Andrey Grodzovsky
894c6890a2 drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.

v2:
Move the actul handling function to TTM

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Ernst Sjöstrand
5039f52988 drm/amd/amdgpu: Validate ip discovery blob
We use the number_instance index that we get from the fw discovery blob
to index into an array for example.

Update error messages (Alex)

Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Arnd Bergmann
335aea75b0 drm/amdgpu: fix warning for overflow check
The overflow check in amdgpu_bo_list_create() causes a warning with
clang-14 on 64-bit architectures, since the limit can never be
exceeded.

drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
        if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list))
            ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The check remains useful for 32-bit architectures, so just avoid the
warning by using size_t as the type for the count.

Fixes: 920990cb08 ("drm/amdgpu: allocate the bo_list array after the list")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Simon Ser
2f350ddadc drm/amdgpu: check tiling flags when creating FB on GFX8-
On GFX9+, format modifiers are always enabled and ensure the
frame-buffers can be scanned out at ADDFB2 time.

On GFX8-, format modifiers are not supported and no other check
is performed. This means ADDFB2 IOCTLs will succeed even if the
tiling isn't supported for scan-out, and will result in garbage
displayed on screen [1].

Fix this by adding a check for tiling flags for GFX8 and older.
The check is taken from radeonsi in Mesa (see how is_displayable
is populated in gfx6_compute_surface).

Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel)

[1]: https://github.com/swaywm/wlroots/issues/3185

Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Matthew Auld
43d46f0b78 drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
  - Rename these to TTM_TT_FLAGS_*
  - Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-29 16:17:56 +02:00
Matthew Auld
21856e1e34 drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu
Now that setting page->index shouldn't be needed anymore, we are just
left with setting page->mapping, and here it looks like amdgpu is the
only user, where pointing the page->mapping at the dev_mapping is used
to verify that the pages do indeed belong to the device, if userspace
later tries to touch them.

v2(Christian):
  - Drop the functions altogether and just inline modifying
    the page->mapping

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927114114.152310-3-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-29 13:55:09 +02:00
Prike Liang
26db706a6d drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by firmware miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-28 14:40:27 -04:00
Simon Ser
98122e63a7 drm/amdgpu: check tiling flags when creating FB on GFX8-
On GFX9+, format modifiers are always enabled and ensure the
frame-buffers can be scanned out at ADDFB2 time.

On GFX8-, format modifiers are not supported and no other check
is performed. This means ADDFB2 IOCTLs will succeed even if the
tiling isn't supported for scan-out, and will result in garbage
displayed on screen [1].

Fix this by adding a check for tiling flags for GFX8 and older.
The check is taken from radeonsi in Mesa (see how is_displayable
is populated in gfx6_compute_surface).

Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel)

[1]: https://github.com/swaywm/wlroots/issues/3185

Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-28 14:40:19 -04:00
Hawking Zhang
9f52c25f59 drm/amdgpu: correct initial cp_hqd_quantum for gfx9
didn't read the value of mmCP_HQD_QUANTUM from correct
register offset

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-28 14:39:29 -04:00
Leslie Shi
66805763a9 drm/amdgpu: fix gart.bo pin_count leak
gmc_v{9,10}_0_gart_disable() isn't called matched with
correspoding gart_enbale function in SRIOV case. This will
lead to gart.bo pin_count leak on driver unload.

Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 14:38:16 -04:00
Hawking Zhang
e794747622 drm/amdgpu: correct initial cp_hqd_quantum for gfx9
didn't read the value of mmCP_HQD_QUANTUM from correct
register offset

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:08 -04:00
Tao Zhou
f524dd54a7 drm/amdgpu: skip umc ras irq handling in poison mode (v2)
In ras poison mode, umc uncorrectable error will be ignored until
the corrupted data consumed by another ras module (such as gfx, sdma).

v2: update the debug message and replace dev_warn with dev_info.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:07 -04:00
Tao Zhou
e43488493c drm/amdgpu: set poison supported flag for RAS (v2)
Add RAS poison supported flag and tell PSP RAS TA about the info.

v2: rename poison mode to poison supported, we can also disable poison
mode even we support it.
    print value of poison supported if ras feature enablement fails.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:07 -04:00
Tao Zhou
aaca8c3861 drm/amdgpu: add poison mode query for UMC
Add ras poison mode query interface for UMC.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:06 -04:00
Tao Zhou
ca5c636dc6 drm/amdgpu: add poison mode query for DF (v2)
Add ras poison mode query interface for DF.

v2: replace RREG32_PCIE with RREG32_SOC15.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:06 -04:00
Candice Li
77ec28eac2 drm/amdgpu: Update PSP TA Invoke to use common TA context as input
Updated invoke to use new common TA structure similarily to load/unload.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:06 -04:00
Leslie Shi
71cf9e72b3 drm/amdgpu: fix gart.bo pin_count leak
gmc_v{9,10}_0_gart_disable() isn't called matched with
correspoding gart_enbale function in SRIOV case. This will
lead to gart.bo pin_count leak on driver unload.

Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:05 -04:00
Dave Airlie
1e3944578b Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.16-2021-09-27:

amdgpu:
- RAS improvements
- BACO fixes
- Yellow Carp updates
- Misc code cleanups
- Initial DP 2.0 support
- VCN priority handling
- Cyan Skillfish updates
- Rework IB handling for multimedia engine tests
- Backlight fixes
- DCN 3.1 power saving improvements
- Runtime PM fixes
- Modifier support for DCC image stores for gfx 10.3
- Hotplug fixes
- Clean up stack related warnings in display code
- DP alt mode fixes
- Display rework for better handling FP code
- Debugfs fixes

amdkfd:
- SVM fixes
- DMA map fixes

radeon:
- AGP fix

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927212653.4575-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-09-28 17:08:26 +10:00
Alex Deucher
2485e2753e drm/amdgpu: make soc15_common_ip_funcs static
It's not used outside of soc15.c

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:35:27 -04:00
Candice Li
9080a18fc5 drm/amdgpu: Remove all code paths under the EAGAIN path in RAS late init
All code paths under the EAGAIN path in RAS late init are unused.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:35:13 -04:00
John Clements
73490d2658 drm/amdgpu: Consolidate RAS cmd warning messages
Explicity post warning if cmd is issued against unsupported IP

Update to latest RAS TA interface

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:35:04 -04:00
John Clements
640ae42efb drm/amdgpu: Updated RAS infrastructure
Update RAS infrastructure to support RAS query for MCA subblocks

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:34:43 -04:00
Guchun Chen
6effad8abe drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage
adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent
access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu
will still use adev->rmmio for access after amdgpu_device_unmap_mmio.
The patch is to move such SRIOV calling earlier to fini_early stage.

Fixes: 07775fc138 ("drm/amdgpu: Unmap all MMIO mappings")
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:34:28 -04:00
Andrey Grodzovsky
ebe86a57c8 drm/amdgpu: Fix resume failures when device is gone
Problem:
When device goes into suspend and unplugged during it
then all HW programming during resume fails leading
to a bad SW during pci remove handling which follows.
Because device is first resumed and only later removed
we cannot rely on drm_dev_enter/exit here.

Fix:
Use a flag we use for PCIe error recovery to avoid
accessing registres. This allows to successfully complete
pm resume sequence and finish pci remove.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
Andrey Grodzovsky
c03509cbc0 drm/amdgpu: Fix MMIO access page fault
Add more guards to MMIO access post device
unbind/unplug

Bug: https://bugs.archlinux.org/task/72092?project=1&order=dateopened&sort=desc&pagenum=1
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
Andrey Grodzovsky
d82e2c249c drm/amdgpu: Fix crash on device remove/driver unload
Crash:
BUG: unable to handle page fault for address: 00000000000010e1
RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu]
Call Trace:
pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu]
amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu]
amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu]
vce_v4_0_hw_fini+0x95/0xa0 [amdgpu]
amdgpu_device_fini_hw+0x232/0x30d [amdgpu]
amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu]
amdgpu_pci_remove+0x27/0x40 [amdgpu]
pci_device_remove+0x3e/0xb0
device_release_driver_internal+0x103/0x1d0
device_release_driver+0x12/0x20
pci_stop_bus_device+0x79/0xa0
pci_stop_and_remove_bus_device_locked+0x1b/0x30
remove_store+0x7b/0x90
dev_attr_store+0x17/0x30
sysfs_kf_write+0x4b/0x60
kernfs_fop_write_iter+0x151/0x1e0

Why:
VCE/UVD had dependency on SMC block for their suspend but
SMC block is the first to do HW fini due to some constraints

How:
Since the original patch was dealing with suspend issues
move the SMC block dependency back into suspend hooks as
was done in V1 of the original patches.
Keep flushing idle work both in suspend and HW fini seuqnces
since it's essential in both cases.

Fixes: 859e465927 ("drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend")
Fixes: bf756fb833 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
xinhui pan
0a2267809f drm/amdgpu: Fix uvd ib test timeout when use pre-allocated BO
Now we use same BO for create/destroy msg. So destroy will wait for the
fence returned from create to be signaled. The default timeout value in
destroy is 10ms which is too short.

Lets wait both fences with the specific timeout.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
xinhui pan
b2fe31cf64 drm/amdgpu: Put drm_dev_enter/exit outside hot codepath
We hit soft hang while doing memory pressure test on one numa system.
After a qucik look, this is because kfd invalid/valid userptr memory
frequently with process_info lock hold.
Looks like update page table mapping use too much cpu time.

perf top says below,
75.81%  [kernel]       [k] __srcu_read_unlock
 6.19%  [amdgpu]       [k] amdgpu_gmc_set_pte_pde
 3.56%  [kernel]       [k] __srcu_read_lock
 2.20%  [amdgpu]       [k] amdgpu_vm_cpu_update
 2.20%  [kernel]       [k] __sg_page_iter_dma_next
 2.15%  [drm]          [k] drm_dev_enter
 1.70%  [drm]          [k] drm_prime_sg_to_dma_addr_array
 1.18%  [kernel]       [k] __sg_alloc_table_from_pages
 1.09%  [drm]          [k] drm_dev_exit

So move drm_dev_enter/exit outside gmc code, instead let caller do it.
They are gart_unbind, gart_map, vm_clear_bo, vm_update_pdes and
gmc_init_pdb0. vm_bo_update_mapping already calls it.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
John Clements
226f4f5a6b drm/amdgpu: Resolve nBIF RAS error harvesting bug
Set correct RAS nBIF error query register offsets on aldebaran

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Candice Li
17c6805a00 drm/amdgpu: Update PSP TA unload function
Update PSP TA unload function to use PSP TA context as input argument.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Candice Li
3f83f17b73 drm/amdgpu: Conform ASD header/loading to generic TA systems
Update asd_context structure and add asd_initialize function to
conform ASD header/loading to generic TA systems.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Paul Menzel
31ea43442d drm/amdgpu: Demote TMZ unsupported log message from warning to info
As the user cannot do anything about the unsupported Trusted Memory Zone
(TMZ) feature, do not warn about it, but make it informational, so
demote the log level from warning to info.

Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Michel Dänzer
6cd1f9b40a drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.

Fixes compile failure building for ppc64le on RHEL 8:

In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
                 from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call
 to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available
   90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
 1985 |         max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: c84d46707e "drm/amdgpu: validate bad page threshold in ras(v3)"
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Dave Airlie
0dfc70818a drm-misc-next for $kernel-version:
UAPI Changes:
 
 Cross-subsystem Changes:
   - dma-buf: Avoid a warning with some allocations, Remove
     DMA_FENCE_TRACE macros
 
 Core Changes:
   - bridge: New helper to git rid of panels in drivers
   - fence: Improve dma_fence_add_callback documentation, Improve
     dma_fence_ops->wait documentation
   - ioctl: Unexport drm_ioctl_permit
   - lease: Documentation improvements
   - fourcc: Add new macro to determine the modifier vendor
   - quirks: Add the Steam Deck, Chuwi HiBook, Chuwi Hi10 Pro, Samsung
     Galaxy Book 10.6, KD Kurio Smart C15200 2-in-1, Lenovo Ideapad D330
   - resv: Improve the documentation
   - shmem-helpers: Allocate WC pages on x86, Switch to vmf_insert_pfn
   - sched: Fix for a timer being canceled too soon, Avoid null pointer
     derefence if the fence is null in drm_sched_fence_free, Convert
     drivers to rely on its dependency tracking
   - ttm: Switch to kerneldoc, new helper to clear all DMA mappings, pool
     shrinker optitimization, Remove ttm_tt_destroy_common, Fix for
     unbinding on multiple drivers
 
 Driver Changes:
   - bochs: New PCI IDs
   - msm: Fence ordering impromevemnts
   - stm: Add layer alpha support, zpos
   - v3d: Fix for a Vulkan CTS failure
   - vc4: Conversion to the new bridge helpers
   - vgem: Use shmem helpers
   - virtio: Support mapping exported vram
   - zte: Remove obsolete driver
 
   - bridge: Probe improvements for it66121, enable DSI EOTP for anx7625,
     errors propagation improvements for anx7625
 
   - panels: 60fps mode for otm8009a, New driver for Samsung S6D27A1
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYULyqgAKCRDj7w1vZxhR
 xVR1AP96dB3rfB0uIEvujMROBqupaKbYvP/7qilfMGIwLotDqQD/RKNB+EAaoHtT
 hRA7zmz7kwYA/l8PihmF1zoFddX21gA=
 =nFnK
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-09-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for $kernel-version:

UAPI Changes:

Cross-subsystem Changes:
  - dma-buf: Avoid a warning with some allocations, Remove
    DMA_FENCE_TRACE macros

Core Changes:
  - bridge: New helper to git rid of panels in drivers
  - fence: Improve dma_fence_add_callback documentation, Improve
    dma_fence_ops->wait documentation
  - ioctl: Unexport drm_ioctl_permit
  - lease: Documentation improvements
  - fourcc: Add new macro to determine the modifier vendor
  - quirks: Add the Steam Deck, Chuwi HiBook, Chuwi Hi10 Pro, Samsung
    Galaxy Book 10.6, KD Kurio Smart C15200 2-in-1, Lenovo Ideapad D330
  - resv: Improve the documentation
  - shmem-helpers: Allocate WC pages on x86, Switch to vmf_insert_pfn
  - sched: Fix for a timer being canceled too soon, Avoid null pointer
    derefence if the fence is null in drm_sched_fence_free, Convert
    drivers to rely on its dependency tracking
  - ttm: Switch to kerneldoc, new helper to clear all DMA mappings, pool
    shrinker optitimization, Remove ttm_tt_destroy_common, Fix for
    unbinding on multiple drivers

Driver Changes:
  - bochs: New PCI IDs
  - msm: Fence ordering impromevemnts
  - stm: Add layer alpha support, zpos
  - v3d: Fix for a Vulkan CTS failure
  - vc4: Conversion to the new bridge helpers
  - vgem: Use shmem helpers
  - virtio: Support mapping exported vram
  - zte: Remove obsolete driver

  - bridge: Probe improvements for it66121, enable DSI EOTP for anx7625,
    errors propagation improvements for anx7625

  - panels: 60fps mode for otm8009a, New driver for Samsung S6D27A1

Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Thu 16 Sep 2021 17:30:50 AEST
# gpg:                using EDDSA key 5C1337A45ECA9AEB89060E9EE3EF0D6F671851C5
# gpg: Can't check signature: No public key
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916073132.ptbbmjetm7v3ufq3@gilmour
2021-09-22 15:30:40 +10:00
Paul Menzel
b287e49468 drm/amdgpu: Demote TMZ unsupported log message from warning to info
As the user cannot do anything about the unsupported Trusted Memory Zone
(TMZ) feature, do not warn about it, but make it informational, so
demote the log level from warning to info.

Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:24 -04:00
Michel Dänzer
114518ff3b drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.

Fixes compile failure building for ppc64le on RHEL 8:

In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
                 from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call
 to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available
   90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
 1985 |         max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: c84d46707e "drm/amdgpu: validate bad page threshold in ras(v3)"
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:24 -04:00
James Zhu
f02abeb077 drm/amdgpu: move iommu_resume before ip init/resume
Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-16 09:56:24 -04:00
James Zhu
8066008482 drm/amdgpu: add amdgpu_amdkfd_resume_iommu
Add amdgpu_amdkfd_resume_iommu for amdgpu.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-16 09:56:24 -04:00
James Zhu
fefc01f042 drm/amdkfd: separate kfd_iommu_resume from kfd_resume
Separate kfd_iommu_resume from kfd_resume for fine-tuning
of amdgpu device init/resume/reset/recovery sequence.

v2: squash in fix for !CONFIG_HSA_AMD

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-16 09:56:24 -04:00
Nirmoy Das
b04ce53eac drm/amdgpu: use IS_ERR for debugfs APIs
debugfs APIs returns encoded error so use
IS_ERR for checking return value.

v2: return PTR_ERR(ent)

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-By: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-14 16:21:15 -04:00
Christian König
c92db8d64f drm/amdgpu: fix use after free during BO move
The memory backing old_mem is already freed at that point, move the
check a bit more up.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: bfa3357ef9 ("drm/ttm: allocate resource object instead of embedding it v2")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1699
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-14 16:17:39 -04:00
Ernst Sjöstrand
67a44e6598 drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10
Seems like newer cards can have even more instances now.
Found by UBSAN: array-index-out-of-bounds in
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29
index 8 is out of range for type 'uint32_t *[8]'

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697
Cc: stable@vger.kernel.org
Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 16:14:41 -04:00
xinhui pan
0fcfb30019 drm/amdgpu: Fix a race of IB test
Direct IB submission should be exclusive. So use write lock.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
xinhui pan
405a81ae3f drm/amdgpu: VCN avoid memory allocation during IB test
alloc extra msg from direct IB pool.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
xinhui pan
cb9038aa8a drm/amdgpu: VCE avoid memory allocation during IB test
alloc extra msg from direct IB pool.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
xinhui pan
68331d7cf3 drm/amdgpu: UVD avoid memory allocation during IB test
move BO allocation in sw_init.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
Candice Li
de3a1e3360 drm/amdgpu: Unify PSP TA context
Remove all TA binary structures and add the specific binary
structure in struct ta_context.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
James Zhu
9cec53c18a drm/amdgpu: move iommu_resume before ip init/resume
Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
James Zhu
ea20e246f3 drm/amdgpu: add amdgpu_amdkfd_resume_iommu
Add amdgpu_amdkfd_resume_iommu for amdgpu.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
James Zhu
f8846323d5 drm/amdkfd: separate kfd_iommu_resume from kfd_resume
Separate kfd_iommu_resume from kfd_resume for fine-tuning
of amdgpu device init/resume/reset/recovery sequence.

v2: squash in fix for !CONFIG_HSA_AMD

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:46 -04:00
shaoyunl
8e6d0b6996 drm/amdgpu: Get atomicOps info from Host for sriov setup
The AtomicOp Requester Enable bit is reserved in VFs and the PF value applies to all
associated VFs. so guest driver can not directly enable the atomicOps for VF, it
depends on PF to enable it. In current design, amdgpu driver  will get the enabled
atomicOps bits through private pf2vf data

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:57:11 -04:00
xinhui pan
a7496559e4 drm/amdgpu: Increase direct IB pool size
Direct IB pool is used for vce/vcn IB extra msg too. Increase its size
to AMDGPU_IB_POOL_SIZE.

v2: Squash in unused variable removal

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:56:49 -04:00
John Clements
3771449bc8 drm/amdgpu: Update RAS trigger error block support
Added trigger error support for MP0/MP1/MPIO blocks

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:37:48 -04:00
John Clements
334f81d164 drm/amdgpu: Update RAS status print
Remove uncessary RAS status prints

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:37:41 -04:00
Likun Gao
02f958a20c drm/amdgpu: refactor function to init no-psp fw
Refactor the code of amdgpu_ucode_init_single_fw to make it more
readable as too many ucode need to handle on this function currently.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:37:34 -04:00
Nirmoy Das
62d266b2bd drm/amdgpu: cleanup debugfs for amdgpu rings
Use debugfs_create_file_size API for creating ring debugfs, and as its a
NULL returning API, change the return type for amdgpu_debugfs_ring_init
API as well. Also cleanup surrounding code.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:35:59 -04:00
Nirmoy Das
59715cffce drm/amdgpu: use IS_ERR for debugfs APIs
debugfs APIs returns encoded error so use
IS_ERR for checking return value.

v2: return PTR_ERR(ent)

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-By: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:35:44 -04:00
Maxime Ripard
2f76520561
Merge drm/drm-next into drm-misc-next
Kickstart new drm-misc-next cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-09-14 09:25:30 +02:00
Linus Torvalds
a668acb8f0 drm fixes for 5.15-rc1
ttm:
 - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed.
 - Fix ttm deadlock if target BO isn't idle
 - ttm build fix
 - ttm docs fix
 
 dma-buf:
 - config option fixes
 
 fbdev:
 - limit resolutions to avoid int overflow
 
 i915:
 - stddef change.
 
 amdgpu:
 - Misc cleanups, typo fixes
 - EEPROM fix
 - Add some new PCI IDs
 - Scatter/Gather display support for Yellow Carp
 - PCIe DPM fix for RKL platforms
 - RAS fix
 
 amdkfd:
 - SVM fix
 
 vc4:
 - static function fix
 
 mgag200:
 - fix uninit var
 
 panfrost:
 - lock_region fixes
 
  - Make some dma-buf config options depend on DMA_SHARED_BUFFER.
     - Handle multiplication overflow of fbdev xres/yres in the core.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmE6/HkACgkQDHTzWXnE
 hr4Edw/+PTYtJHSZZbcT/Avcdif1KpEWuBfhq+dd75Tm1SNYXBRe03CqH3d23YnZ
 1I9oZ4TG1St3KaFBrlW5BERyFD2RhAAWJ4bMUz+/bBN9Y2u/r1scVR7YKoqkI2jr
 li1pYoPVLNYrHqdhmhsl7sKOqDRi/0TNvUY/B8tWyEZhTNiMGD9A8Tyv7WJ+iinT
 /mLrR0tCYYrzkvMEVdHt0t8+Bp1nvR/ZSfCS/NavD1CZ4RffENzTnFIhBb1QvCDj
 W1bF4D6930iOS/HXmheVzKygJlz9fj+8PS1DnvIyRPJjXH74dcCn+DPDRVTxyYB1
 3ZSY0I2yFSK0oorN1jYVraDXGB1R0OtIwbdRWvyztqMxaj+gRrSNbSSEcRGAy4YL
 Ipyvd2FyHO1rGxN5CS6FDCkJ/9WxOx1caBF0D3HhZVGxqw/m8qISxS+za8U5lbrT
 90KqHnaWbKL4flfUExjpwPKSvPImgLHN4tqC8l0471i4Tku0unBf8H9RkODkreRU
 fW9GHYCjzxHMwYT0JSHGohsscCvhIhkRYTYlx3bf/1tr0SfYXPiZEJwrJfNTLkZh
 mfm5R+wTL5hGHdDheOldjiGQZsazzxzJv2NK5aAuojVRqJuy3pohiQ72mHP5Wr4M
 9zOKlXbgBDSxTJleN7MJKZhNyanFUaZut+1rhTFeQ4RCUcgqpxc=
 =R62Q
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Just an initial bunch of fixes for the merge window, amdgpu is most of
  them with a few ttm fixes and an fbdev avoid multiply overflow fix.

  core:
   - Make some dma-buf config options depend on DMA_SHARED_BUFFER
   - Handle multiplication overflow of fbdev xres/yres in the core

  ttm:
   - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed
   - Fix ttm deadlock if target BO isn't idle
   - ttm build fix
   - ttm docs fix

  dma-buf:
   - config option fixes

  fbdev:
   - limit resolutions to avoid int overflow

  i915:
   - stddef change.

  amdgpu:
   - Misc cleanups, typo fixes
   - EEPROM fix
   - Add some new PCI IDs
   - Scatter/Gather display support for Yellow Carp
   - PCIe DPM fix for RKL platforms
   - RAS fix

  amdkfd:
   - SVM fix

  vc4:
   - static function fix

  mgag200:
   - fix uninit var

  panfrost:
   - lock_region fixes"

* tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm: (36 commits)
  drm/ttm: Fix a deadlock if the target BO is not idle during swap
  fbmem: don't allow too huge resolutions
  dma-buf: DMABUF_SYSFS_STATS should depend on DMA_SHARED_BUFFER
  dma-buf: DMABUF_DEBUG should depend on DMA_SHARED_BUFFER
  drm/i915: use linux/stddef.h due to "isystem: trim/fixup stdarg.h and other headers"
  dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER
  drm/amdkfd: drop process ref count when xnack disable
  drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode
  drm/amdgpu: fix fdinfo race with process exit
  drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
  drm/amdgpu: stop scheduler when calling hw_fini (v2)
  drm/amdgpu: Clear RAS interrupt status on aldebaran
  drm/amd/display: Initialize lt_settings on instantiation
  drm/amd/display: cleanup idents after a revert
  drm/amd/display: Fix memory leak reported by coverity
  drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resource
  drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum"
  drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform
  drm/amdgpu: show both cmd id and name when psp cmd failed
  drm/amd/display: setup system context for APUs
  ...
2021-09-10 11:22:23 -07:00
Colin Ian King
e8ba4922a2 drm/amdgpu: sdma: clean up identation
There is a statement that is indented incorrectly. Clean it up.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:50 -04:00
Colin Ian King
9ae807f0ec drm/amdgpu: clean up inconsistent indenting
There are a couple of statements that are indented one character
too deeply, clean these up.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:50 -04:00
Christian König
a7181b52ea drm/amdgpu: remove unused amdgpu_bo_validate
Just drop some dead code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:50 -04:00
Christian König
101ba90ff0 drm/amdgpu: fix use after free during BO move
The memory backing old_mem is already freed at that point, move the
check a bit more up.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: bfa3357ef9 ("drm/ttm: allocate resource object instead of embedding it v2")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1699
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:49 -04:00
Candice Li
ac1509d19e drm/amdgpu: Create common PSP TA load function
Creat common PSP TA load function and update PSP ta_mem_context
with size information.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:49 -04:00
Ernst Sjöstrand
cd54323e76 drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10
Seems like newer cards can have even more instances now.
Found by UBSAN: array-index-out-of-bounds in
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29
index 8 is out of range for type 'uint32_t *[8]'

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697
Cc: stable@vger.kernel.org
Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-02 12:39:41 -04:00
Christian König
d72277b6c3 dma-buf: nuke DMA_FENCE_TRACE macros v2
Only the DRM GPU scheduler, radeon and amdgpu where using them and they depend
on a non existing config option to actually emit some code.

v2: keep the signal path as is for now

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210818105443.1578-1-christian.koenig@amd.com
2021-09-02 12:40:52 +02:00
Satyajit Sahu
7d7630fc6b drm/amdgpu:schedule vce/vcn encode based on priority
Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Satyajit Sahu
0ad29a4eb1 drm/amdgpu/vcn: set the priority for each encode ring
VCN has multiple rings. Set the proper priority level for each
encode ring while initializing.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Satyajit Sahu
080e613c74 drm/amdgpu/vce: set the priority for each ring
VCE has multiple rings. Set the proper priority level for each
ring while initializing.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Candice Li
a0a2f7bb22 drm/amd/amdgpu: add mpio to ras block
Add MPIO to RAS block

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Candice Li
25c94b33dd drm/amd/amdgpu: consolidate PSP TA unload function
Create common PSP TA unload function and replace all common TA unloading
sequences.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Tom St Denis
37df9560cd drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)
This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
      for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0xFFFFFFFF for broadcast
      instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Nirmoy Das
34eaf30f9a drm/amdgpu: detach ring priority from gfx priority
Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Nirmoy Das
84d588c3de drm/amdgpu: rework context priority handling
To get a hardware queue priority for a context, we are currently
mapping AMDGPU_CTX_PRIORITY_* to DRM_SCHED_PRIORITY_* and then
to hardware queue priority, which is not the right way to do that
as DRM_SCHED_PRIORITY_* is software scheduler's priority and it is
independent from a hardware queue priority.

Use userspace provided context priority, AMDGPU_CTX_PRIORITY_* to
map a context to proper hardware queue priority.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Linus Torvalds
477f70cd2a drm for v5.15-rc1
core:
 - extract i915 eDP backlight into core
 - DP aux bus support
 - drm_device.irq_enabled removed
 - port drivers to native irq interfaces
 - export gem shadow plane handling for vgem
 - print proper driver name in framebuffer registration
 - driver fixes for implicit fencing rules
 - ARM fixed rate compression modifier added
 - updated fb damage handling
 - rmfb ioctl logging/docs
 - drop drm_gem_object_put_locked
 - define DRM_FORMAT_MAX_PLANES
 - add gem fb vmap/vunmap helpers
 - add lockdep_assert(once) helpers
 - mark drm irq midlayer as legacy
 - use offset adjusted bo mapping conversion
 
 vgaarb:
 - cleanups
 
 fbdev:
 - extend efifb handling to all arches
 - div by 0 fixes for multiple drivers
 
 udmabuf:
 - add hugepage mapping support
 
 dma-buf:
 - non-dynamic exporter fixups
 - document implicit fencing rules
 
 amdgpu:
 - Initial Cyan Skillfish support
 - switch virtual DCE over to vkms based atomic
 - VCN/JPEG power down fixes
 - NAVI PCIE link handling fixes
 - AMD HDMI freesync fixes
 - Yellow Carp + Beige Goby fixes
 - Clockgating/S0ix/SMU/EEPROM fixes
 - embed hw fence in job
 - rework dma-resv handling
 - ensure eviction to system ram
 
 amdkfd:
 - uapi: SVM address range query added
 - sysfs leak fix
 - GPUVM TLB optimizations
 - vmfault/migration counters
 
 i915:
 - Enable JSL and EHL by default
 - preliminary XeHP/DG2 support
 - remove all CNL support (never shipped)
 - move to TTM for discrete memory support
 - allow mixed object mmap handling
 - GEM uAPI spring cleaning
   - add I915_MMAP_OBJECT_FIXED
   - reinstate ADL-P mmap ioctls
   - drop a bunch of unused by userspace features
   - disable and remove GPU relocations
 - revert some i915 misfeatures
 - major refactoring of GuC for Gen11+
 - execbuffer object locking separate step
 - reject caching/set-domain on discrete
 - Enable pipe DMC loading on XE-LPD and ADL-P
 - add PSF GV point support
 - Refactor and fix DDI buffer translations
 - Clean up FBC CFB allocation code
 - Finish INTEL_GEN() and friends macro conversions
 
 nouveau:
 - add eDP backlight support
 - implicit fence fix
 
 msm:
 - a680/7c3 support
 - drm/scheduler conversion
 
 panfrost:
 - rework GPU reset
 
 virtio:
 - fix fencing for planes
 
 ast:
 - add detect support
 
 bochs:
 - move to tiny GPU driver
 
 vc4:
 - use hotplug irqs
 - HDMI codec support
 
 vmwgfx:
 - use internal vmware device headers
 
 ingenic:
 - demidlayering irq
 
 rcar-du:
 - shutdown fixes
 - convert to bridge connector helpers
 
 zynqmp-dsub:
 - misc fixes
 
 mgag200:
 - convert PLL handling to atomic
 
 mediatek:
 - MT8133 AAL support
 - gem mmap object support
 - MT8167 support
 
 etnaviv:
 - NXP Layerscape LS1028A SoC support
 - GEM mmap cleanups
 
 tegra:
 - new user API
 
 exynos:
 - missing unlock fix
 - build warning fix
 - use refcount_t
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmEtvn8ACgkQDHTzWXnE
 hr7aqw//WfcIyGdPLjAz59cW8jm+FgihD5colHtOUYRHRO4GeX/bNNufquR8+N3y
 HESsyZdpihFHms/wURMq41ibmHg0EuHA01HZzjZuGBesG4F9I8sP/HnDOxDuYuAx
 N7Lg4PlUNlfFHmw7Y84owQ6s/XWmNp5iZ8e/mTK5hcraJFQKS4QO74n9RbG/F1vC
 Hc3P6AnpqGac2AEGXt0NjIRxVVCTUIBGx+XOhj+1AMyAGzt9VcO1DS9PVCS0zsEy
 zKMj9tZAPNg0wYsXAi4kA1lK7uVY8KoXSVDYLpsI5Or2/e7mfq2b4EWrezbtp6UA
 H+w86axuwJq7NaYHYH6HqyrLTOmvcHgIl2LoZN91KaNt61xfJT3XZkyQoYViGIrJ
 oZy6X/+s+WPoW98bHZrr6vbcxtWKfEeQyUFEAaDMmraKNJwROjtwgFC9DP8MDctq
 PUSM+XkwbGRRxQfv9dNKufeWfV5blVfzEJO8EfTU1YET3WTDaUHe/FoIcLZt2DZG
 JAJgZkIlU8egthPdakUjQz/KoyLMyovcN5zcjgzgjA9PyNEq74uElN9l446kSSxu
 jEVErOdd+aG3Zzk7/ZZL/RmpNQpPfpQ2RaPUkgeUsW01myNzUNuU3KUDaSlVa+Oi
 1n7eKoaQ2to/+LjhYApVriri4hIZckNNn5FnnhkgwGi8mpHQIVQ=
 =vZkA
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Highlights:

   - i915 has seen a lot of refactoring and uAPI cleanups due to a
     change in the upstream direction going forward

     This has all been audited with known userspace, but there may be
     some pitfalls that were missed.

   - i915 now uses common TTM to enable discrete memory on DG1/2 GPUs

   - i915 enables Jasper and Elkhart Lake by default and has preliminary
     XeHP/DG2 support

   - amdgpu adds support for Cyan Skillfish

   - lots of implicit fencing rules documented and fixed up in drivers

   - msm now uses the core scheduler

   - the irq midlayer has been removed for non-legacy drivers

   - the sysfb code now works on more than x86.

  Otherwise the usual smattering of stuff everywhere, panels, bridges,
  refactorings.

  Detailed summary:

  core:
   - extract i915 eDP backlight into core
   - DP aux bus support
   - drm_device.irq_enabled removed
   - port drivers to native irq interfaces
   - export gem shadow plane handling for vgem
   - print proper driver name in framebuffer registration
   - driver fixes for implicit fencing rules
   - ARM fixed rate compression modifier added
   - updated fb damage handling
   - rmfb ioctl logging/docs
   - drop drm_gem_object_put_locked
   - define DRM_FORMAT_MAX_PLANES
   - add gem fb vmap/vunmap helpers
   - add lockdep_assert(once) helpers
   - mark drm irq midlayer as legacy
   - use offset adjusted bo mapping conversion

  vgaarb:
   - cleanups

  fbdev:
   - extend efifb handling to all arches
   - div by 0 fixes for multiple drivers

  udmabuf:
   - add hugepage mapping support

  dma-buf:
   - non-dynamic exporter fixups
   - document implicit fencing rules

  amdgpu:
   - Initial Cyan Skillfish support
   - switch virtual DCE over to vkms based atomic
   - VCN/JPEG power down fixes
   - NAVI PCIE link handling fixes
   - AMD HDMI freesync fixes
   - Yellow Carp + Beige Goby fixes
   - Clockgating/S0ix/SMU/EEPROM fixes
   - embed hw fence in job
   - rework dma-resv handling
   - ensure eviction to system ram

  amdkfd:
   - uapi: SVM address range query added
   - sysfs leak fix
   - GPUVM TLB optimizations
   - vmfault/migration counters

  i915:
   - Enable JSL and EHL by default
   - preliminary XeHP/DG2 support
   - remove all CNL support (never shipped)
   - move to TTM for discrete memory support
   - allow mixed object mmap handling
   - GEM uAPI spring cleaning
       - add I915_MMAP_OBJECT_FIXED
       - reinstate ADL-P mmap ioctls
       - drop a bunch of unused by userspace features
       - disable and remove GPU relocations
   - revert some i915 misfeatures
   - major refactoring of GuC for Gen11+
   - execbuffer object locking separate step
   - reject caching/set-domain on discrete
   - Enable pipe DMC loading on XE-LPD and ADL-P
   - add PSF GV point support
   - Refactor and fix DDI buffer translations
   - Clean up FBC CFB allocation code
   - Finish INTEL_GEN() and friends macro conversions

  nouveau:
   - add eDP backlight support
   - implicit fence fix

  msm:
   - a680/7c3 support
   - drm/scheduler conversion

  panfrost:
   - rework GPU reset

  virtio:
   - fix fencing for planes

  ast:
   - add detect support

  bochs:
   - move to tiny GPU driver

  vc4:
   - use hotplug irqs
   - HDMI codec support

  vmwgfx:
   - use internal vmware device headers

  ingenic:
   - demidlayering irq

  rcar-du:
   - shutdown fixes
   - convert to bridge connector helpers

  zynqmp-dsub:
   - misc fixes

  mgag200:
   - convert PLL handling to atomic

  mediatek:
   - MT8133 AAL support
   - gem mmap object support
   - MT8167 support

  etnaviv:
   - NXP Layerscape LS1028A SoC support
   - GEM mmap cleanups

  tegra:
   - new user API

  exynos:
   - missing unlock fix
   - build warning fix
   - use refcount_t"

* tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm: (1318 commits)
  drm/amd/display: Move AllowDRAMSelfRefreshOrDRAMClockChangeInVblank to bounding box
  drm/amd/display: Remove duplicate dml init
  drm/amd/display: Update bounding box states (v2)
  drm/amd/display: Update number of DCN3 clock states
  drm/amdgpu: disable GFX CGCG in aldebaran
  drm/amdgpu: Clear RAS interrupt status on aldebaran
  drm/amdgpu: Add support for RAS XGMI err query
  drm/amdkfd: Account for SH/SE count when setting up cu masks.
  drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain
  drm/amdgpu: drop redundant cancel_delayed_work_sync call
  drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend
  drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend
  drm/amdkfd: map SVM range with correct access permission
  drm/amdkfd: check access permisson to restore retry fault
  drm/amdgpu: Update RAS XGMI Error Query
  drm/amdgpu: Add driver infrastructure for MCA RAS
  drm/amd/display: Add Logging for HDMI color depth information
  drm/amd/amdgpu: consolidate PSP TA init shared buf functions
  drm/amd/amdgpu: add name field back to ras_common_if
  drm/amdgpu: Fix build with missing pm_suspend_target_state module export
  ...
2021-09-01 11:26:46 -07:00
Philip Yang
d7eff46c21 drm/amdgpu: fix fdinfo race with process exit
Get process vm root BO ref in case process is exiting and root BO is
freed, to avoid NULL pointer dereference backtrace:

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
Call Trace:
amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
seq_show+0x12c/0x180
seq_read+0x153/0x410
vfs_read+0x91/0x140[ 3427.206183]  ksys_read+0x4f/0xb0
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:40 -04:00
xinhui pan
703677d934 drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
Fall through to handle the error instead of return.

Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")
Cc: stable@vger.kernel.org
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:08 -04:00
Guchun Chen
f7d6779df6 drm/amdgpu: stop scheduler when calling hw_fini (v2)
This gurantees no more work on the ring can be submitted
to hardware in suspend/resume case, otherwise a potential
race will occur and the ring will get no chance to stay
empty before suspend.

v2: Call drm_sched_resubmit_job before drm_sched_start to
restart jobs from the pending list.

Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-31 14:19:47 -04:00
John Clements
156872b07e drm/amdgpu: Clear RAS interrupt status on aldebaran
Resolve incorrect register address

Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:19:37 -04:00
Linus Torvalds
7d6e3fa87e Updates to the interrupt core and driver subsystems:
Core changes:
 
    - The usual set of small fixes and improvements all over the place, but nothing
      outstanding
 
 MSI changes:
 
    - Further consolidation of the PCI/MSI interrupt chip code
 
    - Make MSI sysfs code independent of PCI/MSI and expose the MSI interrupts
      of platform devices in the same way as PCI exposes them.
 
 Driver changes:
 
    - Support for ARM GICv3 EPPI partitions
 
    - Treewide conversion to generic_handle_domain_irq() for all chained
      interrupt controllers
 
    - Conversion to bitmap_zalloc() throughout the irq chip drivers
 
    - The usual set of small fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmEsnpsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoS+/EACQdpRkzl3IDIYqThxVZ8KQzp2rKKVn
 qisAQiWg/6koNJx/yYy62KNAUyKjCIObNtRnWi7OAOx6OvNtQTD2WOLAwkh3Pgw1
 8ePYYl55k+yCs8VoITsZM9jYeO+Tk878pU2A6R943zR+g6G7bskGJrxEyZ9TbzIe
 qKfusNKnRY9/jMQaRALUAAtA9VIVR867GqORX5X8hKz8yE2rqlpb4y+1CFba5BTV
 Vlxw7cIXvXBn7BKAom5diRqEGDNJEbX+56jJ7yDZshgLo7m11D7QLw72kmb6TNVC
 g7PchvFi4afpc1ifEAAp0tk4RiSIAQ91nS3n0+jLcLbodOjIkl14eY02ZCJGAP29
 uslyzUbmy1wgejG6CA63JtZ4MYdrf/OSMGuoN78qnOKYcIsWFzOvlJmBWWNW34qW
 LCaUF9QdJ/slXu6B4vIx30GfN9q4myml8bFUobE5q9mBRrEk4R0B7iyBvPu1xKYr
 ZEan67prI5VEu+afJGpp4r294m4HNVkMLfl3nYmE5+y4MoLeMNKDY3IPTvI9iP4G
 kaFgoPvQo23WnuclNYpJ+CaA4aRASlB2nTY+oAXIYfehbey9EW5vq4/EK864ek6w
 oyUTepxxNhE81tG2jpQbf2tR4COsEHy986clxqPP4AvsZXcbypCw8O2FcflpQbHO
 5DLEAfTmp7cziQ==
 =qyll
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates to the interrupt core and driver subsystems:

  Core changes:

   - The usual set of small fixes and improvements all over the place,
     but nothing stands out

  MSI changes:

   - Further consolidation of the PCI/MSI interrupt chip code

   - Make MSI sysfs code independent of PCI/MSI and expose the MSI
     interrupts of platform devices in the same way as PCI exposes them.

  Driver changes:

   - Support for ARM GICv3 EPPI partitions

   - Treewide conversion to generic_handle_domain_irq() for all chained
     interrupt controllers

   - Conversion to bitmap_zalloc() throughout the irq chip drivers

   - The usual set of small fixes and improvements"

* tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  platform-msi: Add ABI to show msi_irqs of platform devices
  genirq/msi: Move MSI sysfs handling from PCI to MSI core
  genirq/cpuhotplug: Demote debug printk to KERN_DEBUG
  irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy
  irqdomain: Export irq_domain_disconnect_hierarchy()
  irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
  irqchip/apple-aic: Fix irq_disable from within irq handlers
  pinctrl/rockchip: drop the gpio related codes
  gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type
  gpio/rockchip: support next version gpio controller
  gpio/rockchip: use struct rockchip_gpio_regs for gpio controller
  gpio/rockchip: add driver for rockchip gpio
  dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank
  pinctrl/rockchip: add pinctrl device to gpio bank struct
  pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
  pinctrl/rockchip: always enable clock for gpio controller
  genirq: Fix kernel doc indentation
  EDAC/altera: Convert to generic_handle_domain_irq()
  powerpc: Bulk conversion to generic_handle_domain_irq()
  nios2: Bulk conversion to generic_handle_domain_irq()
  ...
2021-08-30 14:38:37 -07:00
Lang Yu
50c6dedeb1 drm/amdgpu: show both cmd id and name when psp cmd failed
To cover the corner case that people want to know the ID
of an UNKNOWN CMD.

Suggested-by: John Clements <john.clements@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 15:09:21 -04:00
Nicholas Kazlauskas
c5d3c9a093 drm/amdgpu: Enable S/G for Yellow Carp
Missing code for Yellow Carp to enable scatter gather - follows how
DCN21 support was added.

Tested that 8k framebuffer allocation and display can now succeed after
applying the patch.

v2: Add hookup in DM

Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-30 15:08:38 -04:00
Evan Quan
602e338ffe drm/amdgpu: reenable BACO support for 699F:C7 polaris12 SKU
This reverts the commit below:
"drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarily".
As the S3 hang issue has been fixed by another commit:
"drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend".

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
YuBiao Wang
64261a0d06 drm/amd/amdgpu: Add ready_to_reset resp for vega10
Send response to host after received the flr notification from host.
Port NV change to vega10.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Alex Deucher
8f0c93f454 drm/amdgpu: add some additional RDNA2 PCI IDs
New PCI IDs.

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Yifan Zhang
6333a495f5 drm/amdgpu: correct comments in memory type managers
The parameters were renamed.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Luben Tuikov
cc947bf91b drm/amdgpu: Process any VBIOS RAS EEPROM address
We can now process any RAS EEPROM address from
VBIOS. Generalize so as to compute the top three
bits of the 19-bit EEPROM address, from any byte
returned as the "i2c address" from VBIOS.

Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Luben Tuikov
a6a355a22f drm/amdgpu: Fixes to returning VBIOS RAS EEPROM address
1) Generalize the function--if the user didn't set
   i2c_address, still return true/false to
   indicate whether VBIOS contains the RAS EEPROM
   address.  This function shouldn't evaluate
   whether the user set the i2c_address pointer or
   not.

2) Don't touch the caller's i2c_address, unless
   you have to--this function shouldn't have side
   effects.

3) Correctly set the function comment as a
   kernel-doc comment.

Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Daniel Vetter
0e10e9a1db drm/sched: drop entity parameter from drm_sched_push_job
Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com> (v1)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Melissa Wen <mwen@igalia.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch
2021-08-30 10:54:45 +02:00
Daniel Vetter
dbe48d030b drm/sched: Split drm_sched_job_init
This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
  usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
  to be moved into drm_sched_job_arm, which made me realize that the
  job->id definitely needs to be moved too.

  Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't
be a problem where it is now.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Steven Price <steven.price@arm.com> (v2)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch
2021-08-30 10:50:44 +02:00
Dave Airlie
8f0284f190 Merge tag 'amd-drm-next-5.15-2021-08-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.15-2021-08-27:

amdgpu:
- PLL fix for SI
- Misc code cleanups
- RAS fixes
- PSP cleanups
- Polaris UVD/VCE suspend fixes
- aldebaran fixes
- DCN3.x mclk fixes

amdkfd:
- CWSR fixes for arcturus and aldebaran
- SVM fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210827192336.4649-1-alexander.deucher@amd.com
2021-08-30 09:06:03 +10:00
Thomas Gleixner
47fb0cfdb7 irqchip updates for Linux 5.15
API updates:
 
 - Treewide conversion to generic_handle_domain_irq() for anything
   that looks like a chained interrupt controller
 
 - Update the irqdomain documentation
 
 - Use of bitmap_zalloc() throughout the tree
 
 New functionalities:
 
 - Support for GICv3 EPPI partitions
 
 Fixes:
 
 - Qualcomm PDC hierarchy fixes
 
 - Yet another priority decoding fix for the GICv3 pseudo-NMIs
 
 - Fix the apple-aic driver irq_eoi() callback to always unmask
   the interrupt
 
 - Properly handle edge interrupts on loongson-pch-pic
 
 - Let the mtk-sysirq driver advertise IRQCHIP_SKIP_SET_WAKE
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmEqIhgPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDjPkP/Rtp6WNZ1QUfJWmHovnh/Wc6ob1DXcBwi9nX
 hy4miIJ1SWuez9G49RlQAiXZoB28B6KKCKKmouiqu7ke7WUhifS0K1ej188wjxRX
 dqRG+m9yBAqKSr0lyWLB5VVCc8XBz4oZTc28n585gHiXfAPv7u0EzW+zNrnloLU5
 NrAj6ppGFUzVT0VxRqcurbymE6OwRWjc3D+z/PhtHZ4SFOhft95CXgsdvMklqyLj
 wwiuZ0Dhj5EruSP/Z7DzbbXnMNmte3HC2/cUNPYkho4/rk+2gVnYv5kVdfPHKQCY
 Wjti/kvuPC3hdTvdw8g7VQfP63R3clZhcQ8s+myoeX5LWzyAHpoxAtdsbX7oVsgs
 aKyrFhddEFVuiFizYyweS89pL0kCkTob8/zlGeuhRiVRTZ3+kG7Zf2UTTnN1ZdLw
 2lMolghiGk4LYJfr83+CDZyYP/VGHDCthfrmd//l39P2wJhkuCDbbeKaElLGWvUt
 abnPf0buCRqMAJe7vh0GHCx7290nEh2IqyHR4AYRVhRaN7bfAXdZH9Xp6ZGT25Fz
 uORgUbGAyhd5Ics/7twE4qeOkfJ6fxwgXsOlx90EfgVYyDJ1sBHNx8Buo2z8Bl/2
 rwCsW49kU7yX/wp11sJctR72RuLKC23dxS6z7aSWkRc6k3u+8xl2eeLIN59FNrKZ
 ToTdbXEQ
 =bdm2
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

- API updates:

  - Treewide conversion to generic_handle_domain_irq() for anything
    that looks like a chained interrupt controller

  - Update the irqdomain documentation

  - Use of bitmap_zalloc() throughout the tree

- New functionalities:

  - Support for GICv3 EPPI partitions

- Fixes:

  - Qualcomm PDC hierarchy fixes

  - Yet another priority decoding fix for the GICv3 pseudo-NMIs

  - Fix the apple-aic driver irq_eoi() callback to always unmask
    the interrupt

  - Properly handle edge interrupts on loongson-pch-pic

  - Let the mtk-sysirq driver advertise IRQCHIP_SKIP_SET_WAKE

Link: https://lore.kernel.org/r/20210828121013.2647964-1-maz@kernel.org
2021-08-29 21:19:50 +02:00