Commit Graph

765 Commits

Author SHA1 Message Date
Yong Zhao
e694530418 drm/amdkfd: Avoid ambiguity by indicating it's cp queue
The queues represented in queue_bitmap are only CP queues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:20:05 -05:00
Yong Zhao
81b820b304 drm/amdkfd: Rename queue_count to active_queue_count
The name is easier to understand the code.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:19:38 -05:00
Divya Shikre
0c663695a6 drm/amd: Extend ROCt to surface UUID for devices that have them
Devices from Arcturus onwards will have their UUID exposed to Thunk.
Adding neccessary functions to the kernel to propagate the uuid.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:18:17 -05:00
Rajneesh Bhardwaj
9593f4d6a6 drm/amdkfd: refactor runtime pm for baco
So far the kfd driver implemented same routines for runtime and system
wide suspend and resume (s2idle or mem). During system wide suspend the
kfd aquires an atomic lock that prevents any more user processes to
create queues and interact with kfd driver and amd gpu. This mechanism
created problem when amdgpu device is runtime suspended with BACO
enabled. Any application that relies on kfd driver fails to load because
the driver reports a locked kfd device since gpu is runtime suspended.

However, in an ideal case, when gpu is runtime  suspended the kfd driver
should be able to:

 - auto resume amdgpu driver whenever a client requests compute service
 - prevent runtime suspend for amdgpu  while kfd is in use

This change refactors the amdgpu and amdkfd drivers to support BACO and
runtime power management.

Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:00:54 -05:00
Rajneesh Bhardwaj
3c1224c02e drm/amdkfd: show warning when kfd is locked
During system suspend the kfd driver aquires a lock that prohibits
further kfd actions unless the gpu is resumed. This adds some info which
can be useful while debugging.

Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:00:47 -05:00
Amber Lin
6d220a7e79 drm/amdkfd: Add queue information to sysfs
Provide compute queues information in sysfs under /sys/class/kfd/kfd/proc.
The format is /sys/class/kfd/kfd/proc/<pid>/queues/<queue id>/XX where
XX are size, type, and gpuid three files to represent queue size, queue
type, and the GPU this queue uses. <queue id> folder and files underneath
are generated when a queue is created. They are removed when the queue is
destroyed.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-06 15:04:38 -05:00
Yong Zhao
f38abc15d1 drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode
The sdma_queue_count increment should be done before
execute_queues_cpsch(), which calls pm_calc_rlib_size() where
sdma_queue_count is used to calculate whether over_subscription is
triggered.

With the previous code, when a SDMA queue is created,
compute_queue_count in pm_calc_rlib_size() is one more than the
actual compute queue number, because the queue_count has been
incremented while sdma_queue_count has not. This patch fixes that.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 10:32:41 -05:00
Yong Zhao
5205503929 drm/amdkfd: Add a message when SW scheduler is used
SW scheduler is previously called non HW scheduler, or non HWS. This
message is useful when triaging issues from dmesg.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:38:07 -05:00
Huang Rui
8eee00f615 drm/amdkfd: use map_queues for hiq on gfx v10 as well
To align with gfx v9, we use the map_queues packet to load hiq MQD.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:57 -05:00
Aaron Liu
35cd89d5a6 drm/amdkfd: use kiq to load the mqd of hiq queue for gfx v9 (v6)
There is an issue that CP will check the HIQ queue to be configured and mapped
with KIQ ring, otherwise, it will be unable to read back the secure buffer while
the gfxoff is enabled even with trusted IP blocks.

v1 -> v2:
- Fix to remove surplus set_resources packets.
- Fill the whole configuration in MQD.
- Change the author as Aaron because he addressed the key point of this issue.
- Add kiq ring lock.

v2 -> v3:
- Free the lock while in error return case.
- Remove the programming only needed by the queue is unmapped.

v3 -> v4:
- Remove doorbell programming because it's used for restarting queue.
- Remove CP scheduler programming because map_queue packet will handle this.

v4 -> v5:
- Remove cp_hqd_active because mec ucode will enable it while use map_queues.
- Revise goto out_unlock.
- Correct the right doorbell offset for HIQ that kfd driver assigned in the
  packet.

v5 -> v6:
- Merge Arcturus fix into this patch because it will get oops in Arcturus
  platform.

Reported-by: Lisa Saturday <Lisa.Saturday@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:50 -05:00
Alex Sierra
ffa022696f drm/amdgpu: GPU TLB flush API moved to amdgpu_amdkfd
[Why]
TLB flush method has been deprecated using kfd2kgd interface.
This implementation is now on the amdgpu_amdkfd API.

[How]
TLB flush functions now implemented in amdgpu_amdkfd.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:33 -05:00
Felix Kuehling
0f899fd466 drm/amdkfd: Improve kfd_process lookup in kfd_ioctl
Use filep->private_data to store a pointer to the kfd_process data
structure. Take an extra reference for that, which gets released in
the kfd_release callback. Check that the process calling kfd_ioctl
is the same that opened the file descriptor. Return -EBADF if it's
not, so that this error can be distinguished in user mode.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-09 16:08:19 -05:00
Felix Kuehling
c2a77fde10 drm/amdkfd: Avoid hanging hardware in stop_cpsch
Don't use the HWS if it's known to be hanging. In a reset also
don't try to destroy the HIQ because that may hang on SRIOV if the
KIQ is unresponsive.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: shaoyunl  <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:55:04 -05:00
Felix Kuehling
09c34e8d7a drm/amdkfd: Improve HWS hang detection and handling
Move HWS hang detection into unmap_queues_cpsch to catch hangs in all
cases. If this happens during a reset, don't schedule another reset
because the reset already in progress is expected to take care of it.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: shaoyunl  <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:54:56 -05:00
Felix Kuehling
63e088acfc drm/amdkfd: Remove unused variable
dqm->pipeline_mem wasn't used anywhere.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: shaoyunl  <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:54:50 -05:00
Felix Kuehling
2bdac179e2 drm/amdkfd: Fix permissions of hang_hws
Reading from /sys/kernel/debug/kfd/hang_hws would cause a kernel
oops because we didn't implement a read callback. Set the permission
to write-only to prevent that.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: shaoyunl  <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:54:30 -05:00
Huang Rui
f4feb9fa45 drm/amdkfd: expose num_cp_queues data field to topology node (v2)
Thunk driver would like to know the num_cp_queues data, however this data relied
on different asic specific. So it's better to get it from kfd driver.

v2: don't update name size.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-19 10:10:18 -05:00
Huang Rui
bb71c74db3 drm/amdkfd: expose num_sdma_queues_per_engine data field to topology node (v2)
Thunk driver would like to know the num_sdma_queues_per_engine data, however
this data relied on different asic specific. So it's better to get it from kfd
driver.

v2: don't update the name size.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-19 10:10:05 -05:00
Philip Yang
b3eca59d99 drm/amdkfd: queue kfd interrupt work to different CPU
Because queue_work schedule the work on the same CPU the interrupt
handler is running, if there are many interrupts pending, it takes
longer time for work queue to start, or even worse system will hang.

v2: queue work to same NUMA node for better cache locality
v3: handle cpumask_next wraparound case

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Eric Huang <JinhuiEric.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:05 -05:00
Daniel Vetter
be452c4e8d Merge tag 'drm-next-5.6-2019-12-11' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.6-2019-12-11:

amdgpu:
- Add MST atomic routines
- Add support for DMCUB (new helper microengine for displays)
- Add OEM i2c support in DC
- Use vstartup for vblank events on DCN
- Simplify Kconfig for DC
- Renoir fixes for DC
- Clean up function pointers in DC
- Initial support for HDCP 2.x
- Misc code cleanups
- GFX10 fixes
- Rework JPEG engine handling for VCN
- Add clock and power gating support for JPEG
- BACO support for Arcturus
- Cleanup PSP ring handling
- Add framework for using BACO with runtime pm to save power
- Move core pci state handling out of the driver for pm ops
- Allow guest power control in 1 VF case with SR-IOV
- SR-IOV fixes
- RAS fixes
- Support for power metrics on renoir
- Golden settings updates for gfx10
- Enable gfxoff on supported navi10 skus
- Update MAINTAINERS

amdkfd:
- Clean up generational gfx code
- Fixes for gfx10
- DIQ fixes
- Share more code with amdgpu

radeon:
- PPC DMA fix
- Register checker fixes for r1xx/r2xx
- Misc cleanups

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211223020.7510-1-alexander.deucher@amd.com
2019-12-17 18:47:46 +01:00
Dave Airlie
d16f0f6140 Merge tag 'drm-fixes-5.5-2019-12-12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
drm-fixes-5.5-2019-12-12:

amdgpu:
- DC fixes for renoir
- Gfx8 fence flush align with mesa
- Power profile fix for arcturus
- Freesync fix
- DC I2c over aux fix
- DC aux defer fix
- GPU reset fix
- GPUVM invalidation semaphore fixes for PCO and SR-IOV
- Golden settings updates for gfx10

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191212223211.8034-1-alexander.deucher@amd.com
2019-12-13 14:50:01 +10:00
Alex Deucher
ad808910be drm/amdgpu: fix license on Kconfig and Makefiles
amdgpu is MIT licensed.

Fixes: ec8f24b7fa ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:08 -05:00
Alex Deucher
bd95c14452 drm/amdgpu: fix license on Kconfig and Makefiles
amdgpu is MIT licensed.

Fixes: ec8f24b7fa ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 14:29:38 -05:00
Linus Torvalds
7ada90eb9c drm msm + fixes for 5.5-rc1
msm-next:
 - OCMEM support for a3xx and a4xx GPUs.
 - a510 support + display support
 
 core:
 - mst payload deletion fix
 
 i915:
 - uapi alignment fix
 - fix for power usage regression due to security fixes
 - change default preemption timeout to 640ms from 100ms
 - EHL voltage level display fixes
 - TGL DGL PHY fix
 - gvt - MI_ATOMIC cmd parser fix, CFL non-priv warning
 - CI spotted deadlock fix
 - EHL port D programming fix
 
 amdgpu:
 - VRAM lost fixes on BACO for CI/VI
 - navi14 DC fixes
 - misc SR-IOV, gfx10 fixes
 - XGMI fixes for arcturus
 - SRIOV fixes
 
 amdkfd:
 - KFD on ppc64le enabled
 - page table optimisations
 
 radeon:
 - fix for r1xx/2xx register checker.
 
 tegra:
 - displayport regression fixes
 - DMA API regression fixes
 
 mgag200:
 - fix devices that can't scanout except at 0 addr
 
 omap:
 - fix dma_addr refcounting
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJd6cqnAAoJEAx081l5xIa+YR0P/A0LkilEbSnF/k7zKDjm0HN8
 JGsf9ZfQRGA2y8URoLRtNdFjZfyuTSpiDSxsbDI0ShBhRimGHyCSxAJXO42vp8q3
 jE57jBoaTSiGtagSO3nxrc1vQP7CfUpaggC2ilKSmcVvTrlqip6iPx7s2PoNyQYc
 GRVUhkcylnZK5UrMiE8Yz/iNcy3Mh0X8bJQKXMEYxpW2KA3SL4qxuRlYIxXEoMyB
 4MlWEV09wHTduf1uYuKdusHjILgR5EiVOdmbvpM92obqZOTokt5/S20TEdhFqiy0
 0IHxuEkgVx+trXzGFbmqgh2I7BZvZIbKVCSnBT4AXAvUEJ99kGTdEP0I6uOp2lsC
 1DCm+7/hcI8BlwmwC9N6ogUwoAzKn7DNc1urcet/0QVbnZLZlueUK/6fSgUNnUYe
 miOeMNBmfHr83b75MpnNxYVoyz5S+/DFbtUplYKqxgjDYfiWWceSSE47NB+IHAiI
 RVpz3AxGpKaw4/w5l2q8VuToWZxdO85TNjgVCTmKfwlYjIbEuveWpZNFqO/GHMm9
 x50f4ZYVOjU2TEPnLQNTIJOgv71JrTpoAdFzPVwCeWUf4h4Y4lVLgTLvdG1JLcw+
 k9BrA5z2R0kjzPtabRhS6WfSjpgSbY3DgY9hfi+HIUmKvZq4fdtAbBlp1oGSXJ9N
 zkVrs9eE6Ahkcndi6ZV9
 =3cs2
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-12-06' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Dave Airlie:
 "Rob pointed out I missed his pull request for msm-next, it's been in
  next for a while outside of my tree so shouldn't cause any unexpected
  issues, it has some OCMEM support in drivers/soc that is acked by
  other maintainers as it's outside my tree.

  Otherwise it's a usual fixes pull, i915, amdgpu, the main ones, with
  some tegra, omap, mgag200 and one core fix.

  Summary:

  msm-next:
   - OCMEM support for a3xx and a4xx GPUs.
   - a510 support + display support

  core:
   - mst payload deletion fix

  i915:
   - uapi alignment fix
   - fix for power usage regression due to security fixes
   - change default preemption timeout to 640ms from 100ms
   - EHL voltage level display fixes
   - TGL DGL PHY fix
   - gvt - MI_ATOMIC cmd parser fix, CFL non-priv warning
   - CI spotted deadlock fix
   - EHL port D programming fix

  amdgpu:
   - VRAM lost fixes on BACO for CI/VI
   - navi14 DC fixes
   - misc SR-IOV, gfx10 fixes
   - XGMI fixes for arcturus
   - SRIOV fixes

  amdkfd:
   - KFD on ppc64le enabled
   - page table optimisations

  radeon:
   - fix for r1xx/2xx register checker.

  tegra:
   - displayport regression fixes
   - DMA API regression fixes

  mgag200:
   - fix devices that can't scanout except at 0 addr

  omap:
   - fix dma_addr refcounting"

* tag 'drm-next-2019-12-06' of git://anongit.freedesktop.org/drm/drm: (100 commits)
  drm/dp_mst: Correct the bug in drm_dp_update_payload_part1()
  drm/omap: fix dma_addr refcounting
  drm/tegra: Run hub cleanup on ->remove()
  drm/tegra: sor: Make the +5V HDMI supply optional
  drm/tegra: Silence expected errors on IOMMU attach
  drm/tegra: vic: Export module device table
  drm/tegra: sor: Implement system suspend/resume
  drm/tegra: Use proper IOVA address for cursor image
  drm/tegra: gem: Remove premature import restrictions
  drm/tegra: gem: Properly pin imported buffers
  drm/tegra: hub: Remove bogus connection mutex check
  ia64: agp: Replace empty define with do while
  agp: Add bridge parameter documentation
  agp: remove unused variable num_segments
  agp: move AGPGART_MINOR to include/linux/miscdevice.h
  agp: remove unused variable size in agp_generic_create_gatt_table
  drm/dp_mst: Fix build on systems with STACKTRACE_SUPPORT=n
  drm/radeon: fix r1xx/r2xx register checker for POT textures
  drm/amdgpu: fix GFX10 missing CSIB set(v3)
  drm/amdgpu: should stop GFX ring in hw_fini
  ...
2019-12-06 10:28:09 -08:00
Yong Zhao
a5a4d68c93 drm/amdkfd: Eliminate unnecessary kernel queue function pointers
Up to this point, those functions are all the same for all ASICs, so
no need to call them by functions pointers. Removing the function
pointers will greatly increase the code readablity. If there is ever
need for those function pointers, we can add it back then.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:24:36 -05:00
Linus Torvalds
0da522107e compat_ioctl: remove most of fs/compat_ioctl.c
As part of the cleanup of some remaining y2038 issues, I came to
 fs/compat_ioctl.c, which still has a couple of commands that need support
 for time64_t.
 
 In completely unrelated work, I spent time on cleaning up parts of this
 file in the past, moving things out into drivers instead.
 
 After Al Viro reviewed an earlier version of this series and did a lot
 more of that cleanup, I decided to try to completely eliminate the rest
 of it and move it all into drivers.
 
 This series incorporates some of Al's work and many patches of my own,
 but in the end stops short of actually removing the last part, which is
 the scsi ioctl handlers. I have patches for those as well, but they need
 more testing or possibly a rewrite.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJdsHCdAAoJEJpsee/mABjZtYkP/1JGl3jFv3Iq/5BCdPkaePP1
 RtMJRNfURgK3GeuHUui330PvVjI/pLWXU/VXMK2MPTASpJLzYz3uCaZrpVWEMpDZ
 +ImzGmgJkITlW1uWU3zOcQhOxTyb1hCZ0Ci+2xn9QAmyOL7prXoXCXDWv3h6iyiF
 lwG+nW+HNtyx41YG+9bRfKNoG0ZJ+nkJ70BV6u0acQHXWn7Xuupa9YUmBL87hxAL
 6dlJfLTJg6q8QSv/Q6LxslfWk2Ti8OOJZOwtFM5R8Bgl0iUcvshiRCKfv/3t9jXD
 dJNvF1uq8z+gracWK49Qsfq5dnZ2ZxHFUo9u0NjbCrxNvWH/sdvhbaUBuJI75seH
 VIznCkdxFhrqitJJ8KmxANxG08u+9zSKjSlxG2SmlA4qFx/AoStoHwQXcogJscNb
 YIXYKmWBvwPzYu09QFAXdHFPmZvp/3HhMWU6o92lvDhsDwzkSGt3XKhCJea4DCaT
 m+oCcoACqSWhMwdbJOEFofSub4bY43s5iaYuKes+c8O261/Dwg6v/pgIVez9mxXm
 TBnvCsotq5m8wbwzv99eFqGeJH8zpDHrXxEtRR5KQqMqjLq/OQVaEzmpHZTEuK7n
 e/V/PAKo2/V63g4k6GApQXDxnjwT+m0aWToWoeEzPYXS6KmtWC91r4bWtslu3rdl
 bN65armTm7bFFR32Avnu
 =lgCl
 -----END PGP SIGNATURE-----

Merge tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground

Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann:
 "As part of the cleanup of some remaining y2038 issues, I came to
  fs/compat_ioctl.c, which still has a couple of commands that need
  support for time64_t.

  In completely unrelated work, I spent time on cleaning up parts of
  this file in the past, moving things out into drivers instead.

  After Al Viro reviewed an earlier version of this series and did a lot
  more of that cleanup, I decided to try to completely eliminate the
  rest of it and move it all into drivers.

  This series incorporates some of Al's work and many patches of my own,
  but in the end stops short of actually removing the last part, which
  is the scsi ioctl handlers. I have patches for those as well, but they
  need more testing or possibly a rewrite"

* tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits)
  scsi: sd: enable compat ioctls for sed-opal
  pktcdvd: add compat_ioctl handler
  compat_ioctl: move SG_GET_REQUEST_TABLE handling
  compat_ioctl: ppp: move simple commands into ppp_generic.c
  compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t
  compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic
  compat_ioctl: unify copy-in of ppp filters
  tty: handle compat PPP ioctls
  compat_ioctl: move SIOCOUTQ out of compat_ioctl.c
  compat_ioctl: handle SIOCOUTQNSD
  af_unix: add compat_ioctl support
  compat_ioctl: reimplement SG_IO handling
  compat_ioctl: move WDIOC handling into wdt drivers
  fs: compat_ioctl: move FITRIM emulation into file systems
  gfs2: add compat_ioctl support
  compat_ioctl: remove unused convert_in_user macro
  compat_ioctl: remove last RAID handling code
  compat_ioctl: remove /dev/raw ioctl translation
  compat_ioctl: remove PCI ioctl translation
  compat_ioctl: remove joystick ioctl translation
  ...
2019-12-01 13:46:15 -08:00
Timothy Pearson
c38402fe6c amdgpu: Enable KFD on POWER systems
KFD has been verified to function on POWER systems (Talos II / Vega 64).
It should be available as a kernel configuration option on these systems.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 12:24:53 -05:00
Timothy Pearson
70ebe8a482 amdgpu: Enable KFD on POWER systems
KFD has been verified to function on POWER systems (Talos II / Vega 64).
It should be available as a kernel configuration option on these systems.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 09:48:57 -05:00
Yong Zhao
c8c50a7e5d drm/amdkfd: Remove duplicate functions update_mqd_hiq()
The functions are the same as update_mqd().

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Yong Zhao
7633c5e0bd drm/amdkfd: DIQ should not use HIQ way to allocate memory
In the mqd_diq_sdma buffer, there should be only one HIQ mqd. All DIQs
should be allocated somewhere else using the regular way.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Yong Zhao
d7c0b0477b drm/amdkfd: Delete KFD_MQD_TYPE_COMPUTE
It is the same as KFD_MQD_TYPE_CP, so delete it. As a result, we will
have one less mqd mananger per device.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
zhengbin
d191bd6781 drm/amdkfd: remove set but not used variable 'top_dev'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdkfd/kfd_iommu.c: In function kfd_iommu_device_init:
drivers/gpu/drm/amd/amdkfd/kfd_iommu.c:65:30: warning: variable top_dev set but not used [-Wunused-but-set-variable]

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 1ae99eab34 ("drm/amdkfd: Initialize HSA_CAP_ATS_PRESENT capability in topology codes")
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Yong Zhao
594d0c90a4 drm/amdkfd: Rename kfd_kernel_queue_*.c to kfd_packet_manager_*.c
After the recent cleanup, the functionalities provided by the previous
kfd_kernel_queue_*.c are actually all packet manager related. So rename
them to reflect that.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 09:47:23 -05:00
Yong Zhao
ccdef35d07 drm/amdkfd: Eliminate ops_asic_specific in kernel queue
The ops_asic_specific function pointers are actually quite generic after
using a simple if condition. Eliminate it by code refactoring.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 09:47:17 -05:00
Yong Zhao
84ce6c4867 drm/amdkfd: Merge CIK kernel queue functions into VI
The only difference that CIK kernel queue functions are different from
VI is avoid allocating eop_mem. We can achieve that by using a if
condition.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 09:47:02 -05:00
yu kuai
a1bd079fca drm/amdgpu: remove set but not used variable 'count'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdkfd/kfd_device.c: In function
‘kgd2kfd_post_reset’:
drivers/gpu/drm/amd/amdkfd/kfd_device.c:745:11: warning:
variable ‘count’ set but not used [-Wunused-but-set-variable]

'count' is never used, so can be removed. Thus 'atomic_dec_return'
can be replaced as 'atomic_dec'

Fixes: e42051d213 ("drm/amdkfd: Implement GPU reset handlers in KFD")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:46 -05:00
Yong Zhao
8c27a0c451 drm/amdkfd: Stop using GFP_NOIO explicitly for two places
Adapt the change from: 1cd106ecfc ("drm/amdkfd: Stop using GFP_NOIO explicitly")

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
2a7f8883f4 drm/amdkfd: Use QUEUE_IS_ACTIVE macro in mqd v10
This is done for other GFX in commit bb2d2128a5. Port it to GFX10.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
bc05b0ec15 drm/amdkfd: Fix a bug when calculating save_area_used_size
workgroup context data writes from m->cp_hqd_cntl_stack_size, so we
should deduct it when calculating the used size.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
681a9167dd drm/amdkfd: Update get_wave_state() for GFX10
Given control stack is now in the userspace context save restore area
on GFX10, the same as GFX8, it is not needed to copy it back to userspace.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
4d428e912b drm/amdkfd: Implement queue priority controls for gfx10
Ported from gfx9.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
2d030d3e97 drm/amdkfd: Rename create_cp_queue() to init_user_queue()
create_cp_queue() could also work with SDMA queues, so we should rename
it. It only initialize the data values rather than creating queues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
e47a8b5223 drm/amdkfd: Avoid using doorbell_off as offset in process doorbell pages
dorbell_off in the queue properties is mainly used for the doorbell dw
offset in pci bar. We should not set it to the doorbell byte offset in
process doorbell pages. This makes the code much easier to read.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
339903fa98 drm/amdkfd: Use better name to indicate the offset is in dwords
The doorbell offset could mean the byte offset or the dword offset,
and the 0 offset place is also different, sometimes the start of PCI
doorbell bar or the start of process doorbell pages. Use better name
to avoid confusion.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
2945375571 drm/amdkfd: Simplify the mmap offset related bit operations
The new code uses straightforward bit shifts and thus has better readability.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
5d4634b5d4 drm/amdkfd: Use kernel queue v9 functions for v10
The kernel queue functions for v9 and v10 are the same except
pm_map_process_v* which have small difference, so they should be reused.
This eliminates the need of reapplying several patches which were
applied on v9 but not on v10, such as bigger GWS and more than 2
SDMA engine support which were introduced on Arcturus.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
d2c6c1077a drm/amdkfd: Only keep release_mem function for Hawaii
release_mem is only used for Hawaii, but because GFX7 and GFX8 share the
same function pointer structure, so we only delete release_mem for GFX9
and GFX10.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:45 -05:00
Yong Zhao
b805323c31 drm/amdkfd: Adjust function sequences to avoid unnecessary declarations
This is cleaner.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Dave Airlie
8a86b00a43 Merge tag 'drm-next-5.5-2019-11-01' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-01:

amdgpu:
- Add EEPROM support for Arcturus
- Enable VCN encode support for Arcturus
- Misc PSP fixes
- Misc DC fixes
- swSMU cleanup

amdkfd:
- Misc cleanups
- Fix typo in cu bitmap parsing

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101190607.3763-1-alexander.deucher@amd.com
2019-11-04 10:22:53 +10:00
Alex Sierra
ef66915653 drm/amdkfd: bug fix for out of bounds mem on gpu cache filling info
The bitmap in cu_info structure is defined as a 4x4 size array. In
Acturus, this matrix is initialized as a 4x2. Based on the 8 shaders.
In the gpu cache filling initialization, the access to the bitmap matrix
was done as an 8x1 instead of 4x2. Causing an out of bounds memory
access error.
Due to this, the number of GPU cache entries was inconsistent.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Philip Yang
2c99a547bc drm/amdkfd: don't use dqm lock during device reset/suspend/resume
If device reset/suspend/resume failed for some reason, dqm lock is
hold forever and this causes deadlock. Below is a kernel backtrace when
application open kfd after suspend/resume failed.

Instead of holding dqm lock in pre_reset and releasing dqm lock in
post_reset, add dqm->sched_running flag which is modified in
dqm->ops.start and dqm->ops.stop. The flag doesn't need lock protection
because write/read are all inside dqm lock.

For HWS case, map_queues_cpsch and unmap_queues_cpsch checks
sched_running flag before sending the updated runlist.

v2: For no-HWS case, when device is stopped, don't call
load/destroy_mqd for eviction, restore and create queue, and avoid
debugfs dump hdqs.

Backtrace of dqm lock deadlock:

[Thu Oct 17 16:43:37 2019] INFO: task rocminfo:3024 blocked for more
than 120 seconds.
[Thu Oct 17 16:43:37 2019]       Not tainted
5.0.0-rc1-kfd-compute-rocm-dkms-no-npi-1131 #1
[Thu Oct 17 16:43:37 2019] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Oct 17 16:43:37 2019] rocminfo        D    0  3024   2947
0x80000000
[Thu Oct 17 16:43:37 2019] Call Trace:
[Thu Oct 17 16:43:37 2019]  ? __schedule+0x3d9/0x8a0
[Thu Oct 17 16:43:37 2019]  schedule+0x32/0x70
[Thu Oct 17 16:43:37 2019]  schedule_preempt_disabled+0xa/0x10
[Thu Oct 17 16:43:37 2019]  __mutex_lock.isra.9+0x1e3/0x4e0
[Thu Oct 17 16:43:37 2019]  ? __call_srcu+0x264/0x3b0
[Thu Oct 17 16:43:37 2019]  ? process_termination_cpsch+0x24/0x2f0
[amdgpu]
[Thu Oct 17 16:43:37 2019]  process_termination_cpsch+0x24/0x2f0
[amdgpu]
[Thu Oct 17 16:43:37 2019]
kfd_process_dequeue_from_all_devices+0x42/0x60 [amdgpu]
[Thu Oct 17 16:43:37 2019]  kfd_process_notifier_release+0x1be/0x220
[amdgpu]
[Thu Oct 17 16:43:37 2019]  __mmu_notifier_release+0x3e/0xc0
[Thu Oct 17 16:43:37 2019]  exit_mmap+0x160/0x1a0
[Thu Oct 17 16:43:37 2019]  ? __handle_mm_fault+0xba3/0x1200
[Thu Oct 17 16:43:37 2019]  ? exit_robust_list+0x5a/0x110
[Thu Oct 17 16:43:37 2019]  mmput+0x4a/0x120
[Thu Oct 17 16:43:37 2019]  do_exit+0x284/0xb20
[Thu Oct 17 16:43:37 2019]  ? handle_mm_fault+0xfa/0x200
[Thu Oct 17 16:43:37 2019]  do_group_exit+0x3a/0xa0
[Thu Oct 17 16:43:37 2019]  __x64_sys_exit_group+0x14/0x20
[Thu Oct 17 16:43:37 2019]  do_syscall_64+0x4f/0x100
[Thu Oct 17 16:43:37 2019]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Dave Airlie
3275a71e76 Merge tag 'drm-next-5.5-2019-10-09' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-10-09:

amdgpu:
- Additional RAS enablement for vega20
- RAS page retirement and bad page storage in EEPROM
- No GPU reset with unrecoverable RAS errors
- Reserve vram for page tables rather than trying to evict
- Fix issues with GPU reset and xgmi hives
- DC i2c over aux fixes
- Direct submission for clears, PTE/PDE updates
- Improvements to help support recoverable GPU page faults
- Silence harmless SAD block messages
- Clean up code for creating a bo at a fixed location
- Initial DC HDCP support
- Lots of documentation fixes
- GPU reset for renoir
- Add IH clockgating support for soc15 asics
- Powerplay improvements
- DC MST cleanups
- Add support for MSI-X
- Misc cleanups and bug fixes

amdkfd:
- Query KFD device info by asic type rather than pci ids
- Add navi14 support
- Add renoir support
- Add navi12 support
- gfx10 trap handler improvements
- pasid cleanups
- Check against device cgroup

ttm:
- Return -EBUSY with pipelining with no_gpu_wait

radeon:
- Silence harmless SAD block messages

device_cgroup:
- Export devcgroup_check_permission

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com
2019-10-26 05:56:57 +10:00
Arnd Bergmann
1832f2d8ff compat_ioctl: move more drivers to compat_ptr_ioctl
The .ioctl and .compat_ioctl file operations have the same prototype so
they can both point to the same function, which works great almost all
the time when all the commands are compatible.

One exception is the s390 architecture, where a compat pointer is only
31 bit wide, and converting it into a 64-bit pointer requires calling
compat_ptr(). Most drivers here will never run in s390, but since we now
have a generic helper for it, it's easy enough to use it consistently.

I double-checked all these drivers to ensure that all ioctl arguments
are used as pointers or are ignored, but are not interpreted as integer
values.

Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: Darren Hart (VMware) <dvhart@infradead.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-23 17:23:44 +02:00
Stephen Rothwell
1cd4d9eead drm/amdkfd: update for drmP.h removal
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-09 12:04:48 -05:00
Harish Kasiviswanathan
6b855f7b83 drm/amdkfd: Check against device cgroup
Participate in device cgroup. All kfd devices are exposed via /dev/kfd.
So use /dev/dri/renderN node.

Before exposing the device to a task check if it has permission to
access it. If the task (based on its cgroup) can access /dev/dri/renderN
then expose the device via kfd node.

If the task cannot access /dev/dri/renderN then process device data
(pdd) is not created. This will ensure that task cannot use the device.

In sysfs topology, all device nodes are visible irrespective of the task
cgroup. The sysfs node directories are created at driver load time and
cannot be changed dynamically. However, access to information inside
nodes is controlled based on the task's cgroup permissions.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:11:38 -05:00
Alex Deucher
a3e520a25c drm/amdkfd: fix the build when CIK support is disabled
Add proper ifdefs around CIK code in kfd setup.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:11:32 -05:00
Dan Carpenter
aa5e899de1 drm/amdkfd: Fix a && vs || typo
In the current code if "device_info" is ever NULL then the kernel will
Oops so probably || was intended instead of &&.

Fixes: e392c887df ("drm/amdkfd: Use array to probe kfd2kgd_calls")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:11:02 -05:00
Colin Ian King
63617d8b12 drm/amdkfd: add missing void argument to function kgd2kfd_init
Function kgd2kfd_init is missing a void argument, add it
to clean up the non-ANSI function declaration.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:10:26 -05:00
Oak Zeng
c4bb16e0f8 drm/amdkfd: Print more sdma engine hqds in debug fs
Previously only PCIe-optimized SDMA engine hqds were
exposed in debug fs. Print all SDMA engine hqds.

Reported-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:10:01 -05:00
Oak Zeng
40a9592a26 drm/amdkfd: Fix MQD size calculation
On device initialization, a chunk of GTT memory is pre-allocated for
HIQ and all SDMA queues mqd. The size of this allocation was wrong.
The correct sdma engine number should be PCIe-optimized SDMA engine
number plus xgmi SDMA engine number.

Reported-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:09:53 -05:00
Yong Zhao
452f9bdd9a drm/amdkfd: Improve KFD IOCTL printing
The code use hex define, so should the printing.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
e392c887df drm/amdkfd: Use array to probe kfd2kgd_calls
This is the same idea as the kfd device info probe and move all the
probe control together for easy maintenance.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Jay Cornwall
c18cc2bb9e drm/amdkfd: Fix race in gfx10 context restore handler
Missing synchronization with VGPR restore leads to intermittent
VGPR trashing in the user shader.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
56fc40aba4 drm/amdkfd: Eliminate get_atc_vmid_pasid_mapping_valid
get_atc_vmid_pasid_mapping_valid() is very similar to
get_atc_vmid_pasid_mapping_pasid(), so they can be merged into a new
function get_atc_vmid_pasid_mapping_info() to reduce register access
times. More importantly, getting the PASID and the valid bit atomically
with a single read fixes some potential race conditions where the
mapping changes between the two reads.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
3fe023d42e drm/amdkfd: Query vmid pasid mapping through stored info for non HWS
Because we record the mapping under non HWS mode in the software,
we can query pasid through vmid using the stored mapping instead of
reading from ATC registers.

This also prepares for the defeatured ATC block in future ASICs.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
d9d4623c87 drm/amdkfd: Record vmid pasid mapping in the driver for non HWS mode
This makes possible the vmid pasid mapping query through software.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
6027b1bf60 drm/amdkfd: Use hex print format for pasid
Since KFD pasid starts from 0x8000 (32768 in decimal), it is better
perceived as a hex number. Meanwhile, change the pasid type from
unsigned int to uint16_t to be consistent throughout the code.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
22471a5832 drm/amdkfd: Move the control stack on GFX10 to userspace buffer
The GFX10 does not require the control stack to be right after mqd
buffer any more, so move it back to usersapce allocated CSWR buffer.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Harish Kasiviswanathan
3a0c342392 drm/amd: Pass drm_device to kfd
kfd needs drm_device to call into drm_cgroup functions

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Harish Kasiviswanathan
171bc67eb5 drm/amdkfd: Store kfd_dev in iolink and cache properties
This is required to check against cgroup permissions.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
shaoyunl
0e94b5640b drm/amdkfd: use navi12 specific family id for navi12 code path
Keep the same use of CHIP_IDs for navi12 in kfd

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
shaoyunl
b77fb9d88e drm/amdkfd: Add NAVI12 support from kfd side
Add device info for both navi12 PF and VF

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
c637b36aea drm/amdkfd: Fix NULL pointer dereference for set_scratch_backing_va()
Currently this function pointer is missing for GFX10. Considering it is
a void function since GFX9, fix it by checking the function pointer
before dereferencing it.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
424b5442a2 drm/amdkfd: Remove unnecessary pm_init() for non HWS mode
The packet manager is not needed for non HWS mode except Hawaii, so only
initialize it for Hawaii under non HWS mode. This will simplify debugging
under non HWS mode for all new asics, because it eliminates one variable
out of the equation in non HWS mode

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
89b0679bd8 drm/amdkfd: Remove excessive print when reserving doorbells
The dozens of printing messages are compressed into 2 lines.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Allen Pais
81de29d842 drm/amdkfd: fix a potential NULL pointer dereference (v2)
alloc_workqueue is not checked for errors and as a result,
a potential NULL dereference could occur.

v2 (Felix Kuehling):
* Fix compile error (kfifo_free instead of fifo_free)
* Return proper error code

Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Yong Zhao
8daf3eccf8 drm/amdkfd: Delete unused KFD_IS_* macro
These were deleted before, but somehow showed up again. Delete them again.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Linus Torvalds
289991ce1c drm fixes for 5.4-rc1
core:
 - Some cleanups and fixes in the self-refresh helpers
 - Some cleanups and fixes in the atomic helpers
 
 amdgpu:
 - Fix a 64 bit divide
 - Prevent a memory leak in a failure case in dc
 - Load proper gfx firmware on navi14 variants
 - Add more navi12 and navi14 PCI ids
 - Misc fixes for renoir
 - Fix bandwidth issues with multiple displays on vega20
 - Support for Dali
 - Fix a possible oops with KFD on hawaii
 - Fix for backlight level after resume on some APUs
 - Other misc fixes
 
 panfrost:
 - Multiple panfrost fixes for regulator support and page fault handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdjZr6AAoJEAx081l5xIa+nboP+gPWzx45Q3IsbnaZmcdFTFEf
 +/XgScoFcv5Uhd3aXtrSYDvPnSyNXpGsV5ccE/FtxNd4G75n20tPFxGNhjzyXfdc
 B2x1IRgc82W1KxYwwDlmd+f/86h6uthFkh1ToKN3hsHWNm8Wu8AgoJnoWvqwluf9
 natSFnQPQIvcADpbpyk8FiNcXvMg0qrKQ8aj3uPxqUs4/ftigzez+5vYJOkktoEJ
 NFtlouVvIZejVo6l4Q5ebXXsol7On02iHUszpdJtb5FxMcBQwAafewCGln2622cA
 8ooWmekZNtoHUH3WmUlrs7TqPKtoOqIEkMO8UvCJDwBB4/ft8sJRPDKFgk547E/8
 Htv6MZXCSOT+/XxebM/wHijOg3MQVjPzO9s73YSmkytzGZVQ/Fgohl/6W+bN/xAm
 j/huS5ZozengAldFJHG4wruSk820Vzx736x2pk+9sbpf7PdFDIpuZus8U8wHc411
 hu3S2397IxyX4XswLg8BTaIOhCXwT7CluQ9mYD1THPgRzG5YPha8JelTcwwlVsD9
 2Cw6mCUAqydHHMboWQnEhRXhuhVfGlPAdJTsdyoI6zdXYqU/ThihJPBgG0wSq0y0
 fAsj/9NRqSzg6hk9vm1QdCeOthKOuAZ0PgLcVHI1RNSwEyrN8yOupVwe7+Mn+q2z
 UNbfr2qXGqKxn6rqUy2W
 =yCyF
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Fixes built up over the past 1.5 weeks or so, it's two weeks of
  amdgpu, some core cleanups and some panfrost fixes. I also finally
  figured out why my desktop was slow to do a bunch of stuff (someone
  gave it an IPv6 address which can't reach anything!).

  core:
   - Some cleanups and fixes in the self-refresh helpers
   - Some cleanups and fixes in the atomic helpers

  amdgpu:
   - Fix a 64 bit divide
   - Prevent a memory leak in a failure case in dc
   - Load proper gfx firmware on navi14 variants
   - Add more navi12 and navi14 PCI ids
   - Misc fixes for renoir
   - Fix bandwidth issues with multiple displays on vega20
   - Support for Dali
   - Fix a possible oops with KFD on hawaii
   - Fix for backlight level after resume on some APUs
   - Other misc fixes

  panfrost:
   - Multiple panfrost fixes for regulator support and page fault
     handling"

* tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm: (34 commits)
  drm/amd/display: prevent memory leak
  drm/amdgpu/gfx10: add support for wks firmware loading
  drm/amdgpu/display: include slab.h in dcn21_resource.c
  drm/amdgpu/display: fix 64 bit divide
  drm/panfrost: Prevent race when handling page fault
  drm/panfrost: Remove NULL checks for regulator
  drm/panfrost: Fix regulator_get_optional() misuse
  drm: Measure Self Refresh Entry/Exit times to avoid thrashing
  drm: Fix kerneldoc and remove unused struct member in self_refresh helper
  drm/atomic: Rename crtc_state->pageflip_flags to async_flip
  drm/atomic: Reject FLIP_ASYNC unconditionally
  drm/atomic: Take the atomic toys away from X
  drm/amdgpu: flag navi12 and 14 as experimental for 5.4
  drm/kms: Duct-tape for mode object lifetime checks
  drm/amdgpu: add navi12 pci id
  drm/amdgpu: add navi14 PCI ID for work station SKU
  drm/amdkfd: Swap trap temporary registers in gfx10 trap handler
  drm/amd/powerplay: implement sysfs for getting dpm clock
  drm/amd/display: Restore backlight brightness after system resume
  drm/amd/display: Implement voltage limitation for dali
  ...
2019-09-27 11:13:35 -07:00
Linus Torvalds
84da111de0 hmm related patches for 5.4
This is more cleanup and consolidation of the hmm APIs and the very
 strongly related mmu_notifier interfaces. Many places across the tree
 using these interfaces are touched in the process. Beyond that a cleanup
 to the page walker API and a few memremap related changes round out the
 series:
 
 - General improvement of hmm_range_fault() and related APIs, more
   documentation, bug fixes from testing, API simplification &
   consolidation, and unused API removal
 
 - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and
   make them internal kconfig selects
 
 - Hoist a lot of code related to mmu notifier attachment out of drivers by
   using a refcount get/put attachment idiom and remove the convoluted
   mmu_notifier_unregister_no_release() and related APIs.
 
 - General API improvement for the migrate_vma API and revision of its only
   user in nouveau
 
 - Annotate mmu_notifiers with lockdep and sleeping region debugging
 
 Two series unrelated to HMM or mmu_notifiers came along due to
 dependencies:
 
 - Allow pagemap's memremap_pages family of APIs to work without providing
   a struct device
 
 - Make walk_page_range() and related use a constant structure for function
   pointers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl1/nnkACgkQOG33FX4g
 mxqaRg//c6FqowV1pQlLutvAOAgMdpzfZ9eaaDKngy9RVQxz+k/MmJrdRH/p/mMA
 Pq93A1XfwtraGKErHegFXGEDk4XhOustVAVFwvjyXO41dTUdoFVUkti6ftbrl/rS
 6CT+X90jlvrwdRY7QBeuo7lxx7z8Qkqbk1O1kc1IOracjKfNJS+y6LTamy6weM3g
 tIMHI65PkxpRzN36DV9uCN5dMwFzJ73DWHp1b0acnDIigkl6u5zp6orAJVWRjyQX
 nmEd3/IOvdxaubAoAvboNS5CyVb4yS9xshWWMbH6AulKJv3Glca1Aa7QuSpBoN8v
 wy4c9+umzqRgzgUJUe1xwN9P49oBNhJpgBSu8MUlgBA4IOc3rDl/Tw0b5KCFVfkH
 yHkp8n6MP8VsRrzXTC6Kx0vdjIkAO8SUeylVJczAcVSyHIo6/JUJCVDeFLSTVymh
 EGWJ7zX2iRhUbssJ6/izQTTQyCH3YIyZ5QtqByWuX2U7ZrfkqS3/EnBW1Q+j+gPF
 Z2yW8iT6k0iENw6s8psE9czexuywa/Lttz94IyNlOQ8rJTiQqB9wLaAvg9hvUk7a
 kuspL+JGIZkrL3ouCeO/VA6xnaP+Q7nR8geWBRb8zKGHmtWrb5Gwmt6t+vTnCC2l
 olIDebrnnxwfBQhEJ5219W+M1pBpjiTpqK/UdBd92A4+sOOhOD0=
 =FRGg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull hmm updates from Jason Gunthorpe:
 "This is more cleanup and consolidation of the hmm APIs and the very
  strongly related mmu_notifier interfaces. Many places across the tree
  using these interfaces are touched in the process. Beyond that a
  cleanup to the page walker API and a few memremap related changes
  round out the series:

   - General improvement of hmm_range_fault() and related APIs, more
     documentation, bug fixes from testing, API simplification &
     consolidation, and unused API removal

   - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE,
     and make them internal kconfig selects

   - Hoist a lot of code related to mmu notifier attachment out of
     drivers by using a refcount get/put attachment idiom and remove the
     convoluted mmu_notifier_unregister_no_release() and related APIs.

   - General API improvement for the migrate_vma API and revision of its
     only user in nouveau

   - Annotate mmu_notifiers with lockdep and sleeping region debugging

  Two series unrelated to HMM or mmu_notifiers came along due to
  dependencies:

   - Allow pagemap's memremap_pages family of APIs to work without
     providing a struct device

   - Make walk_page_range() and related use a constant structure for
     function pointers"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits)
  libnvdimm: Enable unit test infrastructure compile checks
  mm, notifier: Catch sleeping/blocking for !blockable
  kernel.h: Add non_block_start/end()
  drm/radeon: guard against calling an unpaired radeon_mn_unregister()
  csky: add missing brackets in a macro for tlb.h
  pagewalk: use lockdep_assert_held for locking validation
  pagewalk: separate function pointers from iterator data
  mm: split out a new pagewalk.h header from mm.h
  mm/mmu_notifiers: annotate with might_sleep()
  mm/mmu_notifiers: prime lockdep
  mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
  mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports
  mm/hmm: hmm_range_fault() infinite loop
  mm/hmm: hmm_range_fault() NULL pointer bug
  mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
  mm/mmu_notifiers: remove unregister_no_release
  RDMA/odp: remove ib_ucontext from ib_umem
  RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
  RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
  RDMA/mlx5: Use ib_umem_start instead of umem.address
  ...
2019-09-21 10:07:42 -07:00
Linus Torvalds
574cc45397 drm main pull for 5.4-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdgfi4AAoJEAx081l5xIa+uYQP/3lbB75F60oSb0Y17uOtAwrS
 /ZMKZ3/EXcCw42JuYTbz17EiQSajkJcOC+tNRo22nlg4d9R0x3/kXwA7O/eu5RWI
 8Qi1rfrMZ6LotQXBfc4nVlHvyocsYc/GVNfsCboUCLwU/aNwnrufS9jeEsvWd2Vt
 iIn/okeQ7mTyB/3Dm4RFIAexE21+d5is6YTs45xUnDLhWzXYLU7VnHt5S5kXurEI
 cmVA7C1EAqV+GAwkeFWFx/jBpBRKqvTPa8EpOu7cQL01x7KwU2cQeNdOyBF6Uf8a
 cNKFI7jZZmu/mFp+YqU33ZIZxbLELm5PN1sz4ZvoIT8BJAQf1VmZg+GG87AvQCUz
 zbWKrbHGVy/c+sohUmvCOQvmzca/7rZutFyaCOx2mEdrheRZMWQI/w2C03VfkNFS
 vPpXrKXaWbVezHwF6x9PemRxvOPvLkeKAgSVuAfK0DhT5kEldqdzFzI7UO9MYfyX
 j+HOUIRP/pseshUV6YbnAe9MS3T4zb5P+Qd1zRTGgo8R9/l1AmVHyrkbH1hGNjY0
 mECHucCOh/VsyPAdg1XADJHqMg9081prySK8hNV6oazwSHdC38GdajuOmdyO3azQ
 OpJZDQd0eP4fHPMU6F5HSzLOO/wYuAie8gWVSZ3ylDxDPIKfqcjVo+4bxJ8sbmpI
 akj6BoMX7we0fjhlbdit
 =5CRH
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-09-18' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This is the main pull request for 5.4-rc1 merge window. I don't think
  there is anything outstanding so next week should just be fixes, but
  we'll see if I missed anything. I landed some fixes earlier in the
  week but got delayed writing summary and sending it out, due to a mix
  of sick kid and jetlag!

  There are some fixes pending, but I'd rather get the main merge out of
  the way instead of delaying it longer.

  It's also pretty large in commit count and new amd header file size.
  The largest thing is four new amdgpu products (navi12/14, arcturus and
  renoir APU support).

  Otherwise it's pretty much lots of work across the board, i915 has
  started landing tigerlake support, lots of icelake fixes and lots of
  locking reworking for future gpu support, lots of header file rework
  (drmP.h is nearly gone), some old legacy hacks (DRM_WAIT_ON) have been
  put into the places they are needed.

  uapi:
   - content protection type property for HDCP

  core:
   - rework include dependencies
   - lots of drmP.h removals
   - link rate calculation robustness fix
   - make fb helper map only when required
   - add connector->DDC adapter link
   - DRM_WAIT_ON removed
   - drop DRM_AUTH usage from drivers

  dma-buf:
   - reservation object fence helper

  dma-fence:
   - shrink dma_fence struct
   - merge signal functions
   - store timestamps in dma_fence
   - selftests

  ttm:
   - embed drm_get_object struct into ttm_buffer_object
   - release_notify callback

  bridges:
   - sii902x - audio graph card support
   - tc358767 - aux data handling rework
   - ti-snd64dsi86 - debugfs support, DSI mode flags support

  panels:
   - Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech
     COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191, Boe
     Himax8279d, Sharp LD-D5116Z01B
   - TI nspire, NEC NL8048HL11, LG Philips LB035Q02, Sharp LS037V7DW01,
     Sony ACX565AKM, Toppoly TD028TTEC1 Toppoly TD043MTEA1

  i915:
   - Initial tigerlake platform support
   - Locking simplification work, general all over refactoring.
   - Selftests
   - HDCP debug info improvements
   - DSI properties
   - Icelake display PLL fixes, colorspace fixes, bandwidth fixes, DSI
     suspend/resume
   - GuC fixes
   - Perf fixes
   - ElkhartLake enablement
   - DP MST fixes
   - GVT - command parser enhancements

  amdgpu:
   - add wipe memory on release flag for buffer creation
   - Navi12/14 support (may be marked experimental)
   - Arcturus support
   - Renoir APU support
   - mclk DPM for Navi
   - DC display fixes
   - Raven scatter/gather support
   - RAS support for GFX
   - Navi12 + Arcturus power features
   - GPU reset for Picasso
   - smu11 i2c controller support

  amdkfd:
   - navi12/14 support
   - Arcturus support

  radeon:
   - kexec fix

  nouveau:
   - improved display color management
   - detect lack of GPU power cables

  vmwgfx:
   - evicition priority support
   - remove unused security feature

  msm:
   - msm8998 display support
   - better async commit support for cursor updates

  etnaviv:
   - per-process address space support
   - performance counter fixes
   - softpin support

  mcde:
   - DCS transfers fix

  exynos:
   - drmP.h cleanup

  lima:
   - reduce logging

  kirin:
   - misc clenaups

  komeda:
   - dual-link support
   - DT memory regions

  hisilicon:
   - misc fixes

  imx:
   - IPUv3 image converter fixes
   - 32-bit RGB V4L2 pixel format support

  ingenic:
   - more support for panel related cases

  mgag200:
   - cursor support fix

  panfrost:
   - export GPU features register to userspace
   - gpu heap allocations
   - per-fd address space support

  pl111:
   - CLD pads wiring support removed from DT

  rockchip:
   - rework to use DRM PSR helpers
   - fix bug in VOP_WIN_GET macro
   - DSI DT binding rework

  sun4i:
   - improve support for color encoding and range
   - DDC enabled GPIO

  tinydrm:
   - rework SPI support
   - improve MIPI-DBI support
   - moved to drm/tiny

  vkms:
   - rework CRC tracking

  dw-hdmi:
   - get_eld and i2s improvements

  gm12u320:
   - misc fixes

  meson:
   - global code cleanup
   - vpu feature detect

  omap:
   - alpha/pixel blend mode properties

  rcar-du:
   - misc fixes"

* tag 'drm-next-2019-09-18' of git://anongit.freedesktop.org/drm/drm: (2112 commits)
  drm/nouveau/bar/gm20b: Avoid BAR1 teardown during init
  drm/nouveau: Fix ordering between TTM and GEM release
  drm/nouveau/prime: Extend DMA reservation object lock
  drm/nouveau: Fix fallout from reservation object rework
  drm/nouveau/kms/nv50-: Don't create MSTMs for eDP connectors
  drm/i915: Use NOEVICT for first pass on attemping to pin a GGTT mmap
  drm/i915: to make vgpu ppgtt notificaiton as atomic operation
  drm/i915: Flush the existing fence before GGTT read/write
  drm/i915: Hold irq-off for the entire fake lock period
  drm/i915/gvt: update RING_START reg of vGPU when the context is submitted to i915
  drm/i915/gvt: update vgpu workload head pointer correctly
  drm/mcde: Fix DSI transfers
  drm/msm: Use the correct dma_sync calls harder
  drm/msm: remove unlikely() from WARN_ON() conditions
  drm/msm/dsi: Fix return value check for clk_get_parent
  drm/msm: add atomic traces
  drm/msm/dpu: async commit support
  drm/msm: async commit support
  drm/msm: split power control from prepare/complete_commit
  drm/msm: add kms->flush_commit()
  ...
2019-09-19 16:24:24 -07:00
Jay Cornwall
8fde7784ec drm/amdkfd: Swap trap temporary registers in gfx10 trap handler
ttmp[4:5] hold information useful to the debugger. Use ttmp[14:15]
instead, aligning implementation with gfx9 trap handler.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: shaoyun liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:43:02 -05:00
Jay Cornwall
4b617e2b9e drm/amdkfd: Swap trap temporary registers in gfx10 trap handler
ttmp[4:5] hold information useful to the debugger. Use ttmp[14:15]
instead, aligning implementation with gfx9 trap handler.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: shaoyun liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:28:31 -05:00
Huang Rui
acb9acbefe drm/amdkfd: fix the missed asic name while inited renoir_device_info
This patch fixes null pointer issue below, I missed to init the asic renior name
while I rebase the patches.

[  106.004250] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  106.004254] #PF: supervisor read access in kernel mode
[  106.004256] #PF: error_code(0x0000) - not-present page
[  106.004257] PGD 0 P4D 0
[  106.004261] Oops: 0000 [#1] SMP NOPTI
[  106.004264] CPU: 3 PID: 1422 Comm: modprobe Not tainted 5.2.0-rc1-custom #1
[  106.004266] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS
WCD9814N_Weekly_19_08_1 08/14/2019
[  106.004272] RIP: 0010:strncpy+0x12/0x30
[  106.004274] Code: c1 c0 11 48 c1 c6 15 48 31 d0 48 c1 c2 20 31 c2 89 d0 31 f0
41 5c 5d c3 55 48 85 d2 48 89 f8 48 89 e5 74 1e 48 01 fa 48 89 f9 <44> 0f b6 06
41 80 f8 01 44 88 01 48 83 de ff 48 83 c1 01 48 39 d1
[  106.004278] RSP: 0018:ffffc092c1fd37a8 EFLAGS: 00010286
[  106.004281] RAX: ffff9e943466a28c RBX: 00000000000036ed RCX: ffff9e943466a28c
[  106.004283] RDX: ffff9e943466a2ac RSI: 0000000000000000 RDI: ffff9e943466a28c
[  106.004285] RBP: ffffc092c1fd37a8 R08: ffff9e943d100000 R09: 0000000000000228
[  106.004287] R10: ffff9e94418dc5a8 R11: ffff9e944746c0d0 R12: 0000000000000000
[  106.004289] R13: ffff9e943fa1ec00 R14: ffff9e943466a200 R15: ffff9e943466a200
[  106.004291] FS:  00007f7a022c5540(0000) GS:ffff9e9447ac0000(0000)
knlGS:0000000000000000
[  106.004294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.004296] CR2: 0000000000000000 CR3: 00000001ff0b0000 CR4: 0000000000340ee0
[  106.004298] Call Trace:
[  106.004382]  kfd_topology_add_device+0x150/0x610 [amdgpu]
[  106.004445]  kgd2kfd_device_init+0x2e0/0x4f0 [amdgpu]
[  106.004509]  amdgpu_amdkfd_device_init+0x14c/0x1b0 [amdgpu]

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:54 -05:00
Huang Rui
f5d843d4ea drm/amdkfd: add renoir kfd topology
This patch adds renoir kfd topology which is the same with Raven.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:35 -05:00
Huang Rui
444d4f5fd3 drm/amdkfd: add package manager for renoir
Renoir use GFX v9, so adds v9 package manager.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:29 -05:00
Huang Rui
59a6fc1aef drm/amdkfd: init kernel queue for renoir
Renoir is GFX v9, so init v9 kernel queue.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:21 -05:00
Huang Rui
4d85488cd9 drm/amdkfd: init kfd apertures v9 for renoir
Renoir is GMC v9, so init v9 kfd apertures.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:14 -05:00
Huang Rui
514e5e7e60 drm/amdkfd: add renoir type for the workaround of iommu v2 (v2)
Renoir is the same with Raven, will enable iommu event in future.

v2: fix the checking (Thong)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:08 -05:00
Huang Rui
5a959a8988 drm/amdkfd: enable kfd device queue manager v9 for renoir
Renoir is GFX9, so enable v9 devcie queue manager.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:02 -05:00
Huang Rui
2b9c221119 drm/amdkfd: add renoir kfd device info (v2)
This patch inits renoir kfd device info, so we treat renoir as "dgpu"
(bypass iommu v2). Will enable needs_iommu_device till renoir iommu is ready.

v2: rebase and align the drm-next

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:55 -05:00
Huang Rui
a8d42f174d drm/amdkfd: add renoir cache info for CRAT (v2)
Renoir's cache info should be the same with raven and carrizo's.

v2: fix missed "break"

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:49 -05:00
Yong Zhao
8099ae40d8 drm/amdkfd: Support Navi14 in KFD
Initial support of Navi14 in KFD. The device IDs will be added later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:40 -05:00
Yong Zhao
95a5bd1b33 drm/amdkfd: Fix a building error when KFD_SUPPORT_IOMMU_V2 is turned off
The issue was accidentally introduced recently.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:53 -05:00
Yong Zhao
050091ab6e drm/amdkfd: Query kfd device info by CHIP id instead of pci device id
This optimizes out the pci device id usage in KFD and makes the code
more maintainable.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:45 -05:00
Frank.Min
b313bbebd7 amd/amdkfd: add Arcturus vf DID support
Add the virtual function PCI device id.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Frank.Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-23 11:30:52 -05:00
YueHaibing
7fd5a6fb9a drm/amdkfd: Make deallocate_hiq_sdma_mqd static
Fix sparse warning:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:1846:6:
 warning: symbol 'deallocate_hiq_sdma_mqd' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:25:10 -05:00
YueHaibing
a52c26f1d7 drm/amdkfd: remove set but not used variable 'pdd'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c: In function restore_process_worker:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:949:29: warning:
 variable pdd set but not used [-Wunused-but-set-variable]

It is not used since
commit 5b87245faf ("drm/amdkfd: Simplify kfd2kgd interface")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:19:00 -05:00
Yong Zhao
c181159a5b drm/amdkfd: Fill the name field in node topology with asic name v2
The name field in node topology has not been used. We re-purpose it to
hold the asic name, which can be queried by user space applications
through sysfs.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:16:28 -05:00
Jason Gunthorpe
daa138a58c Merge branch 'odp_fixes' into hmm.git
From rdma.git

Jason Gunthorpe says:

====================
This is a collection of general cleanups for ODP to clarify some of the
flows around umem creation and use of the interval tree.
====================

The branch is based on v5.3-rc5 due to dependencies, and is being taken
into hmm.git due to dependencies in the next patches.

* odp_fixes:
  RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
  RDMA/mlx5: Use ib_umem_start instead of umem.address
  RDMA/core: Make invalidate_range a device operation
  RDMA/odp: Use kvcalloc for the dma_list and page_list
  RDMA/odp: Check for overflow when computing the umem_odp end
  RDMA/odp: Provide ib_umem_odp_release() to undo the allocs
  RDMA/odp: Split creating a umem_odp from ib_umem_get
  RDMA/odp: Make the three ways to create a umem_odp clear
  RMDA/odp: Consolidate umem_odp initialization
  RDMA/odp: Make it clearer when a umem is an implicit ODP umem
  RDMA/odp: Iterate over the whole rbtree directly
  RDMA/odp: Use the common interval tree library instead of generic
  RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-21 20:58:18 -03:00
Jason Gunthorpe
471f390205 drm/amdkfd: use mmu_notifier_put
The sequence of mmu_notifier_unregister_no_release(),
mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
free_notifier callback.

As this is the last user of those APIs, converting it means we can drop
them.

Link: https://lore.kernel.org/r/20190806231548.25242-11-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20 09:35:02 -03:00
Jason Gunthorpe
0029cab314 drm/amdkfd: fix a use after free race with mmu_notifer unregister
When using mmu_notifer_unregister_no_release() the caller must ensure
there is a SRCU synchronize before the mn memory is freed, otherwise use
after free races are possible, for instance:

     CPU0                                      CPU1
                                      invalidate_range_start
                                         hlist_for_each_entry_rcu(..)
 mmu_notifier_unregister_no_release(&p->mn)
 kfree(mn)
                                      if (mn->ops->invalidate_range_end)

The error unwind in amdkfd misses the SRCU synchronization.

amdkfd keeps the kfd_process around until the mm is released, so split the
flow to fully initialize the kfd_process and register it for find_process,
and with the notifier. Past this point the kfd_process does not need to be
cleaned up as it is fully ready.

The final failable step does a vm_mmap() and does not seem to impact the
kfd_process global state. Since it also cannot be undone (and already has
problems with undo if it internally fails), it has to be last.

This way we don't have to try to unwind the mmu_notifier_register() and
avoid the problem with the SRCU.

Along the way this also fixes various other error unwind bugs in the flow.

Fixes: 45102048f7 ("amdkfd: Add process queue manager module")
Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20 09:35:02 -03:00
Yong Zhao
f40c6912d2 drm/amdkfd: Fill amdgpu_task_info for KFD VMs
The amdgpu_task_info will be used when printing VM page fault for KFD
processes.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanatha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:58:21 -05:00
Alex Deucher
3f61fd41f3 Linux 5.3-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl1HiQMeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGFaIIAIM7UI5LXf7FMsVl
 zVemD9uDuCqNijycIfFoXvVvDt8y1PnyFJd5C/hRtXjsHyCPB49CRULE05q9ZOh6
 68jDa9VYOrnZoDlhMT4kuLf74x78RP19gVgQOLok8n0V3VKt7Yqrow5FKNOYVEfq
 0Rd2DqZMU5yGxo6iwG4y1PjCwvwDQ/tcaAGjc9RtOlmYl9KX9MoVHuwn4EEqO8pC
 3BN5GL0c/ebiCyNKG2n+y6vJGj5Y9rekyRYrtmtvhHsfs4iBirbnssMatyGm3gNz
 klysGhbQO98+DoVq3qqclVP5eK0XPdIBCAkF624tBhUN8gczRoQqVRBFuKCUCrD2
 h9wT8dE=
 =k65Y
 -----END PGP SIGNATURE-----

Merge tag 'v5.3-rc3' into drm-next-5.4

Linux 5.3-rc3

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-09 13:07:28 -05:00
Alex Deucher
4b3e30ed3e Revert "drm/amdkfd: New IOCTL to allocate queue GWS"
This reverts commit 1a058c3376.

This interface is still in too much flux.  Revert until
it's sorted out.

Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-07 10:21:38 -05:00
Jay Cornwall
5145d57ec5 drm/amdkfd: Extend CU mask to 8 SEs (v3)
Following bitmap layout logic introduced by:
"drm/amdgpu: support get_cu_info for Arcturus".

v2: squash in fixup for gfx_v9_0.c (Alex)
v3: squash in debug print output fix

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:19:11 -05:00
Jay Cornwall
1faa3b8054 drm/amdkfd: Save/restore vcc on gfx10
VCC moved out of user SGPR allocation in gfx10. It's now stored
in SGPRs 106-107.

Also fixes incorrect SGPR read offsets.

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Jay Cornwall
f9e346aba1 drm/amdkfd: Save/restore flat_scratch_lo/hi on gfx10
These moved from SGPRs in gfx9 to HWREG in gfx10.

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Jay Cornwall
7ce55e0b6f drm/amdkfd: Fix gfx10 wave64 VGPR context restore
Copy/paste error, first 4 VGPRs are separated by 64 dwords (256 bytes).

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Jay Cornwall
306fc9c568 drm/amdkfd: Remove dead code from gfx8/gfx9 trap handlers
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:18 -05:00
Jay Cornwall
a36e896740 drm/amdkfd: Replace gfx10 trap handler with correct branch
Previously submitted code was taken from an incorrect branch and
was non-functional.

Cc: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-By: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:11 -05:00
Jay Cornwall
7c2eaf5cdb drm/amdkfd: Fix lost single step exceptions in gfx9 trap handler
If the trap is entered due to MODE.DEBUG_EN=1 and SAVECTX is raised
concurrently the handler cannot identify the source of the exception.
This causes the debugger to lose single step exception notification
when a context save request arrives at the same time.

When MODE.DEBUG_EN=1 and STATUS.HALT=0 (exception not already handled)
jump to the second-level trap handler upon entering the trap. The
second-level trap will set STATUS.HALT=1 and return to the shader.
If SAVECTX was raised then control flow will return to the trap, which
will then handle the context save request.

Cc: Tony Tye <tony.tye@amd.com>
Cc: Laurent Morichetti <laurent.morichetti@amd.com>
Cc: Qingchuan Shi <qingchuan.shi@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Laurent Morichetti <laurent.morichetti@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:02 -05:00
Jay Cornwall
8c7a5d9e6f drm/amdkfd: Use SQC when TCP would fail in gfx9 context save.
When a wavefront raises TRAPSTS.XNACK_ERROR with STATUS.ALLOW_REPLAY=0
subsequent memory instructions have undefined behavior. In practice
SQC stores continue to work but TCP stores do not.

Context save is permitted to fail after XNACK error because the
wavefront will be halted and subsequently terminated. However the
debugger has an interest in retrieving the wavefront VGPR/LDS state.

Detect the out-of-spec case and use SQC stores during context save
in place of TCP stores.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:47 -05:00
Gustavo A. R. Silva
12fce1ab4a drm/amdkfd/kfd_mqd_manager_v10: Avoid fall-through warning
In preparation to enabling -Wimplicit-fallthrough, this patch silences
the following warning:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c: In function ‘mqd_manager_init_v10’:
./include/linux/dynamic_debug.h:122:52: warning: this statement may fall through [-Wimplicit-fallthrough=]
 #define __dynamic_func_call(id, fmt, func, ...) do { \
                                                    ^
./include/linux/dynamic_debug.h:143:2: note: in expansion of macro ‘__dynamic_func_call’
  __dynamic_func_call(__UNIQUE_ID(ddebug), fmt, func, ##__VA_ARGS__)
  ^~~~~~~~~~~~~~~~~~~
./include/linux/dynamic_debug.h:153:2: note: in expansion of macro ‘_dynamic_func_call’
  _dynamic_func_call(fmt, __dynamic_pr_debug,  \
  ^~~~~~~~~~~~~~~~~~
./include/linux/printk.h:336:2: note: in expansion of macro ‘dynamic_pr_debug’
  dynamic_pr_debug(fmt, ##__VA_ARGS__)
  ^~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c:432:3: note: in expansion of macro ‘pr_debug’
   pr_debug("%s@%i\n", __func__, __LINE__);
   ^~~~~~~~
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c:433:2: note: here
  case KFD_MQD_TYPE_COMPUTE:
  ^~~~

by removing the call to pr_debug() in KFD_MQD_TYPE_CP:

"The mqd init for CP and COMPUTE will have the same  routine." [1]

This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.

[1] https://lore.kernel.org/lkml/c735a1cc-a545-50fb-44e7-c0ad93ee8ee7@amd.com/

Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
2019-07-25 20:13:01 -05:00
Gustavo A. R. Silva
737298d188 drm/amdkfd: Fix missing break in switch statement
Add missing break statement in order to prevent the code from falling
through to case CHIP_NAVI10.

This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.

Fixes: 14328aa58c ("drm/amdkfd: Add navi10 support to amdkfd. (v3)")
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
2019-07-25 20:12:38 -05:00
Linus Torvalds
31cc088a4f drm fixes for -rc1:
nouveau:
 - bugfixes + TU116 enabling (minor iteration):w
 
 amdgpu:
 - large pile of fixes for new hw support this release (navi, vega20)
 - audio hotplug fix
 - bunch of corner cases and small fixes all over for amdgpu/kfd
 
 komeda:
 - back out some new properties (from this merge window) that needs
   more pondering.
 
 bochs: fb pitch setup
 
 ... plus a new panel quirk
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAl0x4sAACgkQTA9ye/CY
 qnFAAw/+JJy7fo95tIVM81p8yDxugpS3+fAJNTnKIndE2behYHPnKCrRk8BhDr0O
 x5xPy4yZHOTndmpDlLUCpV6b8xOvEX+orCNWsqbI2/Kff4yqtBRXhxBhM/3byMth
 nvfjwKVHDLo6SbL0SIIhZTTYBdBDa9zbilJjY86Xn2GdSiiyF/mC3Fhx21tXVTwq
 guoaRDcHAlAwvprKube1dC5y5IXoljJg+w6ydqwma/qUP08As/g0FiI9XvUuzLmY
 ffezdDrsHZPlNIVjGKr2QMhPl6DFSzQRV5UbqXGw7f9s6vW71qtt8a9F+rFk7Ers
 Uq0mqT9VgX6qQ9aBCyXax5UyFj+xr3Owan/D1QEyrUMPpkZHdubz5cliqw20dtYy
 1KNpZtMXR29swGn7J0o/VmtFsRr86+yX9/gL2dY8QDhGCAo/7tYRdDFXBApB+Fgb
 G3Z3Q6YYib6Rom7x3oiZpraf+KY9a+N5RTTrUgvSSxvC7SxxHw/PJbnX7Cjb13fU
 luFw1qs53qv0ytg++UQWivEf5pm/FonhBFq/KikMwtD+LhdtoIm186gPexpV6eaY
 hJZnr9BDafUCwxGZQZ4y01VUwPI5neXTUur8KVOCPqBgtFSR2m6ipgEnZUk9ltLm
 l73MfVbjbvpthds/2+8XDhzB3hnwmTzJlcXN1cQ2RJOEYoBwpe4=
 =s190
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Daniel Vetter:
 "Dave is back in shape, but now family got it so I'm doing the pull.
  Two things worthy of note:

   - nouveau feature pull was way too late, Dave&me decided to not take
     that, so Ben spun up a pull with just the fixes.

   - after some chatting with the arm display maintainers we decided to
     change a bit how that's maintained, for more oversight/review and
     cross vendor collab.

  More details below:

  nouveau:
   - bugfixes
   - TU116 enabling (minor iteration) :w

  amdgpu:
   - large pile of fixes for new hw support this release (navi, vega20)
   - audio hotplug fix
   - bunch of corner cases and small fixes all over for amdgpu/kfd

  komeda:
   - back out some new properties (from this merge window) that needs
     more pondering.

  bochs:
   - fb pitch setup

  core:
   - a new panel quirk
   - misc fixes"

* tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm: (73 commits)
  drm/nouveau/secboot/gp102-: remove WAR for SEC2 RTOS start bug
  drm/nouveau/flcn/gp102-: improve implementation of bind_context() on SEC2/GSP
  drm/nouveau: fix memory leak in nouveau_conn_reset()
  drm/nouveau/dmem: missing mutex_lock in error path
  drm/nouveau/hwmon: return EINVAL if the GPU is powered down for sensors reads
  drm/nouveau: fix bogus GPL-2 license header
  drm/nouveau: fix bogus GPL-2 license header
  drm/nouveau/i2c: Enable i2c pads & busses during preinit
  drm/nouveau/disp/tu102-: wire up scdc parameter setter
  drm/nouveau/core: recognise TU116 chipset
  drm/nouveau/kms: disallow dual-link harder if hdmi connection detected
  drm/nouveau/disp/nv50-: fix center/aspect-corrected scaling
  drm/nouveau/disp/nv50-: force scaler for any non-default LVDS/eDP modes
  drm/nouveau/mcp89/mmu: Use mcp77_mmu_new instead of g84_mmu_new on MCP89.
  drm/amd/display: init res_pool dccg_ref, dchub_ref with xtalin_freq
  drm/amdgpu/pm: remove check for pp funcs in freq sysfs handlers
  drm/amd/display: Force uclk to max for every state
  drm/amdkfd: Remove GWS from process during uninit
  drm/amd/amdgpu: Fix offset for vmid selection in debugfs interface
  drm/amd/powerplay: update vega20 driver if to fit latest SMU firmware
  ...
2019-07-19 12:29:43 -07:00
Oak Zeng
47a7fe5316 drm/amdkfd: Increase vcrat size for GPU
GPU cache info (part of virtual CRAT) size depends on CU number.
For arcturus, CU number has been increased. So the required memory
for vcrat also increases.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Jay Cornwall
37f86a9b36 drm/amdkfd: Merge gfx9/arcturus trap handlers, add ACC VGPR save
ACC VGPRs are a secondary VGPR set of same size as the primary VGPRs.
Save them as a block immediately following VGPRs.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Oak Zeng
e30d90fca3 drm/amdkfd: Add device id for real asics
Add pci device ids.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Oak Zeng
3baa24f0fc drm/amdkfd: Add arcturus CWSR trap handler
CWSR (compute wave save/restore) is used for
preempting compute queues.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Oak Zeng
b6689cf7b9 drm/amdkfd: Set number of xgmi optimized SDMA engines for arcturus
some sdma engines are optimized for xgmi on arcturus.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:05 -05:00
Yong Zhao
0ad8c5e296 drm/amdkfd: Support MMHUB1 in kfd interrupt path
Handle interrupts for second mmhub.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:04 -05:00
Oak Zeng
35cdc81bfa drm/amdkfd: Fix sdma_bitmap overflow issue
In the original formula, when sdma queue number is 64,
the left shift overflows. Use an equivalence that won't
overflow.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:04 -05:00
Oak Zeng
3a68a638a9 drm/amdkfd: Change arcturus sdma engines number
Arcturus has 8 sdma engines

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:04 -05:00
Yong Zhao
49adcf8a6f amd/amdkfd: Add ASIC ARCTURUS to kfd
Add initial support for ARCTURUS to kfd.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:03 -05:00
Oak Zeng
2fb1e49fda drm/amdkfd: Support bigger gds size
Extend map_process and set_resources pm4 packet to support
bigger gds size for arcturus.

v2: Only make the change for v9

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:03 -05:00
Oak Zeng
3a65d14d25 drm/amdkfd: Extend PM4 packets to support 8 SDMA
Extend map_queue and unmap_queue PM4 packets to support 8
SDMA engines. The new format is backward compatible.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:03 -05:00
Joseph Greathouse
6a5d487754 drm/amdkfd: Remove GWS from process during uninit
If we shut down a process without having destroyed its GWS-using
queues, it is possible that GWS BO will still be in the process
BO list during the gpuvm destruction. This list should be empty
at that time, so we should remove the GWS allocation at the
process uninit point if it is still around.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-17 13:34:31 -05:00
Felix Kuehling
75ee64875e drm/amdkfd: Consistently apply noretry setting
Apply the same setting to SH_MEM_CONFIG and VM_CONTEXT1_CNTL. This
makes the noretry param no longer KFD-specific. On GFX10 I'm not
changing SH_MEM_CONFIG in this commit because GFX10 has different
retry behaviour in the SQ and I don't have a way to test it at the
moment.

Suggested-by: Christian König <Christian.Koenig@amd.com>
CC: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by : Shaoyun.liu < Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-16 13:02:55 -05:00
Linus Torvalds
be8454afc5 drm main pull request for v5.3-rc1 (sans mm changes)
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdLMSbAAoJEAx081l5xIa+udkP/iWr8mw44tWYb8Wuzc/aR91v
 02X/J4S9XTQttNn/1Gpq9ItTLMf0Gc08tk1wEBBHAWi/qGaGZS2al+rv0afeuuQa
 aFhQzioDi7K/YZt92iEJhdx7wVMyydICTg3INmYlSP7/FyzLp6gBQRGSJ1kX5mHZ
 qWsFZgUOH9V5evyB6fDMleDaqFOKfcwrD7XYwbOheL/HeYQSv5AYn3VBupBFQ76L
 0hclI5VzZQ5V0nnqRTNDQVA9Yl6NTl+2eXTn5vuBtwKXEI6JJw8eihZp2oZDXqfS
 L441w7wGbkRPzN5kjMZjs1ToPMTlMveR5kL6Sc+o3DT/HmIr1odeaSDXR/93UOLd
 z0CRJ6xMC8h1ThLNHp8UgbxCKqIwYPsY2wVqjsJt7lDY5jma7Yv2YJ9ocYGHN/sO
 DVHcU6ugbwvuC5wZZtVZl5J4hjnBZwNRGSVK+iM0tkjalgdEuSFehXT7eQ8SphF/
 yI5gD1xNEwGfZ4bvZ3u/QrDCcpUAgPIUYmxEa2tPJILQWOJ9O87yc0y9Z21k9Ef1
 9yDqrFV3sPqC2xj/0ufZG/18+Yt99Ykg1jQE3RGDwD/59KAeqPbOvqTKyVODV9jE
 qje6ScSIc2G0713uss2bcaD3k+rCB5YL2JkKrk5OWW/T2+n9T+JFaiNh7dnSFFcU
 gBKyeY24OyCDMwXrby0K
 =SI+Y
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-07-16' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "The biggest thing in this is the AMD Navi GPU support, this again
  contains a bunch of header files that are large. These are the new AMD
  RX5700 GPUs that just recently became available.

  New drivers:
   - ST-Ericsson MCDE driver
   - Ingenic JZ47xx SoC

  UAPI change:
   - HDR source metadata property

  Core:
   - HDR inforframes and EDID parsing
   - drm hdmi infoframe unpacking
   - remove prime sg_table caching into dma-buf
   - New gem vram helpers to reduce driver code
   - Lots of drmP.h removal
   - reservation fencing fix
   - documentation updates
   - drm_fb_helper_connector removed
   - mode name command handler rewrite

  fbcon:
   - Remove the fbcon notifiers

  ttm:
   - forward progress fixes

  dma-buf:
   - make mmap call optional
   - debugfs refcount fixes
   - dma-fence free with pending signals fix
   - each dma-buf gets an inode

  Panels:
   - Lots of additional panel bindings

  amdgpu:
   - initial navi10 support
   - avoid hw reset
   - HDR metadata support
   - new thermal sensors for vega asics
   - RAS fixes
   - use HMM rather than MMU notifier
   - xgmi topology via kfd
   - SR-IOV fixes
   - driver reload fixes
   - DC use a core bpc attribute
   - Aux fixes for DC
   - Bandwidth calc updates for DC
   - Clock handling refactor
   - kfd VEGAM support

  vmwgfx:
   - Coherent memory support changes

  i915:
   - HDR Support
   - HDMI i2c link
   - Icelake multi-segmented gamma support
   - GuC firmware update
   - Mule Creek Canyon PCH support for EHL
   - EHL platform updtes
   - move i915.alpha_support to i915.force_probe
   - runtime PM refactoring
   - VBT parsing refactoring
   - DSI fixes
   - struct mutex dependency reduction
   - GEM code reorg

  mali-dp:
   - Komeda driver features

  msm:
   - dsi vs EPROBE_DEFER fixes
   - msm8998 snapdragon 835 support
   - a540 gpu support
   - mdp5 and dpu interconnect support

  exynos:
   - drmP.h removal

  tegra:
   - misc fixes

  tda998x:
   - audio support improvements
   - pixel repeated mode support
   - quantisation range handling corrections
   - HDMI vendor info fix

  armada:
   - interlace support fix
   - overlay/video plane register handling refactor
   - add gamma support

  rockchip:
   - RX3328 support

  panfrost:
   - expose perf counters via hidden ioctls

  vkms:
   - enumerate CRC sources list

  ast:
   - rework BO handling

  mgag200:
   - rework BO handling

  dw-hdmi:
   - suspend/resume support

  rcar-du:
   - R8A774A1 Soc Support
   - LVDS dual-link mode support
   - Additional formats
   - Misc fixes

  omapdrm:
   - DSI command mode display support

  stm
   - fb modifier support
   - runtime PM support

  sun4i:
   - use vmap ops

  vc4:
   - binner bo binding rework

  v3d:
   - compute shader support
   - resync/sync fixes
   - job management refactoring

  lima:
   - NULL pointer in irq handler fix
   - scheduler default timeout

  virtio:
   - fence seqno support
   - trace events

  bochs:
   - misc fixes

  tc458767:
   - IRQ/HDP handling

  sii902x:
   - HDMI audio support

  atmel-hlcdc:
   - misc fixes

  meson:
   - zpos support"

* tag 'drm-next-2019-07-16' of git://anongit.freedesktop.org/drm/drm: (1815 commits)
  Revert "Merge branch 'vmwgfx-next' of git://people.freedesktop.org/~thomash/linux into drm-next"
  Revert "mm: adjust apply_to_pfn_range interface for dropped token."
  mm: adjust apply_to_pfn_range interface for dropped token.
  drm/amdgpu/navi10: add uclk activity sensor
  drm/amdgpu: properly guard the generic discovery code
  drm/amdgpu: add missing documentation on new module parameters
  drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback
  drm/amd/display: avoid 64-bit division
  drm/amdgpu/psp11: simplify the ucode register logic
  drm/amdgpu: properly guard DC support in navi code
  drm/amd/powerplay: vega20: fix uninitialized variable use
  drm/amd/display: dcn20: include linux/delay.h
  amdgpu: make pmu support optional
  drm/amd/powerplay: Zero initialize current_rpm in vega20_get_fan_speed_percent
  drm/amd/powerplay: Zero initialize freq in smu_v11_0_get_current_clk_freq
  drm/amd/powerplay: Use memset to initialize metrics structs
  drm/amdgpu/mes10.1: Fix header guard
  drm/amd/powerplay: add temperature sensor support for navi10
  drm/amdgpu: fix scheduler timeout calc
  drm/amdgpu: Prepare for hmm_range_register API change (v2)
  ...
2019-07-15 19:04:27 -07:00
Eric Huang
70df8273ca drm/amdkfd: fix cp hang in eviction
The cp hang occurs in OCL conformance test only on supermicro
platform which has 40 cores and the test generates 40 threads.
The root cause is race condition in non-protected flags.

The fix is to add flags of is_evicted and is_active(init_mqd())
into protected area.

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-11 14:37:24 -05:00
Felix Kuehling
a5b1615529 drm/amdkfd: Disable idle optimization for chained runlist
This works around difficult-to-reproduce soft hangs on oversubscribed
runlists.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-03 14:32:10 -05:00
Felix Kuehling
7a049244a0 drm/amdkfd: Add chained_runlist_idle_disable flag to pm4_mes_runlist
New flag to disable an idle runlist optimization that is causing soft
hangs with some diffult-to-reproduce customer workloads. This will
serve as a workaround until the problem can be reproduced and the
root-cause determined.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-03 14:32:04 -05:00
Felix Kuehling
819ec5acf7 drm/amdkfd: Print a warning when the runlist becomes oversubscribed
Oversubscription of queues or processes results in poor performance
mostly because HWS blinbly schedules busy and idle queues, resulting
in poor occupancy if many queues are idle.

Let users know with a warning message when transitioning from a
non-oversubscribed to an oversubscribed runlist.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-03 14:31:26 -05:00
Jack Xiao
ba9e93c5fa drm/amdkfd: remove an unused variable
Just for cleanup.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-02 16:14:22 -05:00
Jack Xiao
aabf3a951c drm/amdkfd: remove duplicated PCIE atomics request
Since amdgpu has always requested PCIE atomics, kfd don't
need duplicated PCIE atomics enablement. Referring to amdgpu
request result is enough.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-01 14:54:47 -05:00
shaoyunl
a864e29d94 drm/amdkfd: remove unnecessary warning message on gpu reset
In XGMI configuration, more than one asic can be reset at same time,
kfd is able to handle this and no need to trigger the warning

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-27 11:22:41 -05:00
Oak Zeng
d9848e149d drm/amdkfd: Set gws_mask to 64 bit 1s
Previous kfd doesn't use gws so this mask was set to 0.
Set it to 64 bit 1s because now kfd can use all 64 gws
resources.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-22 09:34:14 -05:00
Jason A. Donenfeld
9285ec4c8b timekeeping: Use proper clock specifier names in functions
This makes boot uniformly boottime and tai uniformly clocktai, to
address the remaining oversights.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20190621203249.3909-2-Jason@zx2c4.com
2019-06-22 12:11:27 +02:00
Philip Cox
14328aa58c drm/amdkfd: Add navi10 support to amdkfd. (v3)
KFD (kernel fusion driver) is the kernel driver
for the compute backend for usermode compute
stack.

v2: squash in updates (Alex)
v3: squash in rebase fixes (Alex)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-21 18:59:24 -05:00
Kent Russell
de9f26bbd3 drm/amdkfd: Add procfs-style information for KFD processes
Add a folder structure to /sys/class/kfd/kfd/ called proc which contains
subfolders, each representing an active KFD process' PID, containing 1
file: pasid.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-20 11:34:00 -05:00
Oak Zeng
38bb4226ff drm/amdkfd: Fix sdma queue allocate race condition
SDMA queue allocation requires the dqm lock as it modify
the global dqm members. Enclose it in the dqm_lock.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-17 11:01:41 -05:00
Oak Zeng
6a6ef5ee25 drm/amdkfd: Fix a circular lock dependency
The idea to break the circular lock dependency is to temporarily drop
dqm lock before calling allocate_mqd. See callstack #1 below.

[   59.510149] [drm] Initialized amdgpu 3.30.0 20150101 for 0000:04:00.0 on minor 0

[  513.604034] ======================================================
[  513.604205] WARNING: possible circular locking dependency detected
[  513.604375] 4.18.0-kfd-root #2 Tainted: G        W
[  513.604530] ------------------------------------------------------
[  513.604699] kswapd0/611 is trying to acquire lock:
[  513.604840] 00000000d254022e (&dqm->lock_hidden){+.+.}, at: evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.605150]
               but task is already holding lock:
[  513.605307] 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250
[  513.605540]
               which lock already depends on the new lock.

[  513.605747]
               the existing dependency chain (in reverse order) is:
[  513.605944]
               -> #4 (&anon_vma->rwsem){++++}:
[  513.606106]        __vma_adjust+0x147/0x7f0
[  513.606231]        __split_vma+0x179/0x190
[  513.606353]        mprotect_fixup+0x217/0x260
[  513.606553]        do_mprotect_pkey+0x211/0x380
[  513.606752]        __x64_sys_mprotect+0x1b/0x20
[  513.606954]        do_syscall_64+0x50/0x1a0
[  513.607149]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.607380]
               -> #3 (&mapping->i_mmap_rwsem){++++}:
[  513.607678]        rmap_walk_file+0x1f0/0x280
[  513.607887]        page_referenced+0xdd/0x180
[  513.608081]        shrink_page_list+0x853/0xcb0
[  513.608279]        shrink_inactive_list+0x33b/0x700
[  513.608483]        shrink_node_memcg+0x37a/0x7f0
[  513.608682]        shrink_node+0xd8/0x490
[  513.608869]        balance_pgdat+0x18b/0x3b0
[  513.609062]        kswapd+0x203/0x5c0
[  513.609241]        kthread+0x100/0x140
[  513.609420]        ret_from_fork+0x24/0x30
[  513.609607]
               -> #2 (fs_reclaim){+.+.}:
[  513.609883]        kmem_cache_alloc_trace+0x34/0x2e0
[  513.610093]        reservation_object_reserve_shared+0x139/0x300
[  513.610326]        ttm_bo_init_reserved+0x291/0x480 [ttm]
[  513.610567]        amdgpu_bo_do_create+0x1d2/0x650 [amdgpu]
[  513.610811]        amdgpu_bo_create+0x40/0x1f0 [amdgpu]
[  513.611041]        amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu]
[  513.611290]        amdgpu_bo_create_kernel+0x12/0x70 [amdgpu]
[  513.611584]        amdgpu_ttm_init+0x2cb/0x560 [amdgpu]
[  513.611823]        gmc_v9_0_sw_init+0x400/0x750 [amdgpu]
[  513.612491]        amdgpu_device_init+0x14eb/0x1990 [amdgpu]
[  513.612730]        amdgpu_driver_load_kms+0x78/0x290 [amdgpu]
[  513.612958]        drm_dev_register+0x111/0x1a0
[  513.613171]        amdgpu_pci_probe+0x11c/0x1e0 [amdgpu]
[  513.613389]        local_pci_probe+0x3f/0x90
[  513.613581]        pci_device_probe+0x102/0x1c0
[  513.613779]        driver_probe_device+0x2a7/0x480
[  513.613984]        __driver_attach+0x10a/0x110
[  513.614179]        bus_for_each_dev+0x67/0xc0
[  513.614372]        bus_add_driver+0x1eb/0x260
[  513.614565]        driver_register+0x5b/0xe0
[  513.614756]        do_one_initcall+0xac/0x357
[  513.614952]        do_init_module+0x5b/0x213
[  513.615145]        load_module+0x2542/0x2d30
[  513.615337]        __do_sys_finit_module+0xd2/0x100
[  513.615541]        do_syscall_64+0x50/0x1a0
[  513.615731]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.615963]
               -> #1 (reservation_ww_class_mutex){+.+.}:
[  513.616293]        amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu]
[  513.616554]        init_mqd+0x223/0x260 [amdgpu]
[  513.616779]        create_queue_nocpsch+0x4d9/0x600 [amdgpu]
[  513.617031]        pqm_create_queue+0x37c/0x520 [amdgpu]
[  513.617270]        kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu]
[  513.617522]        kfd_ioctl+0x202/0x350 [amdgpu]
[  513.617724]        do_vfs_ioctl+0x9f/0x6c0
[  513.617914]        ksys_ioctl+0x66/0x70
[  513.618095]        __x64_sys_ioctl+0x16/0x20
[  513.618286]        do_syscall_64+0x50/0x1a0
[  513.618476]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.618695]
               -> #0 (&dqm->lock_hidden){+.+.}:
[  513.618984]        __mutex_lock+0x98/0x970
[  513.619197]        evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.619459]        kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
[  513.619710]        kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
[  513.620103]        amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
[  513.620363]        amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
[  513.620614]        __mmu_notifier_invalidate_range_start+0x70/0xb0
[  513.620851]        try_to_unmap_one+0x7fc/0x8f0
[  513.621049]        rmap_walk_anon+0x121/0x290
[  513.621242]        try_to_unmap+0x93/0xf0
[  513.621428]        shrink_page_list+0x606/0xcb0
[  513.621625]        shrink_inactive_list+0x33b/0x700
[  513.621835]        shrink_node_memcg+0x37a/0x7f0
[  513.622034]        shrink_node+0xd8/0x490
[  513.622219]        balance_pgdat+0x18b/0x3b0
[  513.622410]        kswapd+0x203/0x5c0
[  513.622589]        kthread+0x100/0x140
[  513.622769]        ret_from_fork+0x24/0x30
[  513.622957]
               other info that might help us debug this:

[  513.623354] Chain exists of:
                 &dqm->lock_hidden --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem

[  513.623900]  Possible unsafe locking scenario:

[  513.624189]        CPU0                    CPU1
[  513.624397]        ----                    ----
[  513.624594]   lock(&anon_vma->rwsem);
[  513.624771]                                lock(&mapping->i_mmap_rwsem);
[  513.625020]                                lock(&anon_vma->rwsem);
[  513.625253]   lock(&dqm->lock_hidden);
[  513.625433]
                *** DEADLOCK ***

[  513.625783] 3 locks held by kswapd0/611:
[  513.625967]  #0: 00000000f14edf84 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
[  513.626309]  #1: 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250
[  513.626671]  #2: 0000000067b5cd12 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x5/0xb0
[  513.627037]
               stack backtrace:
[  513.627292] CPU: 0 PID: 611 Comm: kswapd0 Tainted: G        W         4.18.0-kfd-root #2
[  513.627632] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  513.627990] Call Trace:
[  513.628143]  dump_stack+0x7c/0xbb
[  513.628315]  print_circular_bug.isra.37+0x21b/0x228
[  513.628581]  __lock_acquire+0xf7d/0x1470
[  513.628782]  ? unwind_next_frame+0x6c/0x4f0
[  513.628974]  ? lock_acquire+0xec/0x1e0
[  513.629154]  lock_acquire+0xec/0x1e0
[  513.629357]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.629587]  __mutex_lock+0x98/0x970
[  513.629790]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630047]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630309]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630562]  evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630816]  kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
[  513.631057]  kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
[  513.631288]  amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
[  513.631536]  amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
[  513.632076]  __mmu_notifier_invalidate_range_start+0x70/0xb0
[  513.632299]  try_to_unmap_one+0x7fc/0x8f0
[  513.632487]  ? page_lock_anon_vma_read+0x68/0x250
[  513.632690]  rmap_walk_anon+0x121/0x290
[  513.632875]  try_to_unmap+0x93/0xf0
[  513.633050]  ? page_remove_rmap+0x330/0x330
[  513.633239]  ? rcu_read_unlock+0x60/0x60
[  513.633422]  ? page_get_anon_vma+0x160/0x160
[  513.633613]  shrink_page_list+0x606/0xcb0
[  513.633800]  shrink_inactive_list+0x33b/0x700
[  513.633997]  shrink_node_memcg+0x37a/0x7f0
[  513.634186]  ? shrink_node+0xd8/0x490
[  513.634363]  shrink_node+0xd8/0x490
[  513.634537]  balance_pgdat+0x18b/0x3b0
[  513.634718]  kswapd+0x203/0x5c0
[  513.634887]  ? wait_woken+0xb0/0xb0
[  513.635062]  kthread+0x100/0x140
[  513.635231]  ? balance_pgdat+0x3b0/0x3b0
[  513.635414]  ? kthread_delayed_work_timer_fn+0x80/0x80
[  513.635626]  ret_from_fork+0x24/0x30
[  513.636042] Evicting PASID 32768 queues
[  513.936236] Restoring PASID 32768 queues
[  524.708912] Evicting PASID 32768 queues
[  524.999875] Restoring PASID 32768 queues

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-17 11:01:41 -05:00
Oak Zeng
d091bc0a70 Revert "drm/amdkfd: Fix a circular lock dependency"
This reverts commit 06b89b38f3.
This fix is not proper. allocate_mqd can't be moved before
allocate_sdma_queue as it depends on q->properties->sdma_id
set in later.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-17 11:01:41 -05:00
Oak Zeng
70d488fb3f Revert "drm/amdkfd: Fix sdma queue allocate race condition"
This reverts commit f77dac6cd6.
This fix is not proper. allocate_mqd can't be moved before
allocate_sdma_queue as it depends on q->properties->sdma_id
set in later.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-17 11:01:41 -05:00
Greg Kroah-Hartman
641d30035c amdkfd: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-13 13:59:49 -05:00
Oak Zeng
465ab9e02a drm/amdkfd: Add device to topology after it is completely inited
We can't have devices that are not completely initialized in kfd topology.
Otherwise it is a race condition when user access not completely
initialized device. This also addresses a kfd_topology_add_device accessing
NULL dqm pointer issue.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-11 12:57:33 -05:00
Oak Zeng
1ae99eab34 drm/amdkfd: Initialize HSA_CAP_ATS_PRESENT capability in topology codes
Move HSA_CAP_ATS_PRESENT initialization logic from kfd iommu codes to
kfd topology codes. This removes kfd_iommu_device_init's dependency
on kfd_topology_add_device. Also remove duplicate code setting the
same.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-11 12:57:25 -05:00
Oak Zeng
f77dac6cd6 drm/amdkfd: Fix sdma queue allocate race condition
SDMA queue allocation requires the dqm lock at it modify
the global dqm members. Move up the dqm_lock so sdma
queue allocation is enclosed in the critical section. Move
mqd allocation out of critical section to avoid circular
lock dependency.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-11 12:57:18 -05:00
Oak Zeng
06b89b38f3 drm/amdkfd: Fix a circular lock dependency
The idea to break the circular lock dependency is to move allocate_mqd
out of dqm lock protection. See callstack #1 below.

[   59.510149] [drm] Initialized amdgpu 3.30.0 20150101 for 0000:04:00.0 on minor 0

[  513.604034] ======================================================
[  513.604205] WARNING: possible circular locking dependency detected
[  513.604375] 4.18.0-kfd-root #2 Tainted: G        W
[  513.604530] ------------------------------------------------------
[  513.604699] kswapd0/611 is trying to acquire lock:
[  513.604840] 00000000d254022e (&dqm->lock_hidden){+.+.}, at: evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.605150]
               but task is already holding lock:
[  513.605307] 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250
[  513.605540]
               which lock already depends on the new lock.

[  513.605747]
               the existing dependency chain (in reverse order) is:
[  513.605944]
               -> #4 (&anon_vma->rwsem){++++}:
[  513.606106]        __vma_adjust+0x147/0x7f0
[  513.606231]        __split_vma+0x179/0x190
[  513.606353]        mprotect_fixup+0x217/0x260
[  513.606553]        do_mprotect_pkey+0x211/0x380
[  513.606752]        __x64_sys_mprotect+0x1b/0x20
[  513.606954]        do_syscall_64+0x50/0x1a0
[  513.607149]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.607380]
               -> #3 (&mapping->i_mmap_rwsem){++++}:
[  513.607678]        rmap_walk_file+0x1f0/0x280
[  513.607887]        page_referenced+0xdd/0x180
[  513.608081]        shrink_page_list+0x853/0xcb0
[  513.608279]        shrink_inactive_list+0x33b/0x700
[  513.608483]        shrink_node_memcg+0x37a/0x7f0
[  513.608682]        shrink_node+0xd8/0x490
[  513.608869]        balance_pgdat+0x18b/0x3b0
[  513.609062]        kswapd+0x203/0x5c0
[  513.609241]        kthread+0x100/0x140
[  513.609420]        ret_from_fork+0x24/0x30
[  513.609607]
               -> #2 (fs_reclaim){+.+.}:
[  513.609883]        kmem_cache_alloc_trace+0x34/0x2e0
[  513.610093]        reservation_object_reserve_shared+0x139/0x300
[  513.610326]        ttm_bo_init_reserved+0x291/0x480 [ttm]
[  513.610567]        amdgpu_bo_do_create+0x1d2/0x650 [amdgpu]
[  513.610811]        amdgpu_bo_create+0x40/0x1f0 [amdgpu]
[  513.611041]        amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu]
[  513.611290]        amdgpu_bo_create_kernel+0x12/0x70 [amdgpu]
[  513.611584]        amdgpu_ttm_init+0x2cb/0x560 [amdgpu]
[  513.611823]        gmc_v9_0_sw_init+0x400/0x750 [amdgpu]
[  513.612491]        amdgpu_device_init+0x14eb/0x1990 [amdgpu]
[  513.612730]        amdgpu_driver_load_kms+0x78/0x290 [amdgpu]
[  513.612958]        drm_dev_register+0x111/0x1a0
[  513.613171]        amdgpu_pci_probe+0x11c/0x1e0 [amdgpu]
[  513.613389]        local_pci_probe+0x3f/0x90
[  513.613581]        pci_device_probe+0x102/0x1c0
[  513.613779]        driver_probe_device+0x2a7/0x480
[  513.613984]        __driver_attach+0x10a/0x110
[  513.614179]        bus_for_each_dev+0x67/0xc0
[  513.614372]        bus_add_driver+0x1eb/0x260
[  513.614565]        driver_register+0x5b/0xe0
[  513.614756]        do_one_initcall+0xac/0x357
[  513.614952]        do_init_module+0x5b/0x213
[  513.615145]        load_module+0x2542/0x2d30
[  513.615337]        __do_sys_finit_module+0xd2/0x100
[  513.615541]        do_syscall_64+0x50/0x1a0
[  513.615731]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.615963]
               -> #1 (reservation_ww_class_mutex){+.+.}:
[  513.616293]        amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu]
[  513.616554]        init_mqd+0x223/0x260 [amdgpu]
[  513.616779]        create_queue_nocpsch+0x4d9/0x600 [amdgpu]
[  513.617031]        pqm_create_queue+0x37c/0x520 [amdgpu]
[  513.617270]        kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu]
[  513.617522]        kfd_ioctl+0x202/0x350 [amdgpu]
[  513.617724]        do_vfs_ioctl+0x9f/0x6c0
[  513.617914]        ksys_ioctl+0x66/0x70
[  513.618095]        __x64_sys_ioctl+0x16/0x20
[  513.618286]        do_syscall_64+0x50/0x1a0
[  513.618476]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.618695]
               -> #0 (&dqm->lock_hidden){+.+.}:
[  513.618984]        __mutex_lock+0x98/0x970
[  513.619197]        evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.619459]        kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
[  513.619710]        kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
[  513.620103]        amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
[  513.620363]        amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
[  513.620614]        __mmu_notifier_invalidate_range_start+0x70/0xb0
[  513.620851]        try_to_unmap_one+0x7fc/0x8f0
[  513.621049]        rmap_walk_anon+0x121/0x290
[  513.621242]        try_to_unmap+0x93/0xf0
[  513.621428]        shrink_page_list+0x606/0xcb0
[  513.621625]        shrink_inactive_list+0x33b/0x700
[  513.621835]        shrink_node_memcg+0x37a/0x7f0
[  513.622034]        shrink_node+0xd8/0x490
[  513.622219]        balance_pgdat+0x18b/0x3b0
[  513.622410]        kswapd+0x203/0x5c0
[  513.622589]        kthread+0x100/0x140
[  513.622769]        ret_from_fork+0x24/0x30
[  513.622957]
               other info that might help us debug this:

[  513.623354] Chain exists of:
                 &dqm->lock_hidden --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem

[  513.623900]  Possible unsafe locking scenario:

[  513.624189]        CPU0                    CPU1
[  513.624397]        ----                    ----
[  513.624594]   lock(&anon_vma->rwsem);
[  513.624771]                                lock(&mapping->i_mmap_rwsem);
[  513.625020]                                lock(&anon_vma->rwsem);
[  513.625253]   lock(&dqm->lock_hidden);
[  513.625433]
                *** DEADLOCK ***

[  513.625783] 3 locks held by kswapd0/611:
[  513.625967]  #0: 00000000f14edf84 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
[  513.626309]  #1: 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250
[  513.626671]  #2: 0000000067b5cd12 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x5/0xb0
[  513.627037]
               stack backtrace:
[  513.627292] CPU: 0 PID: 611 Comm: kswapd0 Tainted: G        W         4.18.0-kfd-root #2
[  513.627632] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  513.627990] Call Trace:
[  513.628143]  dump_stack+0x7c/0xbb
[  513.628315]  print_circular_bug.isra.37+0x21b/0x228
[  513.628581]  __lock_acquire+0xf7d/0x1470
[  513.628782]  ? unwind_next_frame+0x6c/0x4f0
[  513.628974]  ? lock_acquire+0xec/0x1e0
[  513.629154]  lock_acquire+0xec/0x1e0
[  513.629357]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.629587]  __mutex_lock+0x98/0x970
[  513.629790]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630047]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630309]  ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630562]  evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.630816]  kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
[  513.631057]  kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
[  513.631288]  amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
[  513.631536]  amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
[  513.632076]  __mmu_notifier_invalidate_range_start+0x70/0xb0
[  513.632299]  try_to_unmap_one+0x7fc/0x8f0
[  513.632487]  ? page_lock_anon_vma_read+0x68/0x250
[  513.632690]  rmap_walk_anon+0x121/0x290
[  513.632875]  try_to_unmap+0x93/0xf0
[  513.633050]  ? page_remove_rmap+0x330/0x330
[  513.633239]  ? rcu_read_unlock+0x60/0x60
[  513.633422]  ? page_get_anon_vma+0x160/0x160
[  513.633613]  shrink_page_list+0x606/0xcb0
[  513.633800]  shrink_inactive_list+0x33b/0x700
[  513.633997]  shrink_node_memcg+0x37a/0x7f0
[  513.634186]  ? shrink_node+0xd8/0x490
[  513.634363]  shrink_node+0xd8/0x490
[  513.634537]  balance_pgdat+0x18b/0x3b0
[  513.634718]  kswapd+0x203/0x5c0
[  513.634887]  ? wait_woken+0xb0/0xb0
[  513.635062]  kthread+0x100/0x140
[  513.635231]  ? balance_pgdat+0x3b0/0x3b0
[  513.635414]  ? kthread_delayed_work_timer_fn+0x80/0x80
[  513.635626]  ret_from_fork+0x24/0x30
[  513.636042] Evicting PASID 32768 queues
[  513.936236] Restoring PASID 32768 queues
[  524.708912] Evicting PASID 32768 queues
[  524.999875] Restoring PASID 32768 queues

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-11 12:57:07 -05:00
Oak Zeng
8636e53c47 drm/amdkfd: Separate mqd allocation and initialization
Introduce a new mqd allocation interface and split the original
init_mqd function into two functions: allocate_mqd and init_mqd.
Also renamed uninit_mqd to free_mqd. This is preparation work to
fix a circular lock dependency.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-11 12:56:59 -05:00