Commit Graph

765 Commits

Author SHA1 Message Date
Amber Lin
0da8b10e36 drm/amdgpu: get_fw_version isn't ASIC specific
Method of getting firmware version is the same across ASICs, so remove
them from ASIC-specific files and create one in amdgpu_amdkfd.c. This new
created get_fw_version simply reads fw_version from adev->gfx than parsing
the ucode header.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-19 11:32:40 -05:00
Dave Airlie
f06ddb5309 Linux 5.1-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlyzsYgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGMw0H/ir42KJiABBKSETD
 0d38qXVclAI/123zl8EkSfDrBKOsuIpXUDxzKeoDMhMkiurMpK6bbEOTPJAQMZJe
 nEYpq/bZQi+vO8Q/pMMpaC3ExlIRosd0JAR7TyDUh5ZAeeMuDNzmvMk/DPxXPbNt
 0P1FWePDa7908ajCOW1T8ZrB9Ak8boo7TKkF3LBb00ks1mEkyp/l74MKOHdu+HYn
 XIwncX/Jotl4BrKdNC2f/NXYLYk6MrJDGug8TxuHgIqiMWhhrcSqbxU1ri7iqFXB
 cBYdFo6ZJ8CWHux8/5LY5CMjSqEtzKha2Ohuhy3MMu1RsICyFLQtHnxHJ1ytLSBt
 DOPcDQ0=
 =CEUD
 -----END PGP SIGNATURE-----

BackMerge v5.1-rc5 into drm-next

Need rc5 for udl fix to add udl cleanups on top.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-15 15:51:49 +10:00
Alex Deucher
e7ad88553a drm/amdkfd: Add picasso pci id
Picasso is a new raven variant.

Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-04 10:20:34 -05:00
Alex Deucher
20d059278e Revert "drm/amdkfd: avoid HMM change cause circular lock"
This reverts commit 8dd69e69f4.

This depends on an HMM fix which is not upstream yet.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-28 10:15:49 -05:00
Eric Huang
9b54d20176 drm/amdkfd: add RAS ECC event support (v3)
RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.

v2: Fix misleading-indentation warning
v3: fix build with CONFIG_HSA_AMD disabled

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:51 -05:00
Eric Huang
0dee45a25a drm/amdkfd: add RAS capabilities in topology for Vega20 (v2)
It is to collaborate with HSA_CAPABILITY in libhsakmt.

v2: squash in NULL pointer check

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:51 -05:00
Philip Yang
8dd69e69f4 drm/amdkfd: avoid HMM change cause circular lock
There is circular lock between gfx and kfd path with HMM change:
lock(dqm) -> bo::reserve -> amdgpu_mn_lock

To avoid this, move init/unint_mqd() out of lock(dqm), to remove nested
locking between mmap_sem and bo::reserve. The locking order
is: bo::reserve -> amdgpu_mn_lock(p->mn)

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:03:37 -05:00
Kevin Wang
cac734c2db drm/amdkfd: use init_mqd function to allocate object for hid_mqd (CI)
if use the legacy method to allocate object, when mqd_hiq need to run
uninit code, it will be cause WARNING call trace.

eg: (s3 suspend test)
[   34.918944] Call Trace:
[   34.918948]  [<ffffffff92961dc1>] dump_stack+0x19/0x1b
[   34.918950]  [<ffffffff92297648>] __warn+0xd8/0x100
[   34.918951]  [<ffffffff9229778d>] warn_slowpath_null+0x1d/0x20
[   34.918991]  [<ffffffffc03ce1fe>] uninit_mqd_hiq_sdma+0x4e/0x50 [amdgpu]
[   34.919028]  [<ffffffffc03d0ef7>] uninitialize+0x37/0xe0 [amdgpu]
[   34.919064]  [<ffffffffc03d15a6>] kernel_queue_uninit+0x16/0x30 [amdgpu]
[   34.919086]  [<ffffffffc03d26c2>] pm_uninit+0x12/0x20 [amdgpu]
[   34.919107]  [<ffffffffc03d4915>] stop_nocpsch+0x15/0x20 [amdgpu]
[   34.919129]  [<ffffffffc03c1dce>] kgd2kfd_suspend.part.4+0x2e/0x50 [amdgpu]
[   34.919150]  [<ffffffffc03c2667>] kgd2kfd_suspend+0x17/0x20 [amdgpu]
[   34.919171]  [<ffffffffc03c103a>] amdgpu_amdkfd_suspend+0x1a/0x20 [amdgpu]
[   34.919187]  [<ffffffffc02ec428>] amdgpu_device_suspend+0x88/0x3a0 [amdgpu]
[   34.919189]  [<ffffffff922e22cf>] ? enqueue_entity+0x2ef/0xbe0
[   34.919205]  [<ffffffffc02e8220>] amdgpu_pmops_suspend+0x20/0x30 [amdgpu]
[   34.919207]  [<ffffffff925c56ff>] pci_pm_suspend+0x6f/0x150
[   34.919208]  [<ffffffff925c5690>] ? pci_pm_freeze+0xf0/0xf0
[   34.919210]  [<ffffffff926b45c6>] dpm_run_callback+0x46/0x90
[   34.919212]  [<ffffffff926b49db>] __device_suspend+0xfb/0x2a0
[   34.919213]  [<ffffffff926b4b9f>] async_suspend+0x1f/0xa0
[   34.919214]  [<ffffffff922c918f>] async_run_entry_fn+0x3f/0x130
[   34.919216]  [<ffffffff922b9d4f>] process_one_work+0x17f/0x440
[   34.919217]  [<ffffffff922bade6>] worker_thread+0x126/0x3c0
[   34.919218]  [<ffffffff922bacc0>] ? manage_workers.isra.25+0x2a0/0x2a0
[   34.919220]  [<ffffffff922c1c31>] kthread+0xd1/0xe0
[   34.919221]  [<ffffffff922c1b60>] ? insert_kthread_work+0x40/0x40
[   34.919222]  [<ffffffff92974c1d>] ret_from_fork_nospec_begin+0x7/0x21
[   34.919224]  [<ffffffff922c1b60>] ? insert_kthread_work+0x40/0x40
[   34.919224] ---[ end trace 38cd9f65c963adad ]---

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-27 22:19:07 -05:00
Dave Airlie
fbac3c48fa Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
Fixes for 5.1:
amdgpu:
- Fix missing fw declaration after dropping old CI DPM code
- Fix debugfs access to registers beyond the MMIO bar size
- Fix context priority handling
- Add missing license on some new files
- Various cleanups and bug fixes

radeon:
- Fix missing break in CS parser for evergreen
- Various cleanups and bug fixes

sched:
- Fix entities with 0 run queues

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221214134.3308-1-alexander.deucher@amd.com
2019-02-22 15:56:42 +10:00
Yong Zhao
234441dd49 drm/amdkfd: Optimize out sdma doorbell array in kgd2kfd_shared_resources
We can directly calculate sdma doorbell indexes in the process doorbell
pages through the doorbell_index structure in amdgpu_device, so no need
to cache them in kgd2kfd_shared_resources any more. This alleviates the
adaptation needs when new SDMA configurations are introduced.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-18 18:00:50 -05:00
Yong Zhao
1f86805adc drm/amdkfd: Fix bugs regarding CP queue doorbell mask on SOC15
Reserved doorbells for SDMA IH and VCN were not properly masked out
when allocating doorbells for CP user queues. This patch fixed that.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-18 18:00:41 -05:00
Yong Zhao
7452394310 drm/amdkfd: Move a constant definition around
The similar definitions should be consecutive.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-18 17:59:56 -05:00
Dave Airlie
c06de56121 Linux 5.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxqHJYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGWl8H/jPI4EipzD2GbnjZ
 GaFpMBBjcXBaVmoA+Y69so+7BHx1Ql+5GQtqbK0RHJRb9qEPLw3FBhHNjM/N8Sgf
 nSrK+GnBZp9s+k/NR/Yf2RacUR3jhz+Q9JEoQd3u9bFUeQyvE8Rf3vgtoBBwFOfz
 +t7N1memYVF3asLGWB4e4sP1YVMGfseTQpSPojvM30YWM86Bv+QtSx1AGgHczQIM
 kMKealR8ZPelN6JAXgLhQ5opDojBrE4YKB98pwsMDI6abz0Tz2JLFEUTTxsv5XNN
 o/Iz+XDoylskEyxN2unNWfHx7Swkvoklog8J/hDg5XlTvipL/WkT66PHBgcGMNvj
 BW9GgU8=
 =ZizU
 -----END PGP SIGNATURE-----

Merge v5.0-rc7 into drm-next

Backmerging for nouveau and imx that needed some fixes for next pulls.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-02-18 13:27:15 +10:00
Nathan Chancellor
6d3d8065bb drm/amdkfd: Fix if preprocessor statement above kfd_fill_iolink_info_for_cpu
Clang warns:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:866:5: warning:
'CONFIG_X86_64' is not defined, evaluates to 0 [-Wundef]
    ^
1 warning generated.

Fixes: d1c234e2cd ("drm/amdkfd: Allow building KFD on ARM64 (v2)")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-05 18:10:28 -05:00
Felix Kuehling
bbdf514fe5 drm/amdkfd: Don't assign dGPUs to APU topology devices
dGPUs need their own topology devices. Don't assign them to APU topology
devices with CPU cores.

Bug: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/66
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Elias Konstantinidis <ekondis@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:59:50 -05:00
Felix Kuehling
d1c234e2cd drm/amdkfd: Allow building KFD on ARM64 (v2)
ifdef x86_64 specific code.
Allow enabling CONFIG_HSA_AMD on ARM64.

v2: Fixed a compiler warning due to an unused variable

CC: Mark Nutter <Mark.Nutter@arm.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Mark Nutter <Mark.Nutter@arm.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:59:37 -05:00
Felix Kuehling
b8fe05247d drm/amdkfd: Don't assign dGPUs to APU topology devices
dGPUs need their own topology devices. Don't assign them to APU topology
devices with CPU cores.

Bug: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/66
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Elias Konstantinidis <ekondis@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:37:33 -05:00
Felix Kuehling
df1dd4f4a7 drm/amdkfd: Allow building KFD on ARM64 (v2)
ifdef x86_64 specific code.
Allow enabling CONFIG_HSA_AMD on ARM64.

v2: Fixed a compiler warning due to an unused variable

CC: Mark Nutter <Mark.Nutter@arm.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Mark Nutter <Mark.Nutter@arm.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:37:15 -05:00
Amber Lin
308176d6f6 drm/amdgpu: Remove kgd2kfd function pointers
kgd2kfd function pointers and global kgd2kfd pointer are no longer in use.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:29 -05:00
Amber Lin
2d3d25b616 drm/amdgpu: Relocate kgd2kfd function declaration
Since amdkfd is merged into amdgpu module and amdgpu can access amdkfd
directly, move declaration of kgd2kfd functions from kfd_priv.h to
amdgpu_amdkfd.h

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:28 -05:00
Linus Torvalds
0fe4e2d5cd drm i915 gvt, amdgpu, core fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcL65eAAoJEAx081l5xIa+y7EP+wQnTk3GV7rKiIi5LEtux5xW
 X2tTaPKHnwrMYjRaP2VNUntJPH6Wxcby3OHGNvGMe1IqNGL/5qRLQ/g1rSSPuM4z
 rYwWR/ooDU/KwYvsT/o+DSO62AoVzIqx8gn8+ShirRN3MdobCcwDebd5oqKjduOn
 hRy9WQwgPOnDG1D3fRWOGSzOE1K9yDFCUaR0AmhUehn9NvsztQGamMBBwMNg+y52
 a5vu+nSLxQrv3ZyZ5TQUgAzi2pWFtC6QxIVuLpl5TqFA3vdRVyN1T78klDnQ7WU7
 6GY1yq9D923c1Tfa0RZoXnE++bX91KKJ5y9YFuNFv8X/th6UoEzRrOPDINfLoZv3
 JsPPSPAiZTgoXc/RGfoMbnidajNB7Gx+No+Pd8P6MeY5H1E+ivMXt5MrOgcMXUqk
 FajthiuSlaB+u5OjNjuS6gBbAMIKw7Idg4hEFSabj91qhJIet/fPhzNmp0HPJ1wF
 XlNnxI7XOytCAORrjLy2q4/lkaoG2AlVpZzeMLgXSxGGlSCtIpDUIqgQbtV1ppCi
 RboQ8yMflRejeK6oXoC92mI8yDB6rwoQy2tK0Hvnag5/q1r7AVYJq+3890NFEU4X
 F5TuCgvhswdkTEJUED1G6pnX7aQzW0dh6KrCltF34sFzD1etYb150En7laa+2kmX
 G5HfZbkLwscPt91moA6B
 =hFld
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Happy New Year, just decloaking from leave to get some stuff from the
  last week in before rc1:

  core:
   - two regression fixes for damage blob and atomic

  i915 gvt:
   - Some missed GVT fixes from the original pull

  amdgpu:
   - new PCI IDs
   - SR-IOV fixes
   - DC fixes
   - Vega20 fixes"

* tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm: (53 commits)
  drm: Put damage blob when destroy plane state
  drm: fix null pointer dereference on null state pointer
  drm/amdgpu: Add new VegaM pci id
  drm/ttm: Use drm_debug_printer for all ttm_bo_mem_space_debug output
  drm/amdgpu: add Vega20 PSP ASD firmware loading
  drm/amd/display: Fix MST dp_blank REG_WAIT timeout
  drm/amd/display: validate extended dongle caps
  drm/amd/display: Use div_u64 for flip timestamp ns to ms
  drm/amdgpu/uvd:Change uvd ring name convention
  drm/amd/powerplay: add Vega20 LCLK DPM level setting support
  drm/amdgpu: print process info when job timeout
  drm/amdgpu/nbio7.4: add hw bug workaround for vega20
  drm/amdgpu/nbio6.1: add hw bug workaround for vega10/12
  drm/amd/display: Optimize passive update planes.
  drm/amd/display: verify lane status before exiting verify link cap
  drm/amd/display: Fix bug with not updating VSP infoframe
  drm/amd/display: Add retry to read ddc_clock pin
  drm/amd/display: Don't skip link training for empty dongle
  drm/amd/display: Wait edp HPD to high in detect_sink
  drm/amd/display: fix surface update sequence
  ...
2019-01-05 18:25:19 -08:00
Linus Torvalds
96d4f267e4 Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
Arun KS
9705bea5f8 mm: convert zone->managed_pages to atomic variable
totalram_pages, zone->managed_pages and totalhigh_pages updates are
protected by managed_page_count_lock, but readers never care about it.
Convert these variables to atomic to avoid readers potentially seeing a
store tear.

This patch converts zone->managed_pages.  Subsequent patches will convert
totalram_panges, totalhigh_pages and eventually managed_page_count_lock
will be removed.

Main motivation was that managed_page_count_lock handling was complicating
things.  It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

Link: http://lkml.kernel.org/r/1542090790-21750-3-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:47 -08:00
Linus Torvalds
4971f090aa drm pull request for 4.21-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcExwOAAoJEAx081l5xIa+euIP/1NZZvSB+bsCtOwDG8I6uWsS
 OU5JUZ8q2dqyyFagRxzlkeSt3uWJqKp5NyNwuc9z/5u6AGF+3/97D0J1lG6Os/st
 4abF6NadivYJ4cXhJ1ddIHOFMVDcAsyMWNDb93NwPwncCsQ0jt5FFOsrCyj6BGY+
 ihHFlHrIyDrbBGDHz+u1E/EO5WkNnaLDoC+/k2fTRWCNI3bQL3O+orsYTI6S2uvU
 lQJnRfYAllgLD2p1k/rrBHcHXBv50roR0e8uhGmbdhGdp5bEW30UGBLHXxQjjSVy
 fQCwFwTO8X6zoxU53Zbbk+MVrp+jkTHcGKViHRuLkaHzE5mX26UXDwlXdN32ZUbK
 yHOJp+uDaWXX7MIz0LsB9Iqj2+eIUoFaIJMoZTMGVTNvqnTxKnoHnjAtbTH2u258
 teFgmy4BIgPgo2kwEnBEZjCapou0Eivyut2wq8bTAB2Fe8LwURJpr3cioTtMLlUO
 L5/PoD27eFvBCAeFrQIwF3b2XiQEnBpXocmilEwP1xDMPgoyeePAfIF2iEpDvi0U
 jce3rLd2yVvo92xYUgoHkVTD8si/pKKnZ1D0U3+RI6pxK6s0HJEHjcNEMdvdm+2S
 4qgvBQV3wlWFkXEK8PR5BHPoLntg18tKon/BTLBjgGkN9E1o9fWs1/s6KQGY4xdo
 l3Vvfx2LTdkgEoBssSwB
 =wh4W
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Core:
   - shared fencing staging removal
   - drop transactional atomic helpers and move helpers to new location
   - DP/MST atomic cleanup
   - Leasing cleanups and drop EXPORT_SYMBOL
   - Convert drivers to atomic helpers and generic fbdev.
   - removed deprecated obj_ref/unref in favour of get/put
   - Improve dumb callback documentation
   - MODESET_LOCK_BEGIN/END helpers

  panels:
   - CDTech panels, Banana Pi Panel, DLC1010GIG,
   - Olimex LCD-O-LinuXino, Samsung S6D16D0, Truly NT35597 WQXGA,
   - Himax HX8357D, simulated RTSM AEMv8.
   - GPD Win2 panel
   - AUO G101EVN010

  vgem:
   - render node support

  ttm:
   - move global init out of drivers
   - fix LRU handling for ghost objects
   - Support for simultaneous submissions to multiple engines

  scheduler:
   - timeout/fault handling changes to help GPU recovery
   - helpers for hw with preemption support

  i915:
   - Scaler/Watermark fixes
   - DP MST + powerwell fixes
   - PSR fixes
   - Break long get/put shmemfs pages
   - Icelake fixes
   - Icelake DSI video mode enablement
   - Engine workaround improvements

  amdgpu:
   - freesync support
   - GPU reset enabled on CI, VI, SOC15 dGPUs
   - ABM support in DC
   - KFD support for vega12/polaris12
   - SDMA paging queue on vega
   - More amdkfd code sharing
   - DCC scanout on GFX9
   - DC kerneldoc
   - Updated SMU firmware for GFX8 chips
   - XGMI PSP + hive reset support
   - GPU reset
   - DC trace support
   - Powerplay updates for newer Polaris
   - Cursor plane update fast path
   - kfd dma-buf support

  virtio-gpu:
   - add EDID support

  vmwgfx:
   - pageflip with damage support

  nouveau:
   - Initial Turing TU104/TU106 modesetting support

  msm:
   - a2xx gpu support for apq8060 and imx5
   - a2xx gpummu support
   - mdp4 display support for apq8060
   - DPU fixes and cleanups
   - enhanced profiling support
   - debug object naming interface
   - get_iova/page pinning decoupling

  tegra:
   - Tegra194 host1x, VIC and display support enabled
   - Audio over HDMI for Tegra186 and Tegra194

  exynos:
   - DMA/IOMMU refactoring
   - plane alpha + blend mode support
   - Color format fixes for mixer driver

  rcar-du:
   - R8A7744 and R8A77470 support
   - R8A77965 LVDS support

  imx:
   - fbdev emulation fix
   - multi-tiled scalling fixes
   - SPDX identifiers

  rockchip
   - dw_hdmi support
   - dw-mipi-dsi + dual dsi support
   - mailbox read size fix

  qxl:
   - fix cursor pinning

  vc4:
   - YUV support (scaling + cursor)

  v3d:
   - enable TFU (Texture Formatting Unit)

  mali-dp:
   - add support for linear tiled formats

  sun4i:
   - Display Engine 3 support
   - H6 DE3 mixer 0 support
   - H6 display engine support
   - dw-hdmi support
   - H6 HDMI phy support
   - implicit fence waiting
   - BGRX8888 support

  meson:
   - Overlay plane support
   - implicit fence waiting
   - HDMI 1.4 4k modes

  bridge:
   - i2c fixes for sii902x"

* tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm: (1403 commits)
  drm/amd/display: Add fast path for cursor plane updates
  drm/amdgpu: Enable GPU recovery by default for CI
  drm/amd/display: Fix duplicating scaling/underscan connector state
  drm/amd/display: Fix unintialized max_bpc state values
  Revert "drm/amd/display: Set RMX_ASPECT as default"
  drm/amdgpu: Fix stub function name
  drm/msm/dpu: Fix clock issue after bind failure
  drm/msm/dpu: Clean up dpu_media_info.h static inline functions
  drm/msm/dpu: Further cleanups for static inline functions
  drm/msm/dpu: Cleanup the debugfs functions
  drm/msm/dpu: Remove dpu_irq and unused functions
  drm/msm: Make irq_postinstall optional
  drm/msm/dpu: Cleanup callers of dpu_hw_blk_init
  drm/msm/dpu: Remove unused functions
  drm/msm/dpu: Remove dpu_crtc_is_enabled()
  drm/msm/dpu: Remove dpu_crtc_get_mixer_height
  drm/msm/dpu: Remove dpu_dbg
  drm/msm: dpu: Remove crtc_lock
  drm/msm: dpu: Remove vblank_requested flag from dpu_crtc
  drm/msm: dpu: Separate crtc assignment from vblank enable
  ...
2018-12-25 11:48:26 -08:00
Felix Kuehling
e98bdb8061 drm/amdkfd: Fix handling of return code of dma_buf_get
On errors, dma_buf_get returns a negative error code, rather than NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-18 17:39:11 -05:00
Alex Deucher
9bd206f89f drm/amdkfd: add new vega20 pci id
New vega20 id.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-10 15:27:45 -05:00
Alex Deucher
756e16bf79 drm/amdkfd: add new vega10 pci ids
New vega10 ids.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2018-12-10 15:27:32 -05:00
Felix Kuehling
b408a54884 drm/amdkfd: Add support for doorbell BOs
This allows user mode to map doorbell pages into GPUVM address space.
That way GPUs can submit to user mode queues (self-dispatch).

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:14:00 -05:00
Felix Kuehling
1dde0ea95b drm/amdkfd: Add DMABuf import functionality
This is used for interoperability between ROCm compute and graphics
APIs. It allows importing graphics driver BOs into the ROCm SVM
address space for zero-copy GPU access.

The API is split into two steps (query and import) to allow user mode
to manage the virtual address space allocation for the imported buffer.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:13:54 -05:00
Felix Kuehling
3704d56e1a drm/amdkfd: Add NULL-pointer check
top_dev->gpu is NULL for CPUs. Avoid dereferencing it if NULL.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:13:48 -05:00
Brajeswar Ghosh
b8b3ede2de drm/amd/amdkfd: Remove duplicate header
Remove gca/gfx_8_0_enum.h which is included more than once

Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-26 15:54:39 -05:00
Yong Zhao
a53a11a835 drm/amdkfd: Workaround PASID missing in gfx9 interrupt payload under non HWS
This is a known gfx9 HW issue, and this change can perfectly workaround
the issue.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:14 -05:00
Yong Zhao
00557f4131 drm/amdkfd: Adjust the debug message in KFD ISR
This makes debug message get printed even when there is early return.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:14 -05:00
Gang Ba
846a44d7e9 drm/amdkfd: Added Vega12 and Polaris12 for KFD.
Add Vega12 and Polaris12 device info and device IDs to KFD.

Signed-off-by: Gang Ba <gaba@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:13 -05:00
Yong Zhao
4e6c6fc19d drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager
This will make reading code much easier. This fixes a few spots missed in a
previous commit with the same title.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:13 -05:00
Christian König
2383a767c0 drm/amdkfd: fix interrupt spin lock
Vega10 has multiple interrupt rings, so this can be called from multiple
calles at the same time resulting in:

[   71.779334] ================================
[   71.779406] WARNING: inconsistent lock state
[   71.779478] 4.19.0-rc1+ #44 Tainted: G        W
[   71.779565] --------------------------------
[   71.779637] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[   71.779740] kworker/6:1/120 [HC0[0]:SC0[0]:HE1:SE1] takes:
[   71.779832] 00000000ad761971 (&(&kfd->interrupt_lock)->rlock){?...},
at: kgd2kfd_interrupt+0x75/0x100 [amdgpu]
[   71.780058] {IN-HARDIRQ-W} state was registered at:
[   71.780115]   _raw_spin_lock+0x2c/0x40
[   71.780180]   kgd2kfd_interrupt+0x75/0x100 [amdgpu]
[   71.780248]   amdgpu_irq_callback+0x6c/0x150 [amdgpu]
[   71.780315]   amdgpu_ih_process+0x88/0x100 [amdgpu]
[   71.780380]   amdgpu_irq_handler+0x20/0x40 [amdgpu]
[   71.780409]   __handle_irq_event_percpu+0x49/0x2a0
[   71.780436]   handle_irq_event_percpu+0x30/0x70
[   71.780461]   handle_irq_event+0x37/0x60
[   71.780484]   handle_edge_irq+0x83/0x1b0
[   71.780506]   handle_irq+0x1f/0x30
[   71.780526]   do_IRQ+0x53/0x110
[   71.780544]   ret_from_intr+0x0/0x22
[   71.780566]   cpuidle_enter_state+0xaa/0x330
[   71.780591]   do_idle+0x203/0x280
[   71.780610]   cpu_startup_entry+0x6f/0x80
[   71.780634]   start_secondary+0x1b0/0x200
[   71.780657]   secondary_startup_64+0xa4/0xb0

Fix this by always using irq save spin locks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 15:49:38 -05:00
Yong Zhao
435e2f9709 drm/amdkfd: page_table_base already have the flags needed
The flags are added when calling amdgpu_gmc_pd_addr().

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:14 -05:00
Yong Zhao
deb99d7c4f drm/amdkfd: Delete a duplicate statement in set_pasid_vmid_mapping()
The same statement is later done in kgd_set_pasid_vmid_mapping(), so no
need to do it in set_pasid_vmid_mapping().

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:13 -05:00
Amber Lin
7cd52c917a drm/amdkfd: Add proper prefix to functions
Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage.

v2: fix indentation

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:08 -05:00
Amber Lin
5b87245faf drm/amdkfd: Simplify kfd2kgd interface
After amdkfd module is merged into amdgpu, KFD can call amdgpu directly
and no longer needs to use the function pointer. Replace those function
pointers with functions if they are not ASIC dependent.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:07 -05:00
zhong jiang
6dfeb11a4b drm/amdkfd: Use kmemdup instead of duplicating its function
kmemdup has implemented the function that kmalloc() + memcpy().
We prefer to kmemdup rather than code opened implementation.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:20:36 -05:00
YueHaibing
ae5c59a83b drm/amdkfd: Remove set but not used variable 'preempt_all_queues'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c: In function 'destroy_queue_cpsch':
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:1366:7: warning:
 variable 'preempt_all_queues' set but not used [-Wunused-but-set-variable]

It never used since introduct in
commit 992839ad64 ("drm/amdkfd: Add static user-mode queues support")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-10 14:49:42 -05:00
Felix Kuehling
1b19aa5aa8 drm/amdkfd: Fix incorrect use of process->mm
This mm_struct pointer should never be dereferenced. If running in
a user thread, just use current->mm. If running in a kernel worker
use get_task_mm to get a safe reference to the mm_struct.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-09 17:06:19 -05:00
Shaoyun Liu
006a0b3d86 drm/amdkfd: Remove the requirement for atomic Ops on vg20
Firmware have the workaround to replace the atomic Ops with read-modify-write on CP side.
User should not expect atomic Ops on system memory works normally if system didn't not
support it.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-By: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27 09:39:57 -05:00
Shaoyun Liu
22a3a2941b drm/amdkfd: Vega20 bring up on amdkfd side
Add Vega20 device IDs, device info and enable it in KFD.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-09-26 21:09:18 -05:00
Shaoyun Liu
e715c6d0ea drm/amd: Interface change to support 64 bit page_table_base
amdgpu_gpuvm_get_process_page_dir should return the page table address
in the format expected by the pm4_map_process packet for all ASIC
generations.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:17 -05:00
Shaoyun Liu
d50941892e drm/amdkfd: Make the number of SDMA queues variable
Vega20 supports 8 SDMA queues per engine

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:16 -05:00
Jay Cornwall
5df099e8bc drm/amdkfd: Add wavefront context save state retrieval ioctl
Wavefront context save data is of interest to userspace clients for
debugging static wavefront state. The MQD contains two parameters
required to parse the control stack and the control stack itself
is kept in the MQD from gfx9 onwards.

Add an ioctl to fetch the context save area and control stack offsets
and to copy the control stack to a userspace address if it is kept in
the MQD.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:15 -05:00
Felix Kuehling
5ade6c9c35 drm/amdkfd: Report SDMA firmware version in the topology
Also save the version in struct kfd_dev so we only need to query
it once.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:14 -05:00
Emily Deng
6d12aa8741 drm/amdkfd: KFD doesn't support TONGA SRIOV
KFD module doesn't support TONGA SRIOV, if init KFD module in TONGA SRIOV
environment, it will let compute ring IB test fail.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:14 -05:00
Eric Huang
d35f00d8ec drm/amdkfd: reflect atomic support in IO link properties
Add the flags of properties according to Asic type and pcie
capabilities.

Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:13 -05:00
Dave Airlie
bf78296ab1 This is the 4.19-rc5 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlunyjMACgkQONu9yGCS
 aT52HhAA0JU7E88QPZ1gSxc1ifTaIlHXhLQSvQKAXOhIvHDwj4tEKDqPhpCN/dWX
 /o/xaUf36gU0VzUD/1IyEiMFmJEeFKnfvN5SZYZLk8uSrd4swqaY8mSueZxNEDz4
 YNK9ugI/tPztuuz7I6KrO1iVquY1WlnECxc9FH76wvHsit8Sr3PvzhR+CVrOi+8p
 k3cpWlhHiOzT/3K3Wv2Et+oh+U+myKtQTlJDSe3fMx5chksJpBmsV/IDEtsLNZfz
 3v25fHz5a3DOYqKkGJaDrbLyPNC85249B+CiXqbXvfOAHDVkMwYqcxYUG+YZ5cpm
 U0OShLXm67dz8vT9cxqOSguCliPRlM9W5+EKzmVT7l8+ycds3BuEEHg1xWPrJWgG
 7XO10HkhZl+VvnJCj54KaszMUOdpvdEQSUs82gAFxjPbQIx5gosN9O0H+DnirMhS
 6VtzS20ZoIzjd4YVkRoLNcobHB4bZVTNXZ1Zi3C/neP9pxUjhOk0y+Vr/crC5Xph
 3TykIMgiVa+CdvQ/f4LOSiCgTFhF0tLGtfDQTG7f+9+W5pMc4NKSLi8EOMlJtYEy
 wsCYZ7/T9ElgrEzFvlxSvDwiPUhcldNao/EGdRYvMxXtgj0Ctw8LhR/2YKkqo6LK
 oMoKKWkj0o7uKSHKq+dakS0FprKnBnvE2Y+XA4SO/saPGFlDAVc=
 =OFJh
 -----END PGP SIGNATURE-----

BackMerge v4.19-rc5 into drm-next

Sean Paul requested an -rc5 backmerge from some sun4i fixes.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-09-27 11:06:46 +10:00
Yong Zhao
44d8cc6f1a drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs
Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we
need to workaround this.

For future compatibility, we also overwrite the bit in capability according
to the value of needs_iommu_device.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:23 -05:00
Yong Zhao
15426dbb65 drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9
CWSR fails on Raven if the control stack is MTYPE_UC, which is used
for regular GART mappings. As a workaround we map it using MTYPE_NC.

The MEC firmware expects the control stack at one page offset from the
start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU
added a memory allocation flag just for this purpose.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:17 -05:00
shaoyunl
67f7cf9f76 drm/amdkfd: Only add bi-directional iolink on GPU with XGMI or largebar (v2)
v2: compile fix

Signed-off-by: shaoyunl <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:49:33 -05:00
Shaoyun Liu
ae9a25aea7 drm/amdkfd: Generate xGMI direct iolink
Generate xGMI iolink for upper level usage

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:49:00 -05:00
Shaoyun Liu
aa64ca38ed drm/amdkfd: Add new iolink type defines
Update the iolink type defines according to the new thunk spec

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:48:51 -05:00
Shaoyun Liu
0c1690e38b drm/amdkfd: kfd expose the hive_id of the device through its node properties
Thunk will generate the XGMI topology information when necessary with the hive_id
for each specified device

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:48:43 -05:00
Amber Lin
2690262ec9 drm/amdgpu: Relocate some definitions v2
Move some KFD-related (but used in amdgpu_drv.c) definitions from
kfd_priv.h to kgd_kfd_interface.h so we don't need to include kfd_priv.h
in amdgpu_drv.c. This fixes a build failure when AMDGPU is enabled but
MMU_NOTIFIER is not.
This patch also disables KFD-related module options when HSA_AMD is not
enabled.

v2: rebase (Alex)

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-29 12:41:50 -05:00
Oak Zeng
bf47afbabf drm/amdkfd: Release an acquired process vm
For compute vm acquired from amdgpu, vm.pasid is managed
by kfd. Decouple pasid from such vm on process destroy
to avoid duplicate pasid release.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-29 12:35:00 -05:00
Oak Zeng
1685b01a85 drm/amdgpu: Set pasid for compute vm (v2)
To make a amdgpu vm to a compute vm, the old pasid will be freed and
replaced with a pasid managed by kfd. Kfd can't reuse original pasid
allocated by amdgpu because kfd uses different pasid policy with amdgpu.
For example, all graphic devices share one same pasid in a process.

v2: rebase (Alex)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-29 12:34:49 -05:00
Amber Lin
521fb7d021 drm/amdgpu: Move KFD parameters to amdgpu (v3)
After merging KFD into amdgpu, move module parameters defined in KFD to
amdgpu_drv.c, where other module parameters are declared.

v2: add kernel-doc comments
v3: rebase and fix parameter variable name (Alex)

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-28 11:51:11 -05:00
Amber Lin
04d5e27658 drm/amdgpu: Merge amdkfd into amdgpu
Since KFD is only supported by single GPU driver, it makes sense to merge
amdgpu and amdkfd into one module. This patch is the initial step: merge
Kconfig and Makefile.

v2: also remove kfd from drm Kconfig

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-28 11:22:42 -05:00
Felix Kuehling
b5aa3f4aef drm/amdkfd: Call kfd2kgd.set_compute_idle
User mode queue submissions don't go through KFD. Therefore we don't
know exactly when compute is idle or not idle. We use the existence
of user mode queues on a device as an approximation.

register_process is called when the first queue of a process is
created. Conversely unregister_process is called when the last queue
is destroyed. The first process that is registered takes compute
out of idle. The last process that is unregisters sets compute back
to idle.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-16 19:10:37 -04:00
Felix Kuehling
39e7f33186 drm/amdkfd: Add CU-masking ioctl to KFD
CU-masking allows a KFD client to control the set of CUs used by a
user mode queue for executing compute dispatches. This can be used
for optimizing the partitioning of the GPU and minimize conflicts
between concurrent tasks.

Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-14 19:05:59 -04:00
Yong Zhao
4d663df658 drm/amdkfd: Enable Raven for KFD
Add DID and kfd_device_info for Raven.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:48 -04:00
Yong Zhao
359cecdd49 drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event()
memory_exception_data is already initialized for not-present faults.
It only needs to be overridden for permission faults.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:47 -04:00
Yong Zhao
8725aecac3 drm/amdkfd: Workaround to accommodate Raven too many PPR issue
On Raven multiple PPRs can be queued up by the hardware. When the
first of those requests is handled by the IOMMU driver, the memory
access succeeds. After that the application may be done with the
memory and unmap it. At that point the page table entries are
invalidated, but there are still outstanding duplicate PPRs for those
addresses. When the IOMMU driver processes those duplicate requests,
it finds invalid page table entries and triggers an invalid PPR fault.

As a workaround, don't signal invalid PPR faults on Raven to avoid
segfaulting applications that haven't done anything wrong. As a side
effect, real GPU memory access faults may go unnoticed by the
application.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:46 -04:00
Yong Zhao
eab69801cf drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues
On Raven Invalid PPRs (peripheral page requests) can be reported
because multiple PPRs can be still queued when memory is freed.
Apply a rate limit to avoid flooding the log in this case.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:45 -04:00
Yong Zhao
98bb92222e drm/amdkfd: Make SDMA engine number an ASIC-dependent variable
On Raven there is only one SDMA engine instead of previously assumed two,
so we need to adapt our code to this new scenario.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:44 -04:00
Yong Zhao
f3ed5df84c drm/amdkfd: Consolidate duplicate memory banks info in topology
If there are several memory banks that has the same properties in CRAT,
we aggregate them into one memory bank. This cleans up memory banks on
APUs (e.g. Raven) where the CRAT reports each memory channel as a
separate bank. This only confuses user mode, which only deals with
virtual memory.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:43 -04:00
Yong Zhao
e7016d8e6f drm/amdkfd: Clean up reference of radeon
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:08 -04:00
Yong Zhao
8d5f355290 drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager
This will make reading code much easier.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:07 -04:00
Yong Zhao
2b281977f5 drm/amdkfd: Use module parameters noretry as the internal variable name
This makes all module parameters use the same form. Meanwhile clean up
the surrounding code.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:06 -04:00
Yong Zhao
0e9a860c72 drm/amdkfd: Introduce KFD module parameter halt_if_hws_hang
This avoids triggering a GPU reset or otherwise changing the HW
state. Instead KFD will hang, which allows HW debugging tools to
analyze the problem.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:05 -04:00
Shaoyun Liu
a29ec470b1 drm/amdkfd: Add debugfs interface to trigger HWS hang
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:04 -04:00
Shaoyun Liu
951df6d9cf drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation
The bitmap index calculation should reverse the logic used on allocation
so it will clear the same bit used on allocation

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:01 -04:00
Shaoyun Liu
73ea648d92 drm/amdkfd: Implement hang detection in KFD and call amdgpu
The reset will be performed in a new hw_exception work thread to
handle HWS hang without blocking the thread that detected the hang.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:58 -04:00
Shaoyun Liu
e42051d213 drm/amdkfd: Implement GPU reset handlers in KFD
Lock KFD and evict existing queues on reset. Notify user mode by
signaling hw_exception events.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:56 -04:00
Shaoyun Liu
e3b7a96774 drm/amdkfd: Add gpu reset interface and place holder
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:54 -04:00
Lan Xiao
58e6988612 drm/amdkfd: fix zero reading of VMID and PASID for Hawaii
Upon VM Fault, the VMID and PASID written by HW are zeros in
Hawaii. Instead of reading from ih_ring_entry, read directly
from the registers. This workaround fix the soft hang issues
caused by mishandled VM Fault in Hawaii.

Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:51 -04:00
shaoyunl
2640c3facb drm/amdkfd: Handle VM faults in KFD
1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per
   per-vmid. amdkfd needs to get the information from amdgpu through the
   new get_vm_fault_info interface. On GFX9 and later, all the required
   information is in the IH ring
2. amdkfd unmaps all queues from the faulting process and create new
   run-list without the guilty process
3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY

Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:50 -04:00
Moses Reuben
101fee63cb drm/amdkfd: send SIGSEGV to process upon KFD_EVENT_TYPE_MEMORY
Signed-off-by: Moses Reuben <moses.reuben@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:48 -04:00
Wei Lu
e47cb828eb drm/amdkfd: Fix error codes in kfd_get_process
Return ERR_PTR(-EINVAL) if kfd_get_process fails to find the process.
This fixes kernel oopses when a child process calls KFD ioctls with
a file descriptor inherited from the parent process.

Signed-off-by: Wei Lu <wei.lu2@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:47 -04:00
Jay Cornwall
a60d811b2b drm/amdkfd: Fix race between scheduler and context restore
The scheduler may raise SQ_WAVE_STATUS.SPI_PRIO via SQ_CMD before
context restore has completed. Restoring SPI_PRIO=0 after this point
may cause context save to fail as the lower priority wavefronts
are not selected for execution among spin-waiting wavefronts.

Leave SPI_PRIO at its SPI-initialized or scheduler-raised value.

v2: Also fix race with exception handler

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:46 -04:00
Felix Kuehling
1cd106ecfc drm/amdkfd: Stop using GFP_NOIO explicitly
This is no longer needed with the memalloc_nofs_save/restore in
dqm_lock/unlock.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:45 -04:00
Felix Kuehling
efeaed4d98 drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lock
This is needed to prevent deadlocks when MMU notifiers run in
reclaim-FS context and take the DQM lock for userptr evictions.
Previously this was done by making all memory allocations under
DQM locks GFP_NOIO. This is error prone. Using
memalloc_nofs_save/restore will reliably affect all memory
allocations anywhere in the kernel while the DQM lock is held.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:44 -04:00
Arnd Bergmann
0337976f40 drm/admkfd use modern ktime accessors
getrawmonotonic64() and get_monotonic_boottime64() are deprecated
because of the nonstandard naming.

The replacement functions ktime_get_raw_ns() and ktime_get_boot_ns()
also simplify the callers.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 14:41:00 +02:00
Dave Airlie
c76f0b2cc2 Merge tag 'drm-amdkfd-next-2018-05-14' of git://people.freedesktop.org/~gabbayo/linux into drm-next
This is amdkfd pull for 4.18. The major new features are:

- Add support for GFXv9 dGPUs (VEGA)
- Add support for userptr memory mapping

In addition, there are a couple of small fixes and improvements, such as:
- Fix lock handling
- Fix rollback packet in kernel kfd_queue
- Optimize kfd signal handling
- Fix CP hang in APU

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180514070126.GA1827@odedg-x270
2018-05-15 16:06:08 +10:00
Randy Dunlap
7bbc0b950f drm/amdkfd: fix build, select MMU_NOTIFIER
When CONFIG_MMU_NOTIFIER is not enabled, struct mmu_notifier has an
incomplete type definition, which causes build errors.

../drivers/gpu/drm/amd/amdkfd/kfd_priv.h:607:22: error: field 'mmu_notifier' has incomplete type
../include/linux/kernel.h:979:32: error: dereferencing pointer to incomplete type
../include/linux/kernel.h:980:18: error: dereferencing pointer to incomplete type
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:434:2: error: implicit declaration of function 'mmu_notifier_unregister_no_release' [-Werror=implicit-function-declaration]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:435:2: error: implicit declaration of function 'mmu_notifier_call_srcu' [-Werror=implicit-function-declaration]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:438:21: error: variable 'kfd_process_mmu_notifier_ops' has initializer but incomplete type
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: error: unknown field 'release' specified in initializer
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: excess elements in struct initializer [enabled by default]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: (near initialization for 'kfd_process_mmu_notifier_ops') [enabled by default]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:534:2: error: implicit declaration of function 'mmu_notifier_register' [-Werror=implicit-function-declaration]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:50:04 +03:00
Andres Rodriguez
1cf6cc74bb drm/amdkfd: fix clock counter retrieval for node without GPU
Currently if a user requests clock counters for a node without a GPU
resource we will always return EINVAL.

Instead if no GPU resource is attached, fill the gpu_clock_counter
argument with zeroes so that we may proceed and return valid CPU
counters.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:34:44 +03:00
Wei Yongjun
ded5e5622c drm/amdkfd: Fix the error return code in kfd_ioctl_unmap_memory_from_gpu()
Passing NULL pointer to PTR_ERR will result in return value of 0
indicating success which is clearly not what it is intended here.
This patch returns -EINVAL instead.

v2: change ret code to -ENODEV

Fixes: 5ec7e02854 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:14:55 +03:00
kbuild test robot
a4efd3a4e6 drm/amdkfd: kfd_dev_is_large_bar() can be static
Fixes: 5ec7e02854 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:05:27 +03:00
Laura Abbott
af47b39027 drm/amdkfd: Remove vla
There's an ongoing effort to remove VLAs[1] from the kernel to eventually
turn on -Wvla. Switch to a constant value that covers all hardware.

[1] https://lkml.org/lkml/2018/3/7/621

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-13 14:24:12 -07:00
Felix Kuehling
c129db1206 drm/amdkfd: Add sanity checks in IRQ handlers
Only accept interrupts from KFD VMIDs. Just checking for a PASID may
not be enough because amdgpu started using PASIDs to map VM faults
to processes.

Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug).

Suggested-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Suggested-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:12 -04:00
Shaoyun Liu
2533f0741e drm/amdkfd: Remove queue node when destroy queue failed
HWS may hang in the middle of destroy queue, remove the queue from the
process queue list so it won't be freed again in the future

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:11 -04:00
Ben Goz
bfdcbfd255 drm/amdkfd: Locking PM mutex while allocating IB buffer
Signed-off-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:10 -04:00
Felix Kuehling
ccb76b149e drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK
The initialization is not necessary. amd-kfd-staging and ROCm
releases have worked without it for two years.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:09 -04:00
Felix Kuehling
eeb27b7eb3 drm/amdkfd: Fix signal handling performance again
It turns out that idr_for_each_entry is really slow compared to just
iterating over the slots. Based on measurements the difference is
estimated to be about a factor 64. That means using idr_for_each_entry
is only worth it with very few allocated events.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:08 -04:00
Yong Zhao
f8ea72d097 drm/amdkfd: Fix CP soft hang on APUs
The problem happens on Raven and Carrizo. The context save handler
should not clear the high bits of PC_HI before extracting the bits
of IB_STS.

The bug is not relevant to VEGA10 until we enable demand paging.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:07 -04:00
Yong Zhao
0db54b24ad drm/amdkfd: Separate trap handler assembly code and its hex values
Since the assembly code is inside "#if 0", it is ineffective. Despite that,
during debugging, we need to change the assembly code, extract it into
a separate file and compile the new file into hex values using sp3.
That process also requires us to remove "#if 0" and modify lines starting
with "#", so that sp3 can successfully compile the new file.

With this change, all the above chore is no longer needed, and
cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its
hex values.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:06 -04:00
Felix Kuehling
a2e94158b8 drm/amdkfd: Remove redundant include of amd-iommu.h
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:05 -04:00
Philip Yang
fa7e65147e drm/amdkfd: use %px to print user space address instead of %p
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:04 -04:00
Jay Cornwall
2774c63ef3 drm/amdkfd: Use volatile MTYPE in default/alternate apertures
MTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile.
Cache lines loaded through these apertures are intended to be
invalidated before (and sometimes during) a dispatch. The non-volatile
qualifier prevents these cache lines from being distinguished from
those loaded through the private aperture.

Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the
compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor
to automatic per-dispatch scalar/vector L1 volatile invalidation.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:03 -04:00
Jay Cornwall
87e6d4e077 drm/amdkfd: Reduce priority of context-saving waves before spin-wait
Synchronization between context-saving wavefronts is achieved by
sending a SAVEWAVE message to the SPI and then spin-waiting for a
response. These spin-waiting wavefronts may inhibit the progress
of other wavefronts in the context save handler, leading to the
synchronization condition never being achieved.

Before spin-waiting reduce the priority of each wavefront to
guarantee foward progress in the others.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:02 -04:00
Oak Zeng
24f48a4203 drm/amdkfd: Dump HQD of HIQ
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:01 -04:00
Dan Carpenter
8feaccf71d drm/amdkfd: Integer overflows in ioctl
args->n_devices is a u32 that comes from the user.  The multiplication
could overflow on 32 bit systems possibly leading to privilege
escalation.

Fixes: 5ec7e02854 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Dan Carpenter dan.carpenter@oracle.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 16:35:49 +03:00
Felix Kuehling
389056e5fe drm/amdkfd: Add Vega10 topology and device info
* Report 64-bit doorbells as HSA_CAP_DOORBELL_TYPE_2_0 in topology
* Report cache information in topology (duplicates GFXv8 info for now)
* Add device info for Vega10 support in KFD

Raven is not enabled at this time as it needs additional changes in
DQM to work with a single SDMA engine.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:18 -04:00
welu
6106dce955 drm/amdkfd: Try to enable atomics for all GPUs
Report failure to enable atomics only on GPUs that require them.
This allows GPUs that don't require atomics to function, but can
benefit if they are available. This is the case for Vega10, which
doesn't use atomics for basic functioning of the MEC, AQL and HWS
microcode. So it can work without atomics. But shader programs can
still use atomic instructions on systems that support PCIe atomics.

Signed-off-by: welu <Wei.Lu2@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:17 -04:00
Felix Kuehling
3e76c2399b drm/amdkfd: Add GFXv9 CWSR trap handler
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:16 -04:00
Felix Kuehling
70a31d16cc drm/amdkfd: Support flat memory apertures for GFXv9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:15 -04:00
Felix Kuehling
6aac0a48b0 drm/amdkfd: Remove limit on number of GPUs (follow-up)
This condition was missed in a previous commit with the same title.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:14 -04:00
Felix Kuehling
9d7d024816 drm/amdkfd: Add 64-bit doorbell and wptr support to kernel queue
v2: Removed redundant 0x before %p.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-08 22:03:51 -04:00
Felix Kuehling
bebfd2f412 drm/amdkfd: Fix kernel queue rollback_packet
kq->queue->properties.write_ptr is a GPU address which can'd be
derefenced in the kernel. Use kq->wptr_kernel instead, which is the
kernel CPU address of the same buffer.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:12 -04:00
Felix Kuehling
2a26fbfe80 drm/amdkfd: Fix goto usage
Missed a spot in previous cleanup commit:
Remove gotos that do not feature any common cleanup, and use gotos
instead of repeating cleanup commands.

According to kernel.org: "The goto statement comes in handy when a
function exits from multiple locations and some common work such as
cleanup has to be done. If there is no cleanup needed then just return
directly."

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:11 -04:00
Felix Kuehling
ca750681bc drm/amdkfd: Add SOC15 interrupt processing support
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:10 -04:00
Felix Kuehling
bed4f11025 drm/amdkfd: Add GFXv9 device queue manager
Signed-off-by: John Bridgman <john.bridgman@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:09 -04:00
Felix Kuehling
b91d43dd01 drm/amdkfd: Add GFXv9 MQD manager
Signed-off-by: John Bridgman <john.bridgman@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:08 -04:00
Felix Kuehling
454150b1f9 drm/amdkfd: Add GFXv9 PM4 packet writer functions
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:07 -04:00
Felix Kuehling
f6e27ff19d drm/amdkfd: Move packet writer functions into ASIC-specific file
This is in preparation for GFXv9 (Vega10) which uses incompatible PM4
packet formats from previous ASIC generations.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:06 -04:00
Felix Kuehling
ef568db792 drm/amdkfd: Implement doorbell allocation for SOC15
Allocate doorbells according to the doorbell routing information on
SOC15 ASICs (Vega10 and later). On older ASICs we continue to use the
queue_id as the doorbell ID to maintain compatibility with the Thunk.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:05 -04:00
Harish Kasiviswanathan
df03ef9342 drm/amdkfd: Clean up KFD_MMAP_ offset handling
Use bit-rotate for better clarity and remove _MASK from the #defines as
these represent mmap types.

Centralize all the parsing of the mmap offset in kfd_mmap and add device
parameter to doorbell and reserved_mem map functions.

Encode gpu_id into upper bits of vm_pgoff. This frees up the lower bits
for encoding the the doorbell ID on Vega10.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:04 -04:00
Felix Kuehling
ada2b29c4a drm/amdkfd: Make doorbell size ASIC-dependent
This prepares for GFXv9 (Vega10), which has 64-bit doorbells.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:03 -04:00
Linus Torvalds
320b164abb main drm pull request for v4.17
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJavDxYAAoJEAx081l5xIa+pCoP/iwjuxkSTdJpZUx5g0daGkCK
 O18moGqGPChb7qJovfHqCKZ1f9PGulQt7SxwFzzJXNbv0PbfMA/Og0EhMLBImb+Q
 VfYgq2vJLpmkikgcI5fBrzs9DRMQKKobGIzw24VS7IkPYA7d8KgAyywBwG0+LUFR
 G3sobClgapsfaUcleb3ZOeDwymGkGCuuYRpYE4giHtuMDIxCWLePKJKOaOIq8o6P
 A1557EvSbKuLQGI9X50jzJOoBE3TKRQYkzuM1GthdOF8RHaMNcFy44lDNO030HwZ
 hzwAIg5Izhu16PqZGyEdIQ6SJTv3isRJWEciPnOsijvjl1li3ehMdQfhGISa/jZO
 ivEGd32kaactiT0jJ5OyexergEViCPVKCIORksSIk46L84luDva9L22A3yu0mf3F
 ixB63bAiLH7Py77kH3DmeJdqhMxlVZXCbdBVFDvzZvY4O3Mx0Dv9mmN/nw1FVCFH
 scSYnXea9/o4IY5yGASU6FAUJEEGu20HAN12oHJw7/taqV/gbbEos3F7AGmjJE0f
 qe6Rt/8fwi7Lhm2va6EoOo6yltH/gL4/AgnsN76VzppNGbaIv7W8Qa4Y/ES1lAE1
 SATAEUJfU8kiLrVOolIElPbgfdJwv8TzoxiKB5wK/eoH20wf4BTmOuBMviaL2qXK
 Sz6wihq+IlMXW7Y7pIl/
 =DrA+
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "Cannonlake and Vega12 support are probably the two major things. This
  pull lacks nouveau, Ben had some unforseen leave and a few other
  blockers so we'll see how things look or maybe leave it for this merge
  window.

  core:
   - Device links to handle sound/gpu pm dependency
   - Color encoding/range properties
   - Plane clipping into plane check helper
   - Backlight helpers
   - DP TP4 + HBR3 helper support

  amdgpu:
   - Vega12 support
   - Enable DC by default on all supported GPUs
   - Powerplay restructuring and cleanup
   - DC bandwidth calc updates
   - DC backlight on pre-DCE11
   - TTM backing store dropping support
   - SR-IOV fixes
   - Adding "wattman" like functionality
   - DC crc support
   - Improved DC dual-link handling

  amdkfd:
   - GPUVM support for dGPU
   - KFD events for dGPU
   - Enable PCIe atomics for dGPUs
   - HSA process eviction support
   - Live-lock fixes for process eviction
   - VM page table allocation fix for large-bar systems

  panel:
   - Raydium RM68200
   - AUO G104SN02 V2
   - KEO TX31D200VM0BAA
   - ARM Versatile panels

  i915:
   - Cannonlake support enabled
   - AUX-F port support added
   - Icelake base enabling until internal milestone of forcewake support
   - Query uAPI interface (used for GPU topology information currently)
   - Compressed framebuffer support for sprites
   - kmem cache shrinking when GPU is idle
   - Avoid boosting GPU when waited item is being processed already
   - Avoid retraining LSPCON link unnecessarily
   - Decrease request signaling latency
   - Deprecation of I915_SET_COLORKEY_NONE
   - Kerneldoc and compiler warning cleanup for upcoming CI enforcements
   - Full range ycbcr toggling
   - HDCP support

  i915/gvt:
   - Big refactor for shadow ppgtt
   - KBL context save/restore via LRI cmd (Weinan)
   - Properly unmap dma for guest page (Changbin)

  vmwgfx:
   - Lots of various improvements

  etnaviv:
   - Use the drm gpu scheduler
   - prep work for GC7000L support

  vc4:
   - fix alpha blending
   - Expose perf counters to userspace

  pl111:
   - Bandwidth checking/limiting
   - Versatile panel support

  sun4i:
   - A83T HDMI support
   - A80 support
   - YUV plane support
   - H3/H5 HDMI support

  omapdrm:
   - HPD support for DVI connector
   - remove lots of static variables

  msm:
   - DSI updates from 10nm / SDM845
   - fix for race condition with a3xx/a4xx fence completion irq
   - some refactoring/prep work for eventual a6xx support (ie. when we
     have a userspace)
   - a5xx debugfs enhancements
   - some mdp5 fixes/cleanups to prepare for eventually merging
     writeback
   - support (ie. when we have a userspace)

  tegra:
   - mmap() fixes for fbdev devices
   - Overlay plane for hw cursor fix
   - dma-buf cache maintenance support

  mali-dp:
   - YUV->RGB conversion support

  rockchip:
   - rk3399/chromebook fixes and improvements

  rcar-du:
   - LVDS support move to drm bridge
   - DT bindings for R8A77995
   - Driver/DT support for R8A77970

  tilcdc:
   - DRM panel support"

* tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux: (1646 commits)
  drm/i915: Fix hibernation with ACPI S0 target state
  drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  drm/i915: Specify which engines to reset following semaphore/event lockups
  drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
  drm/amdkfd: Use ordered workqueue to restore processes
  drm/amdgpu: Fix acquiring VM on large-BAR systems
  drm/amd/pp: clean header file hwmgr.h
  drm/amd/pp: use mlck_table.count for array loop index limit
  drm: Fix uabi regression by allowing garbage mode->type from userspace
  drm/amdgpu: Add an ATPX quirk for hybrid laptop
  drm/amdgpu: fix spelling mistake: "asssert" -> "assert"
  drm/amd/pp: Add new asic support in pp_psm.c
  drm/amd/pp: Clean up powerplay code on Vega12
  drm/amd/pp: Add smu irq handlers for legacy asics
  drm/amd/pp: Fix set wrong temperature range on smu7
  drm/amdgpu: Don't change preferred domian when fallback GTT v5
  drm/vmwgfx: Bump version patchlevel and date
  drm/vmwgfx: use monotonic event timestamps
  drm/vmwgfx: Unpin the screen object backup buffer when not used
  drm/vmwgfx: Stricter count of legacy surface device resources
  ...
2018-04-02 07:59:23 -07:00
Felix Kuehling
6b95e7973a drm/amdkfd: Add quiesce_mm and resume_mm to kgd2kfd_calls
These interfaces allow KGD to stop and resume all GPU user mode queue
access to a process address space. This is needed for handling MMU
notifiers of userptrs mapped for GPU access in KFD VMs.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:32:32 -04:00
Felix Kuehling
d1853f42b6 drm/amdkfd: GFP_NOIO while holding locks taken in MMU notifier
When an MMU notifier runs in memory reclaim context, it can deadlock
trying to take locks that are already held in the thread causing the
memory reclaim. The solution is to avoid memory reclaim while holding
locks that are taken in MMU notifiers by using GFP_NOIO.

This commit fixes memory allocations done while holding the dqm->lock
which is needed in the MMU notifier (dqm->ops.evict_process_queues).

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:32:31 -04:00
Felix Kuehling
1679ae8f8f drm/amdkfd: Use ordered workqueue to restore processes
Restoring multiple processes concurrently can lead to live-locks
where each process prevents the other from validating all its BOs.

v2: fix duplicate check of same variable

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:30:36 -04:00
Felix Kuehling
72a01d231d drm/amdkfd: Deallocate SDMA queues correctly
Deallocate SDMA queues during abnormal process termination and when
queue creation fails after the SDMA allocation.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:30:34 -04:00
Felix Kuehling
c70a362687 drm/amdkfd: Fix scratch memory with HWS enabled
Program sh_hidden_private_base_vmid correctly in the map-process
PM4 packet.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:30:33 -04:00
Felix Kuehling
374200b154 drm/amdkfd: Add module option for testing large-BAR functionality
Simulate large-BAR system by exporting only visible memory. This
limits the amount of available VRAM to the size of the BAR, but
enables CPU access to VRAM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:53 -04:00
Felix Kuehling
0fc8011f89 drm/amdkfd: Kmap event page for dGPUs
The events page must be accessible in user mode by the GPU and CPU
as well as in kernel mode by the CPU. On dGPUs user mode virtual
addresses are managed by the Thunk's GPU memory allocation code.
Therefore we can't allocate the memory in kernel mode like we do
on APUs. But KFD still needs to map the memory for kernel access.
To facilitate this, the Thunk provides the buffer handle of the
events page to KFD when creating the first event.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:52 -04:00
Felix Kuehling
5ec7e02854 drm/amdkfd: Add ioctls for GPUVM memory management
v2:
* Fix error handling after kfd_bind_process_to_device in
  kfd_ioctl_map_memory_to_gpu
v3:
* Add ioctl to acquire VM from a DRM FD
v4:
* Return number of successful map/unmap operations in failure cases
* Facilitate partial retry after failed map/unmap
* Added comments with parameter descriptions to new APIs
* Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:51 -04:00
Felix Kuehling
552764b680 drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
On GFX7 the CP does not perform a TC flush when queues are unmapped.
To avoid TC eviction from accessing an invalid VMID, flush it
explicitly before releasing a VMID.

v2: Fix unnecessary list_for_each_entry_safe
v3: Moved allocation to kfd_process_device_init_vm

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:50 -04:00
Felix Kuehling
f35751b870 drm/amdkfd: Allocate CWSR trap handler memory for dGPUs
Add helpers for allocating GPUVM memory in kernel mode and use them
to allocate memory for the CWSR trap handler.

v2: Use dev instead of pdd->dev in kfd_process_free_gpuvm
v3:
* Cleaned up and simplified kfd_process_alloc_gpuvm
* Moved allocation for dGPU to kfd_process_device_init_vm

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:49 -04:00
Felix Kuehling
52b29d7334 drm/amdkfd: Add per-process IDR for buffer handles
Also used for cleaning up on process termination.

v2: Refactored cleanup on process termination

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:48 -04:00
Felix Kuehling
d01994c24c drm/amdkfd: Aperture setup for dGPUs
Set up the GPUVM aperture for SVM (shared virtual memory) that allows
sharing a part of virtual address space between GPUs and CPUs.

Report the size of the GPUVM aperture that is supported by KGD accurately.

The low part of the GPUVM aperture is reserved for kernel use. This is
for kernel-allocated buffers that are only accessed on the GPU:
- CWSR trap handler
- IB for submitting commands in user-mode context from kernel mode

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:47 -04:00
Felix Kuehling
c7bcbfa4f8 drm/amdkfd: Remove limit on number of GPUs
Currently the number of GPUs is limited by aperture placement options
available on GFX7 and GFX8 hardware. This limitation is not necessary.
Scratch and LDS represent per-work-item and per-work-group storage
respectively. Different work-items and work-groups use the same virtual
address to access their own data. Work running on different GPUs is by
definition in different work-groups (different dispatches, in fact).
That means the same virtual addresses can be used for these apertures
on different GPUs.

Add a new AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl that removes the
artificial limitation on the number of GPUs that can be supported. The
new ioctl allows user mode to query the number of GPUs to allocate
enough memory for all GPUs to be reported.

This deprecates AMDKFD_IOC_GET_PROCESS_APERTURES.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:46 -04:00
Oak Zeng
7c9b717196 drm/amdkfd: Populate DRM render device minor
Populate DRM render device minor in kfd topology

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:45 -04:00
Felix Kuehling
b84394e206 drm/amdkfd: Create KFD VMs on demand
Instead of creating all VMs on process creation, create them when
a process is bound to a device. This will later allow registering
an existing VM from a DRM render node FD at runtime, before the
process is bound to the device. This way the render node VM can be
used for KFD instead of creating our own redundant VM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:44 -04:00
Arnd Bergmann
48a4438718 drm/amdkfd: fix uninitialized variable use
When CONFIG_ACPI is disabled, we never initialize the acpi_table
structure in kfd_create_crat_image_virtual:

drivers/gpu/drm/amd/amdkfd/kfd_crat.c: In function 'kfd_create_crat_image_virtual':
drivers/gpu/drm/amd/amdkfd/kfd_crat.c:888:40: error: 'acpi_table' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The undefined behavior also happens for any other acpi_get_table()
failure, but then the compiler can't warn about it.

This adds an error check that prevents the structure from
being used in error, avoiding both the undefined behavior and
the warning about it.

Fixes: 520b8fb755 ("drm/amdkfd: Add topology support for CPUs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:49:40 +01:00
Felix Kuehling
26103436da drm/amdkfd: Implement KFD process eviction/restore
When the TTM memory manager in KGD evicts BOs, all user mode queues
potentially accessing these BOs must be evicted temporarily. Once
user mode queues are evicted, the eviction fence is signaled,
allowing the migration of the BO to proceed.

A delayed worker is scheduled to restore all the BOs belonging to
the evicted process and restart its queues.

During suspend/resume of the GPU we also evict all processes to allow
KGD to save BOs in system memory, since VRAM will be lost.

v2:
* Account for eviction when updating of q->is_active in MQD manager

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:45 -05:00
Felix Kuehling
403575c44e drm/amdkfd: Add GPUVM virtual address space to PDD
Create/destroy the GPUVM context during PDD creation/destruction.
Get VM page table base and program it during process registration
(HWS) or VMID allocation (non-HWS).

v2:
* Used dev instead of pdd->dev in kfd_flush_tlb

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:44 -05:00
Harish Kasiviswanathan
4252bf6866 drm/amdkfd: Remove unaligned memory access
Unaligned atomic operations can cause problems on some CPU
architectures. Use simpler bitmask operations instead. Atomic bit
manipulations are not necessary since dqm->lock is held during these
operations.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:42 -05:00
Gustavo A. R. Silva
2e3dca5365 drm/amdkfd: Fix potential NULL pointer dereferences
In case kfd_get_process_device_data returns null, there are some
null pointer dereferences in functions kfd_bind_processes_to_device
and kfd_unbind_processes_from_device.

Fix this by printing a WARN_ON for PDDs that aren't found and skip
them with continue statements.

Addresses-Coverity-ID: 1463794 ("Dereference null return value")
Addresses-Coverity-ID: 1463772 ("Dereference null return value")
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-10 17:15:09 -06:00
Oded Gabbay
a1235e10ee drm/amdkfd: add ull suffix to 64bit defines
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-01-10 12:55:17 +02:00
Yong Zhao
40a526dc1e drm/amdkfd: don't always call execute_queues_cpsch()
When destroying an inactive queue, we don't need to call
execute_queues_cpsch.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-02 13:10:50 -05:00
Yong Zhao
9e8272240b drm/amdkfd: Fix return value 0 when execute_queues_cpsch fails
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-02 13:10:49 -05:00
Harish Kasiviswanathan
b441093e40 drm/amdkfd: Ignore ACPI CRAT for non-APU systems
Some AMD motherboards without an APU have a broken CRAT table which
causes KFD initialization failures or incorrect information about
NUMA nodes, CPU cores or system memory. Ignore CRAT tables without
GPUs and rely on KFD's code to create a CRAT table for the CPU.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:04 -05:00
Felix Kuehling
ebcfd1e276 drm/amdkfd: Module option to disable CRAT table
Some systems have broken CRAT tables. Add a module option to ignore
a CRAT table.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:03 -05:00
Ben Goz
413e85d5d3 drm/amdkfd: Add AQL Queue Memory flag on topology
This is needed for enabling a user-mode workaround for an AQL queue
wrapping HW bug on Tonga.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:02 -05:00
Philip Cox
70f372bffc drm/amdkfd: Fixup incorrect info in the CZ CRAT table
* Wrong value for max_waves_per_simd
* Missing ATC capability bit

Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:01 -05:00
Amber Lin
f475734729 drm/amdkfd: Add perf counters to topology
For hardware blocks whose performance counters are accessed via MMIO
registers, KFD provides the support for those privileged blocks.

IOMMU is one of those privileged blocks. Most performance counter properties
required by Thunk are available at /sys/bus/event_source/devices/amd_iommu.

This patch adds properties to topology in KFD sysfs for information not
available in /sys/bus/event_source/devices/amd_iommu. They are shown at
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/iommu/ formatted as
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/<block>/<property>, i.e.
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/iommu/max_concurrent.

For dGPUs, who don't have IOMMU, nothing appears under
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:00 -05:00
Harish Kasiviswanathan
3a87177eb1 drm/amdkfd: Add topology support for dGPUs
Generate and parse VCRAT tables for dGPUs in kfd_topology_add_device.

Some information that isn't available in the CRAT table is patched
into the topology after parsing.

HSA_CAP_DOORBELL_TYPE_1_0 is dependent on the ASIC feature
CP_HQD_PQ_CONTROL.SLOT_BASED_WPTR, which was not introduced in VI
until Carrizo. Report HSA_CAP_DOORBELL_TYPE_PRE_1_0 on Tonga ASICs.

v2: Added #include <linux/pci.h> to kfd_crat.c to make it compile

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:59 -05:00
Felix Kuehling
520b8fb755 drm/amdkfd: Add topology support for CPUs
Currently, the KFD topology information is generated by parsing the CRAT
(ACPI) table. However, at present CRAT table is available only for AMD
APUs. To support CPUs on systems without a CRAT table, the KFD driver will
create a Virtual CRAT (VCRAT) table and then the existing code will parse
that table to generate topology.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:58 -05:00
Harish Kasiviswanathan
bc0c75a367 drm/amdkfd: Fix sibling_map[] size
Change kfd_cache_properties.sibling_map[256] to
kfd_cache_properties.sibling_map[32]. Since, CRAT uses bitmap for
sibling_map, it is more efficient to use bitmap in the kfd structure
also.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:57 -05:00
Felix Kuehling
175b926335 drm/amdkfd: Simplify counting of memory banks
Only count memory banks in one place. Ignore redundant num_banks
entry in crat_subtype_computeunit.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:56 -05:00
Felix Kuehling
42aa8793d7 drm/amdkfd: Turn verbose topology messages into pr_debug
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:55 -05:00
Harish Kasiviswanathan
4f2937bfff drm/amdkfd: sync IOLINK defines to thunk spec
Current thunk spec v1.07 dated Feb 1, 2016

v2: fix indentation

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:54 -05:00
Harish Kasiviswanathan
6d82eb0ef2 drm/amdkfd: Support enumerating non-GPU devices
Modify kfd_topology_enum_kfd_devices(..) function to support non-GPU
nodes. The function returned NULL when it encountered non-GPU (say CPU)
nodes. This caused kfd_ioctl_create_event and kfd_init_apertures to fail
for Intel + Tonga.

kfd_topology_enum_kfd_devices will now parse all the nodes and return
valid kfd_dev for nodes with GPU.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:53 -05:00
Harish Kasiviswanathan
4f449311e9 drm/amdkfd: Decouple CRAT parsing from device list update
Currently, CRAT parsing is intertwined with topology_device_list and
hence repeated calls to kfd_parse_crat_table() will fail. Decouple
kfd_parse_crat_table() and topology_device_list.

kfd_parse_crat_table() will parse CRAT and add topology devices to a
temporary list temp_topology_device_list and then
kfd_topology_update_device_list will move contents from temporary list to
master list.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:52 -05:00
Harish Kasiviswanathan
8e05247d4c drm/amdkfd: Reorganize CRAT fetching from ACPI
Reorganize and rename kfd_topology_get_crat_acpi function. In this way
acpi_get_table(..) needs to be called only once. This will also aid in
dGPU topology implementation.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:51 -05:00
Felix Kuehling
174de876d6 drm/amdkfd: Group up CRAT related functions
Take CRAT related functions out of kfd_topology.c and place them in
kfd_crat.c. This is the initial step of supporting more CRAT features,
i.e. creating virtual CRAT table for KFD devices without CRAT.

v2: Minor cleanup that was missed previously because code moved around

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:49 -05:00
Yong Zhao
5108d76840 drm/amdkfd: Fix memory leaks in kfd topology
Kobject created using kobject_create_and_add() can be freed using
kobject_put() when there is no referenece any more. However,
kobject memory allocated with kzalloc() has to set up a release
callback in order to free it when the counter decreases to 0.
Otherwise it causes memory leak.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:48 -05:00
Harish Kasiviswanathan
d63f0ba27a drm/amdkfd: Topology: Fix location_id
Fix location_id format to match Thunk specification.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:47 -05:00
Flora Cui
f7ce2fade6 drm/amdkfd: Update number of compute unit from KGD
Overwrite the active simd_count from KGD at driver loading time. This is
based on assumption that register GC_USER_SHADER_ARRAY_CONFIG won’t get
changed.

V2: remove the incorrect simd_count reported at loading module.

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed by: Yair Shachar< yair.shachar@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:46 -05:00
Harish Kasiviswanathan
0504cccf34 drm/amdkfd: Stop using get_vmem_size KGD-KFD interface
get_vmem_size() is deprecated. Instead use get_local_mem_info().

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:43 -05:00
Felix Kuehling
64d1c3a43a drm/amdkfd: Centralize IOMMUv2 code and make it conditional
dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
ASIC information. Also allow building KFD without IOMMUv2 support.
This is still useful for dGPUs and prepares for enabling KFD on
architectures that don't support AMD IOMMUv2.

v2:
* Centralize IOMMUv2 code to avoid #ifdefs in too many places

v3:
* Imply AMD_IOMMU_V2 in Kconfig

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 19:22:12 -05:00
Felix Kuehling
a3084e6c52 drm/amdkfd: Add dGPU device IDs and device info
v2: remove needs_iommu field as it doesn't exists

CC: linux-pci@vger.kernel.org
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:47 -05:00
Felix Kuehling
1d63669885 drm/amdkfd: Add dGPU support to kernel_queue_init
Recognize dGPU ASIC families.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:46 -05:00
Felix Kuehling
ee04955af6 drm/amdkfd: Add dGPU support to the MQD manager
On dGPUs don't set ATC addressing bits and use MTYPE_UC for coherent
memory.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:45 -05:00
Felix Kuehling
97672cbe3d drm/amdkfd: Add dGPU support to the device queue manager
GFXv7 and v8 dGPUs use a different addressing mode for KFD compared
to APUs (GPUVM64 vs HSA64). And dGPUs don't support MTYPE_CC. They
use MTYPE_UC instead for memory that requires coherency.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:44 -05:00
Felix Kuehling
d146c5a719 drm/amdkfd: Make sched_policy a per-device setting
Some dGPUs don't support HWS. Allow them to use a per-device
sched_policy that may be different from the global default.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:43 -05:00
Felix Kuehling
3ee2d00cfb drm/amdkfd: Conditionally enable PCIe atomics
This will be needed for most dGPUs.

CC: linux-pci@vger.kernel.org
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:41 -05:00
Gustavo A. R. Silva
3f866f5f04 drm/amdkfd: Use ARRAY_SIZE macro in kfd_build_sysfs_node_entry
Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Felix Kuehling<Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-18 18:39:55 -06:00
Yong Zhao
c0ede1f8dc drm/amdkfd: Simplify locking during process creation
Also fixes error handling if kfd_process_init_cwsr fails.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:56 -05:00
Felix Kuehling
de1450a559 drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:55 -05:00
Felix Kuehling
2d9b36f983 drm/amdkfd: Reduce nesting in kfd_create_process_device_data
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:54 -05:00
Yong Zhao
82c16b4280 drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails
If no matching process is found, return NULL instead of a pointer
to the last process in the kfd_processes_table.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:53 -05:00
Felix Kuehling
abb208a8d4 drm/amdkfd: Use ref count to prevent kfd_process destruction
Use a reference counter instead of a lock to prevent process
destruction while functions running out of process context are using
the kfd_process structure. In many cases these functions don't need
the structure to be locked. In the few cases that really do need the
process lock, take it explicitly.

This helps simplify lock dependencies between the process lock and
other locks, particularly amdgpu and mm_struct locks. This will be
important when amdgpu calls back to amdkfd for memory evictions.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:52 -05:00
Felix Kuehling
5ce10687ae drm/amdkfd: Make kfd_process reference counted
This will be used to elliminate the use of the process lock for
preventing concurrent process destruction. This will simplify lock
dependencies between KFD and KGD.

This also simplifies the process destruction in a few ways:
* Don't allocate work struct dynamically
* Remove unnecessary hack that increments mm reference counter
* Remove unnecessary process locking during destruction

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:51 -05:00
Felix Kuehling
c7b1243eef drm/amdkfd: Get reference to lead_thread task struct
Increment the kfd_process.lead_thread's reference counter to make
it safe to dereference. This is needed for getting a safe reference
to the process' mm_struct.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:50 -05:00
Felix Kuehling
851a645efd drm/amdkfd: Add debugfs support to KFD
This commit adds several debugfs entries for kfd:

kfd/hqds: dumps all HQDs on all GPUs for KFD-controlled compute and
    SDMA RLC queues

kfd/mqds: dumps all MQDs of all KFD processes on all GPUs

kfd/rls: dumps HWS runlists on all GPUs

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:49 -05:00
Felix Kuehling
36582fa516 drm/amdkfd: Fix oversubscription accounting
Don't count SDMA queues towards compute HQD oversubscription when
deciding whether to create a chained runlist.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:46 -05:00
Felix Kuehling
a99c6d4fdc drm/amdkfd: map multiple processes to HW scheduler
Allow HWS to to execute multiple processes on the hardware
concurrently. The number of concurrent processes is limited by
the number of VMIDs allocated to the HWS.

A module parameter can be used for limiting this further or turn
it off altogether (mainly for debugging purposes).

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:45 -05:00
Kent Russell
8f8fb9b9d0 drm/amdkfd: Fix printing pointer cast
Just print a pointer instead of casting

v2: Remove the 0x prefix, since %p prints that automatically, and remove
it from one other spot as well

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-04 06:50:17 -05:00
Philip Yang
3c0b428090 drm/amdkfd: Add crash protection in debugger register path
After debugger is registered, the pqm_destroy_queue fails because is_debug
is true, the queue should not be removed from process_queue_list since
the count is not reduced.

Test application calls debugger unregister without register debugger, add
null pointer check protection to avoid crash for this case

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:44 -05:00
Yong Zhao
b46cb7d70e drm/amdkfd: Delete a useless parameter from create_queue function pointer
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-24 18:10:54 -05:00
Felix Kuehling
d7b9bd2248 drm/amdkfd: Add support for user-mode trap handlers
A second-level user mode trap handler can be installed. The CWSR trap
handler jumps to the secondary trap handler conditionally for any
conditions not handled by it. This can be used e.g. for debugging or
catching math exceptions.

When CWSR is disabled, the user mode trap handler is installed as
first level trap handler.

Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-14 16:41:20 -05:00
Felix Kuehling
373d708089 drm/amdkfd: Add CWSR support
This hardware feature allows the GPU to preempt shader execution in
the middle of a compute wave, save the state and restore it later
to resume execution.

Memory for saving the state is allocated per queue in user mode and
the address and size passed to the create_queue ioctl. The size
depends on the number of waves that can be in flight simultaneously
on a given ASIC.

Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-14 16:41:19 -05:00
Felix Kuehling
449fea6126 drm/amdkfd: Add trap handler for CWSR
The trap handler is like an interrupt handler running on the GPU
compute unit. It is needed for supporting CWSR (compute wave
save/restore).

This file defines an array with the pre-compiled GFXv8 shader ISA.
The assembly code is included for reference in #if 0 ... #endif.

Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-14 16:41:18 -05:00
Felix Kuehling
b20cd0df15 drm/amdkfd: Cleanup qpd.pqm initialization
The PQM doesn't change after process creation. So initialize it in
kfd_create_process_device_data.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-14 16:41:17 -05:00
Felix Kuehling
115c8c4104 drm/amdkfd: Use order_base_2 to get log2 of buffes sizes
Replace (ffs(size) - 1) with order_base_2(size) as a more straight
forward way to get log2 of buffer sizes.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-06 14:52:28 -05:00
Felix Kuehling
6d56693025 drm/amdkfd: Hardware DWORD size is 4 bytes
Don't use sizeof(uint32_t) or similar types for hardware or firmware
DWORD size. The hardware and firmware don't care about Linux types.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-06 14:52:27 -05:00
Philip Cox
5aaf2befd4 drm/amdkfd: Implement amdkfd SDMA functions for VI
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:22:02 -04:00
Felix Kuehling
97b9ad12ba drm/amdkfd: Use ASIC-specific SDMA MQD type
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:22:01 -04:00
Felix Kuehling
7ce66118aa drm/amd: Update kgd_kfd interface for resuming SDMA queues
Add wptr and mm parameters to hqd_sdma_load and pass these parameters
from device_queue_manager through the mqd_manager.

SDMA doesn't support polling while the engine believes it's idle. The
driver must update the wptr. The new parameters will be used for looking
up the updated value from the specified mm when SDMA queues are resumed
after being disabled.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:58 -04:00
Alex Deucher
e2874a3c8c drm/amdgpu: add license to Makefiles
Was missing license text.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 11:47:55 -05:00
Dave Airlie
662e704007 Merge tag 'drm-amdkfd-fixes-2017-11-26' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes
This is amdkfd pull request for -rc2. It contains three small fixes to the
CIK SDMA code, compilation error fix in kfd_ioctl.h and fix to accessing
a pointer after it was released.

* tag 'drm-amdkfd-fixes-2017-11-26' of git://people.freedesktop.org/~gabbayo/linux:
  uapi: fix linux/kfd_ioctl.h userspace compilation errors
  drm/amdkfd: fix amdkfd use-after-free GP fault
  drm/amdkfd: Fix SDMA oversubsription handling
  drm/amdkfd: Fix SDMA ring buffer size calculation
  drm/amdgpu: Fix SDMA load/unload sequence on HWS disabled mode
2017-12-01 09:14:46 +10:00
Randy Dunlap
c393e9b2d5 drm/amdkfd: fix amdkfd use-after-free GP fault
Fix GP fault caused by dev_info() reference to a struct device*
after the device has been freed (use after free).
kfd_chardev_exit() frees the device so 'kfd_device' should not
be used after calling kfd_chardev_exit().

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-26 11:31:32 +02:00
Felix Kuehling
8c946b8988 drm/amdkfd: Fix SDMA oversubsription handling
SDMA only supports a fixed number of queues. HWS cannot handle
oversubscription.

Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-26 11:31:32 +02:00
shaoyunl
d12fb13f23 drm/amdkfd: Fix SDMA ring buffer size calculation
ffs function return the position of the first bit set on 1 based.
(bit zero returns 1).

Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-26 11:31:32 +02:00
Linus Torvalds
e60e1ee606 main drm pull request for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaCm8RAAoJEAx081l5xIa+zX0QAJSm31kCG3vdw2CNiRx25L3q
 3hcsEOgAjVJ9FQVGKFWjzb8TK35tSqtNx5kWIj0VGaIfBE5Bdg5SLLgKKUYas8rY
 4LaphqICq2uxu2BNa2tpiar/sHhAnuozwQ4czpVWXzlaISnb9yYzRl7gMuyUVGkx
 +Gih5VUhLmQC0HsRTLJ3vaZQoUsLAl2gAjKcWa1bx57j2S+iKOPfsLaq7VYo+y1I
 Njc+iSGqMhJzRLXVkxL2lQKaslp7R38Bbh5K4Kvyjkm4Aq7zErOF6irpOXKMcrGl
 mwnr89vf1G9thjikrBaXpKnuvdbWYveoN/ORMlTdCfxkFnChHLnm3bd7NJ49RXDN
 Hv/Iq9YYjmZ9GTatxnx7lWtmXnZXC5he1yn1JAuz/yt7/0b/Wx+Mu/wEpBXYNFTd
 1AZdD586i+AmPo3yDkqH9nBu8JC0W0AnS9VZma4LVvZOP2UfJmj5Im1CLHItbGDN
 FnUCkwyD/lJUUk+WgT+w/GOMJgmFHDiFFl4tFtYVVjrUirpCFVguSKG9xuv6tT8P
 8iRsoP7RrcmDN9ojN2SEHwcpsAv3HnKkDv+9+GIbWnrGsSbCPq8Qm+JDSvf4h22I
 K5lwNpJrcpSKI+q10L7w2xliTBwb98sJkWGA/rssomrdBOWteGZAyqFRYAVgQ+mJ
 x/nJurIqQYh2KQN9+uLG
 =xVV2
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.15.

  Core:
   - Atomic object lifetime fixes
   - Atomic iterator improvements
   - Sparse/smatch fixes
   - Legacy kms ioctls to be interruptible
   - EDID override improvements
   - fb/gem helper cleanups
   - Simple outreachy patches
   - Documentation improvements
   - Fix dma-buf rcu races
   - DRM mode object leasing for improving VR use cases.
   - vgaarb improvements for non-x86 platforms.

  New driver:
   - tve200: Faraday Technology TVE200 block.

     This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
     the StorLink SL3516 (later Cortina Systems CS3516) as well as the
     Grain Media GM8180.

  New bridges:
   - SiI9234 support

  New panels:
   - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
     LT089AC19000, Innolux AT043TN24

  i915:
   - Remove Coffeelake from alpha support
   - Cannonlake workarounds
   - Infoframe refactoring for DisplayPort
   - VBT updates
   - DisplayPort vswing/emph/buffer translation refactoring
   - CCS fixes
   - Restore GPU clock boost on missed vblanks
   - Scatter list updates for userptr allocations
   - Gen9+ transition watermarks
   - Display IPC (Isochronous Priority Control)
   - Private PAT management
   - GVT: improved error handling and pci config sanitizing
   - Execlist refactoring
   - Transparent Huge Page support
   - User defined priorities support
   - HuC/GuC firmware refactoring
   - DP MST fixes
   - eDP power sequencing fixes
   - Use RCU instead of stop_machine
   - PSR state tracking support
   - Eviction fixes
   - BDW DP aux channel timeout fixes
   - LSPCON fixes
   - Cannonlake PLL fixes

  amdgpu:
   - Per VM BO support
   - Powerplay cleanups
   - CI powerplay support
   - PASID mgr for kfd
   - SR-IOV fixes
   - initial GPU reset for vega10
   - Prime mmap support
   - TTM updates
   - Clock query interface for Raven
   - Fence to handle ioctl
   - UVD encode ring support on Polaris
   - Transparent huge page DMA support
   - Compute LRU pipe tweaks
   - BO flag to allow buffers to opt out of implicit sync
   - CTX priority setting API
   - VRAM lost infrastructure plumbing

  qxl:
   - fix flicker since atomic rework

  amdkfd:
   - Further improvements from internal AMD tree
   - Usermode events
   - Drop radeon support

  nouveau:
   - Pascal temperature sensor support
   - Improved BAR2 handling
   - MMU rework to support Pascal MMU

  exynos:
   - Improved HDMI/mixer support
   - HDMI audio interface support

  tegra:
   - Prep work for tegra186
   - Cleanup/fixes

  msm:
   - Preemption support for a5xx
   - Display fixes for 8x96 (snapdragon 820)
   - Async cursor plane fixes
   - FW loading rework
   - GPU debugging improvements

  vc4:
   - Prep for DSI panels
   - fix T-format tiling scanout
   - New madvise ioctl

  Rockchip:
   - LVDS support

  omapdrm:
   - omap4 HDMI CEC support

  etnaviv:
   - GPU performance counters groundwork

  sun4i:
   - refactor driver load + TCON backend
   - HDMI improvements
   - A31 support
   - Misc fixes

  udl:
   - Probe/EDID read fixes.

  tilcdc:
   - Misc fixes.

  pl111:
   - Support more variants

  adv7511:
   - Improve EDID handling.
   - HDMI CEC support

  sii8620:
   - Add remote control support"

* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
  drm/rockchip: analogix_dp: Use mutex rather than spinlock
  drm/mode_object: fix documentation for object lookups.
  drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
  drm/i915: Move init_clock_gating() back to where it was
  drm/i915: Prune the reservation shared fence array
  drm/i915: Idle the GPU before shinking everything
  drm/i915: Lock llist_del_first() vs llist_del_all()
  drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
  drm/i915: Disable lazy PPGTT page table optimization for vGPU
  drm/i915/execlists: Remove the priority "optimisation"
  drm/i915: Filter out spurious execlists context-switch interrupts
  drm/amdgpu: use irq-safe lock for kiq->ring_lock
  drm/amdgpu: bypass lru touch for KIQ ring submission
  drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
  drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
  drm/amd/powerplay: initialize a variable before using it
  drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
  drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
  drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
  drm/rockchip: add CONFIG_OF dependency for lvds
  ...
2017-11-15 20:42:10 -08:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Felix Kuehling
894a8293aa drm/amdkfd: Minor cleanups
These were missed previously when rebasing changes for upstreaming.

v2: Remove redundant sched_policy conditions

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:33 -04:00
Felix Kuehling
096d1a3efc drm/amdkfd: Update queue_count before mapping queues
map_queues_cpsch uses the queue_count to decide whether to upload
a new runlist. So update the counter before calling it.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:32 -04:00
Yong Zhao
bfd5e378a9 drm/amdkfd: Cleanup DQM ASIC-specific ops
Remove empty initialize function.

Rename register_process to update_qpd to avoid confusion with the
non-ASIC-specific register_process.

Shorten ops_asic_specific to asic_ops.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:31 -04:00
Ben Goz
5a29ad6b9e drm/amdkfd: Register/Deregister process on qpd resolution
Process registration needs to happen on each device. So use per-device
queue lists to determine when to register/deregister the process.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:30 -04:00
Yair Shachar
062c5672d5 drm/amdkfd: Fix debug unregister procedure on process termination
Take the dbgmgr lock and unregister before destroying the debug manager.
Do this before destroying the queues.

v2: Correct locking order in kfd_ioctl_dbg_register to ake sure the
process mutex and dbgmgr mutex are always taken in the same order.

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:29 -04:00
Yong Zhao
e2a8e99964 drm/amdkfd: Avoid calling amd_iommu_unbind_pasid() when suspending
When kfd suspending on APU, we do not need to call
amd_iommu_unbind_pasid(), because pasid will be unbound automatically
when power goes off.

On the other hand, calling amd_iommu_unbind_pasid() will trigger
kfd_process_iommu_unbind_callback() if the process is not terminating.
By design, kfd_process_iommu_unbind_callback() should only be called
for process terminating. So we would rather not to call
amd_iommu_unbind_pasid() when suspending.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:28 -04:00
Jay Cornwall
bba9662db7 drm/amdkfd: Disable CP/SDMA ring/doorbell in MQD
The MQD represents an inactive context and should not have ring or
doorbell enable bits set. Doing so interferes with HWS which streams
the MQD onto the HQD. If enable bits are set this activates the ring
or doorbell before the HQD is fully configured.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:27 -04:00
Yong Zhao
ab40cba303 drm/amdkfd: Clean up the data structure in kfd_process
A list of per-process queues is maintained in the
kfd_process_queue_manager, so the queues array in kfd_process is
redundant and in fact unused.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:26 -04:00
Christian König
f4fa88ab28 drm/radeon: deprecate and remove KFD interface
To quote Felix: "For testing KV with current user mode stack, please use
amdgpu. I don't expect this to work with radeon and I'm not planning to
spend any effort on making radeon work with a current user mode stack."

Only compile tested, but should be straight forward.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-30 14:16:21 +01:00
Andres Rodriguez
48e876a20e drm/amdkfd: use a high priority workqueue for IH work
In systems under heavy load the IH work may experience significant
scheduling delays.

Under load + system workqueue:
    Max Latency: 7.023695 ms
    Avg Latency: 0.263994 ms

Under load + high priority workqueue:
    Max Latency: 1.162568 ms
    Avg Latency: 0.163213 ms

Further work is required to measure the impact of per-cpu settings on IH
performance.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:34 -04:00
Andres Rodriguez
0f875e3f3e drm/amdkfd: wait only for IH work on IH exit
We don't need to wait for all work to complete in the IH exit function.
We only need to make sure the interrupt_work has finished executing to
guarantee that ih_kfifo is no longer in use.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:33 -04:00
Andres Rodriguez
27232055b1 drm/amdkfd: increase IH num entries to 8192
A larger buffer will let us accommodate applications with a large amount
of semi-simultaneous event signals.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:32 -04:00
Andres Rodriguez
04ad47bd14 drm/amdkfd: use standard kernel kfifo for IH
Replace our implementation of a lockless ring buffer with the standard
linux kernel kfifo.

We shouldn't maintain our own version of a standard data structure.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:31 -04:00
Felix Kuehling
b9a5d0a5db drm/amdkfd: Make event limit dependent on user mode mapping size
This allows increasing the KFD_SIGNAL_EVENT_LIMIT in kfd_ioctl.h
without breaking processes built with older kfd_ioctl.h versions.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:29 -04:00
Felix Kuehling
3f04f96148 drm/amdkfd: Use IH context ID for signal lookup
This speeds up signal lookup when the IH ring entry includes a
valid context ID or partial context ID. Only if the context ID is
found to be invalid, fall back to an exhaustive search of all
signaled events.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:28 -04:00
Felix Kuehling
482f07775c drm/amdkfd: Simplify event ID and signal slot management
Signal slots are identical to event IDs.

Replace the used_slot_bitmap and events hash table with an IDR to
allocate and lookup event IDs and signal slots more efficiently.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:27 -04:00
Felix Kuehling
50cb7dd94c drm/amdkfd: Simplify events page allocator
The first event page is always big enough to handle all events.
Handling of multiple events pages is not supported by user mode, and
not necessary.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:26 -04:00
Felix Kuehling
74e4071665 drm/amdkfd: Use wait_queue_t to implement event waiting
Use standard wait queues for waiting and waking up waiting threads
instead of inventing our own. We still have our own wait loop
because the HSA event semantics require the ability to have one
thread waiting on multiple wait queues (events) at the same time.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:25 -04:00
Felix Kuehling
ebf947fe93 drm/amdkfd: remove redundant kfd_event_waiter.input_index
This always identical with the index of the event_waiter in the array.
No need to store it in the waiter record.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:24 -04:00
Felix Kuehling
fe528c13ac drm/amdkfd: Fix event destruction with pending waiters
When an event with pending waiters is destroyed, those waiters may
end up sleeping forever unless they are notified and woken up.
Implement the notification by clearing the waiter->event pointer,
which becomes invalid anyway, when the event is freed, and waking
up the waiting tasks.

Waiters on an event that's destroyed return failure.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:23 -04:00
Felix Kuehling
fdf0c8332a drm/amdkfd: Clean up kfd_wait_on_events
Cleaned up the code while resolving some potential bugs and
inconsistencies in the process.

Clean-ups:
* Remove enum kfd_event_wait_result, which duplicates
  KFD_IOC_EVENT_RESULT definitions
* alloc_event_waiters can be called without holding p->event_mutex
* Return an error code from copy_signaled_event_data instead of bool
* Clean up error handling code paths to minimize duplication in
  kfd_wait_on_events

Fixes:
* Consistently return an error code from kfd_wait_on_events and set
  wait_result to KFD_IOC_WAIT_RESULT_FAIL in all failure cases.
* Always call free_waiters while holding p->event_mutex
* copy_signaled_event_data might sleep. Don't call it while the task state
  is TASK_INTERRUPTIBLE.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:22 -04:00
Sean Keely
d9aeec4cbb drm/amdkfd: Fix scheduler race in kfd_wait_on_events sleep loop
Signed-off-by: Sean Keely <sean.keely@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:21 -04:00
Sean Keely
1f9d09becb drm/amdkfd: Short cut for kfd_wait_on_events without waiting
If kfd_wait_on_events can return immediately, we don't need to populate
the wait list and don't need to enter the sleep-loop.

Signed-off-by: Sean Keely <sean.keely@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:20 -04:00
Felix Kuehling
9b56bb1154 drm/amdkfd: Don't dereference kfd_process.mm
The kfd_process doesn't own a reference to the mm_struct, so it can
disappear without warning even while the kfd_process still exists.

Therefore, avoid dereferencing the kfd_process.mm pointer and make
it opaque. Use get_task_mm to get a temporary reference to the mm
when it's needed.

v2: removed unnecessary WARN_ON

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:19 -04:00
Besar Wicaksono
66b783b446 drm/amdkfd: Add SDMA trap src id to the KFD isr wanted list
This enables SDMA signalling with event interrupt.

Signed-off-by: Besar Wicaksono <Besar.Wicaksono@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:18 -04:00
shaoyunl
e139cd2a2f drm/amdkfd: Improve multiple SDMA queues support per process
HWS does not support over-subscription and the scheduler can not internally
modify the engine. Driver needs to program the correct engine ID.

Fix the queue and engine selection to create queues on alternating SDMA
engines. This allows concurrent bi-directional DMA transfers in a process
that creates two SDMA queues.

Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:56 -04:00
Felix Kuehling
36c2d7eb5e drm/amdkfd: Limit queue number per process and device to 127
HWS uses bit 7 in the queue number of the map process packet for an
undocumented feature. Therefore the queue number per process and
device must be 127 or less.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:55 -04:00
Felix Kuehling
bc920fd4f4 drm/amdkfd: Clean up process queue management
Removed unused num_concurrent_processes.

Implemented counting of queues in QPD. This makes counting the queue
list repeatedly in several places unnecessary.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:54 -04:00
Yong Zhao
e6f791b1b0 drm/amdkfd: Compress unnecessary function parameters
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:53 -04:00
Felix Kuehling
9fd3f1bfae drm/amdkfd: Improve process termination handling
Separate device queue termination from process queue manager
termination. Unmap all queues at once instead of one at a time.
Unmap device queues before the PASID is unbound, in the
kfd_process_iommu_unbind_callback.

When resetting wavefronts in non-HWS mode, do it before the VMID is
released.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:52 -04:00
Yong Zhao
c4744e243c drm/amdkfd: Avoid submitting an unnecessary packet to HWS
v2:
Make queue mapping interfaces more consistent by passing unmap filter
parameters directly to execute_queues_cpsch, same as unmap_queues_cpsch.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:51 -04:00
Felix Kuehling
60a0095657 drm/amdkfd: Fix MQD updates
When a queue is mapped, the MQD is owned by the FW. The FW overwrites
the MQD on the next unmap operation. Therefore the queue must be
unmapped before updating the MQD.

For the non-HWS case, also fix disabling of queues and creation of
queues in disabled state.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:50 -04:00
Yong Zhao
4465f466c7 drm/amdkfd: Pass filter params to unmap_queues_cpsch
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-08 14:57:52 +03:00
Yong Zhao
ac30c78384 drm/amdkfd: move locking outside of unmap_queues_cpsch
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-08 14:57:18 +03:00
Yong Zhao
7da2bcf876 drm/amdkfd: Avoid name confusion involved in queue unmapping
When unmapping the queues from HW scheduler, there are two actions:
reset and preempt. So naming the variables with only preempt is
inapproriate.

For functions such as destroy_queues_cpsch, what they do actually is to
unmap the queues on HW scheduler rather than to destroy them. Change the
name to reflect that fact. On the other hand, there is already a function
called destroy_queue_cpsch() which exactly destroys a queue, and the name
is very close to destroy_queues_cpsch(), resulting in confusion.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:48 -04:00
Felix Kuehling
c986169fde drm/amdkfd: Print event limit messages only once per process
To avoid spamming the log.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:22 -04:00
Yong Zhao
cb1d996746 drm/amdkfd: Fix kernel-queue wrapping bugs
Avoid intermediate negative numbers when doing calculations with a mix
of signed and unsigned variables where implicit conversions can lead
to unexpected results.

When kernel queue buffer wraps around to 0, we need to check that rptr
won't be overwritten by the new packet.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:21 -04:00
Yong Zhao
58dcd5bfcf drm/amdkfd: Drop _nocpsch suffix from shared functions
Several functions in DQM are shared between cpsch and nocpsch code.
Remove the misleading _nocpsch suffix from their names.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:20 -04:00
Yong Zhao
e596b90338 drm/amdkfd: Reuse CHIP_* from amdgpu v2
There are already CHIP_* definitions under amd_shared.h file on amdgpu
side, so KFD should reuse them rather than defining new ones.

Using enum for asic type requires default cases on switch statements
to prevent compiler warnings. WARN on unsupported ASICs. It should never
get there because KFD should not be initialized on unsupported devices.

v2: Replace BUG() with WARN and error return

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:19 -04:00
Yong Zhao
44008d7a87 drm/amdkfd: Use VMID bitmap from KGD v2
The hard-coded values related to VMID were removed in KFD, as those
values can be calculated in the KFD initialization function.

v2: remove unnecessary local variable

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:18 -04:00
Felix Kuehling
b22666febf drm/amdkfd: Fix incorrect destroy_mqd parameter
When uninitializing a kernel queue.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:17 -04:00
Felix Kuehling
b90e3fbecc drm/amdkfd: Adjust dequeue latencies and timeouts
Adjust latencies and timeouts for dequeueing with HWS and consolidate
them in one place. Make them longer to allow long running waves to
complete without causing a timeout. The timeout is twice as long as the
latency plus some buffer to make sure we don't detect a timeout
prematurely.

Change timeouts for dequeueing HQDs without HWS. KFD_UNMAP_LATENCY is
more consistent with what the HWS does for user queues.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:16 -04:00
Yong Zhao
8c72c3d7df drm/amdkfd: Rectify the jiffies calculation error with milliseconds v2
The timeout in milliseconds should not be regarded as jiffies. This
commit fixed that.

v2:
- use msecs_to_jiffies
- change timeout_ms parameter to unsigned int to match msecs_to_jiffies

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:15 -04:00
Yong Zhao
733fa1f742 drm/amdkfd: Fix suspend/resume issue on Carrizo v2
When we do suspend/resume through "sudo pm-suspend" while there is
HSA activity running, upon resume we will encounter HWS hanging, which
is caused by memory read/write failures. The root cause is that when
suspend, we neglected to unbind pasid from kfd device.

Another major change is that the bind/unbinding is changed to be
performed on a per process basis, instead of whether there are queues
in dqm.

v2:
- free IOMMU device if kfd_bind_processes_to_device fails in kfd_resume
- add comments to kfd_bind/unbind_processes_to/from_device
- minor cleanups

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:14 -04:00
Yong Zhao
b8935a7c4b drm/amdkfd: Reorganize kfd resume code
The idea is to let kfd init and resume function share the same code path
as much as possible, rather than to have two copies of almost identical
code. That way improves the code readability and maintainability.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:13 -04:00
Dave Airlie
ebec44a245 Linux 4.14-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZ0WQ6AAoJEHm+PkMAQRiGuloH/3sF4qfBhPuJo8OTf0uCtQ18
 4Ux9zZbm81df/Jjz0exAp1Jqk+TvdIS3OXPWcKilvbUBP16hQcsxFTnI/5QF+YcN
 87aNr+OCMJzOBK4suN1yhzO46NYHeIizdB0PTZVL1Zsto69Tt31D8VJmgH6oBxAw
 Isb/nAkOr31dZ9PI5UEExTIanUt6EywVb0UswA+2rNl3h1UkeasQCpMpK2n6HBhU
 kVD7sxEd/CN0MmfhB0HrySSam/BeSpOtzoU9bemOwrU2uu9+5+2rqMe7Gsdj4nX6
 3Kk+7FQNktlrhxCZIFN/+CdusOUuDd8r/75d7DnsRK5YvSb0sZzJkfD3Nba68Ms=
 =7J2+
 -----END PGP SIGNATURE-----

BackMerge tag 'v4.14-rc3' into drm-next

Linux 4.14-rc3

Requested by Daniel for the tracing build fix in fixes.
2017-10-03 09:35:04 +10:00
Dave Airlie
754270c7c5 Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-next
First feature pull for 4.15.  Highlights:
- Per VM BO support
- Lots of powerplay cleanups
- Powerplay support for CI
- pasid mgr for kfd
- interrupt infrastructure for recoverable page faults
- SR-IOV fixes
- initial GPU reset for vega10
- prime mmap support
- ttm page table debugging improvements
- lots of bug fixes

* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (232 commits)
  drm/amdgpu: clarify license in amdgpu_trace_points.c
  drm/amdgpu: Add gem_prime_mmap support
  drm/amd/powerplay: delete dead code in smumgr
  drm/amd/powerplay: delete SMUM_FIELD_MASK
  drm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_READ_FIELD
  drm/amd/powerplay: delete SMUM_SET_FIELD
  drm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_WRITE_FIELD
  drm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD
  drm/amd/powerplay: move macros to hwmgr.h
  drm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h
  drm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h
  drm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h
  drm/amd/powerplay: add new helper functions in hwmgr.h
  drm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair
  drm/amd/powerplay: refine powerplay code.
  drm/amd/powerplay: delete dead code in hwmgr.h
  drm/amd/powerplay: refine interface in struct pp_smumgr_func
  ...
2017-09-28 08:37:02 +10:00
Felix Kuehling
d2791c4563 drm/amdkfd: Use PASID manager from KGD
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 13:07:04 -04:00
Felix Kuehling
a91e70e30c drm/amdkfd: Separate doorbell allocation from PASID
PASID management is moving into KGD. Limiting the PASID range to the
number of doorbell pages is no longer practical.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 13:07:03 -04:00
Colin Ian King
bfaa1ce809 drm/amdkfd: check for null dev to avoid a null pointer dereference
The call to kfd_device_by_id can potentially return null, so check that
dev is null and return with -EINVAL to avoid a null pointer dereference.

Detected by CoverityScan CID#1454629 ("Dereference null return value")

Fixes: 5d71dbc3a5 ("drm/amdkfd: Implement image tiling mode support v2")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-08 15:13:33 +01:00
Oded Gabbay
1fabbf7811 drm/amdkfd: pass queue's mqd when destroying mqd
In VI, the destroy mqd function needs to inquire fields present in the mqd
structure. That's why we need to pass it to that function instead of NULL.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-02 15:00:25 +03:00
Himanshu Jha
50dad5fb59 drm/amdkfd: remove memset before memcpy
calling memcpy immediately after memset with the same region of memory
makes memset redundant.

Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-30 00:33:35 +05:30
Yong Zhao
5d71dbc3a5 drm/amdkfd: Implement image tiling mode support v2
v2: Removed hole in ioctl number space

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:22 -04:00
Moses Reuben
6a1c951069 drm/amdkfd: Adding new IOCTL for scratch memory v2
v2:
* Renamed ALLOC_MEMORY_OF_SCRATCH to SET_SCRATCH_BACKING_VA
* Removed size parameter from the ioctl, it was unused
* Removed hole in ioctl number space
* No more call to write_config_static_mem
* Return correct error code from ioctl

Signed-off-by: Moses Reuben <moses.reuben@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:20 -04:00
Felix Kuehling
70539bd795 drm/amd: Update MEC HQD loading code for KFD
Various bug fixes and improvements that accumulated over the last two
years.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:17 -04:00
Felix Kuehling
507968dd9e drm/amdkfd: Update PM4 packet headers
To match current firmware. The map process packet has been extended
to support scratch. This is a non-backwards compatible change and
it's about two years old. So no point keeping the old version around
conditionally.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:15 -04:00
Jay Cornwall
af68d87cac drm/amdkfd: Clamp EOP queue size correctly on Gfx8
Gfx8 HW incorrectly clamps CP_HQD_EOP_CONTROL.EOP_SIZE, which can
lead to scheduling deadlock due to SE EOP done counter overflow.

Enforce a EOP queue size limit which prevents the CP from sending
more than 0xFF events at a time.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:14 -04:00
Yong Zhao
4ebc718274 drm/amdkfd: Add more error printing to help bringup v2
v2: Turned WARN into dev_warn and made the message more helpful

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:13 -04:00
Felix Kuehling
32fa821958 drm/amdkfd: Handle remaining BUG_ONs more gracefully v2
In most cases, BUG_ONs can be replaced with WARN_ON with an error
return. In some void functions just turn them into a WARN_ON and
possibly an early exit.

v2:
* Cleaned up error handling in pm_send_unmap_queue
* Removed redundant WARN_ON in kfd_process_destroy_delayed

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:12 -04:00
Felix Kuehling
8625ff9c0b drm/amdkfd: Allocate gtt_sa_bitmap in long units
gtt_sa_bitmap is accessed by bitmap functions, which operate on longs.
Therefore the array should be allocated in long units. Also round up
in case the number of bits is not a multiple of BITS_PER_LONG.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:11 -04:00
Felix Kuehling
735df2ba1d drm/amdkfd: Fix doorbell initialization and finalization
Handle errors in doorbell aperture initialization instead of BUG_ON.
iounmap doorbell aperture during finalization.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:10 -04:00
Felix Kuehling
4f52f2256e drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
Remove BUG_ONs that check for NULL pointer arguments that are
dereferenced in the same function. Dereferencing the NULL pointer
will generate a BUG anyway, so the explicit check is redundant and
unnecessary overhead.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:09 -04:00
Kent Russell
dbf56ab11a drm/amdkfd: Remove usage of alloc(sizeof(struct...
See https://kernel.org/doc/html/latest/process/coding-style.html
under "14) Allocating Memory" for rationale behind removing the
x=alloc(sizeof(struct) style and using x=alloc(sizeof(*x) instead

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:08 -04:00
Kent Russell
ab7c164867 drm/amdkfd: Fix goto usage v2
Remove gotos that do not feature any common cleanup, and use gotos
instead of repeating cleanup commands.

According to kernel.org: "The goto statement comes in handy when a
function exits from multiple locations and some common work such as
cleanup has to be done. If there is no cleanup needed then just return
directly."

v2: Applied review suggestions in create_queue_nocpsch

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:07 -04:00
Kent Russell
4eacc26b3b drm/amdkfd: Change x==NULL/false references to !x
Upstream prefers the !x notation to x==NULL or x==false. Along those lines
change the ==true or !=NULL references as well. Also make the references
to !x the same, excluding () for readability.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:06 -04:00
Kent Russell
79775b627d drm/amdkfd: Consolidate and clean up log commands
Consolidate log commands so that dev_info(NULL, "Error...") uses the more
accurate pr_err, remove the module name from the log (can be seen via
dynamic debugging with +m), and the function name (can be seen via
dynamic debugging with +f). We also don't need debug messages saying
what function we're in. Those can be added by devs when needed

Don't print vendor and device ID in error messages. They are typically
the same for all GPUs in a multi-GPU system. So this doesn't add any
value to the message.

Lastly, remove parentheses around %d, %i and 0x%llX.
According to kernel.org:
"Printing numbers in parentheses (%d) adds no value and should be
avoided."

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:05 -04:00
Kent Russell
8eabaf54cf drm/amdkfd: Clean up KFD style errors and warnings v2
Using checkpatch.pl -f <file> showed a number of style issues. This
patch addresses as many of them as possible. Some long lines have been
left for readability, but attempts to minimize them have been made.

v2: Broke long lines in gfx_v7 get_fw_version

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:04 -04:00
Felix Kuehling
86194cf8cf drm/amdkfd: Fix allocated_queues bitmap initialization
Use shared_resources.queue_bitmap to determine the queues available
for KFD in each pipe.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:02 -04:00
Felix Kuehling
c76cf86966 drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
kfd2kgd->address_watch_get_offset returns dword register offsets.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 23:00:01 -04:00
Felix Kuehling
3f4f46f624 drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15 22:59:59 -04:00
Dave Airlie
dd24df6570 Merge branch 'drm-next-4.14' of git://people.freedesktop.org/~agd5f/linux into drm-next
- Stop reprogramming the MC, the vbios already does this in asic_init
- Reduce internal gart to 256M (this does not affect the ttm GTT pool size)
- Initial support for huge pages
- Rework bo migration logic
- Lots of improvements for vega10
- Powerplay fixes
- Additional Raven enablement
- SR-IOV improvements
- Bug fixes
- Code cleanup

* 'drm-next-4.14' of git://people.freedesktop.org/~agd5f/linux: (138 commits)
  drm/amdgpu: fix header on gfx9 clear state
  drm/amdgpu: reduce the time of reading VBIOS
  drm/amdgpu/virtual_dce: Remove the rmmod error message
  drm/amdgpu/gmc9: disable legacy vga features in gmc init
  drm/amdgpu/gmc8: disable legacy vga features in gmc init
  drm/amdgpu/gmc7: disable legacy vga features in gmc init
  drm/amdgpu/gmc6: disable legacy vga features in gmc init (v2)
  drm/radeon: Set depth on low mem to 16 bpp instead of 8 bpp
  drm/amdgpu: fix the incorrect scratch reg number on gfx v6
  drm/amdgpu: fix the incorrect scratch reg number on gfx v7
  drm/amdgpu: fix the incorrect scratch reg number on gfx v8
  drm/amdgpu: fix the incorrect scratch reg number on gfx v9
  drm/amd/powerplay: add support for 3DP 4K@120Hz on vega10.
  drm/amdgpu: enable huge page handling in the VM v5
  drm/amdgpu: increase fragmentation size for Vega10 v2
  drm/amdgpu: ttm_bind only when user needs gpu_addr in bo pin
  drm/amdgpu: correct clock info for SRIOV
  drm/amdgpu/gmc8: SRIOV need to program fb location
  drm/amdgpu: disable firmware loading for psp v10
  drm/amdgpu:fix gfx fence allocate size
  ...
2017-08-02 12:43:12 +10:00
Dan Carpenter
1d11ee8986 drm/amdgpu: Off by one sanity checks
This is just future proofing code, not something that can be triggered
in real life.  We're testing to make sure we don't shift wrap when we
do "1ull << i" so "i" has to be in the 0-63 range.  If it's 64 then we
have gone too far.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-14 11:06:40 -04:00
Jay Cornwall
13c4a2c78e drm/amdkfd: Remove unused references to shared_resources.num_mec
Dead code.

Change-Id: Ic0bb1bcca87e96bc5e8fa9894727b0de152e8818
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-07-13 20:21:54 -05:00
Geert Uytterhoeven
7a10d63f02 drm/amdkfd: Spelling s/apreture/aperture/
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-06-01 12:28:38 +02:00
Dan Carpenter
b312b2b25b drm/amdkfd: NULL dereference involving create_process()
We accidentally return ERR_PTR(0) which is NULL.  The caller is not
expecting that and it leads to an Oops.

Fixes: dd59239a98 ("amdkfd: init aperture once per process")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-06-14 13:58:53 +03:00
Dave Airlie
04d4fb5fa6 Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-next
New radeon and amdgpu features for 4.13:
- Lots of Vega10 bug fixes
- Preliminary Raven support
- KIQ support for compute rings
- MEC queue management rework from Andres
- Audio support for DCE6
- SR-IOV improvements
- Improved module parameters for controlling radeon vs amdgpu support
  for SI and CIK
- Bug fixes
- General code cleanups

[airlied: dropped drmP.h header from one file was needed and build broke]

* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits)
  drm/amdgpu: Fix compiler warnings
  drm/amdgpu: vm_update_ptes remove code duplication
  drm/amd/amdgpu: Port VCN over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
  drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
  drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
  drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
  drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
  drm/amd/amdgpu: Add offset variant to SOC15 macros
  drm/amd/powerplay: add avfs control for Vega10
  drm/amdgpu: add virtual display support for raven
  drm/amdgpu/gfx9: fix compute ring doorbell index
  drm/amd/amdgpu: Rename KIQ ring to avoid spaces
  drm/amd/amdgpu: gfx9 tidy ups (v2)
  drm/amdgpu: add contiguous flag in ucode bo create
  drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
  drm/amdgpu: export test ib debugfs interface
  ...
2017-06-16 09:56:53 +10:00
Andres Rodriguez
d0b63bb338 drm/amdkfd: allow split HQD on per-queue granularity v5
Update the KGD to KFD interface to allow sharing pipes with queue
granularity instead of pipe granularity.

This allows for more interesting pipe/queue splits.

v2: fix overflow check for res.queue_mask
v3: fix shift overflow when setting res.queue_mask
v4: fix comment in is_pipeline_enabled()
v5: clamp res.queue_mask to the first MEC only

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:48:54 -04:00
Andres Rodriguez
42794b27cc drm/amdgpu: take ownership of per-pipe configuration v3
Make amdgpu the owner of all per-pipe state of the HQDs.

This change will allow us to split the queues between kfd and amdgpu
with a queue granularity instead of pipe granularity.

This patch fixes kfd allocating an HDP_EOP region for its 3 pipes which
goes unused.

v2: support for gfx9
v3: fix gfx7 HPD intitialization

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:48:52 -04:00
Masahiro Yamada
248a1d6f1a drm/amd: fix include notation and remove -Iinclude/drm flag
Include <drm/*.h> instead of relative path from include/drm, then
remove the -Iinclude/drm compiler flag.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1493009447-31524-4-git-send-email-yamada.masahiro@socionext.com
2017-05-16 17:17:41 +02:00
Ingo Molnar
589ee62844 sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h>
Update code that relied on sched.h including various MM types for them.

This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:37 +01:00
Ingo Molnar
3f07c01441 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Ingo Molnar
6e84f31522 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h>
We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/mm.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

The APIs that are going to be moved first are:

   mm_alloc()
   __mmdrop()
   mmdrop()
   mmdrop_async_fn()
   mmdrop_async()
   mmget_not_zero()
   mmput()
   mmput_async()
   get_task_mm()
   mm_access()
   mm_release()

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:28 +01:00
Vegard Nossum
f1f1007644 mm: add new mmgrab() helper
Apart from adding the helper function itself, the rest of the kernel is
converted mechanically using:

  git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)->mm_count);/mmgrab\(\1\);/'
  git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)\.mm_count);/mmgrab\(\&\1\);/'

This is needed for a later patch that hooks into the helper, but might
be a worthwhile cleanup on its own.

(Michal Hocko provided most of the kerneldoc comment.)

Link: http://lkml.kernel.org/r/20161218123229.22952-1-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:48 -08:00
Pan Bian
8bf793883d drm/amdkfd: fix improper return value on error
In function kfd_wait_on_events(), when the call to copy_from_user()
fails, the value of return variable ret is 0. 0 indicates success, which
is inconsistent with the execution status. This patch fixes the bug by
assigning "-EFAULT" to ret when copy_from_user() returns an unexpected
value.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-01-16 17:34:47 +02:00
Colin Ian King
a7522cd938 amdkfd: fix spelling mistake in kfd_ioctl_dbg_unrgesiter
Trivial fix to spelling mistake, rename kfd_ioctl_dbg_unrgesiter
to kfd_ioctl_dbg_unregister

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-01-12 09:53:03 +02:00
Dave Airlie
ca09fb9f60 Linux 4.8-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJX6H4uAAoJEHm+PkMAQRiG5sMH/3yzrMiUCSokdS+cvY+jgKAG
 JS58JmRvBPz2mRaU3MRPBGRDeCz/Nc9LggL2ZcgM+E1ZYirlYyQfIED3lkqk5R07
 kIN1wmb+kQhXyU4IY3fEX7joqyKC6zOy4DUChPkBQU0/0+VUmdVmcJvsuPlnMZtf
 g95m0BdYTui+eDezASRqOEp3Lb5ONL4c3ao4yBP0LHF033ctj3VJQiyi5uERPZJ0
 5e6Mo7Wxn78t9WqJLQAiEH46kTwT2plNlxf3XXqTenfIdbWhqE873HPGeSMa3VQV
 VywXTpCpSPQsA8BYg66qIbebdKOhs9MOviHVfqDtwQlvwhjlBDya0gNHfI5fSy4=
 =Y/L5
 -----END PGP SIGNATURE-----

Merge tag 'v4.8-rc8' into drm-next

Linux 4.8-rc8

There was a lot of fallout in the imx/amdgpu/i915 drivers, so backmerge
it now to avoid troubles.

* tag 'v4.8-rc8': (1442 commits)
  Linux 4.8-rc8
  fault_in_multipages_readable() throws set-but-unused error
  mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
  radix tree: fix sibling entry handling in radix_tree_descend()
  radix tree test suite: Test radix_tree_replace_slot() for multiorder entries
  fix memory leaks in tracing_buffers_splice_read()
  tracing: Move mutex to protect against resetting of seq data
  MIPS: Fix delay slot emulation count in debugfs
  MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
  mm: delete unnecessary and unsafe init_tlb_ubc()
  huge tmpfs: fix Committed_AS leak
  shmem: fix tmpfs to handle the huge= option properly
  blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
  MIPS: Fix pre-r6 emulation FPU initialisation
  arm64: kgdb: handle read-only text / modules
  arm64: Call numa_store_cpu_info() earlier.
  locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
  nvme-rdma: only clear queue flags after successful connect
  i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
  perf/core: Limit matching exclusive events to one PMU
  ...
2016-09-28 12:08:49 +10:00
Edward O'Callaghan
e88a614c40 drm/amdkfd: Pass 'struct queue_propertices' by reference
Allow init_queue() to take 'struct queue_properties' by reference.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-19 20:58:35 +03:00
Edward O'Callaghan
6f4d92a127 drm/amdkfd: Unify multiple calls to pr_debug() into one
Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-19 20:58:35 +03:00
Edward O'Callaghan
ad16a469ff drm/amdkfd: Reuse function to find a process through pasid
The kfd_lookup_process_by_pasid() is just for that purpose,
so use it instead of repeating the code.

v2: return on the condition (p == NULL) instead of BUG_ON(!p).

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-19 20:58:35 +03:00
Edward O'Callaghan
78b13f7964 drm/amdkfd: Add some missing memset zero'ing in queue init func
Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-19 20:58:35 +03:00
Edward O'Callaghan
585f0e6c53 drm/amdkfd: Tidy up kfd_generate_gpu_id() uint64_t bitshift unpack
Dereference the one time and unpack the lower and upper 32bit
portions with the proper kernel helper macros.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-09-19 20:58:35 +03:00
Colin Ian King
6dae61627d drm/amdkfd: print doorbell offset as a hex value
The doorbell offset is formatted with a 0x prefix to suggest it is
a hexadecimal value, when in fact %d is being used and this is confusing.
Use %X instead to match the proceeding 0x prefix.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-08-09 10:02:02 +03:00
Oded Gabbay
7fd5e03ca6 drm/amdkfd: destroy mutex if process creation fails
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-07-03 08:05:45 +03:00
Bhaktipriya Shridhar
fd320bf692 drm/amdkfd: Remove create_workqueue()
alloc_workqueue replaces deprecated create_workqueue().

create_workqueue has been replaced with alloc_workqueue with max_active
as 0 since there is no need for throttling the number of active work items.

WQ_MEM_RECLAIM has not been set to because kfd_process_wq will not be
used in memory reclaim path.

kfd_process_wq is used for delay destruction. A work item embedded in
kfd_process gets queued to kfd_process_wq and when it executes it
destroys and frees the containing kfd_process and thus itself.

This requires a dedicated workqueue because a work item once queued, may
get freed at any point of time and any external entity cannot
flush the work item. So, in order to wait for such a work item,
it needs to be put on a dedicated workqueue.

kfd_module_exit() calls kfd_process_destroy_wq which ensures that all
pending work items are finished before the module is removed.

flush_workqueue is unnecessary since destroy_workqueue() itself calls
drain_workqueue() which flushes repeatedly till the workqueue
becomes empty.

Hence flush_workqueue has been removed.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-07-03 08:05:45 +03:00
Dave Airlie
542d972221 Linux 4.7-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXcHi9AAoJEHm+PkMAQRiGSJ0H/2o4t9VWYmhyPC1sdIHoCExJ
 P4tBrcZYBmKcsOmIfnJDa5g/+IdhouEUM0v0fHPogS2UUWT9eRuJWYD3sY+HpEQ+
 heKTli8X73gsFB25odeIbIt0jAoSiiMYWDrWqLNsuUV1tjEYVA8rH0SM94FiOC/5
 7WVWXLTuH+Rm7JHP18BnKxmMMbzrTFmwisLMqFKyfZRRSlS+/ix7iLUNO9AFa39B
 YHxNPihLrZ0oONyCOAQoHTIXXrw0cQbxV2utg3vnMcCZdme2xOn+iXMntTSKfZ39
 iC9/T0vsO3R6OrRo2aDZAnCPUAniXnMEIhrKG37WMyXpj6cucZ/2QiNXcXviGV4=
 =iLte
 -----END PGP SIGNATURE-----

Back-merge tag 'v4.7-rc5' into drm-next

Linux 4.7-rc5

The fsl-dcu pull needs -rc3 so go to -rc5 for now.
2016-07-02 15:56:01 +10:00
Daniel Vetter
a104299b94 drm/amdkfd: Clean up inline handling
- inline functions need to be static inline, otherwise gcc can opt to
  not inline and the linker gets unhappy.
- no forward decls for inline functions, just include the right headers.

Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Ben Goz <ben.goz@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466500235-21282-2-git-send-email-daniel.vetter@ffwll.ch
2016-06-21 21:32:52 +02:00
Oded Gabbay
0fbbbf8b59 drm/amdkfd: print once about mem_banks truncation
This print can really spam the kernel log in case we are truncating
mem_banks, so just print this info once. It should also not be classified
as warning.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-06-03 08:50:40 +03:00
Oded Gabbay
bc4755a4bd drm/amdkfd: destroy dbgmgr in notifier release
amdkfd need to destroy the debug manager in case amdkfd's notifier
function is called before the unbind function, because in that case,
the unbind function will exit without destroying debug manager.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
CC: Stable <stable@vger.kernel.org>
2016-06-03 08:50:40 +03:00
Oded Gabbay
121b78e679 drm/amdkfd: unbind only existing processes
When unbinding a process from a device (initiated by amd_iommu_v2), the
driver needs to make sure that process still exists in the process table.
There is a possibility that amdkfd's own notifier handler -
kfd_process_notifier_release() - was called before the unbind function
and it already removed the process from the process table.

v2:
Because there can be only one process with the specified pasid, and
because *p can't be NULL inside the hash_for_each_rcu macro, it is more
reasonable to just put the whole code inside the if statement that
compares the pasid value. That way, when we exit hash_for_each_rcu, we
simply exit the function as well.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
CC: Stable <stable@vger.kernel.org>
2016-06-03 08:50:40 +03:00
Edward O'Callaghan
eb026024c2 amdkfd: Trim unnescessary intermediate err var in kfd_chardev.c
Found-By: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-05-01 00:06:29 +10:00
Edward O'Callaghan
371d5b653f amdkfd: Trim off unnescessary semicolon from kfd_packet_manager.c
Found-By: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-05-01 00:06:28 +10:00
Edward O'Callaghan
991ca8eee2 amdkfd: Use the canonical form in branch predicates
Found-By: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-05-01 00:06:27 +10:00
Andy Lutomirski
10f1685fd5 drivers/gpu/drm/amd/amdkfd: use in_compat_syscall to check open() caller type
amdkfd wants to know syscall type, not task type.  Check directly.

Unfortunately, amdkfd is making nasty assumptions that a process'
bitness is a well-defined constant thing.  This isn't the case on x86.
I don't know how much this matters, but this patch has no effect on
generated code on x86, so amdkfd is equally broken with and without this
patch.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Dan Carpenter
93fce95442 drm/amdkfd: uninitialized variable in dbgdev_wave_control_set_registers()
At the end of the function we expect "status" to be zero, but it's
either -EINVAL or uninitialized.

Fixes: 788bf83db3 ('drm/amdkfd: Add wave control operation to debugger')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-03-15 14:09:37 +02:00
Oded Gabbay
c68f4528a2 drm/amdkfd: Track when module's init is complete
Current dependencies between amdkfd and radeon/amdgpu force the loading
of amdkfd _before_ radeon and/or amdgpu are loaded. When all these kernel
drivers are built as modules, this ordering is enforced by the kernel
built-in mechanism of loading dependent modules.

However, there is no such mechanism in case where all these drivers are
compiled inside the kernel image (not as modules). The current way to
enforce loading of amdkfd before radeon/amdgpu, is to put amdkfd before
radeon/amdgpu in the drm Makefile, but that method is way too fragile.

In addition, there is no kernel mechanism to check whether a kernel
driver that is built inside the kernel image, has already been loaded.

To solve this, this patch adds to kfd_module.c a new static variable,
amdkfd_init_completed, that is set to 1 only when amdkfd's
module initialization function has been completed (successfully).

kgd2kfd_init(), which is the initialization function of the
kgd-->kfd interface, and which is the first function in amdkfd called by
radeon/amdgpu, will return successfully only if amdkfd_init_completed is
equal 1.

If amdkfd_init_completed is not equal to 1, kgd2kfd_init() will
return -EPROBE_DEFER to signal radeon/amdgpu they need to defer
their loading until amdkfd is loaded.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-02-27 22:52:40 +02:00
Amitoj Kaur Chawla
642f0f2a15 drm/amdkfd: Remove unnecessary cast in kfree
Remove an unnecassary cast in the argument to kfree.

Found using Coccinelle. The semantic patch used to find this is as follows:

//<smpl>
@@
type T;
expression *f;
@@

- kfree((T *)(f));
+ kfree(f);
//</smpl>

Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-01-28 14:40:11 +02:00
Borislav Petkov
39c01bf933 amdkfd: Copy from the proper user command pointer
8f1d57c172 ("amdkfd: don't open-code memdup_user()") mistakenly uses
an uninitialized local pointer, gcc complains:

  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c: In function ‘kfd_ioctl_dbg_address_watch’:
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c:562:12: warning: ‘args_buff’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    args_buff = memdup_user(args_buff,
                ^

Fix it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-15 15:14:17 -05:00
Al Viro
8f1d57c172 amdkfd: don't open-code memdup_user()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-06 08:25:25 -05:00
Christoph Hellwig
2497ee7215 amdkfd: use <linux/mman.h> instead of <uapi/asm-generic/mman-common.h>
The latter is a default version of <asm/mman.h> and not for driver use.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-08-30 12:36:58 +03:00
Oded Gabbay
a63c580a52 drm/amdkfd: fix bug when initializing sdma vm
A logical AND operation was used during mask and shift, instead of a
bitwise AND operation. This patch fixes this bug by changing the
operation to bitwise AND.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-07-30 09:26:15 +03:00
Ben Goz
7639a8c420 drm/amdkfd: Set correct doorbell packet type for Carrizo
Signed-off-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:49 +03:00
Oded Gabbay
3d30b28be8 drm/amdkfd: Use generic defines in new amd headers
This patch makes use of the new amd headers (that are part of the new
amdgpu driver), instead of private defines.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:49 +03:00
Ben Goz
d7b8f73ea0 drm/amdkfd: Implement create_map_queues() for Carrizo
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:49 +03:00
Ben Goz
e1940fa4bf drm/amdkfd: fix runlist length calculation
The MAP_QUEUES packet length for Carrizo is different than for Kaveri.
Therefore, we now need to calculate the runlist length with regard to the
underlying H/W.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:49 +03:00
Ben Goz
914bea6329 drm/amdkfd: Add support for VI in DQM
This patch adds support for the VI APU in the DQM module.

Most of the functionality of DQM is shared between CI and VI. Therefore,
only a handful of functions are required to be in the
H/W-specific part of DQM.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:48 +03:00
Ben Goz
d696d536f0 drm/amdkfd: add support for VI in MQD manager
This patch implements all the VI MQD manager functions.
This is done in a different file as the MQD format is different
between CI and VI

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:48 +03:00
Ben Goz
2d8f1f3303 drm/amdkfd: add CP HWS packet headers for VI
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:48 +03:00
Ben Goz
123576d144 drm/amdkfd: add supported CZ devices PCI IDs to amdkfd
This patch adds the PCI IDs of supported CZ devices to the
supported_devices structure in amdkfd. That structure is used during the
amdkfd probing stage, to check if the currently probed device is eligible
to be handled by amdkfd.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:48 +03:00
Oded Gabbay
bd72a72c3a drm/amdkfd: Add dependency of DRM_AMDGPU to Kconfig
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-20 09:16:48 +03:00
Maninder Singh
a0f67441b0 drm/amdkfd: validate pdd where it acquired first
Currently pdd is validate after dereferencing it, which is
not correct, Thus validate pdd before its first use.

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-07-09 13:27:52 +03:00
Linus Torvalds
099bfbfc7f Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.2.

  I've one other new driver from freescale on my radar, it's been posted
  and reviewed, I'd just like to get someone to give it a last look, so
  maybe I'll send it or maybe I'll leave it.

  There is no major nouveau changes in here, Ben was working on
  something big, and we agreed it was a bit late, there wasn't anything
  else he considered urgent to merge.

  There might be another msm pull for some bits that are waiting on
  arm-soc, I'll see how we time it.

  This touches some "of" stuff, acks are in place except for the fixes
  to the build in various configs,t hat I just applied.

  Summary:

  New drivers:
      - virtio-gpu:
                KMS only pieces of driver for virtio-gpu in qemu.
                This is just the first part of this driver, enough to run
                unaccelerated userspace on. As qemu merges more we'll start
                adding the 3D features for the virgl 3d work.
      - amdgpu:
                a new driver from AMD to driver their newer GPUs. (VI+)
                It contains a new cleaner userspace API, and is a clean
                break from radeon moving forward, that AMD are going to
                concentrate on. It also contains a set of register headers
                auto generated from AMD internal database.

  core:
      - atomic modesetting API completed, enabled by default now.
      - Add support for mode_id blob to atomic ioctl to complete interface.
      - bunch of Displayport MST fixes
      - lots of misc fixes.

  panel:
      - new simple panels
      - fix some long-standing build issues with bridge drivers

  radeon:
      - VCE1 support
      - add a GPU reset counter for userspace
      - lots of fixes.

  amdkfd:
      - H/W debugger support module
      - static user-mode queues
      - support killing all the waves when a process terminates
      - use standard DECLARE_BITMAP

  i915:
      - Add Broxton support
      - S3, rotation support for Skylake
      - RPS booting tuning
      - CPT modeset sequence fixes
      - ns2501 dither support
      - enable cmd parser on haswell
      - cdclk handling fixes
      - gen8 dynamic pte allocation
      - lots of atomic conversion work

  exynos:
      - Add atomic modesetting support
      - Add iommu support
      - Consolidate drm driver initialization
      - and MIC, DECON and MIPI-DSI support for exynos5433

  omapdrm:
      - atomic modesetting support (fixes lots of things in rewrite)

  tegra:
      - DP aux transaction fixes
      - iommu support fix

  msm:
      - adreno a306 support
      - various dsi bits
      - various 64-bit fixes
      - NV12MT support

  rcar-du:
      - atomic and misc fixes

  sti:
      - fix HDMI timing complaince

  tilcdc:
      - use drm component API to access tda998x driver
      - fix module unloading

  qxl:
      - stability fixes"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (872 commits)
  drm/nouveau: Pause between setting gpu to D3hot and cutting the power
  drm/dp/mst: close deadlock in connector destruction.
  drm: Always enable atomic API
  drm/vgem: Set unique to "vgem"
  of: fix a build error to of_graph_get_endpoint_by_regs function
  drm/dp/mst: take lock around looking up the branch device on hpd irq
  drm/dp/mst: make sure mst_primary mstb is valid in work function
  of: add EXPORT_SYMBOL for of_graph_get_endpoint_by_regs
  ARM: dts: rename the clock of MIPI DSI 'pll_clk' to 'sclk_mipi'
  drm/atomic: Don't set crtc_state->enable manually
  drm/exynos: dsi: do not set TE GPIO direction by input
  drm/exynos: dsi: add support for MIC driver as a bridge
  drm/exynos: dsi: add support for Exynos5433
  drm/exynos: dsi: make use of array for clock access
  drm/exynos: dsi: make use of driver data for static values
  drm/exynos: dsi: add macros for register access
  drm/exynos: dsi: rename pll_clk to sclk_clk
  drm/exynos: mic: add MIC driver
  of: add helper for getting endpoint node of specific identifiers
  drm/exynos: add Exynos5433 decon driver
  ...
2015-06-26 13:18:51 -07:00
Dan Carpenter
7861c7a4ca drm/amdkfd: fix some range checks in address watch ioctl
buf_size_in_bytes must be large enough to hold ->num_watch_points and
watch_mode so I have added a sizeof(int) * 2 to the minimum size.

Also we have to subtract sizeof(*args) from the max args_idx limit so
that it matches the allocation.  Also I changed a > to >= for the last
compare.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-16 15:42:26 +03:00
Oded Gabbay
da73b9fb56 drm/amdkfd: remove not used defines from cik_regs.h
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-06 22:26:47 +03:00
Oded Gabbay
eaccd6e743 drm/amdkfd: Add missing properties to CZ device info
This patch adds two missing properties initializations to the device
info structure of CZ.

As we don't have CZ support yet, it isn't critical, but its important to
fix this now instead of forgetting about it later.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-06 22:26:47 +03:00
Ben Goz
a82918f18a drm/amdkfd: make reset wavefronts per process per device
This commit moves the reset wavefront flag to per process per device
data structure, so we can support multiple devices.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-06 22:26:39 +03:00
Oded Gabbay
6235e15ea6 drm/amdkfd: add debug print to kfd_events.c
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-06 21:09:51 +03:00
Valentin Rothberg
f4e04022ed drm/amdkfd: avoid CONFIG_ prefix for non-Kconfig symbols
The CONFIG_ prefix is reserved for Kconfig options in Make and CPP
syntax.  Various static analysis tools rely on this naming convention
and check if CONFIG_ prefixed symbols are defined Kconfig.  Hence add
yet another prefix AMD_ to CONFIG_REG_{BASE,END,SISE} to apply to this
convention and make static analysis tools happy.

Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-06 20:48:34 +03:00
Alexey Skidanov
826f5de84c drm/amdkfd: fix topology bug with capability attr.
This patch fixes a bug where the number of watch points
was shown before it was actually calculated

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 21:45:54 +03:00
Ben Goz
c3447e8150 drm/amdkfd: Enforce kill all waves on process termination
This commit makes sure that on process termination, after
we're destroying all the active queues, we're killing all the
existing wave front of the current process.

By doing this we're making sure that if any of the CUs were blocked
by infinite loop we're enforcing it to end the shader explicitly.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:34:47 +03:00
Yair Shachar
f8bd13338a drm/amdkfd: Implement address watch debugger IOCTL
v2:

- rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it
- change void* to uint64_t inside ioctl arguments
- use kmalloc instead of kzalloc because we use copy_from_user
  immediately after it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:34:35 +03:00
Yair Shachar
9448458998 drm/amdkfd: Implement wave control debugger IOCTL
v2:

- rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it
- change void* to uint64_t inside ioctl arguments
- use kmalloc instead of kzalloc because we use copy_from_user
  immediately after it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:26 +03:00
Yair Shachar
037ed9a2ac drm/amdkfd: Implement (un)register debugger IOCTLs
v2: rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:07 +03:00
Yair Shachar
e2e9afc4a3 drm/amdkfd: Add address watch operation to debugger
The address watch operation gives the ability to specify watch points
which will generate a shader breakpoint, based on a specified single
address or range of addresses.

There is support for read/write/any access modes.

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:06 +03:00
Yair Shachar
788bf83db3 drm/amdkfd: Add wave control operation to debugger
The wave control operation supports several command types executed upon
existing wave fronts that belong to the currently debugged process.

The available commands are:

HALT   - Freeze wave front(s) execution
RESUME - Resume freezed wave front(s) execution
KILL   - Kill existing wave front(s)

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:06 +03:00
Yair Shachar
fbeb661bfa drm/amdkfd: Add skeleton H/W debugger module support
This patch adds the skeleton H/W debugger module support. This code
enables registration and unregistration of a single HSA process at a
time.

The module saves the process's pasid and use it to verify that only the
registered process is allowed to execute debugger operations through the
kernel driver.

v2: rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:28 +03:00
Yair Shachar
992839ad64 drm/amdkfd: Add static user-mode queues support
This patch adds support for static user-mode queues in QCM.
Queues which are designated as static can NOT be preempted by
the CP microcode when it is executing its scheduling algorithm.

This is needed for supporting the debugger feature, because we
can't allow the CP to preempt queues which are currently being debugged.

The number of queues that can be designated as static is limited by the
number of HQDs (Hardware Queue Descriptors).

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:28 +03:00
Yair Shachar
aef11009c4 drm/amdkfd: add H/W debugger IOCTL set definitions
This patch adds four new IOCTLs to amdkfd. These IOCTLs expose a H/W
debugger functionality to the userspace.

The IOCTLs are:

- AMDKFD_IOC_DBG_REGISTER:

The purpose of this IOCTL is to notify amdkfd that a process wants to use
GPU debugging facilities on itself only.
It is expected that this IOCTL would be called before any other H/W
debugger requests are sent to amdkfd and for each GPU where the H/W
debugging needs to be enabled. The use of this IOCTL ensures that only
one instance of a debugger is active in the system.

- AMDKFD_IOC_DBG_UNREGISTER:

This IOCTL detaches the debugger/debugged process from the H/W
Debug which was established by the AMDKFD_IOC_DBG_REGISTER IOCTL.

- AMDKFD_IOC_DBG_ADDRESS_WATCH:

This IOCTL allows to set different watchpoints with various conditions as
indicated by the IOCTL's arguments. The available number of watchpoints
is retrieved from topology. This operation is confined to the current
debugged process, which was registered through AMDKFD_IOC_DBG_REGISTER.

- AMDKFD_IOC_DBG_WAVE_CONTROL:

This IOCTL allows to control a wavefront as indicated by the IOCTL's
arguments. For example, you can halt/resume or kill either a
single wavefront or a set of wavefronts. This operation is confined to
the current debugged process, which was registered through
AMDKFD_IOC_DBG_REGISTER.

Because the arguments for the address watch IOCTL and wave control IOCTL
are dynamic, meaning that they could vary in size, the userspace passes a
pointer to a structure (in userspace) that contains the value of the
arguments. The kernel driver is responsible to parse this structure and
validate its contents.

v2: change void* to uint64_t inside ioctl arguments

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:07 +03:00
Joe Perches
f761d8bd80 drm/amdkfd: Use DECLARE_BITMAP
Use the generic mechanism to declare a bitmap instead of unsigned long.

It seems that "struct kfd_process.allocated_queue_bitmap" is unused.
Maybe it could be deleted instead.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:31:12 +03:00
Dave Airlie
bdcddf95e8 Linux 4.1-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVWh3TAAoJEHm+PkMAQRiG/kwH/2c9irodp2+M9OUnX2bfsBb6
 LnChiDpvkF5BB8jhP6d/XmvPp4NJzAbTxByhjdfb2E2HkorCUHCOIn2tI1TE2pUs
 2qjkOVH+XCzoV0goGtQjzK1ht8f2IrtlDiEjyRekK5cJHzhggb22QPtWL4npyd0O
 reDmG2jsRaF9POr9uLSFEv4CEnkksmRLUU0vuQX0TZeCJ41O7TXrkN/wKrLZ5mj4
 IWpqXQaSlrffq/T5HnVbXBxk3/T8QmhrIoppiMpV1mUVj0uTqlFRNi5qwT2Nit1h
 FVljWI4+WgOk3bf7fUlp+ahopjkTgu+GuXkiRP/pdgWNQO0cxCWSAzSndAlIIAE=
 =uOoJ
 -----END PGP SIGNATURE-----

Backmerge v4.1-rc4 into into drm-next

We picked up a silent conflict in amdkfd with drm-fixes and drm-next,
backmerge v4.1-rc5 and fix the conflicts

Signed-off-by: Dave Airlie <airlied@redhat.com>

Conflicts:
	drivers/gpu/drm/drm_irq.c
2015-05-20 16:23:53 +10:00
Oded Gabbay
7591cd2cd5 drm/amdkfd: change driver version to 0.7.2
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:30 +03:00
Andrew Lewycky
8377396b5d drm/amdkfd: Implement events IOCTLs
Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:29 +03:00
Oded Gabbay
81663016db drm/amdkfd: Add module parameter of send_sigterm
This patch adds a new kernel module parameter to amdkfd,
called send_sigterm.

This parameter specifies whether amdkfd should send the
SIGTERM signal to an HSA process, when the following conditions
occur:

1. The GPU triggers an exception regarding a kernel that was
   issued by this process.

2. The HSA process isn't waiting on an event that handles
   this exception.

The default behavior is not to send a SIGTERM and suffice
with a dmesg error print.

Reviewed-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:28 +03:00
Alexey Skidanov
930c5ff439 drm/amdkfd: Add bad opcode exception handling
Signed-off-by: Alexey Skidanov <alexey.skidanov@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:28 +03:00
Alexey Skidanov
59d3e8be87 drm/amdkfd: Add memory exception handling
This patch adds Peripheral Page Request (PPR) failure processing
and reporting.

Bad address or pointer to a system memory block with inappropriate
read/write permission cause such PPR failure during a user queue
processing. PPR request handling is done by IOMMU driver notifying
AMDKFD module on PPR failure.

The process triggering a PPR failure will be notified by
appropriate event or SIGTERM signal will be sent to it.

v3:
- Change all bool fields in struct kfd_memory_exception_failure to
  uint32_t

Signed-off-by: Alexey Skidanov <alexey.skidanov@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:27 +03:00
Andrew Lewycky
f3a398183f drm/amdkfd: Add the events module
This patch adds the events module (kfd_events.c) and the interrupt
handle module for Kaveri (cik_event_interrupt.c).

The patch updates the interrupt_is_wanted(), so that it now calls the
interrupt isr function specific for the device that received the
interrupt. That function(implemented in cik_event_interrupt.c)
returns whether this interrupt is of interest to us or not.

The patch also updates the interrupt_wq(), so that it now calls the
device's specific wq function, which checks the interrupt source
and tries to signal relevant events.

v2:

Increase limit of signal events to 4096 per process
Remove bitfields from struct cik_ih_ring_entry
Rename radeon_kfd_event_mmap to kfd_event_mmap
Add debug prints to allocate_free_slot and allocate_signal_page
Make allocate_event_notification_slot return a correct value
Add warning prints to create_signal_event
Remove error print from IOCTL path
Reformatted debug prints in kfd_event_mmap
Map correct size (as received from mmap) in kfd_event_mmap

v3:

Reduce limit of signal events back to 256 per process
Fix allocation of kernel memory for signal events

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:26 +03:00
Andrew Lewycky
29a5d3eb9a drm/amdkfd: add events IOCTL set definitions
- AMDKFD_IOC_CREATE_EVENT:
	Creates a new event of a specified type

- AMDKFD_IOC_DESTROY_EVENT:
	Destroys an existing event

- AMDKFD_IOC_SET_EVENT:
	Signal an existing event

- AMDKFD_IOC_RESET_EVENT:
	Reset an existing event

- AMDKFD_IOC_WAIT_EVENTS:
	Wait on event(s) until they are signaled

v2:

- Move the limit of the signal events to kfd_ioctl.h so it
  can be used by userspace

v3:
- Change all bool fields in struct kfd_memory_exception_failure
to uint32_t

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:00 +03:00
Andrew Lewycky
2249d55827 drm/amdkfd: Add interrupt handling module
This patch adds the interrupt handling module, kfd_interrupt.c, and its
related members in different data structures to the amdkfd driver.

The amdkfd interrupt module maintains an internal interrupt ring
per amdkfd device. The internal interrupt ring contains interrupts
that needs further handling. The extra handling is deferred to
a later time through a workqueue.

There's no acknowledgment for the interrupts we use. The hardware
simply queues a new interrupt each time without waiting.

The fixed-size internal queue means that it's possible for us to lose
interrupts because we have no back-pressure to the hardware.

However, only interrupts that are "wanted" by amdkfd, are copied into
the amdkfd s/w interrupt ring, in order to minimize the chances
for overflow of the ring.

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 12:13:39 +03:00