Commit Graph

6403 Commits

Author SHA1 Message Date
Felix Kuehling
1995b3a35f drm/amdgpu: Fix error handling in amdgpu_ras_recovery_init
Don't set a struct pointer to NULL before freeing its members. It's
hard to see what's happening due to a local pointer-to-pointer data
aliasing con->eh_data.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Philip Cox <Philip.Cox@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:09:33 -05:00
Colin Ian King
317a8d9eb6 drm/amdgpu: remove redundant variable r and redundant return statement
There is a return statement that is not reachable and a variable that
is not used.  Remove them.

Addresses-Coverity: ("Structurally dead code")
Fixes: de7b45babd ("drm/amdgpu: cleanup creating BOs at fixed location (v2)")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:14 -05:00
Colin Ian King
17cf678a33 drm/amdgpu: fix uninitialized variable pasid_mapping_needed
The boolean variable pasid_mapping_needed is not initialized and
there are code paths that do not assign it any value before it is
is read later.  Fix this by initializing pasid_mapping_needed to
false.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 6817bf283b ("drm/amdgpu: grab the id mgr lock while accessing passid_mapping")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:14 -05:00
Leo Liu
d0312d0dca drm/amdgpu: add code comment in vcn_v2_5_hw_init
Add a comment to VCN 2.5 encode ring

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:08 -05:00
Leo Liu
fd287c8cd2 drm/amdgpu/vcn: use amdgpu_ring_test_helper
Instead of amdgpu_ring_test_ring, so the helper function determines
whether the ring is ready

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:52:53 -05:00
Alex Deucher
8a745c7ff2 drm/amdgpu: improve MSI-X handling (v3)
Check the number of supported vectors and fall back to MSI if
we return or error or 0 MSI-X vectors.

v2: only allocate one vector.  We can't currently use more than
one anyway.

v3: install the irq on vector 0.

Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Shaoyun liu  <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 13:51:19 -05:00
Arnd Bergmann
324fb7adf6 drm/amdgpu: hide another #warning
An earlier patch of mine disabled some #warning statements
that get in the way of build testing, but then another
instance was added around the same time.

Remove that as well.

Fixes: b5203d16ae ("drm/amd/amdgpu: hide #warning for missing DC config")
Fixes: e1c14c4339 ("drm/amdgpu: Enable DC on Renoir")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
Arnd Bergmann
128a01f472 drm/amdgpu: make pmu support optional, again
When CONFIG_PERF_EVENTS is disabled, we cannot compile the pmu
portion of the amdgpu driver:

drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:48:38: error: no member named 'hw' in 'struct perf_event'
        struct hw_perf_event *hwc = &event->hw;
                                     ~~~~~  ^
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:51:13: error: no member named 'attr' in 'struct perf_event'
        if (event->attr.type != event->pmu->type)
            ~~~~~  ^
...

The same bug was already fixed by commit d155bef063 ("amdgpu: make pmu
support optional") but broken again by what looks like an incorrectly
rebased patch.

Fixes: 64f55e6292 ("drm/amdgpu: Add RAS EEPROM table.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
yu kuai
2e0db9dec2 drm/amdgpu: remove set but not used variable 'pipe'
Fixes gcc '-Wunused-but-set-variable' warning:

rivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c: In function
‘amdgpu_gfx_graphics_queue_acquire’:
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:234:16: warning:
variable ‘pipe’ set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
Navid Emamdoost
1104057562 drm/amdgpu: fix multiple memory leaks in acp_hw_init
In acp_hw_init there are some allocations that needs to be released in
case of failure:

1- adev->acp.acp_genpd should be released if any allocation attemp for
adev->acp.acp_cell, adev->acp.acp_res or i2s_pdata fails.
2- all of those allocations should be released if
mfd_add_hotplug_devices or pm_genpd_add_device fail.
3- Release is needed in case of time out values expire.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Alex Deucher
2c9a0c66d5 drm/amdgpu: don't increment vram lost if we are in hibernation
We reset the GPU as part of our hibernation sequence so we need
to make sure we don't mark vram as lost in that case.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111879
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
shaoyunl
bd660f4f11 drm/amdgpu : enable msix for amdgpu driver
We might used out of the msi resources in some cloud project
which have a lot gpu devices(including PF and VF), msix can
provide enough resources from system level view

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
e7956997b1 drm/amdgpu: Export setup_vm_pt_regs() logic for mmhub 2.0
The KFD code will call this function later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
923c087a1f drm/amdgpu: Add the HDP flush support for Navi
The HDP flush support code was missing in the nbio and nv files.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
e392c887df drm/amdkfd: Use array to probe kfd2kgd_calls
This is the same idea as the kfd device info probe and move all the
probe control together for easy maintenance.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
47c5ab6ca0 drm/amdkfd: Delete unnecessary function declarations
Ajust the function sequences so that those function delcarations are not
needed any more.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
1456482bf8 drm/amdgpu: Delete useless header file reference
Those header file includes are not needed.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Jack Zhang
21889cec0a drm/amd/amdgpu/sriov ip block setting of Arcturus
Add ip block setting for Arcturus SRIOV

1.PSP need to be initialized before IH.
2.SMU doesn't need to be initialized at kmd driver.
3.Arcturus doesn't support DCE hardware,it needs to skip
  register access to DCE.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Marek Olšák
cf21e76a60 drm/amdgpu: return tcc_disabled_mask to userspace
UMDs need this for correct programming of harvested chips.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Lyude Paul
f8d2d39eb4 drm/amdgpu: Iterate through DRM connectors correctly
Currently, every single piece of code in amdgpu that loops through
connectors does it incorrectly and doesn't use the proper list iteration
helpers, drm_connector_list_iter_begin() and
drm_connector_list_iter_end(). Yeesh.

So, do that.

Cc: Juston Li <juston.li@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Prike Liang
88d802500a drm/amdkfd: fix kgd2kfd_device_init() definition conflict error
The patch c670707 drm/amd: Pass drm_device to kfd introduced this issue and
fix the following compiler error.

  CC [M]  drivers/gpu/drm/amd/amdgpu//../powerplay/smumgr/fiji_smumgr.o
drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.c:746:6: error: conflicting types for ‘kgd2kfd_device_init’
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
      ^
In file included from drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.c:23:0:
drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.h:253:6: note: previous declaration of ‘kgd2kfd_device_init’ was here
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
      ^
scripts/Makefile.build:273: recipe for target 'drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.o' failed
make[1]: *** [drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.o] Error 1

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Kenneth Feng
227f7d58d7 drm/amd/amdgpu: add IH cg support on soc15 project
enable/disable IH clock gating on soc15 projects.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
b2100ce1db drm/amdkfd: Use setup_vm_pt_regs function from base driver in KFD
This was done on GFX9 previously, now do it for GFX10.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
286b789e1e drm/amdgpu: Export setup_vm_pt_regs() logic for gfxhub 2.0
The KFD code will call this function later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
56fc40aba4 drm/amdkfd: Eliminate get_atc_vmid_pasid_mapping_valid
get_atc_vmid_pasid_mapping_valid() is very similar to
get_atc_vmid_pasid_mapping_pasid(), so they can be merged into a new
function get_atc_vmid_pasid_mapping_info() to reduce register access
times. More importantly, getting the PASID and the valid bit atomically
with a single read fixes some potential race conditions where the
mapping changes between the two reads.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
d19eb6aca7 drm/amdkfd: Delete unused defines
They are not used anywhere.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Harish Kasiviswanathan
3a0c342392 drm/amd: Pass drm_device to kfd
kfd needs drm_device to call into drm_cgroup functions

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
b55a8b8b41 drm/amdkfd: Use better name for sdma queue non HWS path
The old name is prone to confusion. The register offset is for a RLC queue
rather than a SDMA engine. The value is not a base address, but a
register offset.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
9941a6bfbd drm/amdkfd: Delete useless SDMA register setting on non HWS path
HW folks have confirm that we should not touch RESUME_CTX of
SDMA*_GFX_CONTEXT_CNTL when manipulating RLC queues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Christian König
56f074d815 drm/amdgpu: restrict hotplug error message
We should print the error only when we are hotplugged and crash
basically all userspace applications.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Christian König
4a24652867 drm/amdgpu: once more fix amdgpu_bo_create_kernel_at
When CPU access is needed we should tell that to
amdgpu_bo_create_reserved() or otherwise the access is denied later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
3d8361b11c drm/amdgpu: add comments in ras interrupt callback
add comments to clarify why checking GFX IP BLOCK for each ras interrupt callback

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
ba08349214 drm/amdgpu: implement common gmc_ras_late_init
common gmc_ecc_late_init can be shared among all generations of gmc

v2: rename gmc_ecc_late_init to gmc_ras_late_init

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
be5b39d87a drm/amdgpu: move xgmi ras fini to xgmi block
it's more suitable to put xgmi ras fini in xgmi block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
196041205c drm/amdgpu: move mmhub ras fini to mmhub block
it's more suitable to put mmhub ras fini in mmhub block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
181c93e5ec drm/amdgpu: move umc ras fini to umc block
it's more suitable to put umc ras fini in umc block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
f2575941e6 drm/amdgpu: add ras fini for xgmi
add ras fini for xgmi to cleanup xgmi ras framework

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
de9bbd5273 drm/amdgpu: add ras fini for nbio
add a common nbio ras fini implementation to cleanup nbio ras framework

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
0771b0bf07 drm/amdgpu: simplify the access to eeprom_control struct
simplify the code of accessing to eeprom_control struct

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
41190cd733 drm/amdgpu: remove ih_info parameter of gfx_ras_late_init
gfx_ras_late_init can get the info by itself

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
56c54b25c3 drm/amdgpu: remove ih_info parameter of umc_ras_late_init
umc_ras_late_init can get the info by itself

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
e536c81850 drm/amdgpu: add common sdma_ras_fini function
sdma_ras_fini can be shared among all generations of sdma

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
3b7b7647be drm/amdgpu: add common gfx_ras_fini function
gfx_ras_fini can be shared among all generations of gfx

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
2adf13440a drm/amdgpu: add common gmc_ras_fini function
gmc_ras_fini can be shared among all generations of gmc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
65bc47a659 drm/amdgpu: move mmhub_ras_if from gmc to mmhub block
mmhub_ras_if is relevant to mmhub

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
d65bf1f8a7 drm/amdgpu: replace mmhub_funcs with mmhub.funcs
remove mmhub_funcs in adev

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
d3a5a121b8 drm/amdgpu: add common mmhub member for adev
put mmhub_funcs and ras_if pointer into mmhub struct

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
03740baab3 drm/amdgpu: move umc_ras_if from gmc to umc block
umc_ras_if is relevant to umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
fc04e6b484 drm/amdgpu: refine sdma4 ras_data_cb
simplify code logic and refine return value

v2: remove unused error source code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
4c65dd1041 drm/amdgpu: move sdma ecc functions to generic sdma file
sdma ras ecc functions can be reused among all sdma generations

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
725253ab9b drm/amdgpu: move gfx ecc functions to generic gfx file
gfx ras ecc common functions could be reused among all gfx generations

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
34cc4fd9ff drm/amdgpu: move umc ras irq functions to umc block
move umc ras irq functions from gmc v9 to generic umc block, these
functions are relevant to umc and they can be shared among all
generations of umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Tao Zhou
f5f06e21e9 drm/amdgpu: update parameter of ras_ih_cb
change struct ras_err_data *err_data to void *err_data, align with
umc code and the callback's declaration in each ras block could
pay no attention to the structure type

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Monk Liu
e7da754b00 drm/amdgpu: fix an UMC hw arbitrator bug(v3)
issue:
the UMC6 h/w bug is that when MCLK is doing the switch
in the middle of a page access being preempted by high
priority client (e.g. DISPLAY) then UMC and the mclk switch
would stuck there due to deadlock

how:
fixed by disabling auto PreChg for UMC to avoid high
priority client preempting other client's access on
the same page, thus the deadlock could be avoided

v2:
put the patch in callback of UMC6
v3:
rename the callback to "init_registers"

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Marek Olšák
6de088a08d drm/amdgpu: remove gfx9 NGG
Never used.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Alex Deucher
631cdbd27e drm/amdgpu/atomfirmware: simplify the interface to get vram info
fetch both the vram type and width in one function call.  This
avoids having to parse the same data table twice to get the two
pieces of data.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Alex Deucher
bd5520273c drm/amdgpu/atomfirmware: use proper index for querying vram type (v3)
The index is stored in scratch register 4 after asic init.  Use
that index.  No functional change since all asics in a family
use the same type of vram (G5, G6, HBM) and that is all we use
at the monent, but if we ever need to query other info, we will
now have the proper index.

v2: module array is variable sized, handle that.
v3: fix off by one in array handling

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Shirish S
52510a4035 drm/amdgpu/psp: silence response status warning
log the response status related error to the driver's
debug log since  psp response status is not 0 even though
there was no problem while the command was submitted.

This warning misleads, hence this change.

Signed-off-by: Shirish S <shirish.s@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Jesse Zhang
df99ac0fcc drm/amd/amdgpu:Fix compute ring unable to detect hang.
When compute fence did not signal, compute ring cannot detect hardware hang
because its timeout value is set to be infinite by default.

In SR-IOV and passthrough mode, if user does not declare custome timeout
value for compute ring, then use gfx ring timeout value as default. So
that when there is a ture hardware hang, compute ring can detect it.

Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
chen gong
90a08351f7 drm/amdgpu: Use mode2 mode to perform GPU RESET for Renoir
Renoir need to use mode2 mode to implement GPU RESET

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
40463bdc22 drm/amdkfd: Sync gfx10 kfd2kgd_calls function pointers
get_hive_id was not set. Also, adjust the function setting sequence.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
c637b36aea drm/amdkfd: Fix NULL pointer dereference for set_scratch_backing_va()
Currently this function pointer is missing for GFX10. Considering it is
a void function since GFX9, fix it by checking the function pointer
before dereferencing it.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
812330eb69 drm/amdkfd: Add an error print if SDMA RLC is not idle
The message will be useful when troubleshooting the issues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
05ba0095fb drm/amdgpu: correct condition check for psp rlc autoload
Otherwise non-autoload case will go into the wrong routine and fail.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Hawking Zhang
1f01cd9905 drm/amdgpu: add command id in psp response failure message
For better clarification of issue.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
90c88dab8e drm/amdgpu: enable psp front door loading by default on Arcturus
Front door firmware loading is done via the psp rather than the
driver.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
9a018e5a85 drm/amdgpu: disable vcn ip block for front door loading on Arcturus
Needs more work to enable via front door loading.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Tianci.Yin
4db37544ce drm/amdgpu/gfx10: add support for wks firmware loading
load different cp firmware according to the DID and RID

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
f77c7109c0 drm/amdgpu/ras: fix and update the documentation for RAS
Add new sections to amdgpu.rst, fix up formatting issues,
add additional documentation to each section.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
a667b75c1e drm/amdgpu: fix documentation for amdgpu_pm.c
Fix DOC link name, clean up formatting in pp_dpm_* section.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
fc9c7f8470 drm/amdgpu/ih: fix documentation in amdgpu_irq_dispatch
Fix parameters.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
1d614ded87 drm/amdgpu/vm: fix up documentation in amdgpu_vm.c
Missing parameters, wrong comment type, etc.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
4d8e54d2b9 drm/amdgpu/mn: fix documentation for amdgpu_mn_read_lock
Document the new parameter.

Fixes: 93065ac753 ("mm, oom: distinguish blockable mode for mmu notifiers")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
ebc52c1692 drm/amdgpu: fix documentation for amdgpu_gem_prime_export
Drop extra function parameter.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
yu kuai
d0580c09c6 drm/amdgpu: remove excess function parameter description
Fixes gcc warning:

drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:431: warning: Excess function
parameter 'sw' description in 'vcn_v2_5_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:550: warning: Excess function
parameter 'sw' description in 'vcn_v2_5_enable_clock_gating'

Fixes: cbead2bdfc ("drm/amdgpu: add VCN2.5 VCPU start and stop")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Guchun Chen
e53aec7e41 drm/amdgpu: enable full ras by default
Enable full ras by default, user does not need to enable it by
boot parameter.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Jiange Zhao
57d4f3b7fd drm/amdgpu/SRIOV: add navi12 pci id for SRIOV (v2)
Add Navi12 PCI id support.

v2: flag as experimental for now (Alex)

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
7677b0dbce drm/amdgpu/gfx10: update gfx golden settings for navi14
update registers: mmUTCL1_CTRL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
aa4604b6e4 drm/amdgpu/gfx10: update gfx golden settings
update registers: mmUTCL1_CTRL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
ade9a34e7d drm/amdgpu: flag navi12 and 14 as experimental for 5.4
We can remove this later as things get closer to launch.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
01b40c98ed drm/amdgpu/psp: invalidate the hdp read cache before reading the psp response
Otherwise we may get stale data.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
e8186eeccb drm/amdgpu/psp: flush HDP write fifo after submitting cmds to the psp
We need to make sure the fifo is flushed before we ask the psp to
process the commands.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Guchun Chen
5222d26146 drm/amdgpu: remove redundant variable definition
No need to define the same variables in each loop of the function.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Guchun Chen
8a3e801f19 drm/amdgpu: avoid null pointer dereference
null ptr should be checked first to avoid null ptr access

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Hawking Zhang
fec6a08aae drm/amdgpu: do not init mec2 jt for renoir
For ASICs like renoir/arct, driver doesn't need to load mec2 jt.
when mec1 jt is loaded, mec2 jt will be loaded automatically
since the write is actaully broadcasted to both.

We need to more time to test other gfx9 asic. but for now we should
be able to draw conclusion that mec2 jt is not needed for renoir and
arct.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Hawking Zhang
2011eaea21 drm/amdgpu: add psp ip block for arct
enable psp block for firmware loading and other security
feature setup.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
a142ba8800 drm/amdgpu/ras: use GPU PAGE_SIZE/SHIFT for reserving pages
We are reserving vram pages so they should be aligned to the
GPU page size.

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Xiaojie Yuan
ec51d3facd drm/amdgpu/discovery: get gpu info from ip discovery table
except soc_bounding_box which is not integrated in discovery table yet

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tao Zhou
afa44809a4 drm/amdgpu: use GPU PAGE SHIFT for umc retired page
umc retired page belongs to vram and it should be aligned to gpu page
size

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
57516cdd74 drm/amdgpu: add navi12 pci id
Add Navi12 PCI id support.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tao Zhou
ae115c81ec drm/amdgpu: replace DRM_ERROR with DRM_WARN in ras_reserve_bad_pages
There are two cases of reserve error should be ignored:
1) a ras bad page has been allocated (used by someone);
2) a ras bad page has been reserved (duplicate error injection for one page);

DRM_ERROR is unnecessary for the failure of bad page reserve

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Adam Zerella
879e723df3 docs: drm/amdgpu: Resolve build warnings
Some of the documentation formatting could be improved
which will resolve some Sphinx amdgpu build warnings e.g

WARNING: Unexpected indentation.
WARNING: Block quote ends without a blank line; unexpected unindent.
WARNING: Inline emphasis start-string without end-string.

Signed-off-by: Adam Zerella <adam.zerella@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Alex Deucher
63b2b5e91b drm/amdgpu/vm: fix documentation for amdgpu_vm_bo_param
Add new parameters.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha
143f230533 drm/amdgpu: psp DTM init
DTM is the display topology manager. This is needed to communicate with
psp about the display configurations.

This patch adds
    -Loading the firmware
    -The functions and definitions for communication with the firmware

v2: Fix formatting

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha
ed19a9a2bb drm/amdgpu: psp HDCP init
This patch adds
-Loading the firmware
-The functions and definitions for communication with the firmware

v2: Fix formatting

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Christian König
de7b45babd drm/amdgpu: cleanup creating BOs at fixed location (v2)
The placement is something TTM/BO internal and the RAS code should
avoid touching that directly.

Add a helper to create a BO at a fixed location and use that instead.

v2: squash in fixes (Alex)

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 08:06:54 -05:00
Andrey Grodzovsky
db338e1663 drm/amdgpu:Fix EEPROM checksum calculation.
Fix checksum calculation after manually resetting the table.
Unify reset and empty EEPROM init flow.
Protect the table reset with lock.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:44 -05:00
Guchun Chen
012dd14d1d drm/amdgpu: fix ras ctrl debugfs node leak
Use debugfs_remove_recursive to remove the whole debugfs
directory instead of removing the node one by one.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:38 -05:00
Christian König
1313dacfad drm/amdgpu: trace if a PD/PT update is done directly
This is usfull for debugging.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:32 -05:00
Christian König
bc51c1e56f drm/amdgpu: drop double HDP flush in the VM code
Already done in the CPU based backend code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:27 -05:00