Hawking Zhang
841933d5b8
drm/amdgpu: don't override default ECO_BITs setting
...
Leave this bit as hardware default setting
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
Cc: stable@vger.kernel.org
2021-12-14 17:50:36 -05:00
Lee Jones
5f7d8ee71e
drm/amd/amdgpu/mmhub_v9_4: Fix naming disparity with 'mmhub_v9_4_set_fault_enable_default()'
...
Fixes the following W=1 kernel build warning(s):
drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c:446: warning: expecting prototype for mmhub_v1_0_set_fault_enable_default(). Prototype was for mmhub_v9_4_set_fault_enable_default() instead
Cc: Alex Deucher <alexander.deucher@amd.com >
Cc: "Christian König" <christian.koenig@amd.com >
Cc: David Airlie <airlied@linux.ie >
Cc: Daniel Vetter <daniel@ffwll.ch >
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2021-05-21 10:32:16 -04:00
Hawking Zhang
53ee6609b4
drm/amdgpu: only harvest gcea/mmea error status in arcturus
...
SDP RdRspStatus/WrRspStatus or first parity error on
RdRsp data can cause system fatal error in arcturus.
GPU will be freezed in such case.
Driver needs to harvest these error information before
reset the GPU. Check error type to avoid harvest normal
gcea/mmea information.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2021-04-20 21:35:45 -04:00
Oak Zeng
0ca565ab97
drm/amdgpu: Calling address translation functions to simplify codes
...
Use amdgpu_gmc_vram_pa and amdgpu_gmc_vram_cpu_pa
to simplify codes. No logic change.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com >
Acked-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2021-04-15 16:03:01 -04:00
Hawking Zhang
8bc7b360ad
drm/amdgpu: split mmhub callbacks into ras and non-ras ones
...
mmhub ras is only avaiable in cerntain mmhub ip
generation.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: John Clements <John.Clements@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2021-04-09 16:51:19 -04:00
Nirmoy Das
68fce5f07c
drm/amdgpu: use AMDGPU_NUM_VMID when possible
...
Replace hardcoded vmid number with AMDGPU_NUM_VMID macro.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com >
Acked-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-12-08 23:05:40 -05:00
Alex Deucher
9b498efae2
drm/amdgpu: store noretry parameter per driver instance
...
This will allow us to have different defaults per asic
in a future patch.
Reviewed-by: Christian König <christian.koenig@amd.com >
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-09-25 16:55:16 -04:00
Stanley.Yang
3f975d0f71
drm/amdgpu: update athub interrupt harvesting handle
...
GCEA/MMHUB EA error should not result to DF freeze, this is
fixed in next generation, but for some reasons the GCEA/MMHUB
EA error will result to DF freeze in previous generation,
diver should avoid to indicate GCEA/MMHUB EA error as hw fatal
error in kernel message by read GCEA/MMHUB err status registers.
Changed from V1:
make query_ras_error_status function more general
make read mmhub er status register more friendly
Changed from V2:
move ras error status query function into do_recovery workqueue
Changed from V3:
remove useless code from V2, print GCEA error status
instance number
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-09-22 17:37:38 -04:00
Oak Zeng
9fb1506eb6
drm/amdgpu: Use function pointer for some mmhub functions
...
Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC. Simplify the code by deleting duplicate functions
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-08-14 16:22:40 -04:00
Huang Rui
714ec7a2bd
drm/amdgpu: use register distance member instead of hardcode in mmhub v9.4
...
This patch updates to use register distance member instead of hardcode
in mmhub v9.4.
Signed-off-by: Huang Rui <ray.huang@amd.com >
Tested-by: AnZhong Huang <anzhong.huang@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-07-08 09:04:08 -04:00
Huang Rui
1f9d56c309
drm/amdgpu: add register distance members into vmhub structure
...
This patch is to abstract register distances between two continuous
context domains and invalidation engines. In different ip headers, these
distances may be differences.
Signed-off-by: Huang Rui <ray.huang@amd.com >
Tested-by: AnZhong Huang <anzhong.huang@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-07-08 09:03:00 -04:00
John Clements
2b961e6a95
drm/amdgpu: update RAS related dmesg print
...
prefix RAS error related dmesg print with pci device info
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: John Clements <john.clements@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-04-07 14:02:36 -04:00
Hawking Zhang
fe5211f19a
drm/amdgpu: add reset_ras_error_count function for MMHUB
...
MMHUB ras error counters are dirty ones after cold reboot
Read operation is needed to reset them to 0
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Guchun Chen <guchun.chen@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-03-05 00:32:40 -05:00
Nirmoy Das
a9d4fe2fd6
drm/amdgpu: remove unnecessary conversion to bool
...
Better clean that up before some automation starts to complain about it
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com >
Acked-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-01-22 16:55:27 -05:00
Dennis Li
39aa0ef163
drm/amdgpu: enable RAS feature for more mmhub sub-blocks of Acrturus
...
Compared with Vg20, the size of mmhub range is changed from 2 to 8.
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Guchun Chen <guchun.chen@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-01-22 16:35:56 -05:00
Zhigang Luo
20bf2f6fef
drm/amd/amdgpu: L1 Policy(1/5) - removed VM settings for mmhub and gfxhub from VF
...
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com >
Signed-off-by: Jane Jian <jane.jian@amd.com >
Reviewed-by: Emily Deng <Emily.Deng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2020-01-07 12:00:17 -05:00
Frank.Min
55d62fe10f
drm/amdgpu: remove FB location config for sriov
...
FB location is already programmed by HV driver
for arcutus so remove this part
Signed-off-by: Frank.Min <Frank.Min@amd.com >
Reviewed-by: Emily Deng <Emily.Deng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-12-23 14:59:43 -05:00
Yong Zhao
6dcab16b41
drm/amdkfd: Contain MMHUB number in mmhub_v9_4_setup_vm_pt_regs()
...
Adjust the exposed function prototype so that the caller does not need
to know the MMHUB number.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-12-03 11:08:24 -05:00
Oak Zeng
3d3f9ba8c4
drm/amdgpu: Apply noretry setting for mmhub9.4
...
Config the translation retry behavior according to noretry
kernel parameter
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com >
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-11-25 11:19:55 -05:00
changzhu
dab5ef2722
drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
...
SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-11-22 14:27:11 -05:00
Dennis Li
f6c3623b7b
drm/amdgpu: implement querying ras error count for mmhub9.4
...
Get mmhub error counter by accessing EDC_CNT registers.
v2: Add mmhub_v9_4_ prefix for local static variable and function
Signed-off-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-11-22 14:27:11 -05:00
Alex Deucher
e2f619aa14
drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZE
...
These were not aligned for optimal performance for GPUVM.
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-11-06 14:20:08 -05:00
Felix Kuehling
7cae706193
drm/amdgpu: Disable retry faults in VMID0
...
There is no point retrying page faults in VMID0. Those faults are
always fatal.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
Reviewed-and-Tested-by: Huang Rui <ray.huang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-09-16 09:54:34 -05:00
Le Ma
a840159c82
drm/amdgpu: enable mmhub clock gating for Arcturus
...
Init MC_MGCG/LS flag. Also apply to athub CG.
Signed-off-by: Le Ma <le.ma@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-08-12 12:47:49 -05:00
Le Ma
cb15e8046d
drm/amdgpu: add mmhub clock gating for Arcturus
...
Add 2 mmhub instances CG
Signed-off-by: Le Ma <le.ma@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-08-12 12:47:49 -05:00
Le Ma
6c54afc7e8
drm/amdgpu: assign fb_start/end in mmhub v9.4 interface
...
Align with mmhub v1.0.
Signed-off-by: Le Ma <le.ma@amd.com >
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-18 14:18:06 -05:00
Le Ma
7d0670f441
drm/amdgpu: set system aperture to cover whole FB region in mmhub v9.4
...
In XGMI configuration, the FB region covers vram region from peer
device, adjust system aperture to cover all of them
Signed-off-by: Le Ma <le.ma@amd.com >
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-18 14:18:05 -05:00
Yong Zhao
c9ffdf5acd
drm/amdgpu: Set VM_L2_CNTL.PDE_FAULT_CLASSIFICATION to 0 for MMHUB 9.4
...
Should be set to 0 for mmhub 9.4.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-18 14:18:04 -05:00
Yong Zhao
6d5311ab2c
drm/amdkfd: Expose function mmhub_v9_4_setup_vm_pt_regs() for kfd to use
...
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com >
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-18 14:18:04 -05:00
Le Ma
2cb2ea1e07
drm/amdgpu: add mmhub v9.4.1 block for Arcturus (v2)
...
Arcturus as an updated mmhub block. mmhub is the
memory controller hub used for sdma and multimedia.
v2: squash in AGP BAR programming (Alex)
Signed-off-by: Le Ma <le.ma@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-18 14:18:02 -05:00