Commit Graph

11123 Commits

Author SHA1 Message Date
Felix Kuehling
fc6ea4bee1 drm/amdgpu: Wipe all VRAM on free when RAS is enabled
On GPUs with RAS, poison can propagate between processes if VRAM is not
cleared when it is freed or allocated. The reason is, that not all write
accesses clear RAS poison. 32-byte writes by the SDMA engine do clear RAS
poison. Clearing memory in the background when it is freed should avoid
major performance impact. KFD has been doing this already for a long time.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-27 15:48:41 -05:00
Tianci.Yin
7270e8957e drm/amdgpu: Fix an error message in rmmod
[why]
In rmmod procedure, kfd sends cp a dequeue request, but the
request does not get response, then an error message "cp
queue pipe 4 queue 0 preemption failed" printed.

[how]
Performing kfd suspending after disabling gfxoff can fix it.

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-27 15:48:30 -05:00
Victor Zhao
039cacd239 drm/amdgpu: add determine passthrough under arm64
add determine for passthrough mode under arm64 by reading
CurrentEL register

v2: squash in warning fix (Alex)

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-27 15:47:34 -05:00
Christian König
3f268ef06f drm/ttm: add back a reference to the bdev to the res manager
It is simply a lot cleaner to have this around instead of adding
the device throughout the call chain.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-3-christian.koenig@amd.com
2022-01-26 15:29:24 +01:00
Christian König
de3688e469 drm/ttm: add ttm_resource_fini v2
Make sure we call the common cleanup function in all
implementations of the resource manager.

v2: fix missing case in i915, rudimentary kerneldoc, should be
    filled in more when we add more functionality

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-2-christian.koenig@amd.com
2022-01-26 15:23:51 +01:00
Tim Huang
37d6b1506b drm/amdgpu: convert to UVD IP version checking
Use IP versions rather than asic_type to differentiate IP version specific features.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:36 -05:00
Tim Huang
d726d43c20 drm/amdgpu: convert to NBIO IP version checking
Use IP versions rather than asic_type to differentiate IP version specific features.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:36 -05:00
Alex Deucher
82c3a7a5ed drm/amdgpu: convert amdgpu_display_supported_domains() to IP versions
Check IP versions rather than asic types.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:36 -05:00
Alex Deucher
243c719e87 drm/amdgpu: handle BACO synchronization with secondary funcs
Extend secondary function handling for runtime pm beyond audio
to USB and UCSI.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:36 -05:00
Alex Deucher
d0d66b8c66 drm/amdgpu: move runtime pm init after drm and fbdev init
Seems more logical to enable runtime pm at the end of
the init sequence so we don't end up entering runtime
suspend before init is finished.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:36 -05:00
Alex Deucher
901e2be20d drm/amdgpu: move PX checking into amdgpu_device_ip_early_init
We need to set the APU flag from IP discovery before
we evaluate this code.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:36 -05:00
Alex Deucher
212021297e drm/amdgpu: set APU flag based on IP discovery table
Use the IP versions to set the APU flag when necessary.

Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:36 -05:00
yipechai
e9287ef8d4 Revert "drm/amdgpu: No longer insert ras blocks into ras_list if it already exists in ras_list"
This reverts commit df4f0041c6.

Xgmi ras initialization had been moved from .late_init to early_init,
the defect of repeated calling amdgpu_ras_register_ras_block had been
fixed, so revert this patch.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:34 -05:00
yipechai
1f33bd18d7 drm/amdgpu: Move xgmi ras initialization from .late_init to .early_init
Move xgmi ras initialization from .late_init to .early_init, which let
xgmi ras can be initialized only once.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Stanley.Yang
d6dac2bc12 drm/amdgpu: fix channel index mapping for SIENNA_CICHLID
Pmfw read ecc info registers in the following order,
     umc0: ch_inst 0, 1, 2 ... 7
     umc1: ch_inst 0, 1, 2 ... 7
The position of the register value stored in eccinfo
table is calculated according to the below formula,
     channel_index = umc_inst * channel_in_umc + ch_inst
Driver directly use the index of eccinfo table array
as channel index, it's not correct, driver needs convert
eccinfo table array index to channel index according to
channel_idx_tbl.

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
04022982fc drm/amdgpu: switch to common helper to read bios from rom
create a common helper function for soc15 and onwards
to read bios image from rom

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
381519dff8 drm/amdgpu: retire rlc callbacks sriov_rreg/wreg
Not needed anymore.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
1b2dc99e2d drm/amdgpu: switch to amdgpu_sriov_rreg/wreg
Instead of ip specific implementation for rlcg
indirect register access

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
5d447e2967 drm/amdgpu: add helper for rlcg indirect reg access
The helper will be used to access registers from sriov
guest in full access time

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
f8f96b17f0 drm/amdgpu: init rlcg_reg_access_ctrl for gfx10
Initialize all the register offsets that will be
used in rlcg indirect reg access path for gfx10
in sw_init phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
4819732f59 drm/amdgpu: init rlcg_reg_access_ctrl for gfx9
Initialize all the register offsets that will be
used in rlcg indirect reg access path for gfx9 in
sw_init phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
b12252b053 drm/amdgpu: add structures for rlcg indirect reg access
Add structures that are used to cache registers offsets
for rlcg indirect reg access ctrl and flag availability
of such interface

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
7bbe43f8a4 drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx10
Switch to common helper to query rlcg access flag
specified by sriov host driver for gfx10

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
97d1a3b967 drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx9
Switch to common helper to query rlcg access flag
specified by sriov host driver for gfx9

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Hawking Zhang
29dbcac82f drm/amdgpu: add helper to query rlcg reg access flag
Query rlc indirect register access approach specified
by sriov host driver per ip blocks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Xin Xiong
dfced44f12 drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj
This issue takes place in an error path in
amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into
default case, the function simply returns -EINVAL, forgetting to
decrement the reference count of a dma_fence obj, which is bumped
earlier by amdgpu_cs_get_fence(). This may result in reference count
leaks.

Fix it by decreasing the refcount of specific object before returning
the error code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Alex Deucher
25c6aefcee drm/amdgpu: filter out radeon secondary ids as well
Older radeon boards (r2xx-r5xx) had secondary PCI functions
which we solely there for supporting multi-head on OSs with
special requirements.  Add them to the unsupported list
as well so we don't attempt to bind to them.  The driver
would fail to bind to them anyway, but this does so
in a cleaner way that should not confuse the user.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
Guchun Chen
f9130b81ae drm/amdgpu: drop WARN_ON in amdgpu_gart_bind/unbind
NULL pointer check has guarded it already.

calltrace:
amdgpu_ttm_gart_bind+0x49/0xa0 [amdgpu]
amdgpu_ttm_alloc_gart+0x13f/0x180 [amdgpu]
amdgpu_bo_create_reserved+0x139/0x2c0 [amdgpu]
? amdgpu_ttm_debugfs_init+0x120/0x120 [amdgpu]
amdgpu_bo_create_kernel+0x17/0x80 [amdgpu]
amdgpu_ttm_init+0x542/0x5e0 [amdgpu]

Fixes: 1b08dfb889 ("drm/amdgpu: remove gart.ready flag")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
Lang Yu
6a6c2ab687 drm/amdgpu: enable amdgpu_dc module parameter
It doesn't work under IP discovery mode. Make it work!

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Alex Deucher <aleander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
Mario Limonciello
828904660a drm/amd: Fix MSB of SMU version printing
Yellow carp has been outputting versions like `1093.24.0`, but this
is supposed to be 69.24.0. That is the MSB is being interpreted
incorrectly.

The MSB is not part of the major version, but has generally been
treated that way thus far.  It's actually the program, and used to
distinguish between two programs from a similar family but different
codebase.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
shaoyunl
901abf367d drm/amdgpu: Disable FRU EEPROM access for SRIOV
VF acces the EEPROM is blocked by security policy, we might need other way
to get SKUs info for VF

v2: squash in compilation fix from Luben

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:32 -05:00
Alex Deucher
9e5a14bce2 drm/amdgpu: filter out radeon secondary ids as well
Older radeon boards (r2xx-r5xx) had secondary PCI functions
which we solely there for supporting multi-head on OSs with
special requirements.  Add them to the unsupported list
as well so we don't attempt to bind to them.  The driver
would fail to bind to them anyway, but this does so
in a cleaner way that should not confuse the user.

Cc: stable@vger.kernel.org
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 17:50:13 -05:00
Maxime Ripard
4adc33f36d drm/edid: Split deep color modes between RGB and YUV444
The current code assumes that the RGB444 and YUV444 formats are the
same, but the HDMI 2.0 specification states that:

   The three DC_XXbit bits above only indicate support for RGB 4:4:4 at
   that pixel size. Support for YCBCR 4:4:4 in Deep Color modes is
   indicated with the DC_Y444 bit. If DC_Y444 is set, then YCBCR 4:4:4
   is supported for all modes indicated by the DC_XXbit flags.

So if we have YUV444 support and any DC_XXbit flag set but the DC_Y444
flag isn't, we'll assume that we support that deep colour mode for
YUV444 which breaks the specification.

In order to fix this, let's split the edid_hdmi_dc_modes field in struct
drm_display_info into two fields, one for RGB444 and one for YUV444.

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: d0c94692e0 ("drm/edid: Parse and handle HDMI deep color modes.")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220120151625.594595-4-maxime@cerno.tech
2022-01-25 10:01:24 +01:00
Christian König
b3bddb7a38 drm/amdgpu: use ttm_resource_manager_debug
Instead of calling the debug operation directly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211124124430.20859-10-christian.koenig@amd.com
2022-01-24 11:24:06 +01:00
Xiaojian Du
a357dca964 drm/amdgpu: fix the page fault caused by uninitialized variables
This patch will fix the page fault caused by uninitialized variables.

Error Log:
......
[ 130.246323] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 131.963112] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
[ 131.963130] BUG: unable to handle page fault for address: 000000000002db80
[ 131.963181] #PF: supervisor write access in kernel mode
[ 131.963210] #PF: error_code(0x0002) - not-present page
[ 131.963233] PGD 0 P4D 0
[ 131.963253] Oops: 0002 [#1] SMP NOPTI
[ 131.963273] CPU: 3 PID: 1411 Comm: modprobe Not tainted 5.13.0+ #1
[ 131.963338] RIP: 0010:osq_lock+0x4d/0x120
[ 131.963381] Code: 10 00 00 00 00 48 c7 02 00 00 00 00 89 42 14 87 07 85 c0 0f 84 d0 00 00 00 83 e8 01 48 98 48 03 0c c5 00 d9 ea 9c 48 89 4a 08 <48> 89 11 44 8b 42 10 45 85 c0 0f 85 af 00 00 00 55 48 89 fe 65 4c
[ 131.963460] RSP: 0018:ffffa40481717768 EFLAGS: 00010202
[ 131.963483] RAX: fffffffffffffffe RBX: ffffa40481717920 RCX: 000000000002db80
[ 131.963520] RDX: ffff9256fecedb80 RSI: ffff9256cbed2e80 RDI: ffffa40481717ac4
[ 131.963547] RBP: ffffa40481717808 R08: ffffa40481717920 R09: 00000000ffffffff
[ 131.963582] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
[ 131.963609] R13: ffffa40481717ac4 R14: ffffa40481717ab8 R15: ffff9256c9480000
[ 131.963646] FS: 00007f23d9b9c540(0000) GS:ffff9256fecc0000(0000) knlGS:0000000000000000
[ 131.963687] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 131.963721] CR2: 000000000002db80 CR3: 0000000008444000 CR4: 00000000000506e0
[ 131.963758] Call Trace:
[ 131.963772] ? __ww_mutex_lock.isra.0+0x3a2/0x760
[ 131.963810] ? prb_read_valid+0x1c/0x20
[ 131.963830] ? console_unlock+0x2fe/0x4f0
[ 131.963849] __ww_mutex_lock_interruptible_slowpath+0x16/0x20
[ 131.963882] ww_mutex_lock_interruptible+0x83/0x90
[ 131.963908] amdgpu_bo_create_reserved+0xf0/0x1e0 [amdgpu]
[ 131.964237] amdgpu_bo_create_kernel+0x17/0x80 [amdgpu]
[ 131.964509] amdgpu_gmc_vram_checking+0x41/0xf0 [amdgpu]
[ 131.964807] gmc_v10_0_hw_init+0x105/0x120 [amdgpu]
[ 131.965108] amdgpu_device_init.cold+0x1aa4/0x1e3e [amdgpu]
......

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-21 16:25:49 -05:00
Stanley.Yang
37ff945f80 drm/amdgpu: fix convert bad page retiremt
Pmfw read ecc info registers and store values in
eccinfo_table in the following order

umc0 ch_inst 0, 1, 2 ... 7
umc1 ch_inst 0, 1, 2 ... 7
...
umc3 ch_inst 0, 1, 2 ... 7

Driver should convert eccinfo_table_idx to channel_index according
to channel_idx_tbl.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-21 16:25:36 -05:00
Linus Torvalds
c2c94b3b18 Merge tag 'drm-next-2022-01-21' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Thanks to Daniel for taking care of things while I was out, just a set
  of merge window fixes that came in this week, two i915 display fixes
  and a bunch of misc amdgpu, along with a radeon regression fix.

  amdgpu:
   - SR-IOV fix
   - VCN harvest fix
   - Suspend/resume fixes
   - Tahiti fix
   - Enable GPU recovery on yellow carp

  radeon:
   - Fix error handling regression in radeon_driver_open_kms

  i915:
   - Update EHL display voltage swing table
   - Fix programming the ADL-P display TC voltage swing"

* tag 'drm-next-2022-01-21' of git://anongit.freedesktop.org/drm/drm:
  drm/radeon: fix error handling in radeon_driver_open_kms
  drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV
  drm/amdgpu: apply vcn harvest quirk
  drm/i915/display/adlp: Implement new step in the TC voltage swing prog sequence
  drm/i915/display/ehl: Update voltage swing table
  drm/amd/display: Revert W/A for hard hangs on DCN20/DCN21
  drm/amdgpu: drop flags check for CHIP_IP_DISCOVERY
  drm/amdgpu: Fix rejecting Tahiti GPUs
  drm/amdgpu: don't do resets on APUs which don't support it
  drm/amdgpu: invert the logic in amdgpu_device_should_recover_gpu()
  drm/amdgpu: Enable recovery on yellow carp
2022-01-21 09:25:38 +02:00
Eric Huang
f61c40c075 drm/amdkfd: enable heavy-weight TLB flush on Arcturus
SDMA FW fixes the hang issue for adding heavy-weight TLB
flush on Arcturus, so we can enable it.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:33:27 -05:00
Jonathan Kim
590e86fe34 drm/amdgpu: fix broken debug sdma vram access function
Debug VRAM access through SDMA has several broken parts resulting in
silent MMIO fallback.

BO kernel creation takes the location of the cpu addr pointer, not
the pointer itself for address kmap.

drm_dev_enter return true on success so change access check.

The source BO is reserved but not pinned so find the address using the
cursor offset relative to its memory domain start.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:33:03 -05:00
Christian König
1b08dfb889 drm/amdgpu: remove gart.ready flag
That's just a leftover from old radeon days and was preventing CS and GART
bindings before the hardware was initialized. But nowdays that is
perfectly valid.

The only thing we need to warn about are GART binding before the table
is even allocated.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:32:47 -05:00
Stanley.Yang
5904e4135f drm/amdgpu: remove unused variable warning
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:32:38 -05:00
mziya
33cd016e60 drm/amdgpu: remove unused variable
Remove set but unused variable.
warning: variable 'umc_reg_offset' set but not used

Signed-off-by: mziya <Mohammadzafar.ziya@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:32:28 -05:00
yipechai
8eb53bb2aa drm/amdgpu: Remove repeated calls
Remove repeated calls.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:32:19 -05:00
Xiaojian Du
86700a4026 drm/amdgpu: modify a pair of functions for the pcie port wreg/rreg
This patch will modify a pair of functions for pcie port wreg/rreg.
AMD GPU have had an independent NBIO block from SOC15 arch.
If the dirver wants to read/write the address space of the pcie devices,
it has to go through the NBIO block.
This patch will move the pcie port wreg/rreg functions to
"amdgpu_device.c", so that to reuse the functions on the
future GPU ASICs.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:32:11 -05:00
Xiaojian Du
479e3b02b7 drm/amdgpu: add vram check function for GMC
This patch will add vram check function for GMC block.
It will write pattern data to the vram and then read back from the vram,
so that to verify the work status of vram.
This patch  will cover gmc v6/7/8/9/10.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:31:51 -05:00
Christian König
75ab2b3633 dma-buf: drop excl_fence parameter from dma_resv_get_fences
Returning the exclusive fence separately is no longer used.

Instead add a write parameter to indicate the use case.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211207123411.167006-4-christian.koenig@amd.com
2022-01-19 10:03:56 +01:00
Christian König
acde6234f6 drm/amdgpu: remove excl as shared workarounds
This was added because of the now dropped shared on excl dependency.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211123142111.3885-15-christian.koenig@amd.com
2022-01-19 08:25:50 +01:00
Jingwen Chen
9a458402fb drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV
[Why]
This fixes 892deb4826 ("drm/amdgpu: Separate vf2pf work item init from virt data exchange").
we should read pf2vf data based at mman.fw_vram_usage_va after gmc
sw_init. commit 892deb4826 breaks this logic.

[How]
calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to
set the right base in the right sequence.

v2:
call amdgpu_virt_init_data_exchange after gmc sw_init to make data
exchange workqueue run

v3:
clean up the code logic

v4:
add some comment and make the code more readable

Fixes: 892deb4826 ("drm/amdgpu: Separate vf2pf work item init from virt data exchange")
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18 18:00:58 -05:00
Guchun Chen
520d9cd267 drm/amdgpu: apply vcn harvest quirk
This is a following patch to apply the workaround only on
those boards with a bad harvest table in ip discovery.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18 18:00:58 -05:00
Minghao Chi
a5e7ffa119 amdgpu/amdgpu_psp: remove unneeded ret variable
Return value from amdgpu_bo_create_kernel() directly instead
of taking this in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18 17:43:36 -05:00