Commit Graph

2133 Commits

Author SHA1 Message Date
Ivan Mironov
7e89e4aaa9 drm/amd/powerplay: Fix NULL dereference in lock_bus() on Vega20 w/o RAS
I updated my system with Radeon VII from kernel 5.6 to kernel 5.7, and
following started to happen on each boot:

	...
	BUG: kernel NULL pointer dereference, address: 0000000000000128
	...
	CPU: 9 PID: 1940 Comm: modprobe Tainted: G            E     5.7.2-200.im0.fc32.x86_64 #1
	Hardware name: System manufacturer System Product Name/PRIME X570-P, BIOS 1407 04/02/2020
	RIP: 0010:lock_bus+0x42/0x60 [amdgpu]
	...
	Call Trace:
	 i2c_smbus_xfer+0x3d/0xf0
	 i2c_default_probe+0xf3/0x130
	 i2c_detect.isra.0+0xfe/0x2b0
	 ? kfree+0xa3/0x200
	 ? kobject_uevent_env+0x11f/0x6a0
	 ? i2c_detect.isra.0+0x2b0/0x2b0
	 __process_new_driver+0x1b/0x20
	 bus_for_each_dev+0x64/0x90
	 ? 0xffffffffc0f34000
	 i2c_register_driver+0x73/0xc0
	 do_one_initcall+0x46/0x200
	 ? _cond_resched+0x16/0x40
	 ? kmem_cache_alloc_trace+0x167/0x220
	 ? do_init_module+0x23/0x260
	 do_init_module+0x5c/0x260
	 __do_sys_init_module+0x14f/0x170
	 do_syscall_64+0x5b/0xf0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
	...

Error appears when some i2c device driver tries to probe for devices
using adapter registered by `smu_v11_0_i2c_eeprom_control_init()`.
Code supporting this adapter requires `adev->psp.ras.ras` to be not
NULL, which is true only when `amdgpu_ras_init()` detects HW support by
calling `amdgpu_ras_check_supported()`.

Before 9015d60c9e, adapter was registered by

	-> amdgpu_device_ip_init()
	  -> amdgpu_ras_recovery_init()
	    -> amdgpu_ras_eeprom_init()
	      -> smu_v11_0_i2c_eeprom_control_init()

after verifying that `adev->psp.ras.ras` is not NULL in
`amdgpu_ras_recovery_init()`. Currently it is registered
unconditionally by

	-> amdgpu_device_ip_init()
	  -> pp_sw_init()
	    -> hwmgr_sw_init()
	      -> vega20_smu_init()
	        -> smu_v11_0_i2c_eeprom_control_init()

Fix simply adds HW support check (ras == NULL => no support) before
calling `smu_v11_0_i2c_eeprom_control_{init,fini}()`.

Please note that there is a chance that similar fix is also required for
CHIP_ARCTURUS. I do not know whether any actual Arcturus hardware without
RAS exist, and whether calling `smu_i2c_eeprom_init()` makes any sense
when there is no HW support.

Cc: stable@vger.kernel.org
Fixes: 9015d60c9e ("drm/amdgpu: Move EEPROM I2C adapter to amdgpu_device")
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Tested-by: Bjorn Nostvold <bjorn.nostvold@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-06-25 13:42:15 -04:00
Sandeep Raghuraman
790243d3bf drm/amdgpu: Replace invalid device ID with a valid device ID
Initializes Powertune data for a specific Hawaii card by fixing what
looks like a typo in the code. The device ID 66B1 is not a supported
device ID for this driver, and is not mentioned elsewhere. 67B1 is a
valid device ID, and is a Hawaii Pro GPU.

I have tested on my R9 390 which has device ID 67B1, and it works
fine without problems.

Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-06-11 16:05:09 -04:00
Evan Quan
cc831b9781 drm/amd/powerplay: ack the SMUToHost interrupt on receive V2
There will be no further interrupt without proper ack
for current one.

V2: fix typo to really set ACK bit only

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-29 13:56:30 -04:00
Alex Deucher
54f78a7655 drm/amdgpu: add apu flags (v2)
Add some APU flags to simplify handling of different APU
variants.  It's easier to understand the special cases
if we use names flags rather than checking device ids and
silicon revisions.

v2: rebase on latest code

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-22 13:41:53 -04:00
Gustavo A. R. Silva
94f2026bd8 drm/amdgpu/smu10: Replace one-element array and use struct_size() helper
The current codebase makes use of one-element arrays in the following
form:

struct something {
    int length;
    u8 data[1];
};

struct something *instance;

instance = kmalloc(sizeof(*instance) + size, GFP_KERNEL);
instance->length = size;
memcpy(instance->data, source, size);

but the preferred mechanism to declare variable-length types such as
these ones is a flexible array member[1][2], introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on. So, replace
the one-element array with a flexible-array member.

Also, make use of the new struct_size() helper to properly calculate the
size of struct smu10_voltage_dependency_table.

This issue was found with the help of Coccinelle and, audited and fixed
_manually_.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21 12:48:43 -04:00
Evan Quan
27a468eac5 drm/amd/powerplay: unify the prompts on thermal interrupts
The prompts will contain pci address(segment/bus/port/function),
severity(warn or error) and some keywords(GPU, amdgpu). Also this
address the issue that pci bus retrieved by PCI_BUS_NUM(adev->pdev->devfn)
is wrong.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21 12:48:42 -04:00
John Clements
30c296e1c1 drm/amdgpu: resolve ras recovery vs smi race condition
during ras recovery block smu access via smi

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21 12:46:51 -04:00
John Clements
b7f0656a25 drm/amdgpu: Updated XGMI power down control support check
Updated SMC FW version check to determine if XGMI power down control is supported

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14 17:43:59 -04:00
John Clements
ab9c21124d drm/amdgpu: Add cmd to control XGMI link sleep
Added host to SMU FW cmd to enable/disable XGMI link power down

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14 17:42:15 -04:00
Evan Quan
cd598d6cfd drm/amd/powerplay: report correct AC/DC event based on ctxid V2
'ctxid' is used to distinguish different events raised from SMC.
0x3 and 0x4 are for AC and DC power mode.

V2: update the way to retrieve the ctxid and change the log level
    to debug

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14 16:42:44 -04:00
Evan Quan
e528ccf932 drm/amd/powerplay: shutdown on HW CTF
To prevent further damage.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14 16:42:44 -04:00
Evan Quan
9495220577 drm/amd/powerplay: try to do a graceful shutdown on SW CTF
Normally this(SW CTF) should not happen. And by doing graceful
shutdown we can prevent further damage.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14 16:42:44 -04:00
Dave Airlie
49eea1c657 Merge tag 'amd-drm-next-5.8-2020-05-12' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.8-2020-05-12:

amdgpu:
- Misc cleanups
- RAS fixes
- Expose FP16 for modesetting
- DP 1.4 compliance test fixes
- Clockgating fixes
- MAINTAINERS update
- Soft recovery for gfx10
- Runtime PM cleanups
- PSP code cleanups

amdkfd:
- Track GPU memory utilization per process
- Report PCI domain in topology

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200512213703.4039-1-alexander.deucher@amd.com
2020-05-14 13:21:33 +10:00
Jane Jian
feb000fdff drm/amd/powerplay: skip judging if baco support for Arcturus sriov
since for sriov, baco happens on host side, no need and meaning
to judge is baco.
also, since kiq reads strap0 in here, if kiq is not ready
or gpu reset(kiq resume) happens after this read, would fail
to read and wrongly set baco as true(1).

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-11 18:03:01 -04:00
Evan Quan
85625e6429 drm/amdgpu: enable hibernate support on Navi1X
BACO is needed to support hibernate on Navi1X.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08 14:32:16 -04:00
Dave Airlie
370fb6b0aa Merge tag 'amd-drm-next-5.8-2020-04-30' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.8-2020-04-30:

amdgpu:
- SR-IOV fixes
- SDMA fix for Navi
- VCN 2.5 DPG fixes
- Display fixes
- Display stuttering fixes for pageflip and cursor
- Add support for handling encrypted GPU memory
- Add UAPI for encrypted GPU memory
- Rework IB pool handling

amdkfd:
- Expose asic revision in topology
- Add UAPI for GWS (Global Wave Sync) resource management

UAPI:
- Add amdgpu UAPI for encrypted GPU memory
  Used by: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401
- Add amdkfd UAPI for GWS (Global Wave Sync) resource management
  Thunk usage of KFD ioctl: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/roc-2.8.0/src/queues.c#L840
  ROCr usage of Thunk API: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/roc-3.1.0/src/core/runtime/amd_gpu_agent.cpp#L597
  HCC code using ROCr API: 98ee9f3494/lib/hsa/mcwamp_hsa.cpp (L2161)
  HIP code using HCC API: cf8589b8c8/src/hip_module.cpp (L567)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430212951.3902-1-alexander.deucher@amd.com
2020-05-08 13:31:08 +10:00
ChenTao
7f6778b114 drm/amdgpu/navi10: fix unsigned comparison with 0
Fixes warning because size is uint32_t and can never be negtative

drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c:1296:12: warning:
comparison of unsigned expression < 0 is always false [-Wtype-limits]
   if (size < 0)

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: ChenTao <chentao107@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-06 16:51:39 -04:00
Evan Quan
74577c3a48 drm/amd/powerplay: perform PG ungate prior to CG ungate
Since gfxoff should be disabled first before trying to access those
GC registers.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-06 16:51:15 -04:00
Jason Yan
50654d7bca drm/amdgpu/smu10: remove duplicate assignment of smu10_hwmgr_funcs members
The struct member 'asic_setup' was assigned twice, let's remove one:

static const struct pp_hwmgr_func smu10_hwmgr_funcs = {
	......
	.asic_setup = NULL,
	......
	.asic_setup = smu10_setup_asic_task,
	......
};

This fixes the following coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c:1357:52-53:
asic_setup: first occurrence line 1360, second occurrence line 1388

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-30 16:48:07 -04:00
Dave Airlie
937eea297e Merge tag 'amd-drm-next-5.8-2020-04-24' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.8-2020-04-24:

amdgpu:
- Documentation improvements
- Enable FRU chip access on boards that support it
- RAS updates
- SR-IOV updates
- Powerplay locking fixes for older SMU versions
- VCN DPG (dynamic powergating) cleanup
- VCN 2.5 DPG enablement
- Rework GPU scheduler handling
- Improve scheduler priority handling
- Add SPM (streaming performance monitor) golden settings for navi
- GFX10 clockgating fixes
- DC ABM (automatic backlight modulation) fixes
- DC cursor and plane fixes
- DC watermark fixes
- DC clock handling fixes
- DC color management fixes
- GPU reset fixes
- Clean up MMIO access macros
- EEPROM access fixes
- Misc code cleanups

amdkfd:
- Misc code cleanups

radeon:
- Clean up safe reg list generation
- Misc code cleanups

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424190827.4542-1-alexander.deucher@amd.com
2020-04-30 11:08:54 +10:00
Tiecheng Zhou
c7833d332e drm/amd/powerplay: avoid using pm_en before it is initialized revised
hwmgr->pm_en is initialized at hwmgr_hw_init.

during amdgpu_device_init, there is amdgpu_asic_reset that calls to
soc15_asic_reset (for V320 usecase, Vega10 asic), in which:
1) soc15_asic_reset_method calls to pp_get_asic_baco_capability (pm_en)
2) soc15_asic_baco_reset calls to pp_set_asic_baco_state (pm_en)

pm_en is used in the above two cases while it has not yet been initialized

So avoid using pm_en in the above two functions for V320 passthrough.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 11:44:11 -04:00
Tiecheng Zhou
a1cd1289a6 Revert "drm/amd/powerplay: avoid using pm_en before it is initialized"
This reverts commit c520787623.

The commit being reverted changed the wrong place, it should have
changed in func get_asic_baco_capability.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 11:43:04 -04:00
Monk Liu
38748ad88a drm/amdgpu: enable one vf mode for nv12
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24 11:42:11 -04:00
Monk Liu
b217e6f579 drm/amdgpu: clear the messed up checking logic
for ARCTURUS+ ASICS, we always support SW_SMU for bare-metal
and for SRIOV one_vf_mode

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24 11:42:11 -04:00
Monk Liu
c983361a72 drm/amdgpu: sriov is forbidden to call disable DPM
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24 11:42:11 -04:00
Jiansong Chen
f9b93c9ba6 drm/amd/powerplay: limit smu support to Arcturus for onevf
Under onevf mode the smu support to other chips is not well
verified yet.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:49 -04:00
Prike Liang
a35da666cc drm/amd/powerplay: update smu12_driver_if.h to align with pmfw
Update the smu12_driver_if.h header to follow the pmfw release.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:49 -04:00
Yuxian Dai
5f6a92e442 drm/amdgpu/powerplay:avoid to show invalid DPM table info
for different ASIC support different the number of DPM levels,
we should avoid to show the invalid level value.
v1 -> v2:
	follow the suggestion,clarifiy the description for this
change

Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Yuxian Dai <Yuxian.Dai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:49 -04:00
Jason Yan
1470e957e2 drm/amd/powerplay: remove defined but not used variables
Fix the following gcc warning:

drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/vega10_powertune.c:710:46:
warning: ‘PSMGCEDCThresholdConfig_vega10’ defined but not used
[-Wunused-const-variable=]
 static const struct vega10_didt_config_reg
PSMGCEDCThresholdConfig_vega10[] =
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/vega10_powertune.c:654:46:
warning: ‘PSMSEEDCThresholdConfig_Vega10’ defined but not used
[-Wunused-const-variable=]
 static const struct vega10_didt_config_reg
PSMSEEDCThresholdConfig_Vega10[] =
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:46 -04:00
Sandeep Raghuraman
7ce016e71a drm/amdgpu: Correctly initialize thermal controller for GPUs with Powerplay table v0 (e.g Hawaii)
Initialize thermal controller fields in the PowerPlay table for Hawaii
GPUs, so that fan speeds are reported.

Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:46 -04:00
Prike Liang
4e2fec3321 drm/amd/powerplay: fix resume failed as smu table initialize early exit
When the amdgpu in the suspend/resume loop need notify the dpm disabled,
otherwise the smu table will be uninitialize and result in resume failed.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Tested-by: Mengbing Wang <Mengbing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:46 -04:00
John Clements
e57761c68b drm/amdgpu: cache smu fw version info
reduce cmd submission to smu by caching version info

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:46 -04:00
John Clements
7f70443fd8 drm/amdgpu: set mp1 state before reload
Set MP1 state to prepare for unload before reloading SMU FW

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:46 -04:00
Evan Quan
47c11cff7e drm/amd/powerplay: update Arcturus smu-driver if header
To fit the latest PMFW.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:45 -04:00
Evan Quan
774e335b87 drm/amd/powerplay: properly set the dpm_enabled state
On the ASIC powered down(in baco or system suspend),
the dpm_enabled will be set as false. Then all access
(e.g. df state setting issued on RAS error event) to
SMU will be blocked.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:45 -04:00
Evan Quan
94e0805ba9 drm/amd/powerplay: correct i2c eeprom init/fini sequence
As data transfer may starts immediately after i2c eeprom init
completed. Thus i2c eeprom should be initialized after SMU
ready. And i2c data transfer should be prohibited when SMU
down. That is the i2c eeprom fini sequence needs to be
updated also.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:45 -04:00
Evan Quan
56ddddaacc drm/amd/powerplay: bump the NAVI10 smu-driver if version
To fit the latest SMC firmware 42.53 and eliminate the
warning on driver loading.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:45 -04:00
Evan Quan
02c0bb4ee3 drm/amd/powerplay: revise the way to retrieve the board parameters
It can support different NV1x ASIC better. And this can guard
no member got missing.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:45 -04:00
Likun Gao
e8663832b0 drm/amdgpu/powerplay: get SMC FW size to a flexible way
Get SMC fw size before backdoor loading instead of giving an
certain value, as it may different for different ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-13 12:02:15 -04:00
Sergei Lopatin
7adf5619ae drm/amd/powerplay: force the trim of the mclk dpm_levels if OD is enabled
Should prevent flicker if PP_OVERDRIVE_MASK is set.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=102646
bug: https://bugs.freedesktop.org/show_bug.cgi?id=108941
bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1088
bug: https://gitlab.freedesktop.org/drm/amd/-/issues/628

Signed-off-by: Sergei Lopatin <magist3r@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-13 12:01:40 -04:00
Evan Quan
4f7d010fc4 drm/amd/powerplay: unload mp1 for Arcturus RAS baco reset
This sequence is recommended by PMFW team for the baco reset
with PMFW reloaded. And it seems able to address the random
failure seen on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 17:03:23 -04:00
Sergei Lopatin
8c7f0a44b4 drm/amd/powerplay: force the trim of the mclk dpm_levels if OD is enabled
Should prevent flicker if PP_OVERDRIVE_MASK is set.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=102646
bug: https://bugs.freedesktop.org/show_bug.cgi?id=108941
bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1088
bug: https://gitlab.freedesktop.org/drm/amd/-/issues/628

Signed-off-by: Sergei Lopatin <magist3r@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-04-09 17:01:38 -04:00
Evan Quan
5aaa8fff3a drm/amd/powerplay: unload mp1 for Arcturus RAS baco reset
This sequence is recommended by PMFW team for the baco reset
with PMFW reloaded. And it seems able to address the random
failure seen on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:18 -04:00
Evan Quan
1744fb2391 drm/amd/powerplay: error out on forcing clock setting not supported
For Arcturus, forcing clock to some specific level is not supported
with 54.18 and onwards SMU firmware. As according to firmware team,
they adopt new gfx dpm tuned parameters which can cover all the use
case in a much smooth way. Thus setting through driver interface
is not needed and maybe do a disservice.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:18 -04:00
Nirmoy Das
db3e0a284e drm/amd/powerplay: fix a typo
Util -> Until

Fixes: 567c8fc4a0 ("drm/amd/powerplay: implement the is_dpm_running()")
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:15 -04:00
Evan Quan
3abd1e95e0 drm/amd/powerplay: error out on forcing clock setting not supported
For Arcturus, forcing clock to some specific level is not supported
with 54.18 and onwards SMU firmware. As according to firmware team,
they adopt new gfx dpm tuned parameters which can cover all the use
case in a much smooth way. Thus setting through driver interface
is not needed and maybe do a disservice.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-08 17:51:54 -04:00
Tiecheng Zhou
c520787623 drm/amd/powerplay: avoid using pm_en before it is initialized
hwmgr->pm_en is initialized at hwmgr_hw_init.
during amdgpu_device_init, there is amdgpu_asic_reset that calls to
pp_get_asic_baco_capability, while hwmgr->pm_en has not yet been initialized.

so avoid using pm_en in pp_get_asic_baco_capability.

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Signed-off-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-03 17:22:44 -04:00
Prike Liang
4ee2bb22dd drm/amd/powerplay: implement the is_dpm_running()
As the pmfw hasn't exported the interface of SMU feature
mask to APU SKU so just force on all the features to driver
inquired interface at early initial stage.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-04-03 17:22:02 -04:00
Yuxian Dai
022ac4c9c5 drm/amdgpu/powerplay: using the FCLK DPM table to set the MCLK
1.Using the FCLK DPM table to set the MCLK for DPM states consist of
three entities:
 FCLK
 UCLK
 MEMCLK
All these three clk change together, MEMCLK from FCLK, so use the fclk
frequency.
2.we should show the current working clock freqency from clock table metric

Signed-off-by: Yuxian Dai <Yuxian.Dai@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Kevin Wang <Kevin1.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-04-03 17:20:15 -04:00
Tiecheng Zhou
764a21cb08 drm/amd/powerplay: avoid using pm_en before it is initialized
hwmgr->pm_en is initialized at hwmgr_hw_init.
during amdgpu_device_init, there is amdgpu_asic_reset that calls to
pp_get_asic_baco_capability, while hwmgr->pm_en has not yet been initialized.

so avoid using pm_en in pp_get_asic_baco_capability.

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Signed-off-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-03 17:03:54 -04:00