Commit Graph

9058 Commits

Author SHA1 Message Date
shaoyunl
e3c1b0712f drm/amdgpu: Reset the devices in the XGMI hive duirng probe
In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices
without sync to each other. This could cause device hang since for XGMI configuration, all the devices
within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue
by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't
do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init
to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Andrey Grodzovsky andrey.grodzovsky@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:38 -04:00
shaoyunl
655ce9cb13 drm/amdgpu: Add reset_list for device list used for reset
The gmc.xgmi.head list originally is designed for device list in the XGMI hive. Mix use it
for reset purpose will prevent the reset function to adjust XGMI device list which is required
in next change

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Andrey Grodzovsky andrey.grodzovsky@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:35 -04:00
shaoyunl
a330b52a9e drm/amdgpu: Init the cp MQD if it's not be initialized before
The MQD might not be initialized duirng first init period if the device need to be reset
druing probe. Driver need to proper init them in gpu recovery period

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:33 -04:00
shaoyunl
8e2712e71b drm/amdgpu: Add kfd init_complete flag to check from amdgpu side
amdgpu driver may be in reset state during init which will not initialize the kfd,
driver need to initialize the KFD after reset by check the flag

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:28 -04:00
Kevin Wang
03597b47d6 Revert "drm/amdgpu: add psp RAP L0 check support"
This reverts commit d86fd724e5.

Disable PSP RAP L0 self test until to RAP feature ready.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:45 -04:00
Mark Yacoub
f258907fdd drm/amdgpu: Verify bo size can fit framebuffer size on init.
To initialize the framebuffer, call drm_gem_fb_init_with_funcs which
verifies that the BO size can fit the FB size by calculating the minimum
expected size of each plane.

The bug was caught using igt-gpu-tools test: kms_addfb_basic.too-high
and kms_addfb_basic.bo-too-small

Tested on ChromeOS Zork by turning on the display and running a YT
video.

=== Changes from v1 ===
1. Added new line under declarations.
2. Use C style comment.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Mark Yacoub <markyacoub@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:36 -04:00
Lijo Lazar
f78313fae9 drm/amdgpu: Check if FB BAR is enabled for ROM read
Some configurations don't have FB BAR enabled. Avoid reading ROM image
from FB BAR region in such cases.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:05 -04:00
Shashank Sharma
e36ccf9a96 drm/amdgpu: Set GTT_USWC flag to enable freesync v2
This patch sets 'AMDGPU_GEM_CREATE_CPU_GTT_USWC' as input
parameter flag, during object creation of an imported DMA
buffer.

In absence of this flag:
1. Function amdgpu_display_supported_domains() doesn't add
   AMDGPU_GEM_DOMAIN_GTT as supported domain.
2. Due to which, Function amdgpu_display_user_framebuffer_create()
   refuses to create framebuffer for imported DMA buffers.
3. Due to which, AddFB() IOCTL fails.
4. Due to which, amdgpu_present_check_flip() check fails in DDX
5. Due to which DDX driver doesn't allow flips (goes to blitting)
6. Due to which setting Freesync/VRR property fails for PRIME buffers.

So, this patch finally enables Freesync with PRIME buffer offloading.

v2 (chk): instead of just checking the flag we copy it over if the
          exporter is an amdgpu device as well.

Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:42 -04:00
Shashank Sharma
0b46bc3a9d drm/amdgpu: clean-up unused variable
Variable 'bp' seems to be unused residue from previous
logic, and is not required anymore.

Cc: Koenig Christian <christian.koenig@amd.com>
Cc: Deucher Alexander <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:28 -04:00
Aurabindo Pillai
c0ea73a4ad Revert freesync video patches temporarily
This temporarily reverts freesync video patches since it causes regression with
eDP displays. This patch is a squashed revert of the following patches:

6f59f229f8 ("drm/amd/display: Skip modeset for front porch change")
d10cd527f5 ("drm/amd/display: Add freesync video modes based on preferred modes")
0eb1af2e82 ("drm/amd/display: Add module parameter for freesync video mode")

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Anson Jacob <anson.jacob@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:04 -04:00
Oak Zeng
47bfa5f60f drm/amdgpu: Increase PSP runtime TMR region size
Aldebaran uses more than 4M runtime TMR. The current
hard coded 4M TMR is not big enough for Aldebaran.
Increase it to 8M.

v2: Only do 8M size for ALDEBARAN (Hawking)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:51 -04:00
Eric Huang
c3c9e0faf4 drm/amdkfd: apply uncached flag for aldebaran
The flag is only applied on fine-grained memory.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:48 -04:00
Eric Huang
2e2f197f4c drm/amdgpu: set snoop bit in pde/pte entries for A+A
Page tables in vram mapping to cpu is changed from uncached to
cached in A+A, the snoop bit in VM_CONTEXTx_PAGE_TABLE_BASE_ADDR/
PDE0s/PDE1s/PDE2s/PTE.TFs has to be set so gpuvm walker snoop
page table data out of CPU cache.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:46 -04:00
Eric Huang
06bfc045d5 drm/amdgpu: set CPU mapping of vram as cached for A+A mode
New A+A HW supports cached vram mapped to cpu.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:44 -04:00
Dennis Li
761d86d37f drm/amdgpu: harvest edc status when connected to host via xGMI
When connected to a host via xGMI, system fatal errors may trigger
warm reset, driver has no change to query edc status before reset.
Therefore in this case, driver should harvest previous error loging
registers during boot, instead of only resetting them.

v2:
1. IP's ras_manager object is created when its ras feature is enabled,
so change to query edc status after amdgpu_ras_late_init called

2. change to enable watchdog timer after finishing gfx edc init

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reivewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:41 -04:00
Felix Kuehling
63dbb0db3a drm/amdgpu: Make noretry the default on Aldebaran
This is needed for best machine learning performance. XNACK can still
be enabled per-process if needed.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:38 -04:00
Harish Kasiviswanathan
4464820dc7 drm/amdgpu: update default timeout of Aldebaran SQ watchdog
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reivewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:35 -04:00
Kevin Wang
d86fd724e5 drm/amdgpu: add psp RAP L0 check support
add PSP RAP L0 check when RAP TA is loaded.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:25 -04:00
Kevin Wang
2fb3c5d0d1 drm/amdgpu: change psp_rap_invoke() function return value
RAP TA is an optional firmware. if it doesn’t exist,
the driver should bypass psp_rap_invoke() function.

1. bypass psp_rap_invoke() when RAP TA is not loaded.
2. add new parameter (status) to query RAP TA status.
   (the status value is different with psp_ta_invoke(),
3. fix the 'rap_status' MThread critical problem.
   (used without lock)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:22 -04:00
Felix Kuehling
6dce50b1aa drm/amdgpu: Let KFD use more VMIDs on Aldebaran
When there is no graphics support, KFD can use more of the VMIDs. Graphics
VMIDs are only used for video decoding/encoding and post processing. With
two VCE engines, there is no reason to reserve more than 2 VMIDs for that.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:59 -04:00
Dennis Li
88f8575bca drm/amdgpu: enable watchdog feature for SQ of aldebaran
SQ's watchdog timer monitors forward progress, a mask of which waves
caused the watchdog timeout is recorded into ras status registers and
then trigger a system fatal error event.

v2:
1. change *query_timeout_status to *query_sq_timeout_status.
2. move query_sq_timeout_status into amdgpu_ras_do_recovery.
3. add module parameters to enable/disable fatal error event and modify
the watchdog timer.

v3:
1. remove unused parameters of *enable_watchdog_timer

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:52 -04:00
Dennis Li
4abc2567f0 drm/amdgpu: refine ras codes for GC utc of aldebaran
The bank number of both VML2 and ATCL2 are changed to 8, so refine
related codes to avoid defining long name arrays.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:50 -04:00
Dennis Li
22616eb5c9 drm/amdgpu: add ras support for gfx of aldebaran
add edc counter/status reset and query functions for gfx block of
aldebaran.

v2: change to clear edc counter explicitly
aldebaran hardware will not clear edc counter after driver reading them,
so driver should clear them explicitly.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:48 -04:00
Kevin Wang
5217811e74 drm/amdgpu: add gc powerbrake support (v2)
add GC power brake feature support for Aldebaran.

v2: squash in fixes (Alex)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:44 -04:00
Hawking Zhang
b3ecf36bf6 drm/amdgpu: update TCP_CHAN_STEER_1 golden value for aldebaran
The golden setting was changed recently. update to
the latest one

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:42 -04:00
Hawking Zhang
9f55d7edb7 drm/amdgpu: add common gc golden settings for aldebaran
golden settings that should be applied

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:39 -04:00
Hawking Zhang
264aef8b3b drm/amdgpu: apply gc v9_4_2 golden settings for aldebaran
Those registers should be programmed as one-time initialization

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:34 -04:00
Jonathan Kim
16171a25d8 drm/amdgpu: restore aldebaran save ttmp and trap config on init (v2)
Initialization of TRAP_DATA0/1 is still required for the debugger to detect
new waves on Aldebaran.  Also, per-vmid global trap enablement may be
required outside of debugger scope so move to init phase.

v2: just add the gfx 9.4.2 changes (Alex)

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:31 -04:00
Jonathan Kim
5073506c7e drm/amdkfd: add aldebaran kfd2kgd callbacks to kfd device (v2)
Create dedicated Aldebaran kfd2kgd callbacks to prepare
for new per-vmid register instructions for debug trap
setting functions and sending host traps.

v2: rebase (Alex)

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:28 -04:00
Oak Zeng
6d909c5da0 drm/amdkfd: Add kernel parameter to stop queue eviction on vm fault
This is to keep wavefront context for debug purpose

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:22 -04:00
Hawking Zhang
2f669734f3 drm/amdgpu: allow use psp to load firmware (v2)
Match existing asics.

v2: rebase (Alex)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:19 -04:00
Alex Sierra
ec8631e011 drm/amdgpu: use pd addr based on gart level page table
With a recent gart page table re-construction, the gart page
table is now 2-level for some ASICs: PDB0->PTB.
In the case of 2-level gart page table, the page_table_base
of vmid0 should point to PDB0 instead of PTB.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:07 -04:00
Oak Zeng
be0478e7b0 drm/amdgpu: Fix the comment in amdgpu_gmc.h
More accurate words are used to address a
code review feedback

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:04 -04:00
Oak Zeng
79194dacb2 drm/amdgpu: Fix GART page table s-bit
For the new 2-level GART table, the last PDE0 points
to PTB. Since PTB is in vram and right now we are
runing under s=0 mode (vram is treated as FB carveout),
so the s bit of this PDE0 should be set to 0.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:02 -04:00
Alex Sierra
f4ec3e5039 drm/amdgpu: update mmhub client ids for Aldebaran
update mmhub client id table for Aldebaran.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:58 -04:00
Dennis Li
abe5ee57c5 drm/amdgpu: enable sram initialization for aldebaran
Aldebaran can share the same initializing shader code witn
arcturus.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:55 -04:00
Oak Zeng
2f055097da drm/amdgpu: workaround the TMR MC address issue (v2)
With the 2-level gart page table,  vram is squeezed into gart aperture
and FB aperture is disabled. Therefore all VRAM virtual addresses are
 in the GART aperture. However currently PSP requires TMR addresses
in FB aperture. So we need some design change at PSP FW level to support
this 2-level gart table driver change. Right now this PSP FW support
doesn't exist. To workaround this issue temporarily, FB aperture is
added back and the gart aperture address is converted back to FB aperture
for this PSP TMR address.

Will revert it after we get a fix from PSP FW.

v2: squash in tmr fix for other asics (Kevin)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:52 -04:00
Oak Zeng
0c19cab555 drm/amdgpu: HW setup of 2-level vmid0 page table
Set up HW for 2-level vmid0 page table: 1. Set up
PAGE_TABLE_START/END registers. Currently only plan
to do 2-level page table for ALDEBARAN, so only gfxhub1.0
and mmhub1.7 is changed. 2. Set page table base register.
For 2-level page table, the page table base should point
to PDB0. 3. Disable AGP and FB aperture as they are not
used.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:49 -04:00
Oak Zeng
522510a677 drm/amdgpu: Set up vmid0 PDB0
If use gart for FB translation, allocate and fill
PDB0.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:47 -04:00
Oak Zeng
a2902c09c5 drm/amdgpu: Add function to allocate and fill PDB0
Add functions to allocate PDB0, map it for CPU access,
and fill it.

Those functions are only used for 2-level vmid0 page
table construction

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:44 -04:00
Oak Zeng
7b454b3a34 drm/amdgpu: Use different gart table parameters for 2-level gart table
If use gart for FB translation, we will squeeze vram into
sysvm aperture. This requires 2 level gart table. Add
page table depth and page table block size parameters
to gmc. This is prepare work to 2-level gart table
construction

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:42 -04:00
Oak Zeng
f527f310bb drm/amdgpu: Placement of gart and vram in sysvm aperture
If use GART for FB translation, place both vram and gart to sysvm
aperture. AGP aperture is not set up in this case because it
is not used

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:36 -04:00
Oak Zeng
6e93ef8b68 drm/amdgpu: Modify comments of vram_start/end
Modify the comment to reflect the fact that, if
use GART for vram address translation for vmid0,
[vram_start, vram_end] will be placed inside SYSVM
aperture, together with GART.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:32 -04:00
Oak Zeng
f1dc12ca56 drm/amdgpu: Moved gart_size calculation to mc_init functions
In amdgpu_gmc_gart_location function, gart_size is adjusted
by a smu_prv_buffer_size. This logic shouldn't belong to
this function. Move the logic to the mc_init functions

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:30 -04:00
Oak Zeng
1f928f5159 drm/amdgpu: Use physical translation mode to access page table
On A+A platform, CPU write page directory and page table in cached
mode. So it is necessary for page table walker to snoop CPU cache.
This setting is necessary for page walker to snoop page directory
and page table data out of CPU cache.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:27 -04:00
Oak Zeng
35d5f224a5 drm/amdgpu: Don't reserve vram as WC for A+A
On A+A platform, vram can be mapped as WB. Not necessarily
to always map vram as WC on such platform.

Calling function arch_io_reserve_memtype_wc will mark the
whole vram region as WC. So don't call it for A+A platform.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:24 -04:00
Jonathan Kim
4ac5617c4b drm/amdgpu: mask the xgmi number of hops reported from psp to kfd
The psp supplies the link type in the upper 2 bits of the psp xgmi node
information num_hops field.  With a new link type, Aldebaran has these
bits set to a non-zero value (1 = xGMI3) so the KFD topology will report
the incorrect IO link weights without proper masking.
The actual number of hops is located in the 3 least significant bits of
this field so mask if off accordingly before passing it to the KFD.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Amber Lin <amber.lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:03 -04:00
Alex Sierra
9a9c59a8f4 drm/amdgpu: enable 48-bit IH timestamp counter
By default this timestamp is 32 bit counter. It gets
overflowed in around 10 minutes.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:59 -04:00
Philip Yang
b672cb1eee drm/amdgpu: enable retry fault wptr overflow
If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done while
handling retry fault.

Remove VMC from IH storm client, enable ring1 write pointer overflow,
then IH will drop retry fault interrupts and be able to receive other
interrupts while driver is handling retry fault.

IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:55 -04:00
Oak Zeng
df23d1bbd1 drm/amdgpu: Use free system memory size for kfd memory accounting
With the current kfd memory accounting scheme, kfd applications
can use up to 15/16 of total system memory. For system which
has small total system memory size it leaves small system memory
for OS. For example, if the system has totally 16GB of system
memory, this scheme leave OS and non-kfd applications only 1GB
of system memory. In many cases, this leads to OOM killer.

This patch changed the KFD system memory accounting scheme.
15/16 of free system memory when kfd driver load. This deduct
the system memory that OS already use.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:49 -04:00
Hawking Zhang
b335f289fe drm/amdgpu: apply new pmfw loading sequence to arcturus and onwards
Arcturus and onwards products should follow the same sequence
that have pmfw loading ahead of tmr setup

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:44 -04:00
Lijo Lazar
6d9059217a drm/amdgpu: Fix aldebaran MMHUB CG/LS logic
Aldebaran MMHUB CG/LS logic is controlled by VBIOS. Enable the state
change logic only if driver is used for control.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:41 -04:00
Lijo Lazar
8cf3dccb07 drm/amdgpu: Enable CP idle interrupts
v1: The interrupts need to be enabled to move to DS clocks.
v2: Don't enable GFX IDLE interrupts if there are no GFX rings.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:36 -04:00
Lijo Lazar
48a6379a23 drm/amdgpu: Add clock gating support for aldebaran
Aldebaran clock gating support for GFX,SDMA,IH blocks
VCN/JPEG blocks are excluded in this patch, to be enabled later

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:22 -04:00
Alex Deucher
e844cd9944 drm/amdgpu: add mmhub client ids for aldebaran
Add the mmhub client id table for aldebaran.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:19 -04:00
James Zhu
557da413d6 drm/amdgpu: enable dpg indirect sram mode on aldebaran
Enable dpg indirect sram mode on aldebaran.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:15 -04:00
James Zhu
bd937973eb drm/amdgpu: enable vcn dpg mode on aldebaran
Enable vcn dpg mode on aldebaran

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:12 -04:00
James Zhu
fdb1fdef2d drm/amdgpu: enable vcn and jpeg on aldebaran
Enable vcn and jpeg 2.6 on aldebaran.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:09 -04:00
Lijo Lazar
bd7228abb3 drm/amdgpu: Enable swsmu block on aldebaran
Enable smu13 block on aldebaran

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:05 -04:00
Hawking Zhang
842811369f drm/amdgpu: switch to cached noretry setting for aldebaran
global noretry setting now is cached to gmc.noretry

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:00 -04:00
Hawking Zhang
d02692ae0d drm/amdgpu: bypass hdp read cache invalidation for aldebaran (v2)
hdp read cache is removed in aldebaran. don't issue
an mmio write or write data packet to hardware.

v2: rebase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:45 -04:00
Amber Lin
b7daed1b62 drm/amdgpu: Aldebaran doesn't use semaphore
Simplify all Aldebaran DIDs into one ASIC type.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:41 -04:00
Alex Sierra
07744e9069 drm/amdgpu: UTLC1 RB SDMA timeout on Aldebaran
[Why]
This causes infinite retries on the UTCL1 RB, preventing
higher priority RB such as paging RB.

[How]
Set to one the SDMAx_UTLC1_TIMEOUT registers for all SDMAs.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:36 -04:00
Feifei Xu
8081f8faca drm/amdpgu: add ATOM_DGPU_VRAM_TYPE_HBM2E vram type
0x61 is assigned to HBM2E in atom_dgpu_vram_type.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:33 -04:00
Hawking Zhang
44b3253a4b drm/amdgpu: retire aldebaran gpu_info firmware
driver should use the gfx_info atomfirmware interface

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:28 -04:00
Hawking Zhang
7159a36e11 drm/amdgpu: query aldebaran gfx_config through atomfirmware i/f
For ASICs that don't support ip discovery feature, query
gfx configuration through atomfirmware interface, rather
than gpu_info firmware.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:22 -04:00
Lijo Lazar
8738a82b37 drm/amd/amdgpu: Add smu_pptable module parameter
Temporarily add smu_pptable module parameter for aldebaran.This is used
to force soft PPTable use overriding any VBIOS PPTable.

Signed-off-by: Lijo Lazar <Lijo.Lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:14 -04:00
Oak Zeng
be566196be drm/amdgpu: Don't do FB resize under A+A config
Disable PCIe BAR resizing on A+A config. It's not needed because we won't use the
PCIe BAR, but it breaks the PCI BAR configuration with the current SBIOS.

Error message of FB BAR resize failure under A+A:

[  154.913731] [drm:amdgpu_device_resize_fb_bar [amdgpu]] *ERROR* Problem resizing BAR0 (-22).

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.kuehling@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:06 -04:00
Oak Zeng
9d0af8b4de drm/amdgpu: pre-map device buffer as cached for A+A config
For A+A configuration, device memory is supposed to be mapped as
cachable from CPU side. For kernel pre-map gpu device memory using
ioremap_cache

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:03 -04:00
Hawking Zhang
d477c5aaec drm/amdgpu: disallow use semaphore on aldebaran
shall revisit the change later

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:53 -04:00
Hawking Zhang
10c71e6cc9 drm/amdgpu: switch to vega20 ih block for aldebaran
replace vega10 ih block with vega20 ih block for
aldebaran.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:51 -04:00
Hawking Zhang
eed4bbd388 drm/amdgpu: correct IH_CHICKEN programming for aldebaran
For aldebaran, psp firmware won't program IH_CHICKEN.
it now depends on driver to program it properly so
either bus address or gpu virtual address is just
working for ih ring.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:48 -04:00
Hawking Zhang
b45589b837 drm/amdgpu: add mmhub error status query callback for aldebaran
The callback will be invoked to query mmea error
status when needed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:46 -04:00
Hawking Zhang
27ad2ca667 drm/amdgpu: add mmhub ras error reset callback for aldebaran
The callback will be invoked to reset mmhub ras error
counters when needed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:44 -04:00
Hawking Zhang
cbb84e7aab drm/amdgpu: add mmhub ras error query callback for aldebaran
The callback will be invoked to harvest all kinds
of mmhub ras error

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:41 -04:00
Hawking Zhang
f5f0e4a0d5 drm/amdgpu: add sdma ras error reset callback for aldebaran
The callback will be invoked to reset sdma ras error
counters when needed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:39 -04:00
Hawking Zhang
b2459840cf drm/amdgpu: add sdma ras error query callback for aldebaran
The callback will be invoked to harvest all kinds
of sdma ras error

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:36 -04:00
Hawking Zhang
2fdb91a25e drm/amdgpu: add sdma v4_4 ras function
sdma ras function is the main structure to support
sdma ras on aldebaran. the patch initializes late_init
late_fini callbacks.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:31 -04:00
Hawking Zhang
a6d9d6ab84 drm/amdgpu: apply sdma golden settings for aldebaran
perform one-time initialization for sdma registers

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:28 -04:00
Hawking Zhang
3de60d961c drm/amdgpu: use physical_node_id to calculate aper_base
Similar as xgmi connected gpu nodes, physical_node_id
* segment_size should be used to calculate the offset
of aper_base.

The asic type check is redundant. once physical_node_id
and segment_size are initialized, it should be count
on.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:20 -04:00
Hawking Zhang
063a1e8341 drm/amdgpu: skip gds ras workaround for aldebaran
there won't be any gds useage in either kernel or
pm4 anymore for aldebaran.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:14 -04:00
Hawking Zhang
18c3d45a9a drm/amdgpu: init gds for aldebaran
aldebaran removed gds internal memory for atomic usage.
it only supports gws opcode in kernel like barrier,
semaphore.etc. there won't be usage of gds in either
kernel or pm4 packet. max_wave_id should also be marked
as deprecated for aldebaran.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:10 -04:00
Feifei Xu
147d082d38 drm/amdgpu: correct vram_info for HBM2E
correct atom_vram_info_header_v2_6 and its vram_module.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:05 -04:00
Hawking Zhang
f31c4a11b4 drm/amdgpu: support get_vram_info atomfirmware i/f for aldebaran
Query vram_type, channel_num, channel_width
information through atomfirmware i/f

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:59 -04:00
Feifei Xu
5c03e5843e drm/amdgpu:add smu mode1/2 support for aldebaran
Use MSG_GfxDriverReset for mode reset and retire MSG_Mode1Reset.
Centralize soc15_asic_mode1_reset() and nv_asic_mode1_reset()functions.
Add mode2_reset_is_support() for smu->ppt_funcs.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:49 -04:00
Feifei Xu
4c2e5f513e drm/amdgpu: Add DID for aldebaran
Add 0x7408,0x740C,0x740F in pciidlist.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:45 -04:00
John Clements
0d2c1855d5 drm/amdgpu: added support for register list loading (v2)
call host to  psp cmd to load reg list

v2: update to latest interface (Alex)

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:41 -04:00
John Clements
b2aa382ae7 drm/amdgpu: added register list driver ctx (v2)
updated psp bin parsing and load register list

v2: update to latest interface (Alex)

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:38 -04:00
John Clements
d74decc412 drm/amdgpu: updated host to psp mailbox cmd (v2)
added host to psp cmd for register list

v2: update to new interface (Alex)

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:35 -04:00
Hawking Zhang
115ba9a9fd drm/amdgpu: declare smuio v13_0 callbacks as static
fix -Wmissing-protoypes warning

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:32 -04:00
Hawking Zhang
4f668d3d31 drm/amdgpu: initialize external rev_id for aldebaran
add exteranal rev_id for aldebaran

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:28 -04:00
Kevin Wang
e747ca0a4e drm/amdgpu: declare sdma firmware binary file for aldebaran
declare sdma firmware binary file for aldebaran

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:26 -04:00
Hawking Zhang
cf7821a84a drm/amdgpu: comments out vcn/jpeg ip blocks for aldebaran
vcn fw front door loading is not functional. comments
out vcn/jpeg ip blocks so people can load amdgpu driver
without specify ip_mask module parameter.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:17 -04:00
Hawking Zhang
fbaa30d87f drm/amdgpu: initialize ta firmware for aldebaran
only xgmi ta is supported at this stage

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:15 -04:00
Kevin Wang
5be50a8fd8 drm/amdgpu: switch to use reg distance member for mmhub v1_7
switch to use register distance member for mmhub v1_7
instead of hardcode

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:09 -04:00
Oak Zeng
4da999cdfc drm/amdgpu: Clean up mmhub functions for aldebaran
Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC.

V2: Split patch into upstreamable and aldebaran

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:05 -04:00
James Zhu
f8db121e47 drm/amdgpu/jpeg: enable JPEG on aldebaran
enable JPEG on aldebaran

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:02 -04:00
James Zhu
9f386fd3aa drm/amdgpu/vcn: enable VCN on aldebaran
Enable VCN on aldebaran

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:58 -04:00
James Zhu
7ce293570c drm/amdgpu/nbio: add aldebaran support
Aldebaran has a new mmBIF_MMSCH1_DOORBELL_RANGE setting.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:52 -04:00
Hawking Zhang
eb28f02b1e drm/amdgpu: skip MEC2_JT initialization for aldebaran
MEC2_JT is not supported

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:47 -04:00
Eric Huang
72b4db0f58 drm/amdgpu: new cache coherence change for Aldebaran
To support new cache coherence HW on A+A platform mainly in KFD.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:37 -04:00
James Zhu
ff6885ac47 drm/amdgpu/jpeg2.6: Add jpeg2.6 support
Aldebaran is using jpeg2.6, and the main change is jpeg2.6 using
AMDGPU_MMHUB_0, and jpeg2.5 using AMDGPU_MMHUB_1.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:34 -04:00
Yong Zhao
7ffe72385a drm/amdgpu: Fix an omission when adding Aldebaran support
Aldebaran should be the same as Arcturus in the PTE SNOOPED bit handling.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:28 -04:00
Oak Zeng
56237c6aef drm/amdgpu: Fix IH client ID naming table
Client ID 26 is reserved. Add it to the table.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:22 -04:00
James Zhu
eb53aa3981 drm/amdgpu/vcn2.6: Add vcn2.6 support
Aldebaran is using vcn2.6, and the main change is vcn2.6 using
AMDGPU_MMHUB_0, and vcn2.5 using AMDGPU_MMHUB_1

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:17 -04:00
James Zhu
86d848b16d drm/amdgpu: add Aldebaran to the VCN family
including firmware support etc.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:10 -04:00
Rajneesh Bhardwaj
3cbb3a9749 drm/amdgpu: support get xgmi information for Aldebaran
Aldebaran uses registers defined in header gc_9_4_2 but much of the xgmi
related functionality can be obtained by reusing the exisitng definition
from gfxhub_v1_1_get_xgmi_info. While adding support for Aldebaran, also
refactored code to better handle the new scenario.

Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:55 -04:00
Rajneesh Bhardwaj
31691b8d1b drm/amdgpu: define address map for host xgmi link (v3)
This applies to AMD Accelerated Processing Platforms that support host
gpu interconnect throguh a special link (xgmi). Aldebaran systems will
support this special feature for utilizing the benefits of host-gpu
cache coherence. This change outlines the basic framework for mapping
the GPU VRAM (HBM) to system address space making it accesible to the
host but managed by the amdgpu driver since this region is marked as
reserved memory in host address space by the underlying system firmware.

v2: switch to smuio callback function to check the type
of host-gpu interface (Hawking)
v3: use hub callbacks rather than direct function calls (Alex)

Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:48 -04:00
Rajneesh Bhardwaj
efce10005b drm/amdgpu: enable xgmi support for Aldebaran
Like its predecessors Aldebran also supports advanced high bandwidth
GPU-GPU communication interface known as xgmi. This enables the basic
xgmi support while refactoring the code slightly.

Detection of xgmi link between host cpu and gpu will be introduced in a
different patch.

Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:45 -04:00
Hawking Zhang
7914a0cd17 drm/amdgpu: initialize smuio callbacks for aldebaran
initialize smuio v13_0 callbacks for aldebaran

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:40 -04:00
Hawking Zhang
2e8c66d6bb drm/amdgpu: implement smuio v13_0 callbacks
Aldebaran will use smuio v13_0 callbacks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:38 -04:00
Hawking Zhang
26f70889e1 drm/amdgpu: add new smuio callbacks for aldebaran
is_host_gpu_xgmi_supported is used to query gpu and
cpu/host link type. get_die_id is used to query die
ids.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:35 -04:00
Hawking Zhang
9fbd96a136 drm/amdgpu: enable psp v13 ip block for aldebaran
Add psp v13 ip block to soc ip init list for aldebaran

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:33 -04:00
Hawking Zhang
efec10c1eb drm/amdgpu: bypass gc_9_x_common golden settings
ALDEBARAN doesn't need these golden settings.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:30 -04:00
Hawking Zhang
1b15bac7bf drm/amdgpu: detect sriov capability for aldebaran
SRIOV pf/vf function identifier regsiter in aldebaran
is the same as the one in arcturus

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:27 -04:00
Hawking Zhang
428ad99e9c drm/amdgpu: load pmfw prior to other non-psp fw for aldebaran
PMFW should be loaded before any operation that
may toggling DF-Cstate. otherwsie, tOS has no
choice but to locally toggle DF Cstate (i.e.
disable DF-Cstate even it already enabled by VBIOS)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:24 -04:00
Hawking Zhang
f8a98f1645 drm/amdgpu: fix incorrect EP_STRAP reg offset for aldebaran
mmRCC_DEV0_EPF0_STRAP0 offset in aldebaran is changed
from arcturus

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:18 -04:00
Hawking Zhang
ee82108325 drm/amdgpu: init psp v13 ip function
Initialze psp ip function for aldebaran

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:15 -04:00
Hawking Zhang
48375542b0 drm/amdgpu: add psp v13 ring support
Add callback functions for psp_v13 ring

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:12 -04:00
Hawking Zhang
f117535590 drm/amdgpu: add tOS loading support for psp v13
Add callback function to support trusted os
loading for psp v13

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:10 -04:00
Hawking Zhang
ea6eaf5583 drm/amdgpu: add sys_drv loading support for psp v13
Add callback function to support sys_drv firmware
loading for psp v13

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:05 -04:00
Hawking Zhang
133d888da9 drm/amdgpu: add kdb loading support for psp v13
Add callback function to support key database firmware
loading for psp v13

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:51:58 -04:00
Hawking Zhang
742d3c61ac drm/amdgpu: init sos microcode for psp v13
Initialize sos microcode for aldebaran

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:51:54 -04:00
Yong Zhao
be14729a33 drm/amdgpu: Print the IH client ID name when vm fault happens
This gives more information and improves productivity.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:51:40 -04:00
Dave Airlie
51c3b916a4 drm-misc-next for 5.13:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - %p4cc printk format modifier
   - atomic: introduce drm_crtc_commit_wait, rework atomic plane state
     helpers to take the drm_commit_state structure
   - dma-buf: heaps rework to return a struct dma_buf
   - simple-kms: Add plate state helpers
   - ttm: debugfs support, removal of sysfs
 
 Driver Changes:
   - Convert drivers to shadow plane helpers
   - arc: Move to drm/tiny
   - ast: cursor plane reworks
   - gma500: Remove TTM and medfield support
   - mxsfb: imx8mm support
   - panfrost: MMU IRQ handling rework
   - qxl: rework to better handle resources deallocation, locking
   - sun4i: Add alpha properties for UI and VI layers
   - vc4: RPi4 CEC support
   - vmwgfx: doc cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYD9fUAAKCRDj7w1vZxhR
 xcRLAQDdWKgUNeHnkKCUNh3ewPGabxvc6KQtPqAxcFv0I3ZmWgEAlfTS0pRLdyzQ
 ITRBL0T0S7cIyqnDULZkwfqB6Q8D0ws=
 =hPCS
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-03-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.13:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - %p4cc printk format modifier
  - atomic: introduce drm_crtc_commit_wait, rework atomic plane state
    helpers to take the drm_commit_state structure
  - dma-buf: heaps rework to return a struct dma_buf
  - simple-kms: Add plate state helpers
  - ttm: debugfs support, removal of sysfs

Driver Changes:
  - Convert drivers to shadow plane helpers
  - arc: Move to drm/tiny
  - ast: cursor plane reworks
  - gma500: Remove TTM and medfield support
  - mxsfb: imx8mm support
  - panfrost: MMU IRQ handling rework
  - qxl: rework to better handle resources deallocation, locking
  - sun4i: Add alpha properties for UI and VI layers
  - vc4: RPi4 CEC support
  - vmwgfx: doc cleanup

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
2021-03-16 17:08:46 +10:00
Dave Airlie
fb198483ed Merge tag 'amd-drm-fixes-5.12-2021-03-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.12-2021-03-10:

amdgpu:
- Fix aux backlight control
- Add a backlight override parameter
- Various display fixes
- PCIe DPM fix for vega
- Polaris watermark fixes
- Additional S0ix fix

radeon:
- Fix GEM regression
- Fix AGP dependency handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210310221141.3974-1-alexander.deucher@amd.com
2021-03-12 11:20:02 +10:00
Alex Deucher
a5cb3c1a36 drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m
Need to check the module variant as well.

Acked-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-10 16:23:26 -05:00
Nirmoy Das
521f04f9e3 drm/amdgpu: fb BO should be ttm_bo_type_device
FB BO should not be ttm_bo_type_kernel type and
amdgpufb_create_pinned_object() pins the FB BO anyway.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 16:19:47 -05:00
Takashi Iwai
7a46f05e5e drm/amd/display: Add a backlight module option
There seem devices that don't work with the aux channel backlight
control.  For allowing such users to test with the other backlight
control method, provide a new module option, aux_backlight, to specify
enabling or disabling the aux backport support explicitly.  As
default, the aux support is detected by the hardware capability.

v2: make the backlight option generic in case we add future
backlight types (Alex)

BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-10 16:12:15 -05:00
Kevin Wang
5af81c6e6e drm/amdgpu: add aldebaran sdma firmware support (v2)
add sdma firmware load support for soc model

v2: drop some emulator leftovers (Alex)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:17 -05:00
Yong Zhao
36e22d59dd drm/amdkfd: Add Aldebaran KFD support
Add initial KFD support.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:13 -05:00
Le Ma
c00a18ec0b drm/amdgpu: set ip blocks for aldebaran
Set ip blocks and asic family id

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:09 -05:00
Le Ma
759eb38ed1 drm/amdgpu: correct mmBIF_SDMA4_DOORBELL_RANGE address for aldebaran
On aldebaran, mmBIF_SDMA4_DOORBELL_RANGE isn't right next to
mmBIF_SDMA3_DOORBELL_RANGE.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:02 -05:00
Le Ma
b61a273e5d drm/amdgpu: add sdma block support for aldebaran
Add initial sdma support for aldebaran, and this asic has 5 sdma instances.

v2: remove adundant condition check

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <Evan.Quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:59 -05:00
Le Ma
cdf545f35f drm/amdgpu: add gfx v9 block support for aldebaran
Add gfx initial support

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:57 -05:00
Le Ma
d39da7dab1 drm/amdgpu: set fw load type for aldebaran
Set backdoor loading way in current phase

v2: change case location to not break other asics

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:54 -05:00
Le Ma
85e395506b drm/amdgpu: add gmc v9 block support for Aldebaran
Add gfx memory controller support

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:51 -05:00
Le Ma
f37945d50f drm/amdgpu: add mmhub support for aldebaran (v3)
v1: dupilcate mmhub_v1_7.c from mmhub_v1_0.c because
mmhub register address for aldebaran is different
from existing asics (Le)
v2: switch to latest mmhub_v9_4_2 register headers (Hawking)
v3: squash in init VM_L2_CNTL3 default value for mmhub v1_7

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:47 -05:00
Le Ma
7906af5e9d drm/amdgpu: add soc15 common ip block support for aldebaran
Initialize aldebaran common ip block

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:43 -05:00
Le Ma
42719073b4 drm/amdgpu: add gpu_info fw parse support for aldebaran
Parses asic configurations stored in gpu_info firmware and make them available
for driver to use.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:37 -05:00
Le Ma
42b72608ae drm/amdgpu: add register base init for aldebaran (v2)
v1: add aldebaran_reg_base_init function to initialize
register base for aldebaran (Le)
v2: update VCN HWIP and initialize base offset (James)

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:34 -05:00
Le Ma
d46b417a91 drm/amdgpu: add aldebaran asic type
Add aldebaran in amdgpu_asic_name array and amdgpu_asic_type enum

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:26 -05:00
Takashi Iwai
7c20984795 drm/amd/display: Add a backlight module option
There seem devices that don't work with the aux channel backlight
control.  For allowing such users to test with the other backlight
control method, provide a new module option, aux_backlight, to specify
enabling or disabling the aux backport support explicitly.  As
default, the aux support is detected by the hardware capability.

v2: make the backlight option generic in case we add future
backlight types (Alex)

BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:13:55 -05:00
Alex Deucher
58aa779019 drm/amdgpu: enable TMZ by default on Raven asics
This has been stable for a while.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:13:30 -05:00
Emily Deng
a00aacdf00 drm/amdgpu: Fix some unload driver issues
If have memory leak, maybe it will have issue in
ttm_bo_force_list_clean-> ttm_mem_evict_first.

Set adev->gart.ptr to null to avoid to call
amdgpu_gmc_set_pte_pde to cause ptr issue pointer when
calling amdgpu_gart_unbind in amdgpu_bo_fini which is after gart_fini.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:12:41 -05:00
Emily Deng
bb0cd09be4 drm/amdgpu: Fix some unload driver issues
When unloading driver after killing some applications, it will hit sdma
flush tlb job timeout which is called by ttm_bo_delay_delete. So
to avoid the job submit after fence driver fini, call ttm_bo_lock_delayed_workqueue
before fence driver fini. And also put drm_sched_fini before waiting fence.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:12:35 -05:00
Jingwen Chen
3c73683c23 drm/amd/amdgpu: add fini virt data exchange to ip_suspend
[Why]
when try to shutdown guest vm in sriov mode, virt data
exchange is not fini. After vram lost, trying to write
vram could hang cpu.

[How]
add fini virt data exchange in ip_suspend

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Jack Zhang <Jack.Zhang1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:12:29 -05:00
Feifei Xu
8f211fe8ac drm/amdgpu: add sdma 4_x interrupts printing
Add VM_HOLE/DOORBELL_INVALID_BE/POLL_TIMEOUT/SRBMWRITE
interrupt info printing.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:12:23 -05:00
Feifei Xu
e528556577 drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.
SDMA 4_x asics share the same MGCG/MGLS setting.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:12:15 -05:00
Leo (Hanghong) Ma
c79fe9b436 drm/amdgpu: add DMUB trace event IRQ source define
[Why & How]
We use DMCUB outbox0 interrupt to log DMCUB trace buffer events
as Linux kernel traces, so need to add some irq source related
defination in the header files;

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo (Hanghong) Ma <hanghong.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:10:49 -05:00
Horace Chen
4215a11923 drm/amdgpu: enable one vf mode on sienna cichlid vf
sienna cichlid needs one vf mode which allows vf to set and get
clock status from guest vm. So now expose the required interface
and allow some smu request on VF mode. Also since this asic blocked
direct MMIO access, use KIQ to send SMU request under sriov vf.

OD use same command as getting pp table which is not allowed for
 sienna cichlid, so remove OD feature under sriov vf.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Monk Liu<monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:10:35 -05:00
Dave Airlie
a1f1054124 Merge tag 'amd-drm-fixes-5.12-2021-03-03' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.12-2021-03-03:

amdgpu:
- S0ix fix
- Handle new NV12 SKU
- Misc power fixes
- Display uninitialized value fix
- PCIE debugfs register access fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210304043255.3792-1-alexander.deucher@amd.com
2021-03-05 11:13:22 +10:00
Kevin Wang
1aa46901ee drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcie
the register offset isn't needed division by 4 to pass RREG32_PCIE()

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-03 23:05:16 -05:00
Alex Deucher
25951362db drm/amdgpu: enable BACO runpm by default on sienna cichlid and navy flounder
It works fine and was only disabled because primary GPUs
don't enter runpm if there is a console bound to the fbdev due
to the kmap.  This will at least allow runpm on secondary cards.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 23:05:16 -05:00
Asher.Song
0c61ac8134 drm/amdgpu:disable VCN for Navi12 SKU
Navi12 0x7360/C7 SKU has no video support, so remove it.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Asher.Song <Asher.Song@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-03 22:48:33 -05:00
Alex Deucher
31ada99bdd drm/amdgpu: Only check for S0ix if AMD_PMC is configured
The S0ix check only makes sense if the AMD PMC driver is
present.  We need to use the legacy S3 pathes when the
PMC driver is not present.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-03 22:46:55 -05:00
Chen Li
147ab7a187 drm/amdgpu: correct DRM_ERROR for kvmalloc_array
This may avoid debug confusion.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Chen Li <chenli@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:37 -05:00
Chen Li
b4d916ee0e drm/amdgpu: Use kvmalloc for CS chunks
The number of chunks/chunks_array may be passed in
by userspace and can be large.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Chen Li <chenli@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:37 -05:00
Jiapeng Chong
fec432f557 drm/amdgpu: Remove unnecessary conversion to bool
Fix the following coccicheck warnings:

./drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:2252:40-45: WARNING: conversion
to bool not needed here.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:37 -05:00
Kevin Wang
43fb6c195d drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcie
the register offset isn't needed division by 4 to pass RREG32_PCIE()

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:37 -05:00
Kevin Wang
e7bdf00e00 drm/amdgpu: add SECURE DISPLAY TA firmware info in debugfs
add SECUREDISPLAY TA firmware info in amdgpu_fimrware_info()

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:35 -05:00
Kevin Wang
4d5ae731c4 drm/amdgpu: refine PSP TA firmware info print in debugfs
refine PSP TA firmware info print in amdgpu_firmware_info().

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:33 -05:00
Alex Deucher
03e0dbcd10 drm/amdgpu: enable BACO runpm by default on sienna cichlid and navy flounder
It works fine and was only disabled because primary GPUs
don't enter runpm if there is a console bound to the fbdev due
to the kmap.  This will at least allow runpm on secondary cards.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-02 14:17:04 -05:00
Alex Deucher
9598173d14 drm/amdgpu: Only check for S0ix if AMD_PMC is configured
The S0ix check only makes sense if the AMD PMC driver is
present.  We need to use the legacy S3 pathes when the
PMC driver is not present.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-02 14:17:04 -05:00
Jonathan Kim
640a28b50c drm/amdgpu: add missing df counter disable write
Request to stop DF performance counters is missing the actual write to the
controller register.

Reported-by: Chris Freehill <chris.freehill@amd.com>
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-02 14:07:52 -05:00
Kevin Wang
3e9e62c780 drm/amdgpu: correct TA RAP firmware information print error
miss RAP TA in loop. (when i == 4)

Fix:
drm/amdgpu: add RAP TA version print in amdgpu_firmware_info

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reported-by: Candice Li <candice.li@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-02 14:05:05 -05:00
Dennis Li
11003c68b1 drm/amdgpu: remove unnecessary reading for epprom header
If the number of badpage records exceed the threshold, driver has
updated both epprom header and control->tbl_hdr.header before gpu reset,
therefore GPU recovery thread no need to read epprom header directly.

v2: merge amdgpu_ras_check_err_threshold into amdgpu_ras_eeprom_check_err_threshold

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Kevin Wang
4890d4e94d drm/amdgpu: add RAP TA version print in amdgpu_firmware_info
add RAP TA version print in amdgpu_firmware_info.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Yang Li
7271a5c2ae drm/amdgpu: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
Fix the following coccicheck warning:
./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1589:0-23: WARNING:
fops_ib_preempt should be defined with DEFINE_DEBUGFS_ATTRIBUTE
./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1592:0-23: WARNING:
fops_sclk_set should be defined with DEFINE_DEBUGFS_ATTRIBUTE

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Alex Deucher
6f786950b1 drm/amdgpu/codec: drop the internal codec index
And just use the ioctl index.  They are the same.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Alex Deucher
b50368da61 drm/amdgpu: bump driver version for new video codec INFO ioctl query
So mesa can check when to query the kernel vs use hardcoded
codec bandwidth data.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Alex Deucher
f35e9bdb06 drm/amdgpu: add INFO ioctl support for querying video caps (v4)
We currently hardcode these in mesa, but querying them from
the kernel makes more sense since there may be board specific
limitations that the kernel driver is better suited to
determining.

Userpace patches that use this interface:
https://gitlab.freedesktop.org/leoliu/drm/-/commits/info_video_caps
https://gitlab.freedesktop.org/leoliu/mesa/-/commits/info_video_caps

v2: reorder the codecs to better align with mesa
v3: add max_pixels_per_frame to handle the portrait case, squash in
    memory leak fix
v4: drop extra break

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Alex Deucher
3b246e8b6a drm/amdgpu: add video decode/encode cap tables and asic callbacks (v3)
For each asic family.  Will be used to populate tables
for the new INFO ioctl query.

v2: add max_pixels_per_frame to handle the portrait case
v3: fix copy paste typos

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Alex Deucher
9269bf1868 drm/amdgpu: add asic callback for querying video codec info (v3)
This will be used by a new INFO ioctl query to fetch the decode
and encode capabilities from the kernel driver rather than
hardcoding them in mesa.  This gives us more fine grained control
of capabilities using information that is only availabl in the
kernel (e.g., platform limitations or bandwidth restrictions).

v2: reorder the codecs to better align with mesa
v3: add max_pixels_per_frame to handle the portrait case

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Aurabindo Pillai
0eb1af2e82 drm/amd/display: Add module parameter for freesync video mode
[Why]
This option shall be opt-in by default since it is a temporary solution
until long term solution is agreed upon which may require userspace interface
changes. This feature give the user a seamless experience when freesync aware
programs (media players for instance) switches to a compatible freesync mode
when playing videos. Enabling this feature also have the potential side effect
of causing higher power consumption due to running a mode with lower resolution
and base clock frequency with the highest base clock supported on the monitor as
per its advertised modes. There has been precedent of manufacturing modes in the
kernel. In AMDGPU, the existing usage are for common modes and scaling modes.
Other driver have a similar approach as well.

[How]
Adds a module parameter to enable freesync video mode modeset
optimization. Enabling this mode allows the driver to skip a full modeset when a
freesync compatible mode is requested by the userspace. This parameter will also
add some additional modes that are within the connected monitor's VRR range
corresponding to common video modes, which media players can use for a seamless
experience while making use of freesync.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:37 -05:00
Ramesh Errabolu
5392b2af97 drm/amdgpu: Remove amdgpu_device arg from free_sgt api (v2)
Currently callers have to provide handle of amdgpu_device,
which is not used by the implementation. It is unlikely this
parameter will become useful in future, thus removing it

v2: squash in unused variable fix

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:19:21 -05:00
Jingwen Chen
8f8c80f430 drm/amd/amdgpu: move inc gpu_reset_counter after drm_sched_stop
Move gpu_reset_counter after drm_sched_stop to avoid race
condition caused by job submitted between reset_count +1 and
drm_sched_stop.

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:19:21 -05:00
Changfeng
996aede280 drm/amdgpu: decline max_me for mec2_fw remove in renoir/arcturus
The value of max_me in amdgpu_gfx_rlc_setup_cp_table should reduce to 4
when mec2_fw is removed on asic renoir/arcturus. Or it will cause kernel
NULL pointer when modprobe driver.

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:01:27 -05:00
Asher.Song
650bc7ae00 drm/amdgpu:disable VCN for Navi12 SKU
Navi12 0x7360/C7 SKU has no video support, so remove it.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Asher.Song <Asher.Song@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:01:14 -05:00
Dennis Li
f89b881c81 drm/amdgpu: reserve backup pages for bad page retirment
To ensure user has a constant of VRAM accessible in run-time, driver
reserves limit backup pages when init, and return ones when bad pages
retired, to keep no change of unused memory size.

v2: refine codes to calculate badpags threshold

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:00:56 -05:00
Linus Torvalds
fdce29602f drm fixes for 5.12-rc1 + msm-next
core:
 - vblank fence timing improvements
 
 dma-buf:
 - improve error handling
 
 ttm:
 - memory leak fix
 
 msm:
 - a6xx speedbin support
 - a508, a509, a512 support
 - various a5xx fixes
 - various dpu fixes
 - qseed3lite support for sm8250
 - dsi fix for msm8994
 - mdp5 fix for framerate bug with cmd mode panels
 - a6xx GMU OOB race fixes that were showing up in CI
 - various addition and removal of semicolons
 - gem submit fix for legacy userspace relocs path
 
 amdgpu:
 - Clang warning fix
 - S0ix platform shutdown/poweroff fix
 - Misc display fixes
 
 i915:
 - color format fix
 - -Wuninitialised reenabled
 - GVT ww locking, cmd parser fixes
 
 atyfb:
 - fix build
 
 rockchip:
 - AFBC modifier fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJgN/iUAAoJEAx081l5xIa+HZQP/RiK+wO8OvSrLeflUWEr2gUC
 5NBbhGBB2dTNXeL7zFYXk4hC5HLYG/aDVfgXDctWVVcUslziEu3vsnHUuQQZrqxM
 8MYRIaOvb+Ic/ZzU8A6Jz6gK3zv8Rmu2/PXOLjkYFTfpwAKPDhipCr6momn3rGLz
 cwzcKwCCHVPIUb6Hp+4739dsI8TsxhBcuIpjj+8yRZHxFNUue+40n1nPh6GTWXwf
 rbnr/sR0SnfUwv8pwzjLeH4Fuj8w2S65Cdbjap868jO2b0qVrGyLpRBTd3611L1X
 /HhFOnJ7BHHvZ28Jsgl3cbtaKqlB1geBq1Yh04xZWrHxcMhVC+J2cOtW2wyMJZ99
 RayxPyxnCig0BuSHQzuWMT4o2wKbTgxBWL/Ys+QFVwZd0Y+BbKVrFPn2pRJ1vtb8
 LNAQVppcyfdAIXF9eqse+JhqTn2Z2BgjM4cm/8EA7yvX6afRm+GJKghrLjWKYqrI
 2iZKc8e4pgc532D3NseK2Xu4otIp7xPSW0wL6+g2ozLvHnwH+GJGI8TY85PoVJF2
 ZrWkDWxW68v3mpxeJnX3VoMS9RUhBJjRoqcBsetf+bLL/qHeIdqEpPaTa9PAOGY4
 CVR5vW6qvXfCjLCht8x/BTvLIToTFaHMIun6QYgCX1HbGu2k9ghpyC/IwiwwLsxV
 Yh9wHBR6Bzi/FRnZh23z
 =hrdL
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-02-26' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Dave Airlie:
 "This is mostly fixes but I missed msm-next pull last week. It's been
  in drm-next.

  Otherwise it's a selection of i915, amdgpu and misc fixes, one TTM
  memory leak, nothing really major stands out otherwise.

  core:
   - vblank fence timing improvements

  dma-buf:
   - improve error handling

  ttm:
   - memory leak fix

  msm:
   - a6xx speedbin support
   - a508, a509, a512 support
   - various a5xx fixes
   - various dpu fixes
   - qseed3lite support for sm8250
   - dsi fix for msm8994
   - mdp5 fix for framerate bug with cmd mode panels
   - a6xx GMU OOB race fixes that were showing up in CI
   - various addition and removal of semicolons
   - gem submit fix for legacy userspace relocs path

  amdgpu:
   - clang warning fix
   - S0ix platform shutdown/poweroff fix
   - misc display fixes

  i915:
   - color format fix
   - -Wuninitialised reenabled
   - GVT ww locking, cmd parser fixes

  atyfb:
   - fix build

  rockchip:
   - AFBC modifier fix"

* tag 'drm-next-2021-02-26' of git://anongit.freedesktop.org/drm/drm: (60 commits)
  drm/panel: kd35t133: allow using non-continuous dsi clock
  drm/rockchip: Require the YTR modifier for AFBC
  drm/ttm: Fix a memory leak
  drm/drm_vblank: set the dma-fence timestamp during send_vblank_event
  dma-fence: allow signaling drivers to set fence timestamp
  dma-buf: heaps: Rework heap allocation hooks to return struct dma_buf instead of fd
  dma-buf: system_heap: Make sure to return an error if we abort
  drm/amd/display: Fix system hang after multiple hotplugs (v3)
  drm/amdgpu: fix shutdown and poweroff process failed with s0ix
  drm/i915: Nuke INTEL_OUTPUT_FORMAT_INVALID
  drm/i915: Enable -Wuninitialized
  drm/amd/display: Remove Assert from dcn10_get_dig_frontend
  drm/amd/display: Add vupdate_no_lock interrupts for DCN2.1
  Revert "drm/amd/display: reuse current context instead of recreating one"
  drm/amd/pm/swsmu: Avoid using structure_size uninitialized in smu_cmn_init_soft_gpu_metrics
  fbdev: atyfb: add stubs for aty_{ld,st}_lcd()
  drm/i915/gvt: Introduce per object locking in GVT scheduler.
  drm/i915/gvt: Purge dev_priv->gt
  drm/i915/gvt: Parse default state to update reg whitelist
  dt-bindings: dp-connector: Drop maxItems from -supply
  ...
2021-02-25 12:10:22 -08:00
Rikard Falkeborn
fff72bb569 drm/amdgpu/ttm: constify static vm_operations_struct
The only usage of amdgpu_ttm_vm_ops is to assign its address to the
vm_ops field in the vm_area_struct struct. Make it const to allow the
compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210209234817.55112-2-rikard.falkeborn@gmail.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-02-25 14:40:24 +01:00
Prike Liang
b092b19602 drm/amdgpu: fix shutdown and poweroff process failed with s0ix
In the shutdown and poweroff opt on the s0i3 system we still need
un-gate the gfx clock gating and power gating before destory amdgpu device.

Fixes: 628c36d7b2 ("drm/amdgpu: update amdgpu device suspend/resume sequence for s0i3 support")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1499
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-24 09:48:46 -05:00
Jiapeng Chong
6c65a582ee drm/amdgpu: Remove unnecessary conversion to bool
Fix the following coccicheck warnings:

./drivers/gpu/drm/amd/amdgpu/athub_v2_1.c:79:40-45: WARNING: conversion
to bool not needed here.

./drivers/gpu/drm/amd/amdgpu/athub_v2_1.c:81:40-45: WARNING: conversion
to bool not needed here.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-24 09:28:55 -05:00
Sonny Jiang
b2576c3bf4 drm/amdgpu/vcn3.0: add wptr/rptr reset/update for share memory
Because of dpg, the rptr/wptr need to be saved on fw shared memory,
and restore them back in RBC_RB_RPTR/WPTR in kernel at power up.

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-24 09:28:55 -05:00
John Clements
f8f70c1371 drm/amdgpu: disable mec2 fw bin loading
disable mec2 fw bin loading and reference on unsupported ASIC

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-24 09:28:55 -05:00
Tao Zhou
211fe484a6 drm/amdgpu: fix wrong executable setting for dimgrey_cavefish_reg_init.c
Remove executable configuration for the file.

Reported-by: Ming Wang <wangming01@loongson.cn>
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-24 09:28:55 -05:00
Prike Liang
b00978de90 drm/amdgpu: fix shutdown and poweroff process failed with s0ix
In the shutdown and poweroff opt on the s0i3 system we still need
un-gate the gfx clock gating and power gating before destory amdgpu device.

Fixes: 628c36d7b2 ("drm/amdgpu: update amdgpu device suspend/resume sequence for s0i3 support")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1499
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-24 09:28:55 -05:00
Jiapeng Chong
cd48758c82 drm/amdgpu/sdma5.2: Remove unnecessary conversion to bool
Fix the following coccicheck warnings:

./drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:1621:40-45: WARNING: conversion
to bool not needed here.

./drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:1619:40-45: WARNING: conversion
to bool not needed here.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-24 09:28:54 -05:00
Nirmoy Das
d4a9ffdf71 drm/amdgpu: remove unused variable from struct amdgpu_bo
Fixes: 62914a99de ("drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror")
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-22 18:03:07 -05:00
Linus Torvalds
d99676af54 drm pull for 5.12-rc1
docs:
 - lots of updated docs
 
 core:
 - require crtc to have unique primary plane
 - fourcc macro fix
 - PCI bar quirk for bar resizing
 - don't sent hotplug on error
 - move vm code to legacy
 - nuke hose only used on old oboslete alpha
 
 dma-buf:
 - kernel doc updates
 - improved lock tracking
 
 dp/hdmi:
 - DP-HDMI2.1 protocol converter support
 
 ttm:
 - bo size handling cleanup
 - release a pinned bo warning
 - cleanup lru handler
 - avoid using pages with drm_prime_sg_to_page_addr_arrays
 
 cma-helper:
 - prime/mmap fixes
 
 bridge:
 - add DP support
 
 gma500:
 - remove gma3600 support
 
 i915:
 - try eDP fast/narrow link again with fallback
 - Intel eDP backlight control
 - replace display register read/write macros
 - refactor intel_display.c
 - display power improvements
 - HPD code cleanup
 - Rocketlake display fixes
 - Power/backlight/RPM fixes
 - DG1 display fix
 - IVB/BYT clear residuals security fix again
 - make i915 mitigations options via parameter
 - HSW GT1 GPU hangs fixes
 - DG1 workaround hang fixes
 - TGL DMAR hang avoidance
 - Lots of GT fixes
 - follow on fixes for residuals clear
 - gen7 per-engine-reset support
 - HDCP2.2 + HDCP1.4 GEN12 DP MST support
 - TGL clear color support
 - backlight refactoring
 - VRR/Adaptive sync enabling on DP/EDP for TGL+
 - async flips for all ilk+
 
 amdgpu:
 - rework IH ring handling (Vega/Navi)
 - rework HDP handling (Vega/Navi)
 - swSMU updates for renoir/vangogh
 - Sienna Cichild overdrive support
 - FP16 on DCE8-11 support
 - GPU reset on navy flounder/vangogh
 - SMU profile fixes for APU
 - SR-IOV fixes
 - Vangogh SMU fixes
 - fan speed control fixes
 
 amdkfd:
 - config handling fix
 - buffer free fix
 - recursive lock warnings fix
 
 nouveau:
 - Turing MMU fault recovery fixes
 - mDP connectors reporting fix
 - audio locking fixes
 - rework engines/instances code to support new scheme
 
 tegra:
 - VIC newer firmware support
 - display/gr2d fixes for older tegra
 - pm reference leak fix
 
 mediatek:
 - SOC MT8183 support
 - decouple sub driver + share mtk mutex driver
 
 radeon:
 - PCI resource fix for some platforms
 
 ingenic:
 - pm support
 - 8-bit delta RGB panels
 
 vmwgfx:
 - managed driver helpers
 
 vc4:
 - BCM2711 DSI1 support
 - converted to atomic helpers
 - enable 10/12 bpc outputs
 - gem prime mmap helpers
 - CEC fix
 
 omap:
 - use degamma table
 - CTM support
 - rework DSI support
 
 imx:
 - stack usage fixes
 - drm managed support
 - imx-tve clock provider leak fix
 -
 
 rcar-du:
 - default mode fixes
 - conversion to managed API
 
 hisilicon:
 - use simple encoder
 
 vkms:
 - writeback connector support
 
 d3:
 - BT2020 support
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJgL1RCAAoJEAx081l5xIa+BxoP/325goULPaGBwUKgVkSl6mTT
 Ror0r8U3ifQHrqPk57C5b4GfvNuJ8vJZC13GYiiwooPn/+sifbl8haMRQWKyH4fz
 PThm9vroIQZ8VC+fqixgrOwFKEwkKqucZ3f7dEj8paBVVcO9DcBIaSeO4QW2EAR/
 n2r7nHtFxVHYEwiOnJvIeWIh1dAmudr/U6pHyB6PnuofVgqveXHT5+mmkY51pJqF
 sn2Y+Ye3tP5+FDlKkueg8JUteyFRTGz1g7JQThxSI//b/+p4MmmRX03qcWvIIkOX
 XiNlP73Ssh7PPMcUgwFmvKbMfm9sfpwf7yX3nqzaAQAHZGufznxX0k50BRkxWyYL
 eMVxRs5/Vl5JAn3vhspAUZhc4BgOcJm9L4zazb7YqDghwpohSnXk/riunUevqFCf
 Dgsc8N63nft8WEBk3aB6loRpDDpo5rm8gVpl5LKk1YXT92o9x4eP+/B1+kf2RepM
 52H3CKD1GLK3ayJlRNa/ljE2qXaQru+PmjCxORgDPEZ7SXdb8q5bfH0MjCB4vEBp
 YIybWYIDQzRBKglN5qMQ3XNIgv95oqrxXKaDFFtp8lMEjVG0v+y2antzFHftXS2g
 Cj0aeyBx4PC3pNbZe54npEhFwVIs7NFXX9brpQnnLJvQj/Qp+GEhf8uqiCUJNnYA
 AF7qRRL0bBGTeiJGt4nM
 =TeKl
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-02-19' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "A pretty normal tree, lots of refactoring across the board, ttm, i915,
  nouveau, and bunch of features in various drivers.

  docs:
   - lots of updated docs

  core:
   - require crtc to have unique primary plane
   - fourcc macro fix
   - PCI bar quirk for bar resizing
   - don't sent hotplug on error
   - move vm code to legacy
   - nuke hose only used on old oboslete alpha

  dma-buf:
   - kernel doc updates
   - improved lock tracking

  dp/hdmi:
   - DP-HDMI2.1 protocol converter support

  ttm:
   - bo size handling cleanup
   - release a pinned bo warning
   - cleanup lru handler
   - avoid using pages with drm_prime_sg_to_page_addr_arrays

  cma-helper:
   - prime/mmap fixes

  bridge:
   - add DP support

  gma500:
   - remove gma3600 support

  i915:
   - try eDP fast/narrow link again with fallback
   - Intel eDP backlight control
   - replace display register read/write macros
   - refactor intel_display.c
   - display power improvements
   - HPD code cleanup
   - Rocketlake display fixes
   - Power/backlight/RPM fixes
   - DG1 display fix
   - IVB/BYT clear residuals security fix again
   - make i915 mitigations options via parameter
   - HSW GT1 GPU hangs fixes
   - DG1 workaround hang fixes
   - TGL DMAR hang avoidance
   - Lots of GT fixes
   - follow on fixes for residuals clear
   - gen7 per-engine-reset support
   - HDCP2.2 + HDCP1.4 GEN12 DP MST support
   - TGL clear color support
   - backlight refactoring
   - VRR/Adaptive sync enabling on DP/EDP for TGL+
   - async flips for all ilk+

  amdgpu:
   - rework IH ring handling (Vega/Navi)
   - rework HDP handling (Vega/Navi)
   - swSMU updates for renoir/vangogh
   - Sienna Cichild overdrive support
   - FP16 on DCE8-11 support
   - GPU reset on navy flounder/vangogh
   - SMU profile fixes for APU
   - SR-IOV fixes
   - Vangogh SMU fixes
   - fan speed control fixes

  amdkfd:
   - config handling fix
   - buffer free fix
   - recursive lock warnings fix

  nouveau:
   - Turing MMU fault recovery fixes
   - mDP connectors reporting fix
   - audio locking fixes
   - rework engines/instances code to support new scheme

  tegra:
   - VIC newer firmware support
   - display/gr2d fixes for older tegra
   - pm reference leak fix

  mediatek:
   - SOC MT8183 support
   - decouple sub driver + share mtk mutex driver

  radeon:
   - PCI resource fix for some platforms

  ingenic:
   - pm support
   - 8-bit delta RGB panels

  vmwgfx:
   - managed driver helpers

  vc4:
   - BCM2711 DSI1 support
   - converted to atomic helpers
   - enable 10/12 bpc outputs
   - gem prime mmap helpers
   - CEC fix

  omap:
   - use degamma table
   - CTM support
   - rework DSI support

  imx:
   - stack usage fixes
   - drm managed support
   - imx-tve clock provider leak fix
-

  rcar-du:
   - default mode fixes
   - conversion to managed API

  hisilicon:
   - use simple encoder

  vkms:
   - writeback connector support

  d3:
   - BT2020 support"

* tag 'drm-next-2021-02-19' of git://anongit.freedesktop.org/drm/drm: (1459 commits)
  drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2)
  drm/radeon: OLAND boards don't have VCE
  drm/amdkfd: Fix recursive lock warnings
  drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()
  drm/amd/display: Fix potential integer overflow
  drm/amdgpu/display: remove hdcp_srm sysfs on device removal
  drm/amdgpu: fix CGTS_TCC_DISABLE register offset on gfx10.3
  drm/i915/gt: Correct surface base address for renderclear
  drm/i915: Disallow plane x+w>stride on ilk+ with X-tiling
  drm/nouveau/top/ga100: initial support
  drm/nouveau/top: add ioctrl/nvjpg
  drm/nouveau/privring: rename from ibus
  drm/nouveau/nvkm: remove nvkm_subdev.index
  drm/nouveau/nvkm: determine subdev id/order from layout
  drm/nouveau/vic: switch to instanced constructor
  drm/nouveau/sw: switch to instanced constructor
  drm/nouveau/sec2: switch to instanced constructor
  drm/nouveau/sec: switch to instanced constructor
  drm/nouveau/pm: switch to instanced constructor
  drm/nouveau/nvenc: switch to instanced constructor
  ...
2021-02-21 14:44:44 -08:00
Nirmoy Das
ea1b8c9b83 drm/amdgpu: mark local function as static
Mark amdgpu_ras_debugfs_create_ctrl_node() as static.

Fixes: eb14235668777b ("drm/amdgpu: do not keep debugfs dentry")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:11 -05:00
Alex Deucher
6e80fb8ab0 drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2)
Fixes the rlc reference clock used for GPU timestamps.
Value is 100Mhz.  Confirmed with hardware team.

v2: reword commit message.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1480
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-18 16:43:09 -05:00
Nirmoy Das
98d28ac2f5 drm/amdgpu: do not use drm middle layer for debugfs
Use debugfs API directly instead of drm middle layer.

This also includes following debugfs file output changes:
1 amdgpu_evict_vram/amdgpu_evict_gtt output will not contain any braces.
  e.g. (0) --> 0
2 amdgpu_gpu_recover output will print return value of
  amdgpu_device_gpu_recover() instead of not so important "gpu recover"
  message.

v2: * checkpatch.pl: use '0444' instead of S_IRUGO.
    * remove S_IFREG from mode.
    * remove mode variable.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
373720f79d drm/amd/pm: do not use drm middle layer for debugfs
Use debugfs API directly instead of drm middle layer.

v2: * checkpatch.pl: use '0444' instead of S_IRUGO.
    * remove S_IFREG from mode.
    * remove mode variable.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
afd3a359c4 drm/amd/display: do not use drm middle layer for debugfs
Use debugfs API directly instead of drm middle layer.

v2: * checkpatch.pl: use '0444' instead of S_IRUGO.
    * remove S_IFREG from mode.
    * remove mode variable.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
0299bef975 drm/amdgpu: remove CONFIG_DRM_AMDGPU_GART_DEBUGFS
Removed unused CONFIG_DRM_AMDGPU_GART_DEBUGFS code.
We can use umr instead of this gart debugfs.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
88293c03c8 drm/amdgpu: do not keep debugfs dentry
Cleanup unnecessary debugfs dentries and surrounding functions.

v3: remove return value check for debugfs_create_file()
v2: remove ttm_debugfs_entries array.
    do not init variables.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Marek Olšák
4112c00354 drm/amdgpu: fix CGTS_TCC_DISABLE register offset on gfx10.3
This fixes incorrect TCC harvesting info reported to userspace.
The impact was a very very tiny performance degradation (unnecessary
GL2 cache flushes).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-18 16:42:55 -05:00
Sakari Ailus
92f1d09ca4 drm: Switch to %p4cc format modifier
Switch DRM drivers from drm_get_format_name() to %p4cc. This gets rid of a
large number of temporary variables at the same time.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210216155723.17109-4-sakari.ailus@linux.intel.com
2021-02-17 12:52:59 +01:00
Rafael J. Wysocki
6e60afb22c Merge branches 'acpi-misc', 'acpi-cppc', 'acpi-docs', 'acpi-config' and 'acpi-apei'
* acpi-misc:
  ACPI: Test for ACPI_SUCCESS rather than !ACPI_FAILURE
  ACPI: Use DEVICE_ATTR_<RW|RO|WO> macros

* acpi-cppc:
  ACPI: CPPC: initialise vaddr pointers to NULL
  ACPI: CPPC: add __iomem annotation to generic_comm_base pointer
  ACPI: CPPC: remove __iomem annotation for cpc_reg's address

* acpi-docs:
  Documentation: ACPI: add new rule for gpio-line-names

* acpi-config:
  ACPI: configfs: add missing check after configfs_register_default_group()

* acpi-apei:
  ACPI: APEI: ERST: remove unneeded semicolon
  ACPI: APEI: Add is_generic_error() to identify GHES sources
2021-02-15 17:04:40 +01:00
Tian Tao
802b8c8355 drm/amdgpu: fix unnecessary NULL check warnings
Remove NULL checks before vfree() to fix these warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:102:2-8: WARNING: NULL
check before some freeing functions is not needed.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:49:33 -05:00
Jiawei Gu
006cc1a213 drm/amdgpu: extend MAX_KIQ_REG_TRY to 1000
Extend retry times of KIQ to avoid starvation situation caused by
long time full access of GPU by other VFs.

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:48:49 -05:00
Tao Zhou
27859ee3df drm/amdgpu: enable gpu recovery for dimgrey_cavefish
As dimgrey_cavefish driver is stable enough, set gpu recovery as default
in HW hang for dimgrey_cavefish.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:25 -05:00
Alex Deucher
cef8b03bbc drm/amdgpu: reset runpm flag if device suspend fails
If device suspend fails when we attempt to runtime suspend,
reset the runpm flag.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:13 -05:00
Alex Deucher
ad887af9b6 drm/amdgpu: use runpm flag rather than fbcon for kfd runtime suspend (v2)
the flag used by kfd is not actually related to fbcon, it just happens
to align.  Use the runpm flag instead so that we can decouple it from
the fbcon flag.

v2: fix resume as well

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:07 -05:00
Alex Deucher
a8d3d80a8c drm/amdgpu: drop extra drm_kms_helper_poll_enable/disable calls
These are already called in amdgpu_device_suspend/resume which
are already called in the same functions.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:03 -05:00
Alex Deucher
f172865a36 drm/amdgpu/nv: add PCI reset support
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism.  This should in general
only be used for validation.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:01 -05:00
Alex Deucher
1176a1e0b9 drm/amdgpu/soc15: add PCI reset support
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism.  This should in general
only be used for validation.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:59 -05:00
Alex Deucher
ffbfd081b4 drm/amdgpu/si: add PCI reset support
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism.  This should in general
only be used for validation.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:57 -05:00
Alex Deucher
af484df800 drm/amdgpu: add generic pci reset as an option
This allows us to use generic PCI reset mechanisms (FLR, SBR) as
a reset mechanism to verify that the generic PCI reset mechanisms
are working properly.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:54 -05:00
Alex Deucher
d5ab066917 drm/amdgpu/vi: minor clean up of reset code
Drop duplicate reset method logging, whitespace changes.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:52 -05:00
Alex Deucher
44ab8bb0bb drm/amdgpu/cik: minor clean up of reset code
Drop duplicate reset method logging, whitespace changes.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:50 -05:00
Alex Deucher
25bd55276b drm/amdgpu/si: minor clean up of reset code
Drop duplicate reset method logging, whitespace changes.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:40 -05:00
Nirmoy Das
f8bf645018 drm/amdgpu: enable wave limit on non high prio cs pipes
To achieve the best QoS for high priority compute jobs it is
required to limit waves on other compute pipes as well.
This patch will set min value in non high priority
mmSPI_WCL_PIPE_PERCENT_CS[0-3] registers to minimize the
impact of normal/low priority compute jobs over high priority
compute jobs.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:08 -05:00
Wayne Lin
11f1a5538b drm/amdgpu: Add otg vertical IRQ Source
[Why & How]
In order to get appropriate timing for registers which
read/write is vertical line sensitive, add new IRQ source variable.
This interrupt is triggered by specific vertical line,

Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:55 -05:00
Kevin Wang
be8901c2ee drm/amdgpu: optimize list operation in amdgpu_xgmi
simplify the list operation.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:49 -05:00
Likun Gao
1001f2a1f3 drm/amdgpu: support rom clockgating related function for NV family
Add functions to support enable/disable rom clock gating and get rom
clock gating status.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:36 -05:00
Likun Gao
0bf7f2dcb9 drm/amdgpu: switch to use smuio callbacks for NV family
Switch to smuio callbacks: use smuio v11_0_6 callbacks for
Sienna_cichlid and forward ASIC, use smuio v11_0 callbacks for the
other NV family ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:27 -05:00
Likun Gao
1deb98534c drm/amdgpu: implement smuio v11_0_6 callbacks
Implement smuio v11_0_6 callbacks which will used by Sienna_Cichlid and
forward ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:21 -05:00
Likun Gao
e1edaeafeb drm/amdgpu: support ASPM for some specific ASIC
Support to program ASPM and LTR for Sienna Cichlid and forward ASIC.
Disable ASPM for Sienna Cichlid and forward ASIC by default.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:04 -05:00
Kenneth Feng
680602d6c2 drm/amd/pm: enable DCS
Enable DCS

V1: Enable Async DCS.
V2: Add the ppfeaturemask bit to enable from the modprobe parameter.
V3:
1. add the flag to skip APU support.
2. remove the hunk for workload selection since
it doesn't impact the function.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:57 -05:00
Alex Deucher
e83db77487 drm/amdgpu/gmc9: fix mmhub client mapping for arcturus
The hw interface changed on arcturus so the old numbering
scheme doesn't work.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:47 -05:00
Nirmoy Das
22e4f31529 drm/amdgpu: enable gfx wave limiting for high priority compute jobs
Enable gfx wave limiting for gfx jobs before pushing high priority
compute jobs so that high priority compute jobs gets more resources
to finish early.

v2: use ring priority instead of job priority.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:11 -05:00
Nirmoy Das
0a52a6cacc drm/amdgpu: add wave limit functionality for gfx8,9
Wave limiting can be use to load balance high priority
compute jobs along with gfx jobs. When enabled, this will reserve
~75% of waves for compute jobs.

We do not need this from gfx10 onwards because >=gfx10 has
asynchronous compute tunneling to replace wave limit requirement.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:04 -05:00
Nirmoy Das
8c0225d792 drm/amdgpu: enable only one high prio compute queue
For high priority compute to work properly we need to enable
wave limiting on gfx pipe. Wave limiting is done through writing
into mmSPI_WCL_PIPE_PERCENT_GFX register. Enable only one high
priority compute queue to avoid race condition between multiple
high priority compute queues writing that register simultaneously.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:26:56 -05:00
Nirmoy Das
ebdd2e9d1a drm/amdgpu: cleanup struct amdgpu_ring
This patch consist of below related changes:

1 Rename ring->priority to ring->hw_prio.
2 Assign correct hardware ring priority.
3 Remove ring->priority_mutex as ring priority remains unchanged
  after initialization.
4 Remove unused ring->num_jobs.

v3: remove ring->num_jobs.
v2: remove ring->priority_mutex.

Fixes: 33abcb1f5a ("drm/amdgpu: set compute queue priority at mqd_init")
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:26:41 -05:00
Christian König
f07069da6b drm/ttm: move memory accounting into vmwgfx v4
This is just another feature which is only used by VMWGFX, so move
it into the driver instead.

I've tried to add the accounting sysfs file to the kobject of the drm
minor, but I'm not 100% sure if this works as expected.

v2: fix typo in KFD and avoid 64bit divide
v3: fix init order in VMWGFX
v4: use pdev sysfs reference instead of drm

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zack Rusin <zackr@vmware.com> (v3)
Tested-by: Nirmoy Das <nirmoy.das@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-2-christian.koenig@amd.com
2021-02-09 17:27:33 +01:00
Christian König
f2f12eb9c3 drm/scheduler: provide scheduler score externally
Allow multiple schedulers to share the load balancing score.

This is useful when one engine has different hw rings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com
2021-02-05 10:47:11 +01:00
Christian König
cd9b0159be drm/amdgpu: enable freesync for A+A configs
Some newer APUs can scanout directly from GTT, that saves us from
allocating another bounce buffer in VRAM and enables freesync in such
configurations.

Without this patch creating a framebuffer from the imported BO will
fail and userspace will fall back to a copy.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 17:36:33 -05:00
chen gong
ea41bd232f drm/amdgpu/gfx10: update CGTS_TCC_DISABLE and CGTS_USER_TCC_DISABLE register offsets for VGH
For Vangogh:
The offset of the CGTS_TCC_DISABLE is 0x5006 by calculation.
The offset of the CGTS_USER_TCC_DISABLE is 0x5007 by calculation.

Signed-off-by: chen gong <curry.gong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 17:34:56 -05:00
Huang Rui
b99a8c8f23 drm/amdkfd: fix null pointer panic while free buffer in kfd
In drm_gem_object_free, it will call funcs of drm buffer obj. So
kfd_alloc should use amdgpu_gem_object_create instead of
amdgpu_bo_create to initialize the funcs as amdgpu_gem_object_funcs.

[  396.231390] amdgpu: Release VA 0x7f76b4ada000 - 0x7f76b4add000
[  396.231394] amdgpu:   remove VA 0x7f76b4ada000 - 0x7f76b4add000 in entry 0000000085c24a47
[  396.231408] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  396.231445] #PF: supervisor read access in kernel mode
[  396.231466] #PF: error_code(0x0000) - not-present page
[  396.231484] PGD 0 P4D 0
[  396.231495] Oops: 0000 [#1] SMP NOPTI
[  396.231509] CPU: 7 PID: 1352 Comm: clinfo Tainted: G           OE     5.11.0-rc2-custom #1
[  396.231537] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS WCD0401N_Weekly_20_04_0 04/01/2020
[  396.231563] RIP: 0010:drm_gem_object_free+0xc/0x22 [drm]
[  396.231606] Code: eb ec 48 89 c3 eb e7 0f 1f 44 00 00 55 48 89 e5 48 8b bf 00 06 00 00 e8 72 0d 01 00 5d c3 0f 1f 44 00 00 48 8b 87 40 01 00 00 <48> 8b 00 48 85 c0 74 0b 55 48 89 e5 e8 54 37 7c db 5d c3 0f 0b c3
[  396.231666] RSP: 0018:ffffb4704177fcf8 EFLAGS: 00010246
[  396.231686] RAX: 0000000000000000 RBX: ffff993a0d0cc400 RCX: 0000000000003113
[  396.231711] RDX: 0000000000000001 RSI: e9cda7a5d0791c6d RDI: ffff993a333a9058
[  396.231736] RBP: ffffb4704177fdd0 R08: ffff993a03855858 R09: 0000000000000000
[  396.231761] R10: ffff993a0d1f7158 R11: 0000000000000001 R12: 0000000000000000
[  396.231785] R13: ffff993a0d0cc428 R14: 0000000000003000 R15: ffffb4704177fde0
[  396.231811] FS:  00007f76b5730740(0000) GS:ffff993b275c0000(0000) knlGS:0000000000000000
[  396.231840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.231860] CR2: 0000000000000000 CR3: 000000016d2e2000 CR4: 0000000000350ee0
[  396.231885] Call Trace:
[  396.231897]  ? amdgpu_amdkfd_gpuvm_free_memory_of_gpu+0x24c/0x25f [amdgpu]
[  396.232056]  ? __dynamic_dev_dbg+0xcd/0x100
[  396.232076]  kfd_ioctl_free_memory_of_gpu+0x91/0x102 [amdgpu]
[  396.232214]  kfd_ioctl+0x211/0x35b [amdgpu]
[  396.232341]  ? kfd_ioctl_get_queue_wave_state+0x52/0x52 [amdgpu]

Fixes: 246cb7e49a ("drm/amdgpu: Introduce GEM object functions")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Changfeng <changzhu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-02 17:33:34 -05:00
Huang Rui
89fa15ecdc drm/amdgpu: fix the issue that retry constantly once the buffer is oversize
We cannot modify initial_domain every time while the retry starts. That
will cause the busy waiting that unable to switch to GTT while the vram
is not enough.

Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-02 17:32:06 -05:00
Christian König
dd017d01c3 drm/amdgpu: enable freesync for A+A configs
Some newer APUs can scanout directly from GTT, that saves us from
allocating another bounce buffer in VRAM and enables freesync in such
configurations.

Without this patch creating a framebuffer from the imported BO will
fail and userspace will fall back to a copy.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:06:54 -05:00
chen gong
2cb96b2387 drm/amdgpu/gfx10: update CGTS_TCC_DISABLE and CGTS_USER_TCC_DISABLE register offsets for VGH
For Vangogh:
The offset of the CGTS_TCC_DISABLE is 0x5006 by calculation.
The offset of the CGTS_USER_TCC_DISABLE is 0x5007 by calculation.

Signed-off-by: chen gong <curry.gong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:06:29 -05:00
xinhui pan
e1a4b67aac drm/amdgpu: Fix a false positive when pin non-VRAM memory
Flag TTM_PL_FLAG_CONTIGUOUS is only valid for VRAM domain. So fix the
false positive by checking memory type too.

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:06:21 -05:00
Ramesh Errabolu
b131c363c8 drm/amdgpu: Limit the maximum size of contiguous VRAM that can be encapsulated by an instance of DRM memory node
[Why]
Enable 1:1 mapping between VRAM of a DRM node and a scatterlist node

[How]
Ensure construction of DRM node to not exceed specified limit

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:05:29 -05:00
Huang Rui
875440fd7d drm/amdkfd: fix null pointer panic while free buffer in kfd
In drm_gem_object_free, it will call funcs of drm buffer obj. So
kfd_alloc should use amdgpu_gem_object_create instead of
amdgpu_bo_create to initialize the funcs as amdgpu_gem_object_funcs.

[  396.231390] amdgpu: Release VA 0x7f76b4ada000 - 0x7f76b4add000
[  396.231394] amdgpu:   remove VA 0x7f76b4ada000 - 0x7f76b4add000 in entry 0000000085c24a47
[  396.231408] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  396.231445] #PF: supervisor read access in kernel mode
[  396.231466] #PF: error_code(0x0000) - not-present page
[  396.231484] PGD 0 P4D 0
[  396.231495] Oops: 0000 [#1] SMP NOPTI
[  396.231509] CPU: 7 PID: 1352 Comm: clinfo Tainted: G           OE     5.11.0-rc2-custom #1
[  396.231537] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS WCD0401N_Weekly_20_04_0 04/01/2020
[  396.231563] RIP: 0010:drm_gem_object_free+0xc/0x22 [drm]
[  396.231606] Code: eb ec 48 89 c3 eb e7 0f 1f 44 00 00 55 48 89 e5 48 8b bf 00 06 00 00 e8 72 0d 01 00 5d c3 0f 1f 44 00 00 48 8b 87 40 01 00 00 <48> 8b 00 48 85 c0 74 0b 55 48 89 e5 e8 54 37 7c db 5d c3 0f 0b c3
[  396.231666] RSP: 0018:ffffb4704177fcf8 EFLAGS: 00010246
[  396.231686] RAX: 0000000000000000 RBX: ffff993a0d0cc400 RCX: 0000000000003113
[  396.231711] RDX: 0000000000000001 RSI: e9cda7a5d0791c6d RDI: ffff993a333a9058
[  396.231736] RBP: ffffb4704177fdd0 R08: ffff993a03855858 R09: 0000000000000000
[  396.231761] R10: ffff993a0d1f7158 R11: 0000000000000001 R12: 0000000000000000
[  396.231785] R13: ffff993a0d0cc428 R14: 0000000000003000 R15: ffffb4704177fde0
[  396.231811] FS:  00007f76b5730740(0000) GS:ffff993b275c0000(0000) knlGS:0000000000000000
[  396.231840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.231860] CR2: 0000000000000000 CR3: 000000016d2e2000 CR4: 0000000000350ee0
[  396.231885] Call Trace:
[  396.231897]  ? amdgpu_amdkfd_gpuvm_free_memory_of_gpu+0x24c/0x25f [amdgpu]
[  396.232056]  ? __dynamic_dev_dbg+0xcd/0x100
[  396.232076]  kfd_ioctl_free_memory_of_gpu+0x91/0x102 [amdgpu]
[  396.232214]  kfd_ioctl+0x211/0x35b [amdgpu]
[  396.232341]  ? kfd_ioctl_get_queue_wave_state+0x52/0x52 [amdgpu]

Fixes: 246cb7e49a ("drm/amdgpu: Introduce GEM object functions")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Changfeng <changzhu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 10:47:47 -05:00
Huang Rui
c5f85696cb drm/amdgpu: fix the issue that retry constantly once the buffer is oversize
We cannot modify initial_domain every time while the retry starts. That
will cause the busy waiting that unable to switch to GTT while the vram
is not enough.

Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 10:47:47 -05:00
Ori Messinger
d26bbbcc16 amdgpu: Add Missing Sienna Cichlid DID
The purpose of this patch is to add a missing device ID for Sienna Cichlid.
The missing ID "0x73A1" is now added to the "amdgpu_drv.c" file.

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-01 11:51:46 -05:00
Luben Tuikov
a6a1f036c7 drm/scheduler: Job timeout handler returns status (v3)
This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
    timeout callback method.

v3: Return scheduler/device information, rather
    than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/
2021-01-29 11:30:22 +01:00
Lang Yu
cd63989e0e drm/amd/amdkfd: adjust dummy functions' placement
Move all the dummy functions in amdgpu_amdkfd.c to
amdgpu_amdkfd.h as inline functions.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-28 14:58:27 -05:00
Alex Deucher
33cf440d59 drm/amdgpu: disable gpu reset on Vangogh for now
Until the issues in the SMU firmware are fixed.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
2021-01-28 14:58:10 -05:00
Bjorn Helgaas
10e927249c ACPI: Test for ACPI_SUCCESS rather than !ACPI_FAILURE
The double negative makes it hard to read "if (!ACPI_FAILURE(status))".
Replace it with "if (ACPI_SUCCESS(status))".

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-27 18:43:07 +01:00
Colin Ian King
5993e79398 drm/amdgpu: Fix masking binary not operator on two mask operations
Currently the ! operator is incorrectly being used to flip bits on
mask values. Fix this by using the bit-wise ~ operator instead.

Addresses-Coverity: ("Logical vs. bitwise operator")
Fixes: 3c9a7b7d6e ("drm/amdgpu: update mmhub mgcg&ls for mmhub_v2_3")
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:47:05 -05:00
Jingwen Chen
64dcf2f01d drm/amd/amdgpu: add error handling to amdgpu_virt_read_pf2vf_data
[Why]
when vram lost happened in guest, try to write vram can lead to
kernel stuck.

[How]
When the readback data is invalid, don't do write work, directly
reschedule a new work.

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu<monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:46:48 -05:00
Horace Chen
91fb309d82 drm/amdgpu: race issue when jobs on 2 ring timeout
Fix a racing issue when jobs on 2 rings timeout simultaneously.

If 2 rings timed out at the same time, the
amdgpu_device_gpu_recover will be reentered. Then the
adev->gmc.xgmi.head will be grabbed by 2 local linked list,
which may cause wild pointer issue in iterating.

lock the device earily to prevent the node be added to 2
different lists.

also increase karma for the skipped job since the job is also
timed out and should be guilty.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:45:16 -05:00
Felix Kuehling
eda1068dc9 drm/amdgpu: Make contiguous pinning optional
Enable pinning of VRAM without forcing it to be contiguous. When memory is
already pinned, make sure it's contiguous if requested.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:45:10 -05:00
Aaron Liu
39263a2f88 drm/amdgpu: update mmhub mgcg&ls for mmhub_v2_3
Starting from vangogh, the ATCL2 and DAGB0 registers relative
to mgcg/ls has changed.

For MGCG:
Replace mmMM_ATC_L2_MISC_CG with mmMM_ATC_L2_CGTT_CLK_CTRL.

For MGLS:
Replace mmMM_ATC_L2_MISC_CG with mmMM_ATC_L2_CGTT_CLK_CTRL.
Add DAGB0_(WR/RD)_CGTT_CLK_CTRL registers.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-21 10:46:05 -05:00
Jinzhou Su
8f0d60fe8b drm/amdgpu: modify GCR_GENERAL_CNTL for Vangogh
GCR_GENERAL_CNTL is defined differently in gc_10_1_0_offset.h and
gc_10_3_0_offset.h. Update GCR_GENERAL_CNTL for Vangogh.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-21 10:46:05 -05:00