Commit Graph

8299 Commits

Author SHA1 Message Date
Jonathan Kim
576e0ec26b drm/amdgpu: fix xgmi perfmon a-b-a problem
Mapping hw counters per event config will cause ABA problems so map per
event instead.

v2: Discontinue starting perf counters if add fails.  Make it clear what's
happening with pmc_start.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07 14:44:16 -04:00
Christian König
4ce032d64c drm/ttm: nuke ttm_bo_evict_mm and rename mgr function v3
Make it more clear what the resource manager function
does and nuke the wrapper function.

v2: nuke the wrapper
v3: fix typo in radeon, rebased

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Link: https://patchwork.freedesktop.org/patch/393914/
2020-10-07 13:53:08 +02:00
Huang Rui
894052d641 drm/amdgpu: add van gogh pci id
Add Van Gogh PCI id support.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:56 -04:00
Huang Rui
ac0dc4c5a0 drm/amdgpu: enable gfx clock gating and power gating for vangogh
This patch is to enable the gfx cg and pg for vangogh.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Huang Rui
3eb4c56422 drm/amdgpu: add gfx power gating for gfx10
This patch adds power gating handler for gfx10.

v2: simplify function

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Alex Deucher
682b1f4c03 drm/amdgpu/mmhub2.3: print client id string for mmhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Alex Deucher
8bb3aa1a83 drm/amdgpu: IP discovery table is not ready yet for VG
Fallback to legacy path for now.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Huang Rui
8447675327 drm/amdgpu: disable gfxoff on vangogh for the moment (v2)
GFXOFF will be enabled once it's verified on real asic.

v2: move check into gfx10 module.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Huang Rui
ed3b735332 drm/amdgpu: enable psp support for vangogh
This patch is to enable psp support for vangogh

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Huang Rui
5120cb5409 drm/amdgpu: add TOC firmware support for apu (v3)
APU needs load toc firmware for gfx10 series on psp front door loading.

v2: rebase against latest code
v3: clarify error message

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Huang Rui
c821e0fbb2 drm/amdgpu: add smu ip block for vangogh
This patch is to add ip block for vangogh.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Huang Rui
a7e91bd718 drm/amdgpu: add nbio v7.2 for vangogh (v2)
VanGogh uses nbio v7.2, and a couple of offsets are changed since nbio
v2.3 for navi series, so add new nbio v7.2 block.

v2: squash in fix for sdma and vcn instances

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Huang Rui
5de54343d5 drm/amdgpu: add pcie port indirect read and write on nv
This patch is to add pcie port indirect read/write callback for nv
series. They will be used for new asic.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Thong Thai
b4e532d678 drm/amdgpu: enable vcn3.0 for van gogh
Same as other VCN 3.0 asics.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Huang Rui
4d8d75a45c drm/amdgpu: add mmhub v2.3 for vangogh (v4)
There are too many register offset mismatch between mmhub v2.0 and v2.3.

E.X:

mmMM_ATC_L2_MISC_CG:  0x064a(v2.0)  0x06cd(v2.3)
mmMMVM_L2_PROTECTION_FAULT_CNTL: 0x0688(v2.0) 0x0708(v2.3)
mmMMVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_LO32: 0x072b(v2.0) 0x0940(v2.3)
mmMMVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_HI32: 0x072c(v2.0) 0x0941(v2.3)
mmMMVM_INVALIDATE_ENG0_REQ: 0x06e3(v2.0) 0x0a01(v2.3)
mmMMVM_INVALIDATE_ENG0_ACK: 0x06f5(v2.0) 0x0a02(v2.3)
mmMMVM_CONTEXT0_CNTL: 0x06c0(v2.0) 0x0740(v2.3)
mmMMVM_L2_PROTECTION_FAULT_STATUS: 0x068c(v2.0) 0x070c(v2.3)
mmMMVM_L2_PROTECTION_FAULT_CNTL: 0x0688(v2.0) 0x0708(v2.3)
mmMM_ATC_L2_MISC_CG: 0x064a(v2.0) 0x06cd(v2.3)
mmDAGB0_CNTL_MISC2: 0x0071(v2.0) 0x0096(v2.3)
...

Continuing using the same file mmhub v2.0 is not good choice, it will
introduce a lot of checking with ASIC types. And also easy to introduce the
issues that offset not align, this kind of issues are really hard to find. Van
Gogh's mmhub vm invalidation is actually caused by the offset mismatch as well.

So it would like to create a new file rather than stick to re-use orignal mmhub
v2.0 here.

v2: add missed translate_further programming.
v3: sync with latest code
v4: add missing callbacks

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Huang Rui
88edbad6ed drm/amdgpu: set ip blocks for van gogh
Enable ip blocks for van gogh.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Huang Rui
54c98eacf3 drm/amdgpu: add sdma support for van gogh
This patch adds the sdma v5.2 support for van gogh.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Alex Deucher
1ec743ac9f drm/amdgpu/gfx10: add updated register offsets for VGH
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:27 -04:00
Huang Rui
ad088550d2 drm/amdgpu: add gfx golden settings for vangogh (v3)
This patch is to add gfx golden settings for vangogh post si.

v2: squash in updates
v3: fix SPI register offset

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:26 -04:00
Huang Rui
6c266fb56c drm/amdgpu: add gfx support for van gogh (v3)
Add van gogh checks to gfx10 code.

v2: squash in fixes
v3: fix mode

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:51 -04:00
Huang Rui
b0ebc8e944 drm/amdgpu: set fw load type for van gogh
This patch sets fw load type as direct for van gogh for the moment.
Will switch to psp when psp is ready.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:03 -04:00
Huang Rui
6405e627a0 drm/amdgpu: add gmc v10 supports for van gogh (v4)
Add gfx memory controller support for van gogh.

v2: don't use dynamic invalidate eng allocation for van gogh.
v3: squash in other fixes
v4: rebase

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:03 -04:00
Huang Rui
15c90a1fbc drm/amdgpu: get the correct vram type for van gogh
This patch is to get the correct vram type from atombios for van gogh.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:03 -04:00
Huang Rui
bf13cb1f46 drm/amdgpu: use gpu virtual address for interrupt packet write space for vangogh
The interrupts are not stable while uses guest physical address (GPA)
for interrupt packet write space even on direct loading case.

v2: make condition more readable

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:03 -04:00
Huang Rui
bd4f28117e drm/amdgpu: add van gogh support for ih block
This patch adds the support for van gogh ih block.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:03 -04:00
Huang Rui
fced3c3a46 drm/amdgpu: skip sdma1 in nv_allowed_read_registers list for van gogh (v2)
Van gogh only has one sdma.

v2: use num_instances rather than APU flag

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:03 -04:00
Huang Rui
026570e633 drm/amdgpu: add nv common ip block support for van gogh
This patch adds common ip support for van gogh.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:02 -04:00
Huang Rui
1f9dab43c2 drm/amdgpu: add vangogh_reg_base_init function for van gogh
This patch adds vangogh_reg_base_init function to init the register base for
van gogh.

v2: make vangogh_reg_base_init void, align equality sign

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:02 -04:00
Huang Rui
4e52a9f8d5 drm/amdgpu: add van gogh support for gpu_info and ip block setting
This patch adds van gogh support for gpu_info firmware and ip block setting.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:02 -04:00
Huang Rui
4f1e9a76bd drm/amdgpu: add van gogh asic_type enum (v2)
This patch adds van gogh to amd_asic_type enum and amdgpu_asic_name[].

v2: add missing comma

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:14:02 -04:00
Alex Sierra
79b1eca0e4 drm/amdgpu: align frag_end to covered address space
align frag_end to the next pd when there are no
page table entries on the current pde.
This fixes invalidation of larger address space areas
where some page tables are allocated and other aren't.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:13:28 -04:00
Dirk Gouders
2ae7870804 drm/amdgpu: fix NULL pointer dereference for Renoir
Commit c1cf79ca5c ("drm/amdgpu: use IP discovery table for renoir")
introduced a NULL pointer dereference when booting with
amdgpu.discovery=0, because it removed the call of vega10_reg_base_init()
for that case.

Fix this by calling that funcion if amdgpu_discovery == 0 in addition to
the case that amdgpu_discovery_reg_base_init() failed.

Fixes: c1cf79ca5c ("drm/amdgpu: use IP discovery table for renoir")
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:12:46 -04:00
Hawking Zhang
346dbbb8f7 drm/amdgpu: enable GDDR6 save-restore support for navy_flounder
add mp0 11_0_11 for navy_flounder to the mem training
supported list, otherwise the modeprobe would fail
on navy_flounder with latest vbios.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01 10:43:02 -04:00
Hawking Zhang
f7ee1874b0 drm/amdgpu: support indirect access reg outside of mmio bar (v2)
support both direct and indirect accessor in unified
helper functions.

v2: Retire indirect mmio access via mm_index/data

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01 10:42:55 -04:00
Hawking Zhang
705a2b5ba0 drm/amdgpu: switch to indirect reg access helper
Switch WREG32/RREG32_PCIE to use indirect reg access
helper for soc15 and onwards

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01 10:42:48 -04:00
Hawking Zhang
1bba36834c drm/amdgpu: add helper function for indirect reg access (v3)
Add helper function in order to remove RREG32/WREG32
in current pcie_rreg/wreg function for soc15 and
onwards adapters.
PCIE_INDEX/DATA pairs are used to access regsiters
outside of mmio bar in the helper functions.
The new helper functions help remove the recursion
of amdgpu_mm_rreg/wreg from pcie_rreg/wreg and
provide the oppotunity to centralize direct and
indirect access in a single function.

v2: Fixed typo and refine the comments

v3: Remove unnecessary volatile local variable

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01 10:42:13 -04:00
Ramesh Errabolu
43a4bc828c drm/amd/amdgpu: Define and implement a function that collects number of
waves that are in flight.

[Why]
Allow user to know how many compute units (CU) are in use at any given
moment.

[How]
Read registers of SQ that give number of waves that are in flight
of various queues. Use this information to determine number of CU's
in use.

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 15:26:27 -04:00
Jiansong Chen
39ad082459 drm/amdgpu: disable gfxoff temporarily for navy_flounder
gfxoff is temporarily disabled for navy_flounder, since
at present the feature caused some tdr when performing
display operations.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:53:21 -04:00
Guchun Chen
4a20300bc2 drm/amdgpu: drop duplicated ecc check for vega10 (v5)
The same ECC check has been executed in amdgpu_ras_init for vega10,
prior to gmc_v9_0_late_init.

v2: drop all atombios helper callings
v3: use bit operation
v4: correct inline comment, remove parity check statement
v5: squash in build fix

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:53:21 -04:00
Oak Zeng
8ffff9b449 drm/amdgpu: use function pointer for gfxhub functions
gfxhub functions are now called from function pointers,
instead of from asic-specific functions.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:50:13 -04:00
Ramesh Errabolu
825c91d090 drm/amd/amdgpu: Prepare implementation to support reporting of CU usage
[Why]
Allow user to know number of compute units (CU) that are in use at any
given moment.

[How]
Read registers of SQ that give number of waves that are in flight
of various queues. Use this information to determine number of CU's
in use.

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:50:06 -04:00
Ramesh Errabolu
b8810a142a drm/amd/amdgpu: Clean up header file of symbols that are defined to be static
[Why]
Header file exports functions get_gpu_clock_counter(), get_cu_info() and
select_se_sh() that are defined to be static

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:49:44 -04:00
Jiansong Chen
95433a1305 drm/amdgpu: disable gfxoff temporarily for navy_flounder
gfxoff is temporarily disabled for navy_flounder, since
at present the feature caused some tdr when performing
display operations.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 09:47:43 -04:00
Jean Delvare
a39d0d7bdf drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
A recent attempt to fix a ref count leak in
amdgpu_display_crtc_set_config() turned out to be doing too much and
"fixed" an intended decrease as if it were a leak. Undo that part to
restore the proper balance. This is the very nature of this function
to increase or decrease the power reference count depending on the
situation.

Consequences of this bug is that the power reference would
eventually get down to 0 while the display was still in use,
resulting in that display switching off unexpectedly.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: e008fa6fb4 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config")
Cc: stable@vger.kernel.org
Cc: Navid Emamdoost <navid.emamdoost@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:09:22 -04:00
Jiansong Chen
0c7014154d drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc.
Remove gpu_info fw support for sienna_cichlid etc., since the
information can be retrieved from discovery binary.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:07:06 -04:00
Kent Russell
f94582e4bc drm/amdgpu: Use SKU instead of DID for FRU check v2
The VG20 DIDs 66a0, 66a1 and 66a4 are used for various SKUs that may or may
not have the FRU EEPROM on it. Parse the VBIOS to check for server SKU
variants (D131 or D134) until a more general solution can be determined.

v2: Remove string-based logic, correct the VBIOS string comment

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:14:09 -04:00
Shashank Sharma
72e71a82d6 drm/amdgpu: add new trace event for page table update
This patch adds a new trace event to track the PTE update
events. This specific event will provide information like:
- start and end of virtual memory mapping
- HW engine flags for the map
- physical address for mapping

This will be particularly useful for memory profiling tools
(like RMV) which are monitoring the page table update events.

V2: Added physical address lookup logic in trace point
V3: switch to use __dynamic_array
    added nptes int the TPprint arguments list
    added page size in the arg list
V4: Addressed Christian's review comments
    add start/end instead of seg
    use incr instead of page_sz to be accurate
V5: Addressed Christian's review comments:
    add pid and vm context information in the event
V6: Re-sequence the variables (put pid and ctx_id first)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:12:02 -04:00
Guchun Chen
125b1deb60 drm/amdgpu: fix incorrect comment
It should be one copy-paste typo.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:50 -04:00
Jean Delvare
3514521ccb drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
A recent attempt to fix a ref count leak in
amdgpu_display_crtc_set_config() turned out to be doing too much and
"fixed" an intended decrease as if it were a leak. Undo that part to
restore the proper balance. This is the very nature of this function
to increase or decrease the power reference count depending on the
situation.

Consequences of this bug is that the power reference would
eventually get down to 0 while the display was still in use,
resulting in that display switching off unexpectedly.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: e008fa6fb4 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config")
Cc: stable@vger.kernel.org
Cc: Navid Emamdoost <navid.emamdoost@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:45 -04:00
Christian König
d3ef581afa drm/amdgpu: stop using TTMs fault callback
Implement the fault handler ourself using the provided TTM functions.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/392324/
2020-09-28 12:37:30 +02:00
Alex Deucher
a069a9eb73 drm/amdgpu: fix a warning in amdgpu_ras.c (v2)
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_fs_init’:
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1284:2: warning: ignoring return value of ‘sysfs_create_group’, declared with attribute warn_unused_result [-Wunused-result]
 1284 |  sysfs_create_group(&adev->dev->kobj, &group);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

v2: just print an error for sysfs group creation failure

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 17:03:22 -04:00
Guchun Chen
c3d4d45db2 drm/amdgpu: clean up ras sysfs creation (v2)
Merge ras sysfs creation together by calling sysfs_create_group
once, as sysfs_update_group may not work properly as expected.

v2: improve commit message

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 17:03:22 -04:00
Tiecheng Zhou
b602ca5f31 drm/amdgpu: stop data_exchange work thread before reset
In FLR routine, init_data_exchange is called at reset_sriov
while fini_data_exchange is not. This will duplicating work
thread.

So call fini_data_exchange before reset for SRIOV

Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 17:03:22 -04:00
Bokun Zhang
519b8b76f0 drm/amdgpu: Implement new guest side VF2PF message transaction (v2)
- Refactor the driver code to use amdgpu_virt_read_pf2vf_data
  and amdgpu_virt_write_vf2pf_data instead of writing all code in
  one function (which is the old amdgpu_virt_init_data_exchange)

- Adding a new transaction method for VF2PF message between host
  and guest driver. Guest side will periodically update VF2PF
  message in the framebuffer.

  In the new header, we include guest ucode information, guest
  framebuffer usage, and engine usage

- Clean up the old macros since they will cause compile error if
  the new transaction method is used

v2: squash in build fix

Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 17:03:22 -04:00
Bokun Zhang
1721bc1b2a drm/amdgpu: Update VF2PF interface
- Update guest side VF2PF interface header file

Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:55:44 -04:00
John Clements
265c280a48 drm/amdgpu: disable sienna chichlid UMC RAS
disable UMC RAS in lieu of stability issues on certain sku

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:55:26 -04:00
Alex Deucher
d5cc02d97a drm/amdgpu: add an auto setting to the noretry parameter
This allows us to set different defaults on a per asic basis.  This
way we can enable noretry on dGPUs where it can increase performance
in certain cases and disable it on chips where it can be problematic.

For now the default is 0 for all asics, but we may want to try and
enable it again for newer dGPUs.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:55:21 -04:00
Alex Deucher
9b498efae2 drm/amdgpu: store noretry parameter per driver instance
This will allow us to have different defaults per asic
in a future patch.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:55:16 -04:00
Emily.Deng
884dcf3c87 drm/amdgpu: Remove some useless code
Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:54:16 -04:00
Jingwen Chen
162b786f0f drm/amd: Skip not used microcode loading in SRIOV
smc, sdma, sos, ta and asd fw is not used in SRIOV. Skip them to
accelerate sw_init for navi12.

v2: skip above fw in SRIOV for vega10 and sienna_cichlid
v3: directly skip psp fw loading in SRIOV
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by:  Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:54:00 -04:00
Jiansong Chen
84d244a364 drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc.
Remove gpu_info fw support for sienna_cichlid etc., since the
information can be retrieved from discovery binary.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:53:17 -04:00
Thomas Zimmermann
246cb7e49a drm/amdgpu: Introduce GEM object functions
GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in amdgpu. The only exception is gem_prime_mmap,
which is non-trivial to convert.

v3:
	* remove amdgpu_object.c from patch (Christian)
v2:
	* move object-function instance to amdgpu_gem.c (Christian)
	* set callbacks in amdgpu_gem_object_create() (Christian)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200923102159.24084-2-tzimmermann@suse.de
2020-09-25 09:19:42 +02:00
Dave Airlie
3a08446b31 drm/amdgpu/ttm: handle tt moves properly.
The core move code currently handles use_tt moves, for amdgpu
this was being handled also in the driver, but not using the same
paths.

If moving between TT/SYSTEM (all the use_tt paths on amdgpu) use
the core move function.

Eventually the core will be flipped over to calling the driver.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924051845.397177-4-airlied@gmail.com
2020-09-25 05:48:00 +10:00
Christian König
4671078eb8 drm/amdgpu: switch over to the new pin interface
Stop using TTM_PL_FLAG_NO_EVICT.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/391617/?series=81973&rev=1
2020-09-24 16:16:50 +02:00
Dave Airlie
6ea6be7708 drm-misc-next for 5.10:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - dev: More devm_drm convertions and removal of drm_dev_init
 
 Driver Changes:
   - i915: selftests improvements
   - panfrost: support for Amlogic SoC
   - vc4: one fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX2jGxQAKCRDj7w1vZxhR
 xR3DAQCiZOnaxVcY49iG4343Z1aHHaIEShbnB0bDdaWstn7kiQD/UXBXUoOSFoFQ
 FkTsW31JsdXNnWP5e6/eJd2Lb6waVAA=
 =VlsU
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-09-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.10:

UAPI Changes:

Cross-subsystem Changes:
  - virtio: Merged a PR for patches that will affect drm/virtio

Core Changes:
  - dev: More devm_drm convertions and removal of drm_dev_init
  - atomic: Split out drm_atomic_helper_calc_timestamping_constants of
    drm_atomic_helper_update_legacy_modeset_state
  - ttm: More rework

Driver Changes:
  - i915: selftests improvements
  - panfrost: support for Amlogic SoC
  - vc4: one fix
  - tree-wide: conversions to devm_drm_dev_alloc,
  - ast: simplifications of the atomic modesetting code
  - panfrost: multiple fixes
  - vc4: multiple fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921152956.2gxnsdgxmwhvjyut@gilmour.lan
2020-09-23 09:52:24 +10:00
Bernard Zhao
f349f772b0 drm/amd: fix typoes in comments
Change the comment typo: "programm" -> "program".

Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 17:37:38 -04:00
Stanley.Yang
78f0aef11f drm/amdgpu: fix hdp register access error
mmHDP_READ_CACHE_INVALIDATE register is in HDP not in NBIO

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 17:37:38 -04:00
Stanley.Yang
3f975d0f71 drm/amdgpu: update athub interrupt harvesting handle
GCEA/MMHUB EA error should not result to DF freeze, this is
fixed in next generation, but for some reasons the GCEA/MMHUB
EA error will result to DF freeze in previous generation,
diver should avoid to indicate GCEA/MMHUB EA error as hw fatal
error in kernel message by read GCEA/MMHUB err status registers.

Changed from V1:
    make query_ras_error_status function more general
    make read mmhub er status register more friendly

Changed from V2:
    move ras error status query function into do_recovery workqueue

Changed from V3:
    remove useless code from V2, print GCEA error status
    instance number

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 17:37:38 -04:00
Liu Shixin
c24a3c0505 drm/amdgpu/gmc9: simplify the return expression of gmc_v9_0_suspend
Simplify the return expression.

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 17:37:37 -04:00
Qinglang Miao
da51e50d45 drm/amdgpu: simplify the return expression
Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 17:37:37 -04:00
Qinglang Miao
d94c8250c6 drm/amdgpu/mes: simplify the return expression of mes_v10_1_ring_init
Simplify the return expression.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 17:37:37 -04:00
Emily.Deng
36499e4c77 drm/amdgpu: Fix dead lock issue for vblank
Always start vblank timer, but only calls vblank function
when vblank is enabled.

This is used to fix the dead lock issue.
When drm_crtc_vblank_off want to disable vblank,
it first get event_lock, and then call hrtimer_cancel,
but hrtimer_cancel want to wait timer handler function finished.
Timer handler also want to aquire event_lock in drm_handle_vblank.

Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 12:25:15 -04:00
Felix Kuehling
c7651b7358 drm/amdgpu: Fix handling of KFD initialization failures
Remember KFD module initializaton status in a global variable. Skip KFD
device probing when the module was not initialized. Other amdgpu_amdkfd
calls are then protected by the adev->kfd.dev check.

Also print a clear error message when KFD disables itself. Amdgpu
continues its initialization even when KFD failed.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22 12:24:11 -04:00
Luben Tuikov
df2ce4596c drm/amdgpu: Convert to using devm_drm_dev_alloc() (v2)
Convert to using devm_drm_dev_alloc(),
as drm_dev_init() is going away.

v2: Remove drm_dev_put() since
    a) devres doesn't do refcounting, see
    Documentation/driver-api/driver-model/devres.rst,
    Section 4, paragraph 1; and since
    b) devres acts as garbage collector when
    the DRM device's parent's devres "action" callback
    is called to free the container device (amdgpu_device),
    which embeds the DRM dev.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918132505.2316382-4-daniel.vetter@ffwll.ch
2020-09-21 10:44:46 +02:00
Alex Deucher
b4ebd0827f drm/amdgpu: remove experimental flag from navi12
Navi12 has worked fine for a while now.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 23:06:58 -04:00
Likun Gao
fc08ce66c0 drm/amdgpu: add device ID for sienna_cichlid (v2)
Add device ID for sienna_cichlid.

v2: squash in additional device ids.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 23:06:30 -04:00
Alex Deucher
8a410da6aa drm/amdgpu: use the AV1 defines for VCN 3.0
Switch from magic numbers to defines for AV1 clockgating.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 23:06:10 -04:00
Philip Yang
1d0e16ac1a drm/amdgpu: prevent double kfree ttm->sg
Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace:

[  420.932812] kernel BUG at
/build/linux-do9eLF/linux-4.15.0/mm/slub.c:295!
[  420.934182] invalid opcode: 0000 [#1] SMP NOPTI
[  420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE
[  420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS
1.5.4 07/09/2020
[  420.952887] RIP: 0010:__slab_free+0x180/0x2d0
[  420.954419] RSP: 0018:ffffbe426291fa60 EFLAGS: 00010246
[  420.955963] RAX: ffff9e29263e9c30 RBX: ffff9e29263e9c30 RCX:
000000018100004b
[  420.957512] RDX: ffff9e29263e9c30 RSI: fffff3d33e98fa40 RDI:
ffff9e297e407a80
[  420.959055] RBP: ffffbe426291fb00 R08: 0000000000000001 R09:
ffffffffc0d39ade
[  420.960587] R10: ffffbe426291fb20 R11: ffff9e49ffdd4000 R12:
ffff9e297e407a80
[  420.962105] R13: fffff3d33e98fa40 R14: ffff9e29263e9c30 R15:
ffff9e2954464fd8
[  420.963611] FS:  00007fa2ea097780(0000) GS:ffff9e297e840000(0000)
knlGS:0000000000000000
[  420.965144] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  420.966663] CR2: 00007f16bfffefb8 CR3: 0000001ff0c62000 CR4:
0000000000340ee0
[  420.968193] Call Trace:
[  420.969703]  ? __page_cache_release+0x3c/0x220
[  420.971294]  ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.972789]  kfree+0x168/0x180
[  420.974353]  ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu]
[  420.975850]  ? kfree+0x168/0x180
[  420.977403]  amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.978888]  ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm]
[  420.980357]  ttm_tt_destroy.part.11+0x4f/0x60 [amdttm]
[  420.981814]  ttm_tt_destroy+0x13/0x20 [amdttm]
[  420.983273]  ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm]
[  420.984725]  ttm_bo_release+0x1c9/0x360 [amdttm]
[  420.986167]  amdttm_bo_put+0x24/0x30 [amdttm]
[  420.987663]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[  420.989165]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
[amdgpu]
[  420.990666]  kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 23:04:22 -04:00
Alex Deucher
d34c7b7b6b drm/amdgpu: remove experimental flag from navi12
Navi12 has worked fine for a while now.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 21:22:17 -04:00
Likun Gao
61278d14bb drm/amdgpu: add device ID for sienna_cichlid (v2)
Add device ID for sienna_cichlid.

v2: squash in additional device ids.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 18:01:59 -04:00
Alex Deucher
d9ed8cb5aa drm/amdgpu: use the AV1 defines for VCN 3.0
Switch from magic numbers to defines for AV1 clockgating.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 18:01:53 -04:00
Alex Deucher
4192f7b576 drm/amdgpu: unmap register bar on device init failure
We never unmapped the regiser BAR on failure.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 18:00:08 -04:00
Philip Yang
c8e74b17c1 drm/amdgpu: prevent double kfree ttm->sg
Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace:

[  420.932812] kernel BUG at
/build/linux-do9eLF/linux-4.15.0/mm/slub.c:295!
[  420.934182] invalid opcode: 0000 [#1] SMP NOPTI
[  420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE
[  420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS
1.5.4 07/09/2020
[  420.952887] RIP: 0010:__slab_free+0x180/0x2d0
[  420.954419] RSP: 0018:ffffbe426291fa60 EFLAGS: 00010246
[  420.955963] RAX: ffff9e29263e9c30 RBX: ffff9e29263e9c30 RCX:
000000018100004b
[  420.957512] RDX: ffff9e29263e9c30 RSI: fffff3d33e98fa40 RDI:
ffff9e297e407a80
[  420.959055] RBP: ffffbe426291fb00 R08: 0000000000000001 R09:
ffffffffc0d39ade
[  420.960587] R10: ffffbe426291fb20 R11: ffff9e49ffdd4000 R12:
ffff9e297e407a80
[  420.962105] R13: fffff3d33e98fa40 R14: ffff9e29263e9c30 R15:
ffff9e2954464fd8
[  420.963611] FS:  00007fa2ea097780(0000) GS:ffff9e297e840000(0000)
knlGS:0000000000000000
[  420.965144] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  420.966663] CR2: 00007f16bfffefb8 CR3: 0000001ff0c62000 CR4:
0000000000340ee0
[  420.968193] Call Trace:
[  420.969703]  ? __page_cache_release+0x3c/0x220
[  420.971294]  ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.972789]  kfree+0x168/0x180
[  420.974353]  ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu]
[  420.975850]  ? kfree+0x168/0x180
[  420.977403]  amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.978888]  ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm]
[  420.980357]  ttm_tt_destroy.part.11+0x4f/0x60 [amdttm]
[  420.981814]  ttm_tt_destroy+0x13/0x20 [amdttm]
[  420.983273]  ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm]
[  420.984725]  ttm_bo_release+0x1c9/0x360 [amdttm]
[  420.986167]  amdttm_bo_put+0x24/0x30 [amdttm]
[  420.987663]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[  420.989165]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
[amdgpu]
[  420.990666]  kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 17:56:38 -04:00
Luben Tuikov
5aea5327ea drm/amdgpu: No sysfs, not an error condition
Not being able to create amdgpu sysfs attributes
is not a fatal error warranting not to continue
to try to bring up the display. Thus, if we get
an error trying to create amdgpu sysfs attrs,
report it and continue on to try to bring up
a display.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Slava Abramov <slava.abramov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 17:56:31 -04:00
Jiansong Chen
24b763d0fb drm/amdgpu: declare ta firmware for navy_flounder
The firmware provided via MODULE_FIRMWARE appears in the
module information. External tools(eg. dracut) may use the
list of fw files to include them as appropriate in an initramfs,
thus missing declaration will lead to request firmware failure
in boot time.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 17:56:17 -04:00
Shirish S
0eaa801242 amdgpu/gmc_v9: Warn if SDPIF_MMIO_CNTRL_0 is not set
With IOMMU enabled, if SDPIF_MMIO_CNTRL_0 is not set
appropriately the system hangs without any trace
during S3.

To ease debug and to ensure that the failure, if any,
was caused by a race conditions that disabled write access to
SDPIF_MMIO_CNTRL_0 register, warn the user about it.

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 17:49:04 -04:00
Dave Airlie
e46f468fef drm/ttm: drop special pipeline accel cleanup function.
The two accel cleanup paths were mostly the same once refactored.

Just pass a bool to say if the evictions are to be pipelined.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917064132.148521-2-airlied@gmail.com
2020-09-18 06:23:06 +10:00
Dave Airlie
cae515f4a5 drm/ttm/drivers: call the bind function directly.
Now the bind functions have all the protection explicitly the
drivers can just call them directly, and the api can be unexported

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-5-airlied@gmail.com
2020-09-18 06:16:03 +10:00
Dave Airlie
37bff6542c drm/ttm: move unbind into the tt destroy.
This moves unbind into the driver side on destroy paths.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-4-airlied@gmail.com
2020-09-18 06:15:24 +10:00
Dave Airlie
7626168fd1 drm/ttm: flip tt destroy ordering.
Call the driver first and have it call the common code cleanup.

This is useful later to fix unbind.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-3-airlied@gmail.com
2020-09-18 06:14:41 +10:00
Dave Airlie
0b988ca1c7 drm/ttm: protect against reentrant bind in the drivers
This moves the generic tracking into the drivers and protects
against reentrancy in the drivers. It fixes up radeon and agp
to be able to query the bound status as that is required.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-2-airlied@gmail.com
2020-09-18 06:14:00 +10:00
Fenghua Yu
c7b6bac9c7 drm, iommu: Change type of pasid to u32
PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".

No PASID type change in uapi although it defines PASID as __u64 in
some places.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
2020-09-17 19:21:16 +02:00
Jiansong Chen
e60c27f1ff drm/amdgpu: declare ta firmware for navy_flounder
The firmware provided via MODULE_FIRMWARE appears in the
module information. External tools(eg. dracut) may use the
list of fw files to include them as appropriate in an initramfs,
thus missing declaration will lead to request firmware failure
in boot time.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-17 00:12:36 -04:00
Dave Airlie
9e9a153bdf drm/ttm: move ttm binding/unbinding out of ttm_tt paths.
Move these up to the bo level, moving ttm_tt to just being
backing store. Next step is to move the bound flag out.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-6-airlied@gmail.com
2020-09-16 09:35:30 +10:00
Dave Airlie
2040ec970e drm/ttm: split populate out from binding.
Drivers have to call populate themselves now before binding.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-5-airlied@gmail.com
2020-09-16 09:34:54 +10:00
Dave Airlie
7eec915138 drm/ttm/tt: add wrappers to set tt state.
This adds 2 getters and 4 setters, however unbound and populated
are currently the same thing, this will change, it also drops
a BUG_ON that seems not that useful.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-2-airlied@gmail.com
2020-09-16 09:33:24 +10:00
Andrey Grodzovsky
5367eb6d8a drm/amdgpu: Include sienna_cichlid in USBC PD FW support.
Create sysfs interface also for sienna_cichlid.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 18:25:40 -04:00
Alex Deucher
f4075be882 drm/amdgpu/gmc9: remove mmhub client duplicated case
Copy paste typo.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Alex Deucher
ea68573d40 drm/amdgpu: Fail to load on RAVEN if SME is active
Due to hardware bugs, scatter/gather display on raven requires
a 1:1 IOMMU mapping, however, SME (System Memory Encryption)
requires an indirect IOMMU mapping because the encryption bit
is beyond the DMA mask of the chip.  As such, the two are
incompatible.

Acked-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
724dc53b92 drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v4_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1003:4-9: WARNING: Comparison to bool
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1083:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
8f00d1fc9d drm/amd/amdgpu: fix comparison pointer to bool warning in amdgpu_atpx_handler.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:619:15-49: WARNING: Comparison to bool
drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:629:15-49: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
3d0c75afdc drm/amd/amdgpu: fix comparison pointer to bool warning in uvd_v6_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c:1243:14-25: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
e66cdf250e drm/amd/amdgpu: fix comparison pointer to bool warning in si.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/si.c:1342:5-10: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
4bbbe77c15 drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v5_2.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:562:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
960a06ff91 drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v5_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c:619:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
89cf8b0637 drm/amd/amdgpu: fix comparison pointer to bool warning in gfx_v10_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3563:5-31: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
7b3fa67d6e drm/amd/amdgpu: fix comparison pointer to bool warning in gfx_v9_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2805:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Jonathan Kim
7c679ef667 drm/amdgpu: stop resetting xgmi perfmons on disable
Disabling perf events does not specify reset in ABI so stop doing it in
hardware.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Oak Zeng
719a6513fb drm/amdgpu: More accurate description of a function param
Add more accurate description of the pe parameter of function
amdgpu_vm_sdma_udpate and amdgpu_vm_cpu_update

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Oak Zeng
91b5900507 drm/amdgpu: Add comment to function amdgpu_ttm_alloc_gart
Add comments to refect what function does

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Andrey Grodzovsky
ce87c98db4 drm/amdgpu: Include sienna_cichlid in USBC PD FW support.
Create sysfs interface also for sienna_cichlid.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Mukul Joshi
62f6b1162e drm/amdgpu: Enable SDMA utilization for Arcturus
SDMA utilization calculations are enabled/disabled by
writing to SDMAx_PUB_DUMMY_REG2 register. Currently,
enable this only for Arcturus.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Aurabindo Pillai
5d1c59c479 drm/amdgpu: Move existing pflip fields into separate struct
[Why&How]
To refactor DM IRQ management, all fields used by IRQ is best moved
to a separate struct so that main amdgpu_crtc struct need not be changed
Location of the new struct shall be in DM

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
John Clements
9c7e2ceb1d drm/amdgpu: Update RAS init handling
Output RAS init status

If RAS init fails, teardown RAS context

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Changfeng
f399d4de2d drm/amdgpu: add ta DTM/HDCP print in amdgpu_firmware_info for apu
It needs to add ta DTM/HDCP print to get HDCP/DTM version info when cat
amdgpu_firmware_info

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Andrey Grodzovsky
7cbbc745dc drm/amdgpu: Minor checkpatch fix
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:29 -04:00
Andrey Grodzovsky
6894305c97 drm/amdgpu: Disable DPC for XGMI for now.
XGMI support is more complicated than single device support as
questions of synchronization between the device recovering from
PCI error and other members of the hive are required.
Leaving this for next round.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:22 -04:00
Andrey Grodzovsky
7ac71382e9 drm/amdgpu: Trim amdgpu_pci_slot_reset by reusing code.
Reuse exsisting functions from GPU recovery to avoid code
duplications.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:16 -04:00
Andrey Grodzovsky
c1dd4aa624 drm/amdgpu: Fix consecutive DPC recovery failures.
Cache the PCI state on boot and before each case where we might
loose it.

v2: Add pci_restore_state while caching the PCI state to avoid
breaking PCI core logic for stuff like suspend/resume.

v3: Extract pci_restore_state from amdgpu_device_cache_pci_state
to avoid superflous restores during GPU resets and suspend/resumes.

v4: Style fixes.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:04 -04:00
Andrey Grodzovsky
362c7b91c1 drm/amdgpu: Fix SMU error failure
Wait for HW/PSP initiated ASIC reset to complete before
starting the recovery operations.

v2: Remove typo

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:55 -04:00
Andrey Grodzovsky
acd89fca67 drm/amdgpu: Block all job scheduling activity during DPC recovery
DPC recovery involves ASIC reset just as normal GPU recovery so block
SW GPU schedulers and wait on all concurrent GPU resets.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:48 -04:00
Andrey Grodzovsky
bf36b52e78 drm/amdgpu: Avoid accessing HW when suspending SW state
At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.

v2: Rename in_dpc to more meaningful name

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:39 -04:00
Andrey Grodzovsky
c9a6b82f45 drm/amdgpu: Implement DPC recovery
Add PCI Downstream Port Containment (DPC) with
basic recovery functionality

v2: remove pci_save_state to avoid breaking suspend/resume
v3: Fix style comments
v4: Improve description.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:32 -04:00
Liu ChengZhe
2a9787dcf5 drm/amdgpu: Do gpu recovery when no job is running
In function flr_work, we should do gpu recovery when no job
is running. Fix the logic by inverting it.

v2: modify the description

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:18 -04:00
Christian König
9c3006a4cc drm/ttm: remove available_caching
Instead of letting TTM make an educated guess based on
some mask all drivers should just specify what caching
they want for their CPU mappings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/390207/
2020-09-15 16:05:19 +02:00
Christian König
0fe438cec9 drm/ttm: remove default caching
As far as I can tell this was never used either and we just
always fallback to the order cached > wc > uncached anyway.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/390142/
2020-09-15 16:03:44 +02:00
Maxime Ripard
00af6729b5
Merge drm/drm-next into drm-misc-next
Paul Cercueil needs some patches in -rc5 to apply new patches for ingenic
properly.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-09-14 18:11:40 +02:00
Christian König
48e07c23cb drm/ttm: nuke memory type flags
It's not supported to specify more than one of those flags.
So it never made sense to make this a flag in the first place.

Nuke the flags and specify directly which memory type to use.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389826/?series=81551&rev=1
2020-09-11 13:31:23 +02:00
Gerd Hoffmann
707d561f77 drm: allow limiting the scatter list size.
Add drm_device argument to drm_prime_pages_to_sg(), so we can
call dma_max_mapping_size() to figure the segment size limit
and call into __sg_alloc_table_from_pages() with the correct
limit.

This fixes virtio-gpu with sev.  Possibly it'll fix other bugs
too given that drm seems to totaly ignore segment size limits
so far ...

v2: place max_segment in drm driver not gem object.
v3: move max_segment next to the other gem fields.
v4: just use dma_max_mapping_size().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20200907112425.15610-2-kraxel@redhat.com
2020-09-09 07:58:56 +02:00
Dave Airlie
877d8c0743 Merge tag 'topic/nouveau-i915-dp-helpers-and-cleanup-2020-08-31-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
UAPI Changes:

None

Cross-subsystem Changes:

* Moves a bunch of miscellaneous DP code from the i915 driver into a set
  of shared DRM DP helpers

Core Changes:

* New DRM DP helpers (see above)

Driver Changes:

* Implements usage of the aforementioned DP helpers in the nouveau
  driver, along with some other various HPD related cleanup for nouveau

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/11e59ebdea7ee4f46803a21fe9b21443d2b9c401.camel@redhat.com
2020-09-09 12:27:13 +10:00
Dave Airlie
5d26eba988 drm/amdgpu/ttm: move to driver backend binding funcs
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-9-airlied@gmail.com
2020-09-09 08:30:24 +10:00
Dave Airlie
ecfe6953fa drm/ttm: introduce ttm_bo_move_null
This pattern is cut-n-pasted across 4 drivers, switch it to
a WARN_ON instead, as BUG_ON is considered a bad idea usually.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-2-airlied@gmail.com
2020-09-09 08:28:53 +10:00
Christian König
54d04ea8cd drm/ttm: merge offset and base in ttm_bus_placement
This is used by TTM to communicate the physical address
which should be used with ioremap(), ioremap_wc(). We don't
need to separate the base and offset in any way here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389457/
2020-09-08 10:43:30 +02:00
Dave Airlie
0c8d22fcae Merge tag 'amd-drm-next-5.10-2020-09-03' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.10-2020-09-03:

amdgpu:
- RAS fixes
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support in DC
- Enable plane rotation
- Rework pre-OS vram reservation handling during driver init
- Add standard interface to dump GPU metrics table from SMU
- Rework tiling and tmz state handling in atomic commits
- Pstate fixes
- Add voltage and power hwmon interfaces for renoir
- SW CTF fixes
- S/G display fix for Raven
- Print client strings for vmfaults for vega and newer
- Manual fan control fixes
- Display updates
- Reorg power management directory structure
- Misc bug fixes
- Misc code cleanups

amdkfd:
- Topology fixes
- Add SMI events for thermal throttling and GPU resets

radeon:
- switch from pci_* to dma_* for dma allocations
- PLL fix

Scheduler:
- Clean up priority levels

UAPI:
- amdgpu INFO IOCTL query update for TMZ state
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049
- amdkfd SMI event interface updates
  https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/therm_thrott

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200903222921.4152-1-alexander.deucher@amd.com
2020-09-08 16:40:13 +10:00
Dave Airlie
ce5c207c6b Linux 5.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9VerweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGhc4H/iHD6qLdB36gZB6K
 oc2nJyrqyWitv4ti2Mnt5PA7o4wX4l6nnr1QvoaJ4BRs5Ja1czRvb2XDmdzqAoIA
 xITGoafqaAeDfxQ91bWrJsVN0pCRKiGwddXlU7TWmqw/riAkfOqi6GYKvav4biJH
 +n1mUPQb1M2IbRFsqkAS+ebKHq3CWaRvzKOEneS88nGlL5u31S9NAru8Ru/fkxRn
 6CwGcs1XRaBPYaZAhdfIb0NuatUlpkhPC9yhNS9up6SqrWmK3m65vmFVng6H0eCF
 fwn1jVztboY/XcNAi5sM9ExpQCql6WLQEEktVikqRDojC8fVtSx6W55tPt7qeaoO
 Z6t4/DA=
 =bcA4
 -----END PGP SIGNATURE-----

Merge tag 'v5.9-rc4' into drm-next

Backmerge 5.9-rc4 as there is a nasty qxl conflict
that needs to be resolved.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08 14:41:40 +10:00
Dave Airlie
0a667b5007 drm/ttm: remove bdev from ttm_tt
I want to split this structure up and use it differently,
step one remove bdev pointer from it and pass it explicitly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826014428.828392-4-airlied@gmail.com
2020-09-08 06:39:21 +10:00
Alex Deucher
11bc98bd71 drm/amdgpu/mmhub2.0: print client id string for mmhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:34 -04:00
Alex Deucher
02f23f5f7c drm/amdgpu/gmc9: print client id string for mmhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:29 -04:00
Alex Deucher
93fabd84c9 drm/amdgpu/gmc10: print client id string for gfxhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:27 -04:00
Alex Deucher
be99ecbfff drm/amdgpu/gmc9: print client id string for gfxhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:23 -04:00
Ye Bin
2d37949dc3 drm/amdgpu/gfx10: Delete some duplicated argument to '|'
1. gfx_v10_0_soft_reset GRBM_STATUS__SPI_BUSY_MASK
2. gfx_v10_0_update_gfx_clock_gating AMD_CG_SUPPORT_GFX_CGLS

Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:18 -04:00
Changfeng
6627d1c1a8 drm/amdgpu: add ta firmware load in psp_v12_0 for renoir
It needs to load renoir_ta firmware because hdcp is enabled by default
for renoir now. This can avoid error:DTM TA is not initialized

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:11 -04:00
Christian König
ee354ff1c7 drm/amdgpu: fix max_entries calculation v4
Calculate the correct value for max_entries or we might run after the
page_address array.

v2: Xinhui pointed out we don't need the shift
v3: use local copy of start and simplify some calculation
v4: fix the case that we map less VA range than BO size

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 1e691e2444 drm/amdgpu: stop allocating dummy GTT nodes
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:04 -04:00
Luben Tuikov
1625951a3a drm/amdgpu: Remove superfluous NULL check
The DRM device is a static member of
the amdgpu device structure and as such
always exists, so long as the PCI and
thus the amdgpu device exist.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:47:55 -04:00
Alex Sierra
abb6fccbb4 drm/amdgpu: enable ih1 ih2 for Arcturus only
Enable multi-ring ih1 and ih2 for Arcturus only.
For Navi10 family multi-ring has been disabled.
Apparently, having multi-ring enabled in Navi was causing
continus page fault interrupts.
Further investigation is needed to get to the root cause.
Related issue link:
https://gitlab.freedesktop.org/drm/amd/-/issues/1279

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:47:48 -04:00
xinhui pan
3d7248d7ce drm/amdgpu: Fix a redundant kfree
drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free
itself.
Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into
amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.
So drm_dev_release try to free a wrong pointer then.

Also driver's release trys to free adev, but drm_dev_release will
access dev after call drvier's release.

To fix it, remove driver's release and set managed.final_kfree to adev.

[   36.269348] BUG: unable to handle page fault for address: ffffa0c279940028
[   36.276841] #PF: supervisor read access in kernel mode
[   36.282434] #PF: error_code(0x0000) - not-present page
[   36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 800ffff8066bf060
[   36.296868] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[   36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G           O      5.9.0-rc2+ #46
[   36.310670] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019
[   36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm]
[   36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5
[   36.347217] RSP: 0018:ffffb9424141fce0 EFLAGS: 00010282
[   36.352931] RAX: 0000000000000006 RBX: ffffa0c279940010 RCX: 0000000000000006
[   36.360718] RDX: ffffffffc0419f5a RSI: 0000000000000200 RDI: ffffa0c279940010
[   36.368503] RBP: ffffb9424141fd10 R08: 0000000000000001 R09: 0000000000000001
[   36.376304] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0c279940010
[   36.384070] R13: ffffffffc0e2a000 R14: ffffa0c26924e220 R15: fffffffffffffff2
[   36.391845] FS:  00007fc4a277b740(0000) GS:ffffa0c288e00000(0000) knlGS:0000000000000000
[   36.400669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.406937] CR2: ffffa0c279940028 CR3: 0000000792304006 CR4: 00000000003706e0
[   36.414732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   36.422550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   36.430354] Call Trace:
[   36.433044]  drm_dev_put.part.0+0x40/0x60 [drm]
[   36.438017]  drm_dev_put+0x13/0x20 [drm]
[   36.442398]  amdgpu_pci_remove+0x56/0x60 [amdgpu]
[   36.447528]  pci_device_remove+0x3e/0xb0
[   36.451807]  device_release_driver_internal+0xff/0x1d0
[   36.457416]  device_release_driver+0x12/0x20
[   36.462094]  pci_stop_bus_device+0x70/0xa0
[   36.466588]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
[   36.472786]  remove_store+0x7b/0x90
[   36.476614]  dev_attr_store+0x17/0x30
[   36.480646]  sysfs_kf_write+0x4b/0x60
[   36.484655]  kernfs_fop_write+0xe8/0x1d0
[   36.488952]  vfs_write+0xf5/0x230
[   36.492562]  ksys_write+0x70/0xf0
[   36.496206]  __x64_sys_write+0x1a/0x20
[   36.500292]  do_syscall_64+0x38/0x90
[   36.504219]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Acked-by: Alex Deucher <alexancer.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:47:33 -04:00
Dennis Li
81202807ae drm/amdgpu: block ring buffer access during GPU recovery
When GPU is in reset, its status isn't stable and ring buffer also need
be reset when resuming. Therefore driver should protect GPU recovery
thread from ring buffer accessed by other threads. Otherwise GPU will
randomly hang during recovery.

v2: correct indent

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:46:55 -04:00
Nirmoy Das
bc21585f3f drm/amdgpu: disable gpu-sched load balance for uvd
On hardware with multiple uvd instances, dependent uvd jobs
may get scheduled to different uvd instances. Because uvd_enc
jobs retain hw context, dependent jobs should always run on the
same uvd instance. This patch disables GPU scheduler's load balancer
for a context that binds jobs from the same context to a uvd
instance.

v2: Squash in uvd_enc fix

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:46:54 -04:00
Dave Airlie
05010c1e2f drm/amdgpu/ttm: remove unused parameter to move blit
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826014428.828392-2-airlied@gmail.com
2020-08-31 12:41:50 +10:00
Nirmoy Das
e230ac1118 drm/amdgpu: fix compiler warnings
Fixes below compiler warnings:
 CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_device.o
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
  381 | void static inline amdgpu_mm_wreg_mmio(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t acc_flags)
      | ^~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_fini’:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3381:6: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]
 3381 |  int r;
      |      ^

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-28 14:00:23 -04:00
Linus Torvalds
5ec06b5c0d drm fixes for 5.9-rc3
core:
 - Take modeset bkl for legacy drivers.
 
 dp_mst:
 - Allow null crtc in dp_mst.
 
 i915:
 - Fix command parser desc matching with masks
 
 amdgpu:
 - Misc display fixes
 - Backlight fixes
 - MPO fix for DCN1
 - Fixes for Sienna Cichlid
 - Fixes for Navy Flounder
 - Vega SW CTF fixes
 - SMU fix for Raven
 - Fix a possible overflow in INFO ioctl
 - Gfx10 clockgating fix
 
 msm:
 - opp/bw scaling patch followup
 - frequency restoring fux
 - vblank in atomic commit fix
 - dpu modesetting fixes
 - fencing fix
 
 etnaviv:
 - scheduler interaction fix
 - gpu init regression fix
 
 exynos:
 - Just drop __iommu annotation to fix sparse warning.
 
 omap:
 - locking state fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJfSGXpAAoJEAx081l5xIa+bs4QAJ6dQb0Llv9TyUBZL/FTTbc2
 iv8PDzA55aoBn7qI00mkUN9M9elfAV75dfLcn9h1fJg/EN3UZXkPCxFAcsscN+gm
 0TXTU9SmgqeDDK2RFxyvSGbuJbfuoOiXMbERdd+sYupUh7FowckDFVrOvbRcEe//
 PX3klUuQRSRxsgShrruJjaLdpqA+saZ12fhoE3eagsvlaFFAuVz8GY2zj533yaw9
 zjgDTkOWiO4Lw2/X10dmjEoxa5Nn26ECF4y+iFih7Uw2e+tu+ZVaL+/PcVPlJk5a
 NdlDAgU1gS8U9c0lSbrLNGGwBRXj1899FIRMS55uLwIBSmdZlOkh/wEOmnC+c1uK
 kbrUfYlU6dXLjzUOTd2d8GQx3F9OTFxOXoZjshMYlryf2RsVl4kImBlpdAC3TT7p
 B/Qk4xbU3uOuDzMgF76b1wT5XiFoHKcPrxNcf5L1tkqULdmYioB78hs4uQij5Bh0
 p6ynfR8rb19J4m+ctI84HqfG6ZbloBgGDZLzIe37lHsvG6xKG7/VjcclBV9BHBXs
 mgG0h+CeogY14a9JWeZ2dQFH8wT77QT5G0/ODQo1r9+OXM1i10DGCZyihrNJf2Nl
 //4uIDlgq695Lxd+h7FMuupWLpvAvnooDaUvnnNfVGsQVFsEhagl3R+TCGaFULHJ
 hi1vfCSZEBV+tCeiKSRU
 =ZmIT
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2020-08-28' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "As expected a bit of an rc3 uptick, amdgpu and msm are the main ones,
  one msm patch was from the merge window, but had dependencies and we
  dropped it until the other tree had landed. Otherwise it's a couple of
  fixes for core, and etnaviv, and single i915, exynos, omap fixes.

  I'm still tracking the Sandybridge gpu relocations issue, if we don't
  see much movement I might just queue up the reverts. I'll talk to
  Daniel next week once he's back from holidays.

  core:
   - Take modeset bkl for legacy drivers

  dp_mst:
   - Allow null crtc in dp_mst

  i915:
   - Fix command parser desc matching with masks

  amdgpu:
   - Misc display fixes
   - Backlight fixes
   - MPO fix for DCN1
   - Fixes for Sienna Cichlid
   - Fixes for Navy Flounder
   - Vega SW CTF fixes
   - SMU fix for Raven
   - Fix a possible overflow in INFO ioctl
   - Gfx10 clockgating fix

  msm:
   - opp/bw scaling patch followup
   - frequency restoring fux
   - vblank in atomic commit fix
   - dpu modesetting fixes
   - fencing fix

  etnaviv:
   - scheduler interaction fix
   - gpu init regression fix

  exynos:
   - Just drop __iommu annotation to fix sparse warning

  omap:
   - locking state fix"

* tag 'drm-fixes-2020-08-28' of git://anongit.freedesktop.org/drm/drm: (41 commits)
  drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init
  drm/amdgpu: disable runtime pm for navy_flounder
  drm/amd/display: Retry AUX write when fail occurs
  drm/amdgpu: Fix buffer overflow in INFO ioctl
  drm/amd/powerplay: Fix hardmins not being sent to SMU for RV
  drm/amdgpu: use MODE1 reset for navy_flounder by default
  drm/amd/pm: correct the thermal alert temperature limit settings
  drm/amdgpu: add asd fw check before loading asd
  drm/amd/display: Keep current gain when ABM disable immediately
  drm/amd/display: Fix passive dongle mistaken as active dongle in EDID emulation
  drm/amd/display: Revert HDCP disable sequence change
  drm/amd/display: Send DISPLAY_OFF after power down on boot
  drm/amdgpu/gfx10: refine mgcg setting
  drm/amd/pm: correct Vega20 swctf limit setting
  drm/amd/pm: correct Vega12 swctf limit setting
  drm/amd/pm: correct Vega10 swctf limit setting
  drm/amd/pm: set VCN pg per instances
  drm/amd/pm: enable run_btc callback for sienna_cichlid
  drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in amdgpu_dm_update_backlight_caps
  drm/amd/display: Reject overlay plane configurations in multi-display scenarios
  ...
2020-08-28 09:46:48 -07:00
Dave Airlie
cbc2e82932 drm-misc-next for 5.10:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - ttm: various cleanups and reworks of the API
 
 Driver Changes:
   - ast: various cleanups
   - gma500: A few fixes, conversion to GPIOd API
   - hisilicon: Change of maintainer, various reworks
   - ingenic: Clock handling and formats support improvements
   - mcde: improvements to the DSI support
   - mgag200: Support G200 desktop cards
   - mxsfb: Support the i.MX7 and i.MX8M and the alpha plane
   - panfrost: support devfreq
   - ps8640: Retrieve the EDID from eDP control, misc improvements
   - tidss: Add a workaround for AM65xx YUV formats handling
   - virtio: a few cleanups, support for virtio-gpu exported resources
   - bridges: Support the chained bridges on more drivers,
     new bridges: Toshiba TC358762, Toshiba TC358775, Lontium LT9611
   - panels: Convert to dev_ based logging, read orientation from the DT,
     various fixes, new panels: Mantix MLAF057WE51-X, Chefree CH101OLHLWH-002,
     Powertip PH800480T013, KingDisplay KD116N21-30NV-A010
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX0fXGwAKCRDj7w1vZxhR
 xTmMAQDPmfSsBLLNnDxu4++zFrQ7OKmNSHCkVr4nAQ/yg3GVPQEAuRw6qPwPWuV3
 +jEPxaQSSmHOhx/jXfolV1tJaE/FHgA=
 =WYoO
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-08-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.10:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - ttm: various cleanups and reworks of the API

Driver Changes:
  - ast: various cleanups
  - gma500: A few fixes, conversion to GPIOd API
  - hisilicon: Change of maintainer, various reworks
  - ingenic: Clock handling and formats support improvements
  - mcde: improvements to the DSI support
  - mgag200: Support G200 desktop cards
  - mxsfb: Support the i.MX7 and i.MX8M and the alpha plane
  - panfrost: support devfreq
  - ps8640: Retrieve the EDID from eDP control, misc improvements
  - tidss: Add a workaround for AM65xx YUV formats handling
  - virtio: a few cleanups, support for virtio-gpu exported resources
  - bridges: Support the chained bridges on more drivers,
    new bridges: Toshiba TC358762, Toshiba TC358775, Lontium LT9611
  - panels: Convert to dev_ based logging, read orientation from the DT,
    various fixes, new panels: Mantix MLAF057WE51-X, Chefree CH101OLHLWH-002,
    Powertip PH800480T013, KingDisplay KD116N21-30NV-A010

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200827155517.do6emeacetpturli@gilmour.lan
2020-08-28 12:38:06 +10:00
Jiawei
4cd2a96d3a drm/amdgpu: simplify hw status clear/set logic
Optimize code to iterate less loops in
amdgpu_device_ip_reinit_early_sriov()

Signed-off-by: Jiawei <Jiawei.Gu@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-27 10:09:07 -04:00
Guchun Chen
0bbb5462d3 drm/amdgpu: correct SE number for arcturus gfx ras
When resetting EDC related register, all CUs needs to be visited,
otherwise, garbage data from EDC register of missed SEs would present.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Jiansong Chen
faeefe4e54 drm/amdgpu: disable runtime pm for navy_flounder
Disable runtime pm for navy_flounder temporarily.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Alex Deucher
cf851f3ff8 drm/amdgpu: Fix buffer overflow in INFO ioctl
The values for "se_num" and "sh_num" come from the user in the ioctl.
They can be in the 0-255 range but if they're more than
AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in
an out of bounds read.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Alex Deucher
c997e8e26c drm/amdgpu: report DC not supported if virtual display is enabled (v2)
Virtual display is non-atomic so report false to avoid checking
atomic state and other atomic things at runtime.

v2: squash into the sr-iov check

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Jiansong Chen
22dd44f47c drm/amdgpu: use MODE1 reset for navy_flounder by default
Switch default gpu reset method to MODE1 for navy_flounder.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Stanley.Yang
5436ab94cd drm/amdkfd: fix set kfd node ras properties value
The ctx->features are new RAS implementation which
is only available for Vega20 and onwards, it is not
available for vega10, vega10 should follow legacy
ECC implementation.

Changed from V1:
    wrap function to initialize kfd node properties

Changed from V2:
    remove wrap function and SDMA SRAM ECC check

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Tao Zhou
c56c90f413 drm/amdgpu: add asd fw check before loading asd
asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Alex Deucher
4d2997ab21 drm/amdgpu: add a wrapper for atom asic_init
This allows us to add asic specific workarounds for atom
asic init while keeping the adev specifics out of the
atombios parser code.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Alex Deucher
a71737313e drm/amdgpu: add pre_asic_init callback for navi
Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Alex Deucher
b0a2db9b48 drm/amdgpu: add pre_asic_init callback for SOC15
We need to restore some registers prior to running asic
init to work around a firmware bug.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Alex Deucher
cff6c7f91a drm/amdgpu: add pre_asic_init callback for VI
Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Alex Deucher
819515c7f3 drm/amdgpu: add pre_asic_init callback for CIK
Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Alex Deucher
632d9f9492 drm/amdgpu: add pre_asic_init callback for SI
Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Alex Deucher
9737a923c9 drm/amdgpu: add an asic callback for pre asic init
This callback can be used by asics that need to
do something special prior to calling atom asic init.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Alex Deucher
f8646661f7 drm/amdgpu: fix up DCHUBBUB_SDPIF_MMIO_CNTRL_0 handling
Properly define this register using a relative offset rather
than an absolute offset and use the proper SOC15 macros to
access it.  It's also DCN, not DCE, so remove it from the
DCE12 header.

No functional change.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Felix Kuehling
332f6e1e98 drm/amdkfd: call amdgpu_amdkfd_get_hive_id directly
No need to use a function pointer because the implementation is not
ASIC-specific.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Felix Kuehling
817154c1a2 drm/amdkfd: call amdgpu_amdkfd_get_unique_id directly
No need to use a function pointer because the implementation is not
ASIC-specific. This fixes missing support due to a missing function
pointer on Arcturus.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Youling Tang
a590a83d74 gpu: amd: Remove duplicate semicolons at the end of line
Remove duplicate semicolons at the end of line.

Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Jiansong Chen
d3bbba79eb drm/amdgpu/gfx10: refine mgcg setting
1. enable ENABLE_CGTS_LEGACY to fix specviewperf11 random hang.
2. remove obsolete RLC_CGTT_SCLK_OVERRIDE workaround.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:17 -04:00
Huang Rui
6127896f4a drm/amdkfd: implement the dGPU fallback path for apu (v6)
We still have a few iommu issues which need to address, so force raven
as "dgpu" path for the moment.

This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
or ACPI CRAT table not correct.

v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2.
v3: Align with existed thunk, don't change the way of raven, only renoir
    will use "dgpu" path by default.
v4: don't update global ignore_crat in the driver, and revise fallback
    function if CRAT is broken.
v5: refine acpi crat good but no iommu support case, and rename the
    title.
v6: fix the issue of dGPU initialized firstly, just modify the report
    value in the node_show().

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:17 -04:00
Luben Tuikov
8aba21b751 drm/amdgpu: Embed drm_device into amdgpu_device (v3)
a) Embed struct drm_device into struct amdgpu_device.
b) Modify the inline-f drm_to_adev() accordingly.
c) Modify the inline-f adev_to_drm() accordingly.
d) Eliminate the use of drm_device.dev_private,
   in amdgpu.
e) Switch from using drm_dev_alloc() to
   drm_dev_init().
f) Add a DRM driver release function, which frees
   the container amdgpu_device after all krefs on
   the contained drm_device have been released.

v2: Split out adding adev_to_drm() into its own
    patch (previous commit), making this patch
    more succinct and clear. More detailed commit
    description.
v3: squash in fix to call drmm_add_final_kfree()
    to avoid a warning.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:17 -04:00
Jiansong Chen
82dff839c9 drm/amdgpu: disable runtime pm for navy_flounder
Disable runtime pm for navy_flounder temporarily.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 15:45:52 -04:00
Alex Deucher
b5b97cab55 drm/amdgpu: Fix buffer overflow in INFO ioctl
The values for "se_num" and "sh_num" come from the user in the ioctl.
They can be in the 0-255 range but if they're more than
AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in
an out of bounds read.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-08-26 15:45:51 -04:00
Jiansong Chen
75947544c8 drm/amdgpu: use MODE1 reset for navy_flounder by default
Switch default gpu reset method to MODE1 for navy_flounder.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 15:45:51 -04:00
Tao Zhou
14e4f3bd81 drm/amdgpu: add asd fw check before loading asd
asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 15:27:52 -04:00
Jiansong Chen
de7a1b0b87 drm/amdgpu/gfx10: refine mgcg setting
1. enable ENABLE_CGTS_LEGACY to fix specviewperf11 random hang.
2. remove obsolete RLC_CGTT_SCLK_OVERRIDE workaround.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-08-26 15:17:16 -04:00
Luben Tuikov
4a580877bd drm/amdgpu: Get DRM dev from adev by inline-f
Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Luben Tuikov
1348969ab6 drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Prike.Liang
50166d1ce5 drm/amdgpu: enable HDP clock gatting
Enabe HDP SD/DS clock gatting in Renoir series.

Signed-off-by: Prike.Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:05 -04:00
Prike.Liang
d844812b28 drm/amdgpu: enable ATHUB clock gatting
Enable ATHUB clock gatting set in Renoir series.

Signed-off-by: Prike.Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:05 -04:00
Dennis Li
08ebb485f0 drm/amdgpu: annotate a false positive recursive locking
Re-apply commit 72e14ebf9f

[  584.110304] ============================================
[  584.110590] WARNING: possible recursive locking detected
[  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
[  584.111164] --------------------------------------------
[  584.111456] kworker/38:1/553 is trying to acquire lock:
[  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.112112]
               but task is already holding lock:
[  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.113068]
               other info that might help us debug this:
[  584.113689]  Possible unsafe locking scenario:

[  584.114350]        CPU0
[  584.114685]        ----
[  584.115014]   lock(&adev->reset_sem);
[  584.115349]   lock(&adev->reset_sem);
[  584.115678]
                *** DEADLOCK ***

[  584.116624]  May be due to missing lock nesting notation

[  584.117284] 4 locks held by kworker/38:1/553:
[  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
[  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.119222]
               stack backtrace:
[  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
[  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[  584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[  584.121638] Call Trace:
[  584.122050]  dump_stack+0x98/0xd5
[  584.122499]  __lock_acquire+0x1139/0x16e0
[  584.122931]  ? trace_hardirqs_on+0x3b/0xf0
[  584.123358]  ? cancel_delayed_work+0xa6/0xc0
[  584.123771]  lock_acquire+0xb8/0x1c0
[  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.124599]  down_write+0x49/0x120
[  584.125032]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125472]  amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125910]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[  584.126367]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[  584.126789]  process_one_work+0x29e/0x630
[  584.127208]  worker_thread+0x3c/0x3f0
[  584.127621]  ? __kthread_parkme+0x61/0x90
[  584.128014]  kthread+0x12f/0x150
[  584.128402]  ? process_one_work+0x630/0x630
[  584.128790]  ? kthread_park+0x90/0x90
[  584.129174]  ret_from_fork+0x3a/0x50

Each adev has owned lock_class_key to avoid false positive
recursive locking.

v2:
1. register adev->lock_key into lockdep, otherwise lockdep will
report the below warning

[ 1216.705820] BUG: key ffff890183b647d0 has not been registered!
[ 1216.705924] ------------[ cut here ]------------
[ 1216.705972] DEBUG_LOCKS_WARN_ON(1)
[ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210

v3:
change to use down_write_nest_lock to annotate the false dead-lock
warning.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:05:52 -04:00
Dennis Li
d95e8e97e2 drm/amdgpu: refine create and release logic of hive info
Change to dynamically create and release hive info object,
which help driver support more hives in the future.

v2:
Change to save hive object pointer in adev, to avoid locking
xgmi_mutex every time when calling amdgpu_get_xgmi_hive.

v3:
1. Change type of hive object pointer in adev from void* to
amdgpu_hive_info*.
2. remove unnecessary variable initialization.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:24:14 -04:00
Dennis Li
aac891685d drm/amdgpu: refine message print for devices of hive
Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device
of a hive is the message for.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:24:06 -04:00
Dennis Li
cbfd17f7ba drm/amdgpu: fix the nullptr issue when reenter GPU recovery
in single gpu system, if driver reenter gpu recovery,
amdgpu_device_lock_adev will return false, but hive is
nullptr now.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:23:54 -04:00
Dennis Li
6049db43d6 drm/amdgpu: change reset lock from mutex to rw_semaphore
clients don't need reset-lock for synchronization when no
GPU recovery.

v2:
change to return the return value of down_read_killable.

v3:
if GPU recovery begin, VF ignore FLR notification.

Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:23:48 -04:00
Lukas Bulwahn
5049a05269 drm/amd/display: remove unintended executable mode
Besides the intended change, commit 4cc1178e16 ("drm/amdgpu: replace DRM
prefix with PCI device info for gfx/mmhub") also set the source files
mmhub_v1_0.c and gfx_v9_4.c to be executable, i.e., changed fromold mode
644 to new mode 755.

Commit 241b2ec931 ("drm/amd/display: Add dcn30 Headers (v2)") added the
four header files {dpcs,dcn}_3_0_0_{offset,sh_mask}.h as executable, i.e.,
mode 755.

Set to the usual modes for source and headers files and clean up those
mistakes. No functional change.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:23:02 -04:00
Dennis Li
53b3f8f40e drm/amdgpu: refine codes to avoid reentering GPU recovery
if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive->in_reset
and adev->in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:22:56 -04:00
Dave Airlie
ebb21aa188 drm/ttm: drop bus.size from bus placement.
This is always calculated the same, and only used in a couple of places.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200811074658.58309-2-airlied@gmail.com
2020-08-24 17:06:08 +10:00
Dave Airlie
098754fe3c drm/ttm: init mem->bus in common code.
The drivers all do the same thing here.

Reviewed-by: Christian König <christian.koenig@amd.com> for both.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200811074658.58309-1-airlied@gmail.com
2020-08-24 17:00:48 +10:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Dave Airlie
ba9086a6df Merge tag 'amd-drm-fixes-5.9-2020-08-20' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.9-2020-08-20:

amdgpu:
- Fixes for Navy Flounder
- Misc display fixes
- RAS fix

amdkfd:
- SDMA fix for renoir

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200820041938.3928-1-alexander.deucher@amd.com
2020-08-21 10:17:52 +10:00
Jiansong Chen
da2446b66b Revert "drm/amdgpu: disable gfxoff for navy_flounder"
This reverts commit 9c9b17a7d1.
Newly released sdma fw (51.52) provides a fix for the issue.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-19 23:55:04 -04:00
Dave Airlie
485d41b092 Merge tag 'amd-drm-fixes-5.9-2020-08-12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.9-2020-08-12:

amdgpu:
- Fix allocation size
- SR-IOV fixes
- Vega20 SMU feature state caching fix
- Fix custom pptable handling
- Arcturus golden settings update
- Several display fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200813033610.4008-1-alexander.deucher@amd.com
2020-08-19 13:56:28 +10:00
Leo Liu
cdab4211f6 drm/amdgpu/jpeg: remove redundant check when it returns
Fix warning from kernel test robot
v2: remove the local variable as well

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:22:16 -04:00
jqdeng
8e1d88f948 drm/amdgpu: Limit the error info print rate
Use function printk_ratelimit to limit the print rate.

Signed-off-by: jqdeng <Emily.Deng@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:22:10 -04:00
jqdeng
9a1cddd637 drm/amdgpu: Fix repeatly flr issue
Only for no job running test case need to do recover in
flr notification.
For having job in mirror list, then let guest driver to
hit job timeout, and then do recover.

Signed-off-by: jqdeng <Emily.Deng@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:22:02 -04:00
Jiansong Chen
332d790365 Revert "drm/amdgpu: disable gfxoff for navy_flounder"
This reverts commit ba4e049e63.
Newly released sdma fw (51.52) provides a fix for the issue.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:20:34 -04:00
Luben Tuikov
9af5e21dac drm/scheduler: Remove priority macro INVALID (v2)
Remove DRM_SCHED_PRIORITY_INVALID. We no longer
carry around an invalid priority and cut it off
at the source.

Backwards compatibility behaviour of AMDGPU CTX
IOCTL passing in garbage for context priority
from user space and then mapping that to
DRM_SCHED_PRIORITY_NORMAL is preserved.

v2: Revert "res"  --> "r" and
           "prio" --> "priority".

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:20:26 -04:00
Luben Tuikov
e2d732fdb7 drm/scheduler: Scheduler priority fixes (v2)
Remove DRM_SCHED_PRIORITY_LOW, as it was used
in only one place.

Rename and separate by a line
DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT
as it represents a (total) count of said
priorities and it is used as such in loops
throughout the code. (0-based indexing is the
the count number.)

Remove redundant word HIGH in priority names,
and rename *KERNEL* to *HIGH*, as it really
means that, high.

v2: Add back KERNEL and remove SW and HW,
    in lieu of a single HIGH between NORMAL and KERNEL.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:20:17 -04:00
Huang Rui
34174b89bf drm/amdkfd: fix the wrong sdma instance query for renoir
Renoir only has one sdma instance, it will get failed once query the
sdma1 registers. So use switch-case instead of static register array.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 17:54:12 -04:00
Bhawanpreet Lakha
0a668aee0a drm/amdgpu: parse ta firmware for navy_flounder
Use the same case as sienna_cichlid

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 17:53:50 -04:00
Guchun Chen
1a68d96f81 drm/amdgpu: fix NULL pointer access issue when unloading driver
When unloading driver by "modprobe -r amdgpu", one NULL pointer
dereference bug occurs in ras debugfs releasing. The cause is the
duplicated debugfs_remove, as drm debugfs_root dir has been cleaned
up already by drm_minor_unregister.

BUG: kernel NULL pointer dereference, address: 00000000000000a0
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 11 PID: 1526 Comm: modprobe Tainted: G           OE     5.6.0-guchchen #1
Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
RIP: 0010:down_write+0x15/0x40
Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92
d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3
RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0
RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0
R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40
R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000
FS:  00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 simple_recursive_removal+0x63/0x370
 ? debugfs_remove+0x60/0x60
 debugfs_remove+0x40/0x60
 amdgpu_ras_fini+0x82/0x230 [amdgpu]
 ? __kernfs_remove.part.17+0x101/0x1f0
 ? kernfs_name_hash+0x12/0x80
 amdgpu_device_fini+0x1c0/0x580 [amdgpu]
 amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]
 amdgpu_pci_remove+0x36/0x60 [amdgpu]
 pci_device_remove+0x3b/0xb0
 device_release_driver_internal+0xe5/0x1c0
 driver_detach+0x46/0x90
 bus_remove_driver+0x58/0xd0
 pci_unregister_driver+0x29/0x90
 amdgpu_exit+0x11/0x25 [amdgpu]
 __x64_sys_delete_module+0x13d/0x210
 do_syscall_64+0x5f/0x250
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 17:00:46 -04:00
Jiansong Chen
9c9b17a7d1 drm/amdgpu: disable gfxoff for navy_flounder
gfxoff is temporarily disabled for navy_flounder,
since at present the feature has broken some basic
amdgpu test.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 16:58:59 -04:00
Maxime Ripard
d85ddd1318 Linux 5.9-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl85kWkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGGPwIAJpEmEBkMoQ+KARK
 PaaVDQW9fwAlC1nThMpGv/m8Ym7KbfLkTgEJQiQyNv3pDDhyLP8jvcZcscIkfs4s
 56IMjFndRHWNeCVu9YPXWmAEp/WycZNC7YVPu0j1bI9VgvaHvbHOqUWzxB716RbY
 K4TFprJEA3sotNm0vdda2NgSlSup/0NVKiP2LwQPjkwH+Kf6/Ol1j2uxbWywEo75
 BdW5LreDtUoJ7W5BeX8GJ0IVgWdyxBV61eVbaINNY3EOPc7+uMGOgR9oHeGWRceH
 V4ELYww5yjizUDtKFvVTc/k0tj+Rq73mtOADdaF0YWItqxtDBvAcdKIpC0KYzVaa
 2fB+rts=
 =9Pnj
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXzvIXwAKCRDj7w1vZxhR
 xYXLAQC80uF6JkpBeNyuewyY7CDadDG1qDchDmYquwGVDnO+HwEAmvL84csLcxBy
 ah3UMOKUyWz5Sahlg48ZIaaUhRaulwE=
 =Lu/B
 -----END PGP SIGNATURE-----

Merge v5.9-rc1 into drm-misc-next

Sam needs 5.9-rc1 to have dev_err_probe in to merge some patches.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-08-18 14:14:25 +02:00
Kevin Wang
4444457450 drm/amdgpu: add condition check for trace_amdgpu_cs()
v1:
add trace event enabled check to avoid nop loop when submit multi ibs
in amdgpu_cs_ioctl() function.

v2:
add a new wrapper function to trace all amdgpu cs ibs.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-17 14:06:37 -04:00
Kevin Wang
736b172978 drm/amdgpu: fix amdgpu_bo_release_notify() comment error
fix amdgpu_bo_release_notify() comment error.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-17 14:06:28 -04:00
Huang Rui
d95c42a150 drm/amdkfd: fix the wrong sdma instance query for renoir
Renoir only has one sdma instance, it will get failed once query the
sdma1 registers. So use switch-case instead of static register array.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-17 14:06:13 -04:00
Alex Deucher
11043b7a99 drm/amdgpu: note what type of reset we are using
When we reset the GPU, note what type of reset will be
used.  This makes debugging different reset scenarios
more clear as the driver may use different reset
methods depending on conditions on the system.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 17:03:20 -04:00
Alex Deucher
bddbacc9e0 drm/amdgpu: print where we get the vbios image from
ACPI, ROM, PCI BAR, etc.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 17:03:20 -04:00
Bhawanpreet Lakha
31e726ca3d drm/amdgpu: parse ta firmware for navy_flounder
Use the same case as sienna_cichlid

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:41 -04:00
James Zhu
ac1128c996 drm/amdgpu/vcn3.0: only SIENNA_CICHLID need specify instance for dec/enc
Only SIENNA_CICHLID(VCN3) has 2 unsymmetrical instances, there're less
codecs on instance 1, we use 0 for decode and 1 for encode.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:41 -04:00
Evan Quan
e098bc9612 drm/amd/pm: optimize the power related source code layout
The target is to provide a clear entry point(for power routines).
Also this can help to maintain a clear view about the frameworks
used on different ASICs. Hopefully all these can make power part
more friendly to play with.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:41 -04:00
Evan Quan
e9372d2371 drm/amd/powerplay: put those exposed power interfaces in amdgpu_dpm.c
As other power interfaces.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:41 -04:00
Evan Quan
20d3c28ce4 drm/amd/powerplay: optimize i2c bus access implementation
The caller needs not care about the internal details how the powerplay
API implemented.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:41 -04:00
Evan Quan
70bdb6ed22 drm/amd/powerplay: drop unnecessary pp_funcs checker
It's redundant. Also, the callers should not care about
the implementation details.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:41 -04:00
Evan Quan
b89e9eb681 drm/amd/powerplay: optimize amdgpu_dpm_set_clockgating_by_smu() implementation
Cover the implementation details from outside(of power part).

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:41 -04:00
Guchun Chen
ae2bf61ff3 drm/amdgpu: guard ras debugfs creation/removal based on CONFIG_DEBUG_FS
It can avoid potential build warn/error when
CONFIG_DEBUG_FS is not set.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Guchun Chen
2e2f5dd514 drm/amdgpu: fix NULL pointer access issue when unloading driver
When unloading driver by "modprobe -r amdgpu", one NULL pointer
dereference bug occurs in ras debugfs releasing. The cause is the
duplicated debugfs_remove, as drm debugfs_root dir has been cleaned
up already by drm_minor_unregister.

BUG: kernel NULL pointer dereference, address: 00000000000000a0
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 11 PID: 1526 Comm: modprobe Tainted: G           OE     5.6.0-guchchen #1
Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
RIP: 0010:down_write+0x15/0x40
Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92
d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3
RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0
RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0
R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40
R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000
FS:  00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 simple_recursive_removal+0x63/0x370
 ? debugfs_remove+0x60/0x60
 debugfs_remove+0x40/0x60
 amdgpu_ras_fini+0x82/0x230 [amdgpu]
 ? __kernfs_remove.part.17+0x101/0x1f0
 ? kernfs_name_hash+0x12/0x80
 amdgpu_device_fini+0x1c0/0x580 [amdgpu]
 amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]
 amdgpu_pci_remove+0x36/0x60 [amdgpu]
 pci_device_remove+0x3b/0xb0
 device_release_driver_internal+0xe5/0x1c0
 driver_detach+0x46/0x90
 bus_remove_driver+0x58/0xd0
 pci_unregister_driver+0x29/0x90
 amdgpu_exit+0x11/0x25 [amdgpu]
 __x64_sys_delete_module+0x13d/0x210
 do_syscall_64+0x5f/0x250
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Christian König
f1403342eb drm/amdgpu: revert "fix system hang issue during GPU reset"
The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1aa2.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Evan Quan
9f979a49e2 drm/amd/powerplay: enable swSMU mgpu fan boost support
Enable mgpu fan boost feature on swSMU routines.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Evan Quan
f10bb940d8 drm/amd/powerplay: optimize the interface for mgpu fan boost enablement
Cover the implementation details from outside(of power). Also preparing
for expanding this to swSMU.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Jiansong Chen
ba4e049e63 drm/amdgpu: disable gfxoff for navy_flounder
gfxoff is temporarily disabled for navy_flounder,
since at present the feature has broken some basic
amdgpu test.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Oak Zeng
9fb1506eb6 drm/amdgpu: Use function pointer for some mmhub functions
Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC. Simplify the code by deleting duplicate functions

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Nirmoy Das
2f53072434 drm/amdgpu: pass NULL pointer instead of 0
Fixes: c030f2e416 ("drm/amdgpu: add amdgpu_ras.c to support ras (v2)")
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Dennis Li
72e14ebf9f drm/amdgpu: annotate a false positive recursive locking
[  584.110304] ============================================
[  584.110590] WARNING: possible recursive locking detected
[  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
[  584.111164] --------------------------------------------
[  584.111456] kworker/38:1/553 is trying to acquire lock:
[  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.112112]
               but task is already holding lock:
[  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.113068]
               other info that might help us debug this:
[  584.113689]  Possible unsafe locking scenario:

[  584.114350]        CPU0
[  584.114685]        ----
[  584.115014]   lock(&adev->reset_sem);
[  584.115349]   lock(&adev->reset_sem);
[  584.115678]
                *** DEADLOCK ***

[  584.116624]  May be due to missing lock nesting notation

[  584.117284] 4 locks held by kworker/38:1/553:
[  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
[  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.119222]
               stack backtrace:
[  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
[  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[  584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[  584.121638] Call Trace:
[  584.122050]  dump_stack+0x98/0xd5
[  584.122499]  __lock_acquire+0x1139/0x16e0
[  584.122931]  ? trace_hardirqs_on+0x3b/0xf0
[  584.123358]  ? cancel_delayed_work+0xa6/0xc0
[  584.123771]  lock_acquire+0xb8/0x1c0
[  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.124599]  down_write+0x49/0x120
[  584.125032]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125472]  amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125910]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[  584.126367]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[  584.126789]  process_one_work+0x29e/0x630
[  584.127208]  worker_thread+0x3c/0x3f0
[  584.127621]  ? __kthread_parkme+0x61/0x90
[  584.128014]  kthread+0x12f/0x150
[  584.128402]  ? process_one_work+0x630/0x630
[  584.128790]  ? kthread_park+0x90/0x90
[  584.129174]  ret_from_fork+0x3a/0x50

Each adev has owned lock_class_key to avoid false positive
recursive locking.

v2:
1. register adev->lock_key into lockdep, otherwise lockdep will
report the below warning

[ 1216.705820] BUG: key ffff890183b647d0 has not been registered!
[ 1216.705924] ------------[ cut here ]------------
[ 1216.705972] DEBUG_LOCKS_WARN_ON(1)
[ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210

v3:
change to use down_write_nest_lock to annotate the false dead-lock
warning.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Wenhui Sheng
a4322e1881 drm/amdgpu: add debugfs interface for RAP test
After amdgpu driver loading successfully, we can use
RAP debugfs interface <debugfs_dir>/dri/xxx/rap_test
to trigger RAP test.

Currently only L0 validate test is supported.

v2: refine amdgpu_rap.h

Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Wenhui Sheng
8602692b6f drm/amdgpu: enable RAP TA load
Enable the RAP TA loading path and add RAP test
trigger interface.

v2: fix potential mem leak issue

Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:39 -04:00
Wenhui Sheng
a189d0ae0c drm/amdgpu: add RAP TA header file
The RAP TA contains tests used to verify if
RAP(Register Access Policy), or otherwise known
as Security Policy is applied correctly
by PSP BL&TOS.

The RAP test is a measure to ensure that we reduce
the avenue of complexity and mistakes when dealing
with RAP in post-si execution, where debugging failures
related to RAP is quite difficult and expensive.

v2: add introduction for RAP TA

Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:39 -04:00
Tianci.Yin
425a78f43b drm/amdgpu: reconfigure spm golden settings on Navi1x after GFXOFF exit(v3)
On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfigure the golden settings after GFXOFF
exit.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:39 -04:00
Tianci.Yin
d58fe3cf11 drm/amdgpu: add interface amdgpu_gfx_init_spm_golden for Navi1x
On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfiguration is needed. Make the
configuration code as an interface for future use.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:29 -04:00
Guchun Chen
66459e1db2 drm/amdgpu: add debugfs node to toggle ras error cnt harvest
Before ras recovery is issued, user could operate this debugfs
node to enable/disable the harvest of all RAS IPs' ras error
count registers, which will help keep hardware's registers'
status instead of cleaning up them.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:12:47 -04:00
Guchun Chen
f75e94d868 drm/amdgpu: bypass querying ras error count registers
Once ras recovery is issued by ras sync flood interrupt or
ras controller interrupt, add this guard to bypass or execute
ras error count register harvest of all IPs.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:12:22 -04:00
Linus Torvalds
ea6ec77437 drm fixes for 5.9-rc1
core:
 - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
 - Remove null check for kfree in drm_dev_release.
 - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition.
 - re-added docs for drm_gem_flink_ioctl()
 - add orientation quirk for ASUS T103HAF
 
 ttm:
 - ttm: fix page-offset calculation within TTM
 - revert patch causing vmwgfx regressions
 
 fbcon:
 - Fix a fbcon OOB read in fbdev, found by syzbot.
 
 vga:
 - Mark vga_tryget static as it's not used elsewhere.
 
 amdgpu:
 - Re-add spelling typo fix
 - Sienna Cichlid fixes
 - Navy Flounder fixes
 - DC fixes
 - SMU i2c fix
 - Power fixes
 
 vmwgfx:
 - regression fixes for modesetting crashes
 - misc fixes
 
 xlnx:
 - Small fixes to xlnx.
 
 omap:
 - Fix mode initialization in omap_connector_mode_valid().
 - force runtime PM suspend on system suspend
 
 tidss:
 - fix modeset init for DPI panels
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJfM3QrAAoJEAx081l5xIa+4H8P/0YLHHK52yuYqyYEtvfehlDC
 i3ur47QdQS/6fkspVSbVndsToQS+D9ZNAZFUUSumVr1kR40yGC6oGZGmo+UWYH+T
 AiZC1LHmz8hsGMni40uCb+ble1mtcsbqjAkb5hG/9RxPfBiunQ6OJJGU7X2FoB3e
 TZNWwYKyF/Y5f/OlKdRbi9qfszWwWb/+ZW94q9ER1MjlcwrW3EqgRrkRGT3hH4UO
 AvzYhgGlzz7FXwQic3mJw3Sirm6NnbuUv9aRPRBCGm9i/BDSRHFpIcXIuLuMsiOk
 L/WtdYodMDbhB0xRKwDQzf9TIUxSwulrXO0Jfo8CVKknyHha62fz2bMGpf+4ron2
 3WrZmV+5w0Q0qcfbh1EgHkVXmDry+mtFws1dvAtvyNp7GSI675tNI2qH3cgLfvQX
 FgMrfbmOp+f+ohXL7fUw+J/7cUjrVYZxtkCAEctB/6B44INjFdRuwQfP98x5CcmX
 CpZHLJKsMI+Y6r6Jno5foIQJIZJrDiUJLhRE2sN7cWv54+2gmHXbqsLAfMMpB6Mc
 2gMZNIwB+qSVjULR+MHvzC3iZHHTkD3hijkYshb8+s5pICGo+eq2v4S/r31C4Wn1
 DpToox1t/RPY+WPS5dbi95zxg1QDwr/BsExTCKTUB2k+ovoWE6byPvt2Vslh+F/K
 i+LrkziEqKb/BhyGA7z9
 =dr7v
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "This has a few vmwgfx regression fixes we hit from the merge window
  (one in TTM), it also has a bunch of amdgpu fixes along with a
  scattering everywhere else.

  core:
   - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
   - Remove null check for kfree in drm_dev_release.
   - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition.
   - re-added docs for drm_gem_flink_ioctl()
   - add orientation quirk for ASUS T103HAF

  ttm:
   - ttm: fix page-offset calculation within TTM
   - revert patch causing vmwgfx regressions

  fbcon:
   - Fix a fbcon OOB read in fbdev, found by syzbot.

  vga:
   - Mark vga_tryget static as it's not used elsewhere.

  amdgpu:
   - Re-add spelling typo fix
   - Sienna Cichlid fixes
   - Navy Flounder fixes
   - DC fixes
   - SMU i2c fix
   - Power fixes

  vmwgfx:
   - regression fixes for modesetting crashes
   - misc fixes

  xlnx:
   - Small fixes to xlnx.

  omap:
   - Fix mode initialization in omap_connector_mode_valid().
   - force runtime PM suspend on system suspend

  tidss:
   - fix modeset init for DPI panels"

* tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm: (70 commits)
  drm/ttm: revert "drm/ttm: make TT creation purely optional v3"
  drm/vmwgfx: fix spelling mistake "Cant" -> "Can't"
  drm/vmwgfx: fix spelling mistake "Cound" -> "Could"
  drm/vmwgfx/ldu: Use drm_mode_config_reset
  drm/vmwgfx/sou: Use drm_mode_config_reset
  drm/vmwgfx/stdu: Use drm_mode_config_reset
  drm/vmwgfx: Fix two list_for_each loop exit tests
  drm/vmwgfx: Use correct vmw_legacy_display_unit pointer
  drm/vmwgfx: Use struct_size() helper
  drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
  drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3)
  drm/amd/powerplay: update swSMU VCN/JPEG PG logics
  drm/amdgpu: use mode1 reset by default for sienna_cichlid
  drm/amdgpu/smu: rework i2c adpater registration
  drm/amd/display: Display goes blank after inst
  drm/amd/display: Change null plane state swizzle mode to 4kb_s
  drm/amd/display: Use helper function to check for HDMI signal
  drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink
  drm/amd/display: Fix logger context
  drm/amd/display: populate new dml variable
  ...
2020-08-12 11:53:01 -07:00
Thomas Zimmermann
534b1f9071 Merge drm/drm-next into drm-misc-next
Backmerging drm-next into drm-misc-next for nouveau and panel updates.
Resolves a conflict between ttm and nouveau, where struct ttm_mem_res got
renamed to struct ttm_resource.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2020-08-12 20:42:08 +02:00
Christian König
b2458726b3 drm/ttm: give resource functions their own [ch] files
This is a separate object we work within TTM.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/384338/?series=80346&rev=1
2020-08-12 15:51:03 +02:00
Christian König
e92ae67d6e drm/ttm: rename ttm_resource_manager_func callbacks
The names get/put are associated with reference counting
in the Linux kernel, use alloc/free instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/384340/?series=80346&rev=1
2020-08-12 15:50:51 +02:00
Arunpravin
0cf0ee983b drm/amdgpu: Enable P2P dmabuf over XGMI
Access the exported P2P dmabuf over XGMI, if available.
Otherwise, fall back to the existing PCIe method.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin <apaneers@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-11 11:47:35 -04:00
Oleg Vasilev
65bf2cf95d drm/amdgpu: utilize subconnector property for DP through atombios
Since DP-specific information is stored in driver's structures, every
driver needs to implement subconnector property by itself.

v2: rebase

v3: renamed a function call

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Jeevan B <jeevan.b@intel.com>
Signed-off-by: Oleg Vasilev <oleg.vasilev@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1587732655-17544-4-git-send-email-jeevan.b@intel.com
2020-08-11 14:19:41 +02:00
Dave Airlie
16e6eea29d Merge tag 'amd-drm-fixes-5.9-2020-08-07' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-fixes-5.9-2020-08-07:

amdgpu:
- Re-add spelling typo fix
- Sienna Cichlid fixes
- Navy Flounder fixes
- DC fixes
- SMU i2c fix
- Power fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200807222843.3909-1-alexander.deucher@amd.com
2020-08-11 13:08:45 +10:00
Linus Torvalds
97d052ea3f A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in various
     situations caused by the lockdep additions to seqcount to validate that
     the write side critical sections are non-preemptible.
 
   - The seqcount associated lock debug addons which were blocked by the
     above fallout.
 
     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict per
     CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot
     validate that the lock is held.
 
     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored and
     write_seqcount_begin() has a lockdep assertion to validate that the
     lock is held.
 
     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API is
     unchanged and determines the type at compile time with the help of
     _Generic which is possible now that the minimal GCC version has been
     moved up.
 
     Adding this lockdep coverage unearthed a handful of seqcount bugs which
     have been addressed already independent of this.
 
     While generaly useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if the
     writers are serialized by an associated lock, which leads to the well
     known reader preempts writer livelock. RT prevents this by storing the
     associated lock pointer independent of lockdep in the seqcount and
     changing the reader side to block on the lock when a reader detects
     that a writer is in the write side critical section.
 
  - Conversion of seqcount usage sites to associated types and initializers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8xmPYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTuQEACyzQCjU8PgehPp9oMqWzaX2fcVyuZO
 QU2yw6gmz2oTz3ZHUNwdW8UnzGh2OWosK3kDruoD9FtSS51lER1/ISfSPCGfyqxC
 KTjOcB1Kvxwq/3LcCx7Zi3ZxWApat74qs3EhYhKtEiQ2Y9xv9rLq8VV1UWAwyxq0
 eHpjlIJ6b6rbt+ARslaB7drnccOsdK+W/roNj4kfyt+gezjBfojGRdMGQNMFcpnv
 shuTC+vYurAVIiVA/0IuizgHfwZiXOtVpjVoEWaxg6bBH6HNuYMYzdSa/YrlDkZs
 n/aBI/Xkvx+Eacu8b1Zwmbzs5EnikUK/2dMqbzXKUZK61eV4hX5c2xrnr1yGWKTs
 F/juh69Squ7X6VZyKVgJ9RIccVueqwR2EprXWgH3+RMice5kjnXH4zURp0GHALxa
 DFPfB6fawcH3Ps87kcRFvjgm6FBo0hJ1AxmsW1dY4ACFB9azFa2euW+AARDzHOy2
 VRsUdhL9CGwtPjXcZ/9Rhej6fZLGBXKr8uq5QiMuvttp4b6+j9FEfBgD4S6h8csl
 AT2c2I9LcbWqyUM9P4S7zY/YgOZw88vHRuDH7tEBdIeoiHfrbSBU7EQ9jlAKq/59
 f+Htu2Io281c005g7DEeuCYvpzSYnJnAitj5Lmp/kzk2Wn3utY1uIAVszqwf95Ul
 81ppn2KlvzUK8g==
 =7Gj+
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:
 "A set of locking fixes and updates:

   - Untangle the header spaghetti which causes build failures in
     various situations caused by the lockdep additions to seqcount to
     validate that the write side critical sections are non-preemptible.

   - The seqcount associated lock debug addons which were blocked by the
     above fallout.

     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict
     per CPU seqcounts. As the lock is not part of the seqcount, lockdep
     cannot validate that the lock is held.

     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored
     and write_seqcount_begin() has a lockdep assertion to validate that
     the lock is held.

     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API
     is unchanged and determines the type at compile time with the help
     of _Generic which is possible now that the minimal GCC version has
     been moved up.

     Adding this lockdep coverage unearthed a handful of seqcount bugs
     which have been addressed already independent of this.

     While generally useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if
     the writers are serialized by an associated lock, which leads to
     the well known reader preempts writer livelock. RT prevents this by
     storing the associated lock pointer independent of lockdep in the
     seqcount and changing the reader side to block on the lock when a
     reader detects that a writer is in the write side critical section.

   - Conversion of seqcount usage sites to associated types and
     initializers"

* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  locking/seqlock, headers: Untangle the spaghetti monster
  locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
  x86/headers: Remove APIC headers from <asm/smp.h>
  seqcount: More consistent seqprop names
  seqcount: Compress SEQCNT_LOCKNAME_ZERO()
  seqlock: Fold seqcount_LOCKNAME_init() definition
  seqlock: Fold seqcount_LOCKNAME_t definition
  seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
  hrtimer: Use sequence counter with associated raw spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  vfs: Use sequence counter with associated spinlock
  timekeeping: Use sequence counter with associated raw spinlock
  xfrm: policy: Use sequence counters with associated lock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  sched: tasks: Use sequence counter with associated spinlock
  ...
2020-08-10 19:07:44 -07:00
Dave Airlie
c44264f9f7 Linux 5.8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8nLmkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGBEkH/RJAnEan4gcdkBDf
 2xS0yk4XLMjKZwbz61VSeKMUBkGCsh1cWsaAtJJAIVYj/6o/7mld01sZCnOASLnJ
 ET0nXL2NiT/f+prYkTE5qeYH225/Yfh5jgmrqZtx/uXFCwgE5Nzi3f72IXQDmCR+
 kmpNhNox3YqQTKXhv7DXobDKcO0n8nZavnhxmA9SBZn2h9RHvmvJghD0UOfLjMpA
 1SbknaE67n5JN/JjI6TkYWk4nuJmqfvmBL5IYVDEZYO4UlM5Bqzhw0XN7Ax70K3M
 KRK/eiqRmNwun5MxWnbzQU7t7iTgVmzjHLTpWGcM3V4blgGXC3uhjc+p/R8KTQUE
 bIydSzs=
 =fDeo
 -----END PGP SIGNATURE-----

Merge tag 'v5.8' into drm-next

I need to backmerge 5.8 as I've got a bunch of fixes sitting
on an rc7 base that I want to land.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-08-11 11:58:31 +10:00
shiwu.zhang
97a9b60fa3 drm/amdgpu: update gc golden register for arcturus
Update golden setting to improve performance on HPC
and ML apps

Signed-off-by: shiwu.zhang <shiwu.zhang@amd.com>
Tested-by: gang.long <gang.long@amd.com>
Reviewed-by: guchun.chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-10 18:08:21 -04:00
Liu ChengZhe
d5bbb4761c drm/amdgpu: Skip some registers config for SRIOV
Some registers are not accessible to virtual function setup, so
skip their initialization when in VF-SRIOV mode.

v2: move SRIOV VF check into specify functions;
modify commit description and comment.

Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-10 18:06:35 -04:00
Christophe JAILLET
78484d7c74 drm: amdgpu: Use the correct size when allocating memory
When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
of 'sizeof(*sg)'.

The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but
it won't lead to corruption.

Fixes: f44ffd677f ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-08-10 18:06:10 -04:00
Monk Liu
bcca629806 drm/amdgpu: fix reload KMD hang on GFX10 KIQ
GFX10 KIQ will hang if we try below steps:
modprobe amdgpu
rmmod amdgpu
modprobe amdgpu sched_hw_submission=4

Due to KIQ is always living there even after KMD unloaded
thus when doing the realod KIQ will crash upon its register
being programed by different values with the previous loading
(the config like HQD addr, ring size, is easily changed if we alter
the sched_hw_submission)

the fix is we must inactive KIQ first before touching any
of its registgers

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-10 17:26:52 -04:00
shiwu.zhang
5a58abf5ed drm/amdgpu: update gc golden register for arcturus
Update golden setting to improve performance on HPC
and ML apps

Signed-off-by: shiwu.zhang <shiwu.zhang@amd.com>
Tested-by: gang.long <gang.long@amd.com>
Reviewed-by: guchun.chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-10 17:26:52 -04:00
Liu ChengZhe
1d44732619 drm/amdgpu: Skip some registers config for SRIOV
Some registers are not accessible to virtual function setup, so
skip their initialization when in VF-SRIOV mode.

v2: move SRIOV VF check into specify functions;
modify commit description and comment.

Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-10 17:26:52 -04:00