linux

Author	SHA1	Message	Date
Alex Deucher	a98cec220a	drm/amdgpu: fix SDMA suspend/resume on SR-IOV Update all SDMA versions that support SR-IOV to properly tear down the ttm buffer functions on suspend. Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-10 17:32:56 -04:00
Alex Deucher	571c053658	drm/amdgpu: switch sdma buffer function tear down to a helper Switch all of the SDMA implementations to use the helper to tear down the ttm buffer manager. Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-10 17:32:56 -04:00
Bokun Zhang	e5da651985	drm/amdgpu: Fix SDMA engine resume issue under SRIOV - Under SRIOV, SDMA engine is shared between VFs. Therefore, we will not stop SDMA during hw_fini. This is not an issue with normal dirver loading and unloading. - However, when we put the SDMA engine to suspend state and resume it, the issue starts to show up. Something could attempt to use that SDMA engine to clear or move memory before the engine is initialized since the DRM entity is still there. - Therefore, we will call sdma_v5_2_enable(false) during hw_fini, and if we are under SRIOV, we will call sdma_v5_2_enable(true) afterwards to allow other VFs to use SDMA. This way, the DRM entity of SDMA engine is emptied and it will follow the flow of resume code path. Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-10 17:32:55 -04:00
Hamza Mahfooz	17d819e282	Revert "drm/amdgpu: use dirty framebuffer helper" This reverts commit `66f99628eb`. Unfortunately, that commit causes performance regressions on non-PSR setups. So, just revert it until FB_DAMAGE_CLIPS support can be added. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2189 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216554 Fixes: `66f99628eb` ("drm/amdgpu: use dirty framebuffer helper") Fixes: `abbc7a3daf` ("drm/amdgpu: don't register a dirty callback for non-atomic") Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-06 12:08:27 -04:00
Philip Yang	2302d50714	drm/amdgpu: Correct amdgpu_amdkfd_total_mem_size calculation amdkfd_total_mem_size is the size of total GPUs vram plus system memory to estimate page tables memory usage and leave enough VRAM room for page tables allocation. Calculate amdkfd_total_mem_size in amdgpu_amdkfd_device_probe is incorrect because adev->gmc.real_vram_size is still 0 called from amdgpu_device_ip_early_init. Move the calculation to amdgpu_amdkfd_device_init to get the correct VRAM size. Do reverse calculation in amdgpu_amdkfd_device_fini_sw to support hot-unplugging GPUs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-06 12:08:18 -04:00
Philip Yang	9a3c6067bd	drm/amdgpu: Set vmbo destroy after pt bo is created Under VRAM usage pression, map to GPU may fail to create pt bo and vmbo->shadow_list is not initialized, then ttm_bo_release calling amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below dmesg and NULL pointer access backtrace: Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt bo successfully, otherwise use default callback amdgpu_bo_destroy. amdgpu: amdgpu_vm_bo_update failed amdgpu: update_gpuvm_pte() failed amdgpu: Failed to map bo to gpuvm amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2 BUG: kernel NULL pointer dereference, address: RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu] Call Trace: <TASK> ttm_bo_release+0x207/0x320 [amdttm] amdttm_bo_init_reserved+0x1d6/0x210 [amdttm] amdgpu_bo_create+0x1ba/0x520 [amdgpu] amdgpu_bo_create_vm+0x3a/0x80 [amdgpu] amdgpu_vm_pt_create+0xde/0x270 [amdgpu] amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu] amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu] amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu] update_gpuvm_pte+0x160/0x420 [amdgpu] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu] kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu] kfd_ioctl+0x24a/0x5b0 [amdgpu] Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-06 12:08:09 -04:00
Arunpravin Paneer Selvam	312b4dc11d	drm/amdgpu: Fix VRAM BO swap issue DRM buddy manager allocates the contiguous memory requests in a single block or multiple blocks. So for the ttm move operation (incase of low vram memory) we should consider all the blocks to compute the total memory size which compared with the struct ttm_resource num_pages in order to verify that the blocks are contiguous for the eviction process. v2: Added a Fixes tag v3: Rewrite the code to save a bit of calculations and variables (Christian) Fixes: `c9cad937c0` ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-06 12:07:37 -04:00
Ruili Ji	21a550de5f	drm/amdgpu: Enable F32_WPTR_POLL_ENABLE in mqd This patch is to fix the SDMA user queue doorbell missing issue on SDMA 6.0. F32_WPTR_POLL_ENABLE has to be set if doorbell mode is used. Otherwise ringing SDMA user queue doorbell can't wake up system from gfxoff. Signed-off-by: Ruili Ji <ruiliji2@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x	2022-10-06 12:05:44 -04:00
Yang Yingliang	525530ad9a	drm/amdgpu/sdma: add missing release_firmware() in amdgpu_sdma_init_microcode() In some error path in amdgpu_sdma_init_microcode(), release_firmware() is not called, the memory allocated in request_firmware() will be leaked, calling amdgpu_sdma_destroy_inst_ctx() which calls release_firmware() to avoid memory leak. Fixes: `15aa13056d` ("drm/amdgpu: add function to init SDMA microcode") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-06 12:05:01 -04:00
Sonny Jiang	e626d9b9c6	drm/amdgpu: Enable VCN PG on GC11_0_1 Enable VCN PG on GC11_0_1 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x	2022-10-06 12:02:49 -04:00
Dave Airlie	65898687cf	Merge tag 'amd-drm-next-6.1-2022-09-30' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.1-2022-09-30: amdgpu: - RLC FW code cleanup - RLC fixes for GC 11.x - SMU 13.x fixes - CP FW code cleanup - SDMA FW code cleanup - GC 11.x fixes - DCN 3.2.x fixes - DCN 3.1.4 fixes - Misc fixes - RAS fixes - SR-IOV fixes - VCN 4.x fixes amdkfd: - GC 11.x fixes - Xnack fixes - UBSAN warning fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220930162012.5823-1-alexander.deucher@amd.com	2022-10-04 09:42:24 +10:00
Sonny Jiang	730548ba02	drm/amdgpu: Enable sram on vcn_4_0_2 Enable sram on vcn_4_0_2 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-30 11:21:02 -04:00
Sonny Jiang	0b37f47494	drm/amdgpu: Enable VCN DPG for GC11_0_1 Enable VCN DPG on GC11_0_1 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-30 11:20:43 -04:00
Le Ma	a79852a393	drm/amdgpu: correct the memcpy size for ip discovery firmware Use fw->size instead of discovery_tmr_size for fallback path. Signed-off-by: Le Ma <le.ma@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:44:02 -04:00
Vignesh Chander	f61a825aa8	drm/amdgpu: Skip put_reset_domain if it doesn't exist For xgmi sriov, the reset is handled by host driver and hive->reset_domain is not initialized so need to check if it exists before doing a put. Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:43:52 -04:00
Graham Sider	e67135571e	drm/amdgpu: remove switch from amdgpu_gmc_noretry_set Simplify the logic in amdgpu_gmc_noretry_set by getting rid of the switch. Also set noretry=1 as default for GFX 10.3.0 and greater since retry faults are not supported. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:43:41 -04:00
Leo Li	3ff4ccc3e9	drm/amdgpu: Fix mc_umc_status used uninitialized warning On ChromeOS clang build, the following warning is seen: /mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:463:6: error: variable 'mc_umc_status' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (mca_addr == UMC_INVALID_ADDR) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ /mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:485:21: note: uninitialized use occurs here if ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1 && ^~~~~~~~~~~~~ /mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:1208:5: note: expanded from macro 'REG_GET_FIELD' (((value) & REG_FIELD_MASK(reg, field)) >> REG_FIELD_SHIFT(reg, field)) ^~~~~ /mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:463:2: note: remove the 'if' if its condition is always true if (mca_addr == UMC_INVALID_ADDR) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:460:24: note: initialize the variable 'mc_umc_status' to silence this warning uint64_t mc_umc_status, mc_umc_addrt0; ^ = 0 1 error generated. make[5]: *** [/mnt/host/source/src/third_party/kernel/v5.15/scripts/Makefile.build:289: drivers/gpu/drm/amd/amdgpu/umc_v6_7.o] Error 1 Fix by initializing mc_umc_status = 0. Fixes: `1014bd1cb3` ("drm/amdgpu: support to convert dedicated umc mca address") Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:42:50 -04:00
Tao Zhou	5e1fdf76cf	drm/amdgpu: add page retirement handling for CPU RAS Do RAS page retirement in poison consumption handler unconditionally. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:42:36 -04:00
Tao Zhou	cd4c99f103	drm/amdgpu: use RAS error address convert api in mca notifier Use the convert interface to simplify code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:42:31 -04:00
Tao Zhou	1014bd1cb3	drm/amdgpu: support to convert dedicated umc mca address Update umc error address query interface, the mca address can be read from register or input from parameter. TODO: define a common address conversion function to simplify the code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:42:21 -04:00
Tao Zhou	c19a5f325a	drm/amdgpu: export umc error address convert interface Make it global so we can convert specific mca address. v2: rename query_error_address_per_channel to convert_ras_error_address Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:42:15 -04:00
Likun Gao	baf28cc10a	drm/amdgpu: fix sdma v4 init microcode error Fix init SDMA microcode error for sdma v4, which caused by mistake when rearch sdma init microcode function (coding 4.2.2 to 4.2.0). Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:42:08 -04:00
Bokun Zhang	d7274ec723	drm/amdgpu: Add amdgpu suspend-resume code path under SRIOV - Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor during the suspend time. Furthermore, we cannot request a mode 1 reset under SRIOV as VF. Therefore, we will skip it as it is called in suspend_noirq() function. - In the resume code path, we need to send REQ_GPU_INIT to the hypervisor and also resume PSP IP block under SRIOV. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:46 -04:00
Likun Gao	2d89e2ddfd	drm/amdgpu: fix compiler warning for amdgpu_gfx_cp_init_microcode Change the type of parameter on amdgpu_gfx_cp_init_microcode to fix compiler warning. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:46 -04:00
Hawking Zhang	940d4dd402	drm/amdgpu: add rlc_sr_cntl_list to firmware array To allow upload the list via psp Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:46 -04:00
Jiadong.Zhu	e7b8e90add	drm/amdgpu: Remove fence_process in count_emitted The function amdgpu_fence_count_emitted used in work_hander should not call amdgpu_fence_process which must be used in irq handler. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:46 -04:00
Jiadong.Zhu	415be17fb2	drm/amdgpu: Correct the position in patch_cond_exec The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec underflows when the wptr is divisible by ring->buf_mask + 1. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:46 -04:00
Graham Sider	3e9cf23428	drm/amdgpu: pass queue size and is_aql_queue to MES Update mes_v11_api_def.h add_queue API with is_aql_queue parameter. Also re-use gds_size for the queue size (unused for KFD). MES requires the queue size in order to compute the actual wptr offset within the queue RB since it increases monotonically for AQL queues. v2: Make is_aql_queue assign clearer Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:44 -04:00
David Belanger	585a82618b	drm/amdgpu: Enable SA software trap. Enables support for software trap for MES >= 4. Adapted from implementation from Jay Cornwall. v2: Add IP version check in conditions. v3: Remove debugger code changes. Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: David Belanger <david.belanger@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Ruijing Dong	167be85228	drm/amdgpu/vcn: update vcn4 fw shared data structure update VF_RB_SETUP_FLAG, add SMU_DPM_INTERFACE_FLAG, and corresponding change in VCN4. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	b077656b8c	drm/amdgpu/sdma6: use common function to init sdma fw Use common function to init sdma v6 firmware ucode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	52642d13d6	drm/amdgpu: support sdma struct v2 fw init Support SDMA firmware init on common function for sdma v2 struct. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	108db8decf	drm/amdgpu/sdma5: use common function to init sdma fw Use common function to init sdma v5 firmware ucode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	a2d3b4b81f	drm/amdgpu/sdma4: use common function to init sdma fw Use common function to init sdma v4 firmware ucode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	15aa13056d	drm/amdgpu: add function to init SDMA microcode Add an common function to init SDMA related microcode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	e268df1d20	drm/amdgpu/gfx11: use common function to init cp fw Use common function to init gfx v11 CP firmware ucode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	5993e4c68a	drm/amdgpu/gfx10: use common function to init CP fw Use common function to init gfx v10 CP firmware ucode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	93cad722d3	drm/amdgpu/gfx9: use common function to init cp fw Use common function to init gfx v9 CP firmware ucode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Likun Gao	ec71b25017	drm/amdgpu: add function to init CP microcode Add an common function to init CP related microcode. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:43 -04:00
Evan Quan	7436538899	drm/amdgpu: avoid gfx register accessing during gfxoff Make sure gfxoff is disabled before gfx register accessing. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:42 -04:00
Lijo Lazar	bb66ecbf12	drm/amdgpu: Use simplified API for p2p dist calc Use the simpified API that calculates distance between two devices. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:42 -04:00
Lijo Lazar	d0fa84f174	drm/amdgpu: Disable verbose for p2p dist calc Disable verbose while getting p2p distance. With verbose, it shows warning if ACS redirect is set between the devices. Adds noise to dmesg logs when a few GPU devices are on the same platform. Example log: amdgpu 0000:34:00.0: ACS redirect is set between the client and provider (0000:31:00.0) amdgpu 0000:34:00.0: to disable ACS redirect for this path, add the kernel parameter: pci=disable_acs_redir=0000:30:00.0;0000:2e:00.0;0000:33:00.0;0000:2e:10.0 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:42 -04:00
YiPeng Chai	642c040113	drm/amdgpu: Fixed ras warning when uninstalling amdgpu For the asic using smu v13_0_2, there is the following warning when uninstalling amdgpu: amdgpu: ras disable gfx failed poison:1 ret:-22. [Why]: For the asic using smu v13_0_2, the psp .suspend and mode1reset is called before amdgpu_ras_pre_fini during amdgpu uninstall, it has disabled all ras features and reset the psp. Since the psp is reset, calling amdgpu_ras_disable_all_features in amdgpu_ras_pre_fini to disable ras features will fail. [How]: If all ras features are disabled, amdgpu_ras_disable_all_features will not be called to disable all ras features again. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:42 -04:00
Hawking Zhang	7c32d4e37f	drm/amdgpu/gfx11: switch to amdgpu_gfx_rlc_init_microcode switch to common helper to initialize rlc firmware for gfx11 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:42 -04:00
Hawking Zhang	39a35d52d4	drm/amdgpu/gfx10: switch to amdgpu_gfx_rlc_init_microcode switch to common helper to initialize rlc firmware for gfx10 v2: squash in size validation fix (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-29 09:41:42 -04:00
Dave Airlie	e8573000f4	Merge tag 'amd-drm-next-6.1-2022-09-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.1-2022-09-23: amdgpu: - SDMA fix - Add new firmware types to debugfs/IOCTL version queries - Misc spelling and grammar fixes - Misc code cleanups - DCN 3.2.x fixes - DCN 3.1.x fixes - CS cleanup - Gang submit support - Clang fixes - Non-DC audio fix - GPUVM locking fixes - Vega10 PWN fan speed fix amdkgd: - MQD manager cleanup - Misc spelling and grammar fixes UAPI: - Add new firmware types to the FW version query IOCTL Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220923215729.6061-1-alexander.deucher@amd.com	2022-09-28 14:56:09 +10:00
Dave Airlie	907cc346ff	Merge tag 'drm-misc-next-2022-09-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 6.1: UAPI Changes: Cross-subsystem Changes: - dma-buf: Improve signaling when debugging Core Changes: - Backlight handling improvements - format-helper: Add drm_fb_build_fourcc_list() - fourcc: Kunit tests improvements - modes: Add DRM_MODE_INIT() macro - plane: Remove drm_plane_init(), Allocate planes with drm_universal_plane_alloc() - plane-helper: Add drm_plane_helper_atomic_check() - probe-helper: Add drm_connector_helper_get_modes_fixed() and drm_crtc_helper_mode_valid_fixed() - tests: Conversion to parametrized tests, test name consistency Driver Changes: - amdgpu: Fix for a VRAM eviction issue - ast: Resolution handling improvements - mediatek: small code improvements for DP - omap: Refcounting fix, small improvements - rockchip: RK3568 support, Gamma support for RK3399 - sun4i: Build failure fix when !OF - udl: Multiple fixes here and there - vc4: HDMI hotplug handling improvements - vkms: Warning fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220923073943.d43tne5hni3iknlv@houat	2022-09-28 13:50:46 +10:00
Hawking Zhang	f6f8bb5989	drm/amdgpu/gfx9: switch to amdgpu_gfx_rlc_init_microcode switch to common helper to initialize rlc firmware for gfx9 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-27 17:02:39 -04:00
Hawking Zhang	5b41521268	drm/amdgpu: add helper to init rlc firmware To initialzie rlc firmware according to rlc firmware header version v2: squash in backwards compat fix Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-27 17:02:38 -04:00
Arunpravin Paneer Selvam	39dd0cc2e5	drm/amdgpu: Fix VRAM eviction issue A user reported that when he starts a game (MTGA) with wine, he observed an error msg failed to pin framebuffer with error -12. Found an issue with the VRAM mem type eviction decision condition logic. This patch will fix the if condition code error. Gitlab bug link: https://gitlab.freedesktop.org/drm/amd/-/issues/2159 Fixes: `ded910f368` ("drm/amdgpu: Implement intersect/compatible functions") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220922151447.265696-1-Arunpravin.PaneerSelvam@amd.com Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2022-09-22 19:53:06 +02:00

1 2 3 4 5 ...

11361 Commits