drm/amdkfd: fix partition query when setting up recommended sdma engines

When users dynamically set the partition mode through sysfs writes,
this can lead to a double lock situation where the KFD is trying to take
the partition lock when updating the recommended SDMA engines.
Have the KFD reference its saved socket device number count instead.
Also ensure we have enough SDMA xGMI engines to report the recommended
engines in the first place.

Fixes: e06b71b231 ("drm/amdkfd: allow users to target recommended SDMA engines")
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Jonathan Kim 2024-08-07 15:33:41 -04:00 committed by Alex Deucher
parent 9b7e697839
commit 70f83e7706

View File

@ -1286,9 +1286,8 @@ static void kfd_set_recommended_sdma_engines(struct kfd_topology_device *to_dev,
struct amdgpu_device *adev = gpu->adev;
int num_xgmi_nodes = adev->gmc.xgmi.num_physical_nodes;
bool support_rec_eng = !amdgpu_sriov_vf(adev) && to_dev->gpu &&
adev->aid_mask && num_xgmi_nodes &&
(amdgpu_xcp_query_partition_mode(adev->xcp_mgr, AMDGPU_XCP_FL_NONE) ==
AMDGPU_SPX_PARTITION_MODE) &&
adev->aid_mask && num_xgmi_nodes && gpu->kfd->num_nodes == 1 &&
kfd_get_num_xgmi_sdma_engines(gpu) >= 14 &&
(!(adev->flags & AMD_IS_APU) && num_xgmi_nodes == 8);
if (support_rec_eng) {