KASAN reported a slab-out-of-bounds read of size 1 in
kdf_create_vcrat_image_cpu().
This occurs when, for example, when on an x86_64 with a single NUMA node
because kfd_fill_iolink_info_for_cpu() is a no-op, but afterwards the
sub_type_hdr->length, which is out-of-bounds, is read and multiplied by
entries. Fortunately, entries is 0 in this case so the overall
crat_table->length is still correct.
Check if there were any entries before de-referencing sub_type_hdr which
may be pointing to out-of-bounds memory.
Fixes: b7b6c38529 ("drm/amdkfd: Calculate CPU VCRAT size dynamically (v2)")
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The acpi_get_table() should be coupled with acpi_put_table() if
the mapped table is not used at runtime to release the table
mapping which can prevent the memory leak.
In kfd_create_crat_image_acpi(), crat_table is copied to pcrat_image,
and in kfd_create_vcrat_image_cpu(), the acpi_table is only used to
get the OEM information, so those two table mappings need to be released
after using it.
Fixes: 174de876d6 ("drm/amdkfd: Group up CRAT related functions")
Fixes: 520b8fb755 ("drm/amdkfd: Add topology support for CPUs")
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the ignore_crat is set to non-zero value, it's no point getting
the CRAT table, so just move the ignore_crat check before we get the
CRAT table.
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of guessing at a sufficient size for the CPU VCRAT, base the
size on the number of online NUMA nodes.
v2: fix warning
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We still have a few iommu issues which need to address, so force raven
as "dgpu" path for the moment.
This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
or ACPI CRAT table not correct.
v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2.
v3: Align with existed thunk, don't change the way of raven, only renoir
will use "dgpu" path by default.
v4: don't update global ignore_crat in the driver, and revise fallback
function if CRAT is broken.
v5: refine acpi crat good but no iommu support case, and rename the
title.
v6: fix the issue of dGPU initialized firstly, just modify the report
value in the node_show().
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The bitmap in cu_info structure is defined as a 4x4 size array. In
Acturus, this matrix is initialized as a 4x2. Based on the 8 shaders.
In the gpu cache filling initialization, the access to the bitmap matrix
was done as an 8x1 instead of 4x2. Causing an out of bounds memory
access error.
Due to this, the number of GPU cache entries was inconsistent.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing break statement in order to prevent the code from falling
through to case CHIP_NAVI10.
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
Fixes: 14328aa58c ("drm/amdkfd: Add navi10 support to amdkfd. (v3)")
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
GPU cache info (part of virtual CRAT) size depends on CU number.
For arcturus, CU number has been increased. So the required memory
for vcrat also increases.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
After amdkfd module is merged into amdgpu, KFD can call amdgpu directly
and no longer needs to use the function pointer. Replace those function
pointers with functions if they are not ASIC dependent.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If there are several memory banks that has the same properties in CRAT,
we aggregate them into one memory bank. This cleans up memory banks on
APUs (e.g. Raven) where the CRAT reports each memory channel as a
separate bank. This only confuses user mode, which only deals with
virtual memory.
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* Report 64-bit doorbells as HSA_CAP_DOORBELL_TYPE_2_0 in topology
* Report cache information in topology (duplicates GFXv8 info for now)
* Add device info for Vega10 support in KFD
Raven is not enabled at this time as it needs additional changes in
DQM to work with a single SDMA engine.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Simulate large-BAR system by exporting only visible memory. This
limits the amount of available VRAM to the size of the BAR, but
enables CPU access to VRAM.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
When CONFIG_ACPI is disabled, we never initialize the acpi_table
structure in kfd_create_crat_image_virtual:
drivers/gpu/drm/amd/amdkfd/kfd_crat.c: In function 'kfd_create_crat_image_virtual':
drivers/gpu/drm/amd/amdkfd/kfd_crat.c:888:40: error: 'acpi_table' may be used uninitialized in this function [-Werror=maybe-uninitialized]
The undefined behavior also happens for any other acpi_get_table()
failure, but then the compiler can't warn about it.
This adds an error check that prevents the structure from
being used in error, avoiding both the undefined behavior and
the warning about it.
Fixes: 520b8fb755 ("drm/amdkfd: Add topology support for CPUs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
ASIC information. Also allow building KFD without IOMMUv2 support.
This is still useful for dGPUs and prepares for enabling KFD on
architectures that don't support AMD IOMMUv2.
v2:
* Centralize IOMMUv2 code to avoid #ifdefs in too many places
v3:
* Imply AMD_IOMMU_V2 in Kconfig
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Generate and parse VCRAT tables for dGPUs in kfd_topology_add_device.
Some information that isn't available in the CRAT table is patched
into the topology after parsing.
HSA_CAP_DOORBELL_TYPE_1_0 is dependent on the ASIC feature
CP_HQD_PQ_CONTROL.SLOT_BASED_WPTR, which was not introduced in VI
until Carrizo. Report HSA_CAP_DOORBELL_TYPE_PRE_1_0 on Tonga ASICs.
v2: Added #include <linux/pci.h> to kfd_crat.c to make it compile
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Currently, the KFD topology information is generated by parsing the CRAT
(ACPI) table. However, at present CRAT table is available only for AMD
APUs. To support CPUs on systems without a CRAT table, the KFD driver will
create a Virtual CRAT (VCRAT) table and then the existing code will parse
that table to generate topology.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Only count memory banks in one place. Ignore redundant num_banks
entry in crat_subtype_computeunit.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Currently, CRAT parsing is intertwined with topology_device_list and
hence repeated calls to kfd_parse_crat_table() will fail. Decouple
kfd_parse_crat_table() and topology_device_list.
kfd_parse_crat_table() will parse CRAT and add topology devices to a
temporary list temp_topology_device_list and then
kfd_topology_update_device_list will move contents from temporary list to
master list.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Take CRAT related functions out of kfd_topology.c and place them in
kfd_crat.c. This is the initial step of supporting more CRAT features,
i.e. creating virtual CRAT table for KFD devices without CRAT.
v2: Minor cleanup that was missed previously because code moved around
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>