Some video card has more than one vcn instance, passing 0 to
vcn_v3_0_pause_dpg_mode is incorrect.
Error msg:
Register(1) [mmUVD_POWER_STATUS] failed to reach value
0x00000001 != 0x00000002
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
For VCN FW to detect ASIC type, in order to use different mailbox registers.
V2: simplify codes and fix format issue.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Variable igp_lane_info always is 0. 0 & any value = 0 and false.
In this way, all сonditional statements will false.
The code was leftover from when the code was ported from radeon
where igp_lane_info was derived from the vbios on supported
platforms.
[update commit message - Alex]
Signed-off-by: Grigory Vasilyev <h0tc0d3@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For VG20 + XGMI bridge, all mappings PTEs cache in TC, this may have
stall invalid PTEs in TC because one cache line has 8 pages. Need always
flush_tlb after updating mapping.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Testing the valid bit is not enough to figure out if we
need to invalidate the TLB or not.
During eviction it is quite likely that we move a BO from VRAM to GTT and
update the page tables immediately to the new GTT address.
Rework the whole function to get all the necessary parameters directly as
value.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Flush TLB needs wait for GPU update fence done. MMU notify callback to
unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb
to unlocked update fence to increase vm->tlb_seq.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To fix two issues with unlocked update fence:
1. vm->last_unlocked store the latest fence without taking refcount.
2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This way we finally fix the problem that new resource are
not immediately evict-able after allocation.
That has caused numerous problems including OOM on GDS handling
and not being able to use TTM as general resource manager.
v2: stop assuming in ttm_resource_fini that res->bo is still valid.
v3: cleanup kerneldoc, add more lockdep annotation
v4: consistently use res->num_pages
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-1-christian.koenig@amd.com
RAS error query support addition for JPEG 2.6
V2: removed unused options and corrected comment format.
Moved register definition to header file.
V3: poison query status check added.
Removed the error query support
V4: Return statement refactored.
Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add vcn and jpeg ras support options
V2: vcn and jpeg ras flag enabled for aldebaran asic only
V3: vcn and jpeg ras flag disabled for error counter query
Generic poison query interface added
VCN and JPEG ras enabled based on IP version check
V4: vcn and jpeg ras flag moved under ecc flag for dGPU
Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some video card has more than one vcn instance, passing 0 to
vcn_v3_0_pause_dpg_mode is incorrect.
Error msg:
Register(1) [mmUVD_POWER_STATUS] failed to reach value
0x00000001 != 0x00000002
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ATOMIC and DRIVER log categories do not typically contain per-frame log
messages. This patch re-classifies some messages in amd to chattier
categories to keep ATOMIC/DRIVER quiet.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For VCN FW to detect ASIC type, in order to use different mailbox registers.
V2: simplify codes and fix format issue.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of tracking the VM updates through the dependencies just use a
sequence counter for page table updates which indicates the need to
flush the TLB.
This reduces the need to flush the TLB drastically.
v2: squash in NULL check fix (Christian)
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Separate the VM page table backend operations from the state machine since
the amdgpu_vm.c file is becoming to complex.
The allocating, freeing and updating page tables and page directories can
easily be moved into a separate file.
While at it cleanup everything checkpatch.pl reported and rename the
functions a bit to make more clear that they belong together.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the page tables to the idle list after updating the PDEs.
We have gone back and forth with that a couple of times because of problems
with the inter PD dependencies, but it should work now that we have the
state handling cleanly separated.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add help functions to query and reset RAS UTCL2 poison status.
v2: implement it on amdgpu side and kfd only calls it.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix
this by passing correct number of VMIDs to HWS
v2: squash in warning fix (Alex)
Signed-off-by: Tushar Patel <tushar.patel@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It is a hardware issue that VCN can't handle a GTT
backing stored TMZ buffer on CHIP_RAVEN series ASIC.
Move such a TMZ buffer to VRAM domain before command
submission as a workaround.
v2:
- Use patch_cs_in_place callback.
v3:
- Bail out early if unsecure IBs.
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org