Merge the functionality mostly into amdgpu_vm_bo_update_mapping.
This way we can even handle small contiguous system pages without
to much extra CPU overhead.
v2: fix typo, keep the cursor as it is for now
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com> (v1)
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Create new debugfs entry to print memory info using VM buffer
objects.
V2: Added Common function for printing BO info.
Dump more VM lists for evicted, moved, relocated, invalidated.
Removed dumping VM mapped BOs.
V3: Fixed coding style comments, renamed print API and variables.
V4: Fixed coding style comments.
Signed-off-by: Mihir Bhogilal Patel <Mihir.Patel@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
align frag_end to the next pd when there are no
page table entries on the current pde.
This fixes invalidation of larger address space areas
where some page tables are allocated and other aren't.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds a new trace event to track the PTE update
events. This specific event will provide information like:
- start and end of virtual memory mapping
- HW engine flags for the map
- physical address for mapping
This will be particularly useful for memory profiling tools
(like RMV) which are monitoring the page table update events.
V2: Added physical address lookup logic in trace point
V3: switch to use __dynamic_array
added nptes int the TPprint arguments list
added page size in the arg list
V4: Addressed Christian's review comments
add start/end instead of seg
use incr instead of page_sz to be accurate
V5: Addressed Christian's review comments:
add pid and vm context information in the event
V6: Re-sequence the variables (put pid and ctx_id first)
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Calculate the correct value for max_entries or we might run after the
page_address array.
v2: Xinhui pointed out we don't need the shift
v3: use local copy of start and simplify some calculation
v4: fix the case that we map less VA range than BO size
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 1e691e2444 drm/amdgpu: stop allocating dummy GTT nodes
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.
v2: Use a typed visible static inline function
instead of an invisible macro.
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Backmerging drm-next into drm-misc-next for nouveau and panel updates.
Resolves a conflict between ttm and nouveau, where struct ttm_mem_res got
renamed to struct ttm_resource.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Access the exported P2P dmabuf over XGMI, if available.
Otherwise, fall back to the existing PCIe method.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin <apaneers@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Have strict check on bo mapping since on some systems, such as A+A or
hybrid, the cpu might support 5 level paging or can address memory above
48 bits but gpu might be limited by hardware to just use 48 bits. In
general, this applies to all asics where this limitation can be checked
against their max_pfn range. This restricts the range to map bo within
pratical limits of cpu and gpu for shared virtual memory access.
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For HMM support we need the ability to invalidate PTEs from
a MM callback where we can't lock the root PD.
Add a new flag to better support this instead of assuming
that all invalidation updates are unlocked.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If a buffer object is secure, i.e. created with
AMDGPU_GEM_CREATE_ENCRYPTED, then the TMZ bit of
the PTEs that belong the buffer object should be
set.
v1: design and draft the skeletion of TMZ bits setting on PTEs (Alex)
v2: return failure once create secure BO on non-TMZ platform (Ray)
v3: amdgpu_bo_encrypted() only checks the BO (Luben)
v4: move TMZ flag setting into amdgpu_vm_bo_update (Christian)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Vega20 arbitrates pstate at hive level and not device level. Last peer to
remote buffer unmap could drop P-State while another process is still
remote buffer mapped.
With this fix, P-States still needs to be disabled for now as SMU bug
was discovered on synchronous P2P transfers. This should be fixed in the
next FW update.
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Originally, only the PTE valid is taken in consider.
The PRT case is missied when bo update which raise problem.
We need add condition for PRT case.
v2: add PRT condition for amdgpu_vm_bo_update_mapping, too
v3: fix one typo error
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SPM access the video memory according to SPM_VMID. It should be updated
with the job's vmid right before the job is scheduled. SPM_VMID is a
global resource
Signed-off-by: Jacob He <jacob.he@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add update fences to the root PD while mapping BOs.
Otherwise PDs freed during the mapping won't wait for
updates to finish and can cause corruptions.
v2: rebased on drm-misc-next
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 90b69cdc5f drm/amdgpu: stop adding VM updates fences to the resv obj
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Tested-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If provided we only sync to the BOs reservation
object and no longer to the root PD.
v2: update comment, cleanup amdgpu_bo_sync_wait_resv
v3: use correct reservation object while clearing
v4: fix typo in amdgpu_bo_sync_wait_resv
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Avoid reclaim filesystem while eviction lock is held called from
MMU notifier.
[How]
Setting PF_MEMALLOC_NOFS flags while eviction mutex is locked.
Using memalloc_nofs_save / memalloc_nofs_restore API.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm_sched_entity_init() takes drm gpu scheduler list instead of
drm_sched_rq list. This makes conversion of drm_sched_rq list
to drm gpu scheduler list unnecessary
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Entity currently keeps a copy of run_queue list and modify it in
drm_sched_entity_set_priority(). Entities shouldn't modify run_queue
list. Use drm_gpu_scheduler list instead of drm_sched_rq list
in drm_sched_entity struct. In this way we can select a runqueue based
on entity/ctx's priority for a drm scheduler.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't add the VM update fences to the resv object and remove
the handling to stop implicitely syncing to them.
Ongoing updates prevent page tables from being evicted and we manually
block for all updates to complete before releasing PDs and PTS.
This way we can do updates even without the resv obj locked.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only for the debugger use case.
[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.
[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm-misc-next for 5.5:
UAPI Changes:
-dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean)
Cross-subsystem Changes:
- None
Core Changes:
-dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock
state on mmap/munmap (Christian)
-vram: add prepare/cleanup fb helpers to vram helpers (Thomas)
-ttm: always keep bo's on the lru + ttm cleanups (Christian)
-sched: allow a free_job routine to sleep (Steven)
-fb_helper: remove unused drm_fb_helper_defio_init() (Thomas)
Driver Changes:
-bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas)
-amdgpu: Implement dma-buf import/export without drm helpers (Christian)
-panfrost: Simplify devfreq integration in driver (Steven)
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031193015.GA243509@art_vandelay
The boolean variable pasid_mapping_needed is not initialized and
there are code paths that do not assign it any value before it is
is read later. Fix this by initializing pasid_mapping_needed to
false.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 6817bf283b ("drm/amdgpu: grab the id mgr lock while accessing passid_mapping")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>