linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-10 14:11:52 +00:00

Author	SHA1	Message	Date
Matthew Brost	501d943893	drm/xe: Update xe_sa to use xe_managed_bo_create_pin_map Preferred way to create kernel BOs is xe_managed_bo_create_pin_map, use it. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820172958.1095143-7-matthew.brost@intel.com	2024-08-23 09:54:32 -07:00
Matthew Brost	6eb2aad402	drm/xe: Move hw_engine_fini to devm managed Kernel BOs are destroyed with GGTT mappings, this is hardware interaction so use devm. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820172958.1095143-5-matthew.brost@intel.com	2024-08-23 09:54:13 -07:00
Matthew Brost	a323782567	drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini Not a big deal if CT is down as driver is unloading, no need to warn. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820172958.1095143-4-matthew.brost@intel.com	2024-08-23 09:54:12 -07:00
Matthew Brost	b5de6a5ced	drm/xe: Set firmware state to loadable before registering guc_fini_hw The guc_fini_hw registered calls __xe_uc_fw_status which is only expected to be called after initializing fw state. Move this before registering guc_fini_hw. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820172958.1095143-3-matthew.brost@intel.com	2024-08-23 09:54:11 -07:00
Matthew Brost	5b993d00d7	drm/xe: Move ggtt_fini to devm managed ggtt->scratch is destroyed via devm, ggtt_fini sets ggtt->scratch to NULL, ggtt->scratch in GGTT clears, so ensure ggtt->scratch is set NULL before the BO is destroyed. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820172958.1095143-2-matthew.brost@intel.com	2024-08-23 09:54:10 -07:00
Matthew Brost	25ebe10e3f	Revert "drm/xe: Invalidate media_gt TLBs in PT code" This reverts commit `40520283e0`. We can't install dma-fence-chain in timeline sync objs. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240823162207.2168887-1-matthew.brost@intel.com	2024-08-23 09:51:47 -07:00
Rodrigo Vivi	919bb54e98	drm/xe: Fix missing runtime outer protection for ggtt_remove_node Defer the ggtt node removal to a thread if runtime_pm is not active. The ggtt node removal can be called from multiple places, including places where we cannot protect with outer callers and places we are within other locks. So, try to grab the runtime reference if the device is already active, otherwise defer the removal to a separate thread from where we are sure we can wake the device up. v2: - use xe wq instead of system wq (Matt and CI) - Avoid GFP_KERNEL to be future proof since this removal can be called from outside our drivers and we don't want to block if atomic is needed. (Brost) v3: amend forgot chunk declaring xe_device. v4: Use a xe_ggtt_region to encapsulate the node and remova info, wihtout the need for any memory allocation at runtime. v5: Actually fill the delayed_removal.invalidate (Brost) v6: - Ensure that ggtt_region is not freed before work finishes (Auld) - Own wq to ensures that the queued works are flushed before ggtt_fini (Brost) v7: also free ggtt_region on early !bound return (Auld) v8: Address the null deref (CI) v9: Based on the new xe_ggtt_node for the proper care of the lifetime of the object. v10: Redo the lost v5 change. (Brost) v11: Simplify the invalidate_on_remove (Lucas) Cc: Matthew Auld <matthew.auld@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-12-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	34e804220f	drm/xe: Make xe_ggtt_node struct independent In some rare cases, the drm_mm node cannot be removed synchronously due to runtime PM conditions. In this situation, the node removal will be delegated to a workqueue that will be able to wake up the device before removing the node. However, in this situation, the lifetime of the xe_ggtt_node cannot be restricted to the lifetime of the parent object. So, this patch introduces the infrastructure so the xe_ggtt_node struct can be allocated in advance and freed when needed. By having the ggtt backpointer, it also ensure that the init function is always called before any attempt to insert or reserve the node in the GGTT. v2: s/xe_ggtt_node_force_fini/xe_ggtt_node_fini and use it internaly (Brost) v3: - Use GF_NOFS for node allocation (CI) - Avoid ggtt argument, now that we have it inside the node (Lucas) - Fix some missed fini cases (CI) v4: - Fix SRIOV critical case where config->ggtt_region was lost (Michal) - Avoid ggtt argument also on removal (missed case on v3) (Michal) - Remove useless checks (Michal) - Return 0 instead of negative errno on a u32 addr. (Michal) - s/xe_ggtt_assign/xe_ggtt_node_assign for coherence, while we are touching it (Michal) v5: - Fix VFs' ggtt_balloon Cc: Matthew Auld <matthew.auld@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-11-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	15ca09499b	drm/xe: Refactor xe_ggtt balloon functions to make the node clear These operations are related to node. Convert them to the new appropriate name space xe_ggtt_node. v2: Also move arguments around for consistency (Lucas). v3: s/node_balloon/node_insert_balloon and s/node_deballoon/node_remove_balloon (Michal). Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-10-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	136367290e	drm/xe: Introduce xe_ggtt_print_holes Introduce a new xe_ggtt_print_holes helper that attends the SRIOV demand and finishes the goal of limiting drm_mm access to xe_ggtt. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-9-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	1144e0dff5	drm/xe: Introduce xe_ggtt_largest_hole Introduce a new xe_ggtt_largest_hole helper that attends the SRIOV demand and continue with the goal of limiting drm_mm access to xe_ggtt. v2: Fix a typo (Michal) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-8-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:45 -04:00
Rodrigo Vivi	8b5ccc9743	drm/xe: Limit drm_mm_node_allocated access to xe_ggtt_node Continue with the encapsulation of drm_mm_node inside xe_ggtt. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-7-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	0567f18e07	drm/xe: Rename xe_ggtt_node related functions Bring some consistency and prepare for more xe_ggtt_node related functions to be introduced. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-6-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	6062ea9398	drm/xe: Encapsulate drm_mm_node inside xe_ggtt_node The xe_ggtt component uses drm_mm to manage the GGTT. The drm_mm_node is just a node inside drm_mm, but in Xe we use that only in the GGTT context. So, this patch encapsulates the drm_mm_node into a xe_ggtt's new struct. This is the first step towards limiting all the drm_mm access through xe_ggtt. The ultimate goal is to have a better control of the node insertion and removal, so the removal can be delegated to a delayed workqueue. v2: Fix includes and typos (Michal and Brost) Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-5-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	6dbd43dced	drm/{i915, xe}: Avoid direct inspection of dpt_vma from outside dpt DPT code is so dependent on i915 vma implementation and it is not ported yet to Xe. This patch limits inspection to DPT's VMA struct to intel_dpt component only, so the Xe GGTT code can evolve. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-4-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	df99acc7ba	drm/xe: Remove unnecessary drm_mm.h includes These includes are no longer necessary, and where appropriate are replaced by the linux/types.h one. Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-3-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	244fe16663	drm/xe: Introduce GGTT documentation Document xe_ggtt and ensure it is part of the built kernel docs. v2: - Accepted all Michal's suggestions - Rebased on top of new set_pte per platform/wa function pointer v3: - Typos and other acronym fixes (Michal) Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> #v1 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Rodrigo Vivi	69f0925c67	drm/xe: Removed unused xe_ggtt_printk Apparently this was only useful when enabling ggtt support for the very first time and never used again. It is also not useful now that we have the ggtt_dump available through debugfs. Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-22 14:00:44 -04:00
Matthew Auld	321d6b4b9c	drm/xe: fixup xe_alloc_pf_queue kzalloc expects number of bytes, therefore we should convert the number of dw into bytes, otherwise we are likely just accessing beyond the array causing all kinds of carnage. Also fixup the error handling while we are here. v2: - Prefer kcalloc (dim) Fixes: `3338e4f90c` ("drm/xe: Use topology to determine page fault queue size") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821171917.417386-2-matthew.auld@intel.com	2024-08-21 19:38:24 -07:00
Matthew Brost	40520283e0	drm/xe: Invalidate media_gt TLBs in PT code Testing on LNL has shown media GT's TLBs need to be invalidated via the GuC, update PT code appropriately. v2: - Do dma_fence_get before first call of invalidation_fence_init (Himal) - No need to check for valid chain fence (Himal) Fixes: `3330361543` ("drm/xe/lnl: Add LNL platform definition") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820161632.987369-1-matthew.brost@intel.com	2024-08-21 19:34:07 -07:00
Matthew Brost	77cc3f6c58	drm/xe: Invalidate media_gt TLBs Testing on LNL has shown media TLBs need to be invalidated via the GuC, update xe_vm_invalidate_vma appropriately. v2: Fix 2 tile case v3: Include missing local change Fixes: `3330361543` ("drm/xe/lnl: Add LNL platform definition") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820160129.986889-1-matthew.brost@intel.com	2024-08-21 08:53:50 -07:00
Matthew Brost	32a42c93b7	drm/xe: Free job before xe_exec_queue_put Free job depends on job->vm being valid, the last xe_exec_queue_put can destroy the VM. Prevent UAF by freeing job before xe_exec_queue_put. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820202309.1260755-1-matthew.brost@intel.com	2024-08-21 08:38:37 -07:00
Matthew Brost	60db6f540a	drm/xe: Drop HW fence pointer to HW fence ctx The HW fence ctx objects are not ref counted rather tied to the life of an LRC object. HW fences reference the HW fence ctx, HW fences can outlive LRCs thus resulting in UAF. Drop the HW fence pointer to HW fence ctx rather just store what is needed directly in HW fence. v2: - Fix typo in commit (Ashutosh) - Use snprintf (Ashutosh) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815193522.16008-1-matthew.brost@intel.com	2024-08-20 13:06:00 -07:00
Stuart Summers	df2dbc925f	drm/xe/guc: Bump the G2H queue size to account for page faults With the increase in the size of the recoverable page fault queue, we want to ensure the initial messages from GuC in the G2H buffer have space while we transfer those out to the actual pf_queue. Bump the G2H queue size to account for this increase in the pf_queue size. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4c2b6974801bcffd8a010d838c8733fa4092573d.1723862633.git.stuart.summers@intel.com	2024-08-20 09:45:54 -07:00
Stuart Summers	3338e4f90c	drm/xe: Use topology to determine page fault queue size Currently the page fault queue size is hard coded. However the hardware supports faulting for each EU and each CS. For some applications running on hardware with a large number of EUs and CSs, this can result in an overflow of the page fault queue. Add a small calculation to determine the page fault queue size based on the number of EUs and CSs in the platform as detmined by fuses. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/24d582a3b48c97793b8b6a402f34b4b469471636.1723862633.git.stuart.summers@intel.com	2024-08-20 09:45:51 -07:00
Stuart Summers	7586fc52b1	drm/xe: Fix missing workqueue destroy in xe_gt_pagefault On driver reload we never free up the memory for the pagefault and access counter workqueues. Add those destroy calls here. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c9a951505271dc3a7aee76de7656679f69c11518.1723862633.git.stuart.summers@intel.com	2024-08-20 09:40:30 -07:00
Nirmoy Das	2368306180	drm/xe/lnl: Offload system clear page activity to GPU On LNL because of flat CCS, driver creates migrates job to clear CCS meta data. Extend that to also clear system pages using GPU. Inform TTM to allocate pages without __GFP_ZERO to avoid double page clearing by clearing out TTM_TT_FLAG_ZERO_ALLOC flag and set TTM_TT_FLAG_CLEARED_ON_FREE while freeing to skip ttm pool's clear on free as XE now takes care of clearing pages. If a bo is in system placement such as BO created with DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING and there is a cpu map then for such BO gpu clear will be avoided as there is no dma mapping for such BO at that moment to create migration jobs. Tested this patch api_overhead_benchmark_l0 from https://github.com/intel/compute-benchmarks Without the patch: api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation: UsmMemoryAllocation(api=l0 type=Host size=4KB) 84.206 us UsmMemoryAllocation(api=l0 type=Host size=1GB) 105775.56 us erf tool top 5 entries: 71.44% api_overhead_be [kernel.kallsyms] [k] clear_page_erms 6.34% api_overhead_be [kernel.kallsyms] [k] __pageblock_pfn_to_page 2.24% api_overhead_be [kernel.kallsyms] [k] cpa_flush 2.15% api_overhead_be [kernel.kallsyms] [k] pages_are_mergeable 1.94% api_overhead_be [kernel.kallsyms] [k] find_next_iomem_res With the patch: api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation: UsmMemoryAllocation(api=l0 type=Host size=4KB) 79.439 us UsmMemoryAllocation(api=l0 type=Host size=1GB) 98677.75 us Perf tool top 5 entries: 11.16% api_overhead_be [kernel.kallsyms] [k] __pageblock_pfn_to_page 7.85% api_overhead_be [kernel.kallsyms] [k] cpa_flush 7.59% api_overhead_be [kernel.kallsyms] [k] find_next_iomem_res 7.24% api_overhead_be [kernel.kallsyms] [k] pages_are_mergeable 5.53% api_overhead_be [kernel.kallsyms] [k] lookup_address_in_pgd_attr Without this patch clear_page_erms() dominates execution time which is also not pipelined with migration jobs. With this patch page clearing will get pipelined with migration job and will free CPU for more work. v2: Handle regression on dgfx(Himal) Update commit message as no ttm API changes needed. v3: Fix Kunit test. v4: handle data leak on cpu mmap(Thomas) v5: s/gpu_page_clear/gpu_page_clear_sys and move setting it to xe_ttm_sys_mgr_init() and other nits (Matt Auld) v6: Disable it when init_on_alloc and/or init_on_free is active(Matt) Use compute-benchmarks as reporter used it to report this allocation latency issue also a proper test application than mime. In v5, the test showed significant reduction in alloc latency but that is not the case any more, I think this was mostly because previous test was done on IFWI which had low mem BW from CPU. Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816135154.19678-2-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2024-08-19 17:49:00 +02:00
Nirmoy Das	decbfaf06d	drm/ttm: Add a flag to allow drivers to skip clear-on-free Add TTM_TT_FLAG_CLEARED_ON_FREE, which DRM drivers can set before releasing backing stores if they want to skip clear-on-free. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816135154.19678-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2024-08-19 17:49:00 +02:00
Thorsten Blum	6841a26e2c	drm/xe/oa: Use vma_pages() helper function in xe_oa_mmap() Use the vma_pages() helper function and remove the following Coccinelle/coccicheck warning reported by vma_pages.cocci: WARNING: Consider using vma_pages helper on vma Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240819095751.539645-2-thorsten.blum@toblux.com	2024-08-19 07:41:56 -07:00
Maarten Lankhorst	cb8f81c175	drm/xe/display: Make display suspend/resume work on discrete We should unpin before evicting all memory, and repin after GT resume. This way, we preserve the contents of the framebuffers, and won't hang on resume due to migration engine not being restored yet. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240806105044.596842-3-maarten.lankhorst@linux.intel.com Signed-off-by: Maarten Lankhorst,,, <maarten.lankhorst@linux.intel.com>	2024-08-19 15:17:04 +02:00
Maarten Lankhorst	492be2a070	drm/xe/display: Match i915 driver suspend/resume sequences better Suspend fbdev sooner, and disable user access before suspending to prevent some races. I've noticed this when comparing xe suspend to i915's. Matches the following commits from i915: `24b412b1bf` ("drm/i915: Disable intel HPD poll after DRM poll init/enable") `1ef28d86be` ("drm/i915: Suspend the framebuffer console earlier during system suspend") `bd738d859e` ("drm/i915: Prevent modesets during driver init/shutdown") Thanks to Imre for pointing me to those commits. Driver shutdown is currently missing, but I have some idea how to implement it next. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240806105044.596842-2-maarten.lankhorst@linux.intel.com Signed-off-by: Maarten Lankhorst,,, <maarten.lankhorst@linux.intel.com>	2024-08-19 15:16:52 +02:00
Matthew Auld	7116c35aac	drm/xe: prevent UAF around preempt fence The fence lock is part of the queue, therefore in the current design anything locking the fence should then also hold a ref to the queue to prevent the queue from being freed. However, currently it looks like we signal the fence and then drop the queue ref, but if something is waiting on the fence, the waiter is kicked to wake up at some later point, where upon waking up it first grabs the lock before checking the fence state. But if we have already dropped the queue ref, then the lock might already be freed as part of the queue, leading to uaf. To prevent this, move the fence lock into the fence itself so we don't run into lifetime issues. Alternative might be to have device level lock, or only release the queue in the fence release callback, however that might require pushing to another worker to avoid locking issues. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2454 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2342 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2020 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814110129.825847-2-matthew.auld@intel.com	2024-08-19 12:38:09 +01:00
Nirmoy Das	6b77dab5da	drm/xe: Remove redundant param from xe_bo_create_user BO from xe_bo_create_user() will always be of type, ttm_bo_type_device. So remove that redundant parameter. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816102248.25628-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2024-08-19 09:38:16 +02:00
Francois Dugast	4099cfda9d	drm/xe/device: Remove unused xe_device::usm::num_vm_in_* Those counters were used to keep track of the numbers VMs in fault mode and in non-fault mode, to determine if the whole device was in fault mode or not. This is no longer needed so remove those variables and their usages. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-12-francois.dugast@intel.com	2024-08-17 18:31:59 -07:00
Francois Dugast	226d92e49a	drm/xe/vm: Remove restriction that all VMs must be faulting if one is With this restriction, all VMs on the device must be faulting VMs if there is already one faulting VM, in which case the device is considered in fault mode. This prevents for example an application from running 3D jobs for the compositor while submitting a SVM compute job on the same device. Now that mutual exclusion of faulting LR jobs and dma fence jobs is ensured on the hw engine group, remove this restriction to allow running faulting and non-faulting VMs on the same device. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-11-francois.dugast@intel.com	2024-08-17 18:31:58 -07:00
Francois Dugast	d16ef1a18e	drm/xe/exec: Switch hw engine group execution mode upon job submission If the job about to be submitted is a dma-fence job, update the current execution mode of the hw engine group. This triggers an immediate suspend of the exec queues running faulting long-running jobs. If the job about to be submitted is a long-running job, kick a new worker used to resume the exec queues running faulting long-running jobs once the dma-fence jobs have completed. v2: Kick the resume worker from exec IOCTL, switch to unordered workqueue, destroy it after use (Matt Brost) v3: Do not resume if no exec queue was suspended (Matt Brost) v4: Squash commits (Matt Brost) v5: Do not kick the worker when xe_vm_in_preempt_fence_mode (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-10-francois.dugast@intel.com	2024-08-17 18:31:57 -07:00
Francois Dugast	770bd1d341	drm/xe/hw_engine_group: Ensure safe transition between execution modes Provide a way to safely transition execution modes of the hw engine group ahead of the actual execution. When necessary, either wait for running jobs to complete or preempt them, thus ensuring mutual exclusion between execution modes. Unlike a mutex, the rw_semaphore used in this context allows multiple submissions in the same mode. v2: Use lockdep_assert_held_write, add annotations (Matt Brost) v3: Fix kernel doc, remove redundant code (Matt Brost) v4: Now that xe_hw_engine_group_suspend_faulting_lr_jobs can fail, propagate the error to the caller (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-9-francois.dugast@intel.com	2024-08-17 18:31:56 -07:00
Francois Dugast	2750ff97ee	drm/xe/hw_engine_group: Add helper to wait for dma fence jobs This is a required feature for faulting long running jobs not to be submitted while dma fence jobs are running on the hw engine group. v2: Switch to lockdep_assert_held_write in worker, get a proper reference for the last fence (Matt Brost) v3: Directly call dma_fence_put with the fence ref (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-8-francois.dugast@intel.com	2024-08-17 18:31:55 -07:00
Francois Dugast	0d92cd8935	drm/xe/exec_queue: Prepare last fence for hw engine group resume context Ensure we can safely take a ref of the exec queue's last fence from the context of resuming jobs from the hw engine group. The locking requirements differ from the general case, hence the introduction of this new function. v2: Add kernel doc, rework the code to prevent code duplication v3: Fix kernel doc, remove now unnecessary lockdep variants (Matt Brost) v4: Remove new put function (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-7-francois.dugast@intel.com	2024-08-17 18:31:54 -07:00
Francois Dugast	7f0d7bee20	drm/xe/exec_queue: Remove duplicated code This code section is the same as the body of xe_exec_queue_last_fence_put_unlocked() so call the function instead and remove duplicated code to make maintenance easier. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-6-francois.dugast@intel.com	2024-08-17 18:31:53 -07:00
Francois Dugast	53fdfa19e6	drm/xe/hw_engine_group: Add helper to suspend faulting LR jobs This is a required feature for dma fence jobs to preempt faulting long running jobs in order to ensure mutual exclusion on a given hw engine group. v2: Pipeline calls to suspend(q) and suspend_wait(q) to improve efficiency, switch to lockdep_assert_held_write (Matt Brost) v3: Return error on suspend_wait failure to propagate on the call stack up to IOCTL (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-5-francois.dugast@intel.com	2024-08-17 18:31:52 -07:00
Francois Dugast	7970cb3696	'drm/xe/hw_engine_group: Register hw engine group's exec queues Add helpers to safely add and delete the exec queues attached to a hw engine group, and make use them at the time of creation and destruction of the exec queues. Keeping track of them is required to control the execution mode of the hw engine group. v2: Improve error handling and robustness, suspend exec queues created in fault mode if group in dma-fence mode, init queue link (Matt Brost) v3: Delete queue from hw engine group when it is destroyed by the user, also clean up at the time of closing the file in case the user did not destroy the queue v4: Use correct list when checking if empty, do not add the queue if VM is in xe_vm_in_preempt_fence_mode (Matt Brost) v5: Remove unrelated newline, add checks and asserts for group, unwind on suspend failure (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-4-francois.dugast@intel.com	2024-08-17 18:31:51 -07:00
Francois Dugast	3dc6da76ae	drm/xe/guc_submit: Make suspend_wait interruptible Rely on wait_event_interruptible_timeout() to put the process to sleep with TASK_INTERRUPTIBLE. It allows using this function in interruptible context. v2: Propagate error on wait_event_interruptible_timeout (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-3-francois.dugast@intel.com	2024-08-17 18:31:50 -07:00
Francois Dugast	f784750c67	drm/xe/hw_engine_group: Introduce xe_hw_engine_group A xe_hw_engine_group is a group of hw engines. Two hw engines belong to the same xe_hw_engine_group if one hw engine cannot make progress while the other is stuck on a page fault. Typically, hw engines of the same group share some resources such as EUs, but this really depends on the hardware configuration of the platforms. The simple engines partitioning proposed here might be too conservative but is intended to work for existing platforms. It can be optimized later if more sets of independent engines are identified. The hw engine groups are intended to be used in the context of faulting long-running jobs submissions. v2: Move to own files, improve error handling (Matt Brost) v3: Fix build issue reported by CI, improve commit message (Matt Roper) v4: Fix kernel doc v5: Add switch case for XE_ENGINE_CLASS_OTHER Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-2-francois.dugast@intel.com	2024-08-17 18:31:47 -07:00
Matthew Brost	852856e3b6	drm/xe: Use reserved copy engine for user binds on faulting devices User binds map to engines with can fault, faults depend on user binds completion, thus we can deadlock. Avoid this by using reserved copy engine for user binds on faulting devices. While we are here, normalize bind queue creation with a helper. v2: - Pass in extensions to bind queue creation (CI) v3: - s/resevered/reserved (Lucas) - Fix NULL hwe check (Jonathan) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816034033.53837-1-matthew.brost@intel.com	2024-08-17 14:12:19 -07:00
Matt Roper	fc7c7498db	drm/xe/mcr: Try to derive dss_per_grp from hwconfig attributes When steering MCR register ranges of type "DSS," the group_id and instance_id values are calculated by dividing the DSS pool according to the size of a gslice or cslice, depending on the platform. These values haven't changed much on past platforms, so we've been able to hardcode the proper divisor so far. However the layout may not be so fixed on future platforms so the proper, future-proof way to determine this is by using some of the attributes from the GuC's hwconfig table. The hwconfig has two attributes reflecting the architectural maximum slice and subslice counts (i.e., before any fusing is considered) that can be used for the purposes of calculating MCR steering targets. If the hwconfig is lacking the necessary values (which should only be possible on older platforms before these attributes were added), we can still fall back to the old hardcoded values. Going forward the hwconfig is expected to always provide the information we need on newer platforms, and any failure to do so will be considered a bug in the firmware that will prevent us from switching to the buggy firmware release. It's worth noting that over time GuC's hwconfig has provided a couple different keys with similar-sounding descriptions. For our purposes here, we only trust the newer key "70" which has supplanted the similarly-named key "2" that existed on older platforms. Bspec: 73210 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815172602.2729146-4-matthew.d.roper@intel.com	2024-08-16 11:07:13 -07:00
Matt Roper	8a0f58ec47	drm/xe: Add debugfs to dump GuC's hwconfig Although the query uapi is the official way to get at the GuC's hwconfig table contents, it's still useful to have a quick debugfs interface to dump the table in a human-readable format while debugging the driver. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815172602.2729146-3-matthew.d.roper@intel.com	2024-08-16 11:07:11 -07:00
Lucas De Marchi	ed7171ff9f	Merge drm/drm-next into drm-xe-next Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for the display side. This resolves the current conflict for the enable_display module parameter and allows further pending refactors. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-08-16 10:33:54 -07:00
Matthew Brost	db3461a774	drm/xe: Use for_each_remote_tile rather than manual check Replace for_each_tile plus a check against primary tile with for_each_remote_tile in tiles_fini. The latter macro does this for us. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816040208.62695-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-08-16 09:31:18 -07:00
Daniele Ceraolo Spurio	5a891a0e69	drm/xe/uc: Use devm to register cleanup that includes exec_queues Exec_queue cleanup requires HW access, so we need to use devm instead of drmm for it. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815230541.3828206-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-08-16 09:15:04 -07:00

1 2 3 4 5 ...

1295257 Commits