If the job about to be submitted is a dma-fence job, update the current
execution mode of the hw engine group. This triggers an immediate suspend
of the exec queues running faulting long-running jobs.
If the job about to be submitted is a long-running job, kick a new worker
used to resume the exec queues running faulting long-running jobs once
the dma-fence jobs have completed.
v2: Kick the resume worker from exec IOCTL, switch to unordered workqueue,
destroy it after use (Matt Brost)
v3: Do not resume if no exec queue was suspended (Matt Brost)
v4: Squash commits (Matt Brost)
v5: Do not kick the worker when xe_vm_in_preempt_fence_mode (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-10-francois.dugast@intel.com
Provide a way to safely transition execution modes of the hw engine group
ahead of the actual execution. When necessary, either wait for running
jobs to complete or preempt them, thus ensuring mutual exclusion between
execution modes.
Unlike a mutex, the rw_semaphore used in this context allows multiple
submissions in the same mode.
v2: Use lockdep_assert_held_write, add annotations (Matt Brost)
v3: Fix kernel doc, remove redundant code (Matt Brost)
v4: Now that xe_hw_engine_group_suspend_faulting_lr_jobs can fail,
propagate the error to the caller (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-9-francois.dugast@intel.com
This is a required feature for faulting long running jobs not to be
submitted while dma fence jobs are running on the hw engine group.
v2: Switch to lockdep_assert_held_write in worker, get a proper reference
for the last fence (Matt Brost)
v3: Directly call dma_fence_put with the fence ref (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-8-francois.dugast@intel.com
Ensure we can safely take a ref of the exec queue's last fence from the
context of resuming jobs from the hw engine group. The locking requirements
differ from the general case, hence the introduction of this new function.
v2: Add kernel doc, rework the code to prevent code duplication
v3: Fix kernel doc, remove now unnecessary lockdep variants (Matt Brost)
v4: Remove new put function (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-7-francois.dugast@intel.com
This code section is the same as the body of
xe_exec_queue_last_fence_put_unlocked() so call the function instead and
remove duplicated code to make maintenance easier.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-6-francois.dugast@intel.com
This is a required feature for dma fence jobs to preempt faulting long
running jobs in order to ensure mutual exclusion on a given hw engine
group.
v2: Pipeline calls to suspend(q) and suspend_wait(q) to improve
efficiency, switch to lockdep_assert_held_write (Matt Brost)
v3: Return error on suspend_wait failure to propagate on the call stack
up to IOCTL (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-5-francois.dugast@intel.com
Add helpers to safely add and delete the exec queues attached to a hw
engine group, and make use them at the time of creation and destruction of
the exec queues. Keeping track of them is required to control the
execution mode of the hw engine group.
v2: Improve error handling and robustness, suspend exec queues created in
fault mode if group in dma-fence mode, init queue link (Matt Brost)
v3: Delete queue from hw engine group when it is destroyed by the user,
also clean up at the time of closing the file in case the user did
not destroy the queue
v4: Use correct list when checking if empty, do not add the queue if VM
is in xe_vm_in_preempt_fence_mode (Matt Brost)
v5: Remove unrelated newline, add checks and asserts for group, unwind on
suspend failure (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-4-francois.dugast@intel.com
Rely on wait_event_interruptible_timeout() to put the process to sleep
with TASK_INTERRUPTIBLE. It allows using this function in interruptible
context.
v2: Propagate error on wait_event_interruptible_timeout (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-3-francois.dugast@intel.com
A xe_hw_engine_group is a group of hw engines. Two hw engines belong to
the same xe_hw_engine_group if one hw engine cannot make progress while
the other is stuck on a page fault.
Typically, hw engines of the same group share some resources such as EUs,
but this really depends on the hardware configuration of the platforms.
The simple engines partitioning proposed here might be too conservative
but is intended to work for existing platforms. It can be optimized later
if more sets of independent engines are identified.
The hw engine groups are intended to be used in the context of faulting
long-running jobs submissions.
v2: Move to own files, improve error handling (Matt Brost)
v3: Fix build issue reported by CI, improve commit message (Matt Roper)
v4: Fix kernel doc
v5: Add switch case for XE_ENGINE_CLASS_OTHER
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-2-francois.dugast@intel.com
User binds map to engines with can fault, faults depend on user binds
completion, thus we can deadlock. Avoid this by using reserved copy
engine for user binds on faulting devices.
While we are here, normalize bind queue creation with a helper.
v2:
- Pass in extensions to bind queue creation (CI)
v3:
- s/resevered/reserved (Lucas)
- Fix NULL hwe check (Jonathan)
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240816034033.53837-1-matthew.brost@intel.com
When steering MCR register ranges of type "DSS," the group_id and
instance_id values are calculated by dividing the DSS pool according to
the size of a gslice or cslice, depending on the platform. These values
haven't changed much on past platforms, so we've been able to hardcode
the proper divisor so far. However the layout may not be so fixed on
future platforms so the proper, future-proof way to determine this is by
using some of the attributes from the GuC's hwconfig table. The
hwconfig has two attributes reflecting the architectural maximum slice
and subslice counts (i.e., before any fusing is considered) that can be
used for the purposes of calculating MCR steering targets.
If the hwconfig is lacking the necessary values (which should only be
possible on older platforms before these attributes were added), we can
still fall back to the old hardcoded values. Going forward the hwconfig
is expected to always provide the information we need on newer
platforms, and any failure to do so will be considered a bug in the
firmware that will prevent us from switching to the buggy firmware
release.
It's worth noting that over time GuC's hwconfig has provided a couple
different keys with similar-sounding descriptions. For our purposes
here, we only trust the newer key "70" which has supplanted the
similarly-named key "2" that existed on older platforms.
Bspec: 73210
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240815172602.2729146-4-matthew.d.roper@intel.com
Although the query uapi is the official way to get at the GuC's hwconfig
table contents, it's still useful to have a quick debugfs interface to
dump the table in a human-readable format while debugging the driver.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240815172602.2729146-3-matthew.d.roper@intel.com
Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for
the display side. This resolves the current conflict for the
enable_display module parameter and allows further pending refactors.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Replace for_each_tile plus a check against primary tile with
for_each_remote_tile in tiles_fini. The latter macro does this for us.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240816040208.62695-1-matthew.brost@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Exec_queue cleanup requires HW access, so we need to use devm instead of
drmm for it.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240815230541.3828206-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Drmm actions are not the right ones to clean up BOs and we should use
devm instead. However, we can also instead just allocate the objects
using the managed_bo function, which will internally register the
correct cleanup call and therefore allows us to simplify the code.
While at it, switch to drmm_kzalloc for the GSC proxy allocation to
further simplify the cleanup.
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240815230541.3828206-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Only set tile->mmio.regs to NULL if not the root tile in tile_fini. The
root tile mmio regs is setup ealier in MMIO init thus it should be set
to NULL in mmio_fini.
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809232830.3302251-1-matthew.brost@intel.com
An upcoming PXP patch will kill queues at runtime when a PXP
invalidation event occurs, so we need exec_queue_kill to be safe to call
multiple times.
v2: Add documentation (Matt B)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240814205654.1716586-1-daniele.ceraolospurio@intel.com
This was fixed in commit b7dce525c4 ("drm/xe/queue: fix engine_class
bounds check"), but then re-introduced in commit 6f20fc0993 ("drm/xe:
Move and export xe_hw_engine lookup.") which should only be simple code
movement of the existing function.
Fixes: 6f20fc0993 ("drm/xe: Move and export xe_hw_engine lookup.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812141331.729843-2-matthew.auld@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Register STATELESS_COMPRESSION_CTRL should be considered
mcr register which should write to all slices as per
documentation.
Bspec: 71185
Fixes: ecabb5e6ce ("drm/xe/xe2: Add performance turning changes")
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-4-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Register GAMREQSTRM_CTRL should be considered mcr register
which should write to all slices as per documentation.
Bspec: 71185
Fixes: 01570b4469 ("drm/xe/bmg: implement Wa_16023588340")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-3-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
xe_gt_enable_host_l2_vram() is reading the XE2_GAMREQSTRM_CTRL register
that is currently missing the MCR annotation. However, just adding the
annotation doesn't work as this function is called before MCR handling
is initialized in xe_gt_mcr_init().
xe_gt_enable_host_l2_vram() is used to implement WA 16023588340 that
needs to be done as early as possible during initialization in order to
be effective since the MMIO writes impact it. In the failure scenario,
driver would simply not be able to bind successfully.
Moving xe_gt_enable_host_l2_vram() later, after MCR initialization is
done, only incurs a few additional HW accesses, particularly when
loading GuC for hwconfig. Binding/unbinding the driver 100 times in BMG
still works so it should be ok to start handling the WA a little bit
later. This is sufficient to allow adding the MCR annotation to
XE2_GAMREQSTRM_CTRL.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-2-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The different approach used by xe regarding the initialization of
display HW has been proved a great addition for early driver bring up:
core xe can be tested without having all the bits sorted out on the
display side.
On the other hand, the approach exposed by i915-display is to *actively*
disable the display by programming it if needed, i.e. if it was left
enabled by firmware. It also has its use to make sure the HW is actually
disabled and not wasting power.
However having both the way it is in xe doesn't expose a good interface
wrt module params. From modinfo:
disable_display:Disable display (default: false) (bool)
enable_display:Enable display (bool)
Rename enable_display to probe_display to try to convey the message that
the HW is being touched and improve the module param description. To
avoid confusion, the enable_display is renamed everywhere, not only in
the module param. New description for the parameters:
disable_display:Disable display (default: false) (bool)
probe_display:Probe display HW, otherwise it's left untouched (default: true) (bool)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240813141931.3141395-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Remove the xe parameter from the pde_encode_pat_index and
pte_encode_pat_index functions, as it is no longer used.
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240813104419.2958046-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Early in the development of Xe we identified an issue with SVG state
handling on DG2 and MTL (and later on Xe2 as well). In
commit 72ac304769 ("drm/xe: Emit SVG state on RCS during driver load
on DG2 and MTL") and commit fb24b858a2 ("drm/xe/xe2: Update SVG state
handling") we implemented our own workaround to prevent SVG state from
leaking from context A to context B in cases where context B never
issues a specific state setting.
The hardware teams have now created official workaround Wa_14019789679
to cover this issue. The workaround description only requires emitting
3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction
that would potentially remain unset by a context B, but still cause
notable issues if unwanted values were inherited from context A.
However since we already have a more extensive implementation that emits
the entire SVG state and prevents _any_ SVG state from unintentionally
leaking, we'll stick with our existing implementation just to be safe.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812181042.2013508-2-matthew.d.roper@intel.com
Parameterize clearing ccs and bo data in xe_migrate_clear() which higher
layers can utilize. This patch will be used later on when doing bo data
clear for igfx as well.
v2: Replace multiple params with flags in xe_migrate_clear (Matt B)
v3: s/CLEAR_BO_DATA_FLAG_*/XE_MIGRATE_CLEAR_FLAG_* and move to
xe_migrate.h. other nits(Matt B)
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809220347.25330-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Wa_14021821874 applies to xe2_hpg
V2(Himal):
- Use space after define
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812134117.813670-1-tejas.upadhyay@intel.com
Add stats for tlb invalidation count which can be viewed with per GT
stat debugfs file.
Example output:
cat /sys/kernel/debug/dri/0/gt0/stats
tlb_inval_count: 22
v2: fix #include order(Tejas)
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sai Gowtham Ch <sai.gowtham.ch@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240810191522.18616-2-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Add skeleton APIs for recording and printing various stats over
debugfs. This currently only added counter types stats which is backed
by atomic_t and wrapped with CONFIG_DRM_XE_STATS so this can be disabled
on production system.
v4: Rebase and other minor fixes (Matt)
v3: s/CONFIG_DRM_XE_STATS/CONFIG_DEBUG_FS(Lucas)
v2: add missing docs
Add boundary checks for stats id and other improvements (Michal)
Fix build when CONFIG_DRM_XE_STATS is disabled(Matt)
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sai Gowtham Ch <sai.gowtham.ch@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240810191522.18616-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Going forward, struct intel_display shall replace struct
drm_i915_private as the main display device data pointer type. Convert
intel_opregion.[ch] to struct intel_display.
v2:
- Fix declarations for !CONFIG_ACPI (Imre, kernel test robot)
- Pass encoder/connector directly to intel_display() (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/aef94503909bbbf95f0244dc382a4d4cd050b903.1723213547.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Log the address of the register that caused the timeout
interrupt by reading RMTIMEOUTREG_CAPTURE
--v2:
- Update RMTIMEOUTREG_CAPTURE naming (Suraj)
--v3:
- XeLpdp naming convention.
- Use if condition instead of else if
Signed-off-by: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240807142106.1270213-1-mitulkumar.ajitkumar.golani@intel.com
Use a dummy xe_debugfs_register() if debugfs is not enabled and move all
debugfs-related files under `ifeq ($(CONFIG_DEBUG_FS),y)` in the
Makefile. This is similar to what was done for display in
commit 439987f6f4 ("drm/xe: don't build debugfs files when
CONFIG_DEBUG_FS=n").
This removes the following warning while loading xe with
CONFIG_DEUBG_FS=n:
xe 0000:03:00.0: [drm] Create GT directory failed
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240808171121.2484237-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The critical section which requires the VM dma-resv is the call
xe_lrc_create in __xe_exec_queue_init. Move this lock to
__xe_exec_queue_init holding it just around xe_lrc_create. Not only is
good practice, this also fixes a locking double of the VM dma-resv in
the error paths of __xe_exec_queue_init as xe_lrc_put tries to acquire
this too resulting in a deadlock.
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240724152831.1848325-1-matthew.brost@intel.com
When validating VF config on the media GT, we may wrongly report
that VF is already partially configured on it, as we consider GGTT
and LMEM provisioning done on the primary GT (since both GGTT and
LMEM are tile-level resources, not a GT-level).
This will cause skipping a VF auto-provisioning on the media-GT and
in result will block a VF from successfully initialize that GT.
Fix that by considering GGTT and LMEM configurations only when
checking if a VF provisioning is complete, and omit GGTT and LMEM
when reporting empty/partial provisioning.
Fixes: 234670cea9 ("drm/xe/pf: Skip fair VFs provisioning if already provisioned")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240806180516.618-1-michal.wajdeczko@intel.com
Mgag200's BMC connector tracks the status of an underlying physical
connector and updates the BMC status accordingly. This functionality
works around GNOME's settings app, which cannot handle multiple
outputs on the same CRTC.
The workaround is now obsolete as the VGA-BMC connector handles BMC
support internally. Hence, remove the driver's code and the BMC output
entirely.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805130622.63458-6-tzimmermann@suse.de