We really only need memcpy restore for objects that affect the
operability of the migrate context. That is, primarily the page-table
objects of the migrate VM.
Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early
restores using memcpy and a way to assign LMEM page-table object flags
to be used by the vms.
Restore objects without this flag with the gpu blitter and only objects
carrying the flag using TTM memcpy.
Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and
defer for a later audit which vms actually need it. Most importantly, user-
allocated vms with pinned page-table objects can be restored using the
blitter.
Performance-wise memcpy restore is probably as fast as gpu restore if not
faster, but using gpu restore will help tackling future restrictions in
mappable LMEM size.
v4:
- Don't mark the aliasing ppgtt page table flags for early resume, but
rather the ggtt page table flags as intended. (Matthew Auld)
- The check for user buffer objects during early resume is pointless, since
they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld)
v5:
- Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored
before we fire up the migrate context.
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com
Previously, we were storing the ring size in the ring pointer before it
was actually allocated. We would then guard setting the ring size on
checking for CONTEXT_ALLOC_BIT. This is error-prone at best and really
only saves us a few bytes on something that already burns at least 4K.
Instead, this patch adds a new ring_size field and makes everything use
that.
v2 (Daniel Vetter):
- Replace 512 * SZ_4K with SZ_2M
v2 (Jason Ekstrand):
- Rebase on top of page migration code
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-3-jason@jlekstrand.net
This was done by the following semantic patch:
@@ expression i915; @@
- INTEL_GEN(i915)
+ GRAPHICS_VER(i915)
@@ expression i915; expression E; @@
- INTEL_GEN(i915) >= E
+ GRAPHICS_VER(i915) >= E
@@ expression dev_priv; expression E; @@
- !IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) != E
@@ expression dev_priv; expression E; @@
- IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) == E
@@
expression dev_priv;
expression from, until;
@@
- IS_GEN_RANGE(dev_priv, from, until)
+ IS_GRAPHICS_VER(dev_priv, from, until)
@def@
expression E;
identifier id =~ "^gen$";
@@
- id = GRAPHICS_VER(E)
+ ver = GRAPHICS_VER(E)
@@
identifier def.id;
@@
- id
+ ver
It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".
Cc: intel-gvt-dev@lists.freedesktop.org
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603165428.3625495-5-lucas.demarchi@intel.com
Logical Context is actually a big batch buffer consisting of multiple
LRI commands + saved registers. It comprises Ring Context (the first
0x50 dwords) and Engine Context. The registers defined in Engine Context
are command accessible, and safe to execute in VM Context.
This patch
1. stops copy Ring Context and only copys Engine Context from VM Context
2. audits VM Engine Contexts to disallow undesired LRIs
(if accessing registers out of Engine Context that hardware generates).
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Wang Zhi <zhi.a.wang@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201223034500.16983-1-yan.y.zhao@intel.com
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
One issue exposed after below commit with which the system will freeze
at suspend after vGPU is created (no need to activate the vGPU).
commit e6ba764802 ("drm/i915: Remove i915->kernel_context")
Old implementation pin the intel_context at setup_submission and
unpin it at clean_submission. So after some vGPU is created, the
intel_context is always pinned there although no workload using it.
It will then block i915 enter suspend state.
There is no need to pin it all the time. Pin/unpin it around workload
lifecycle is more reasonable. After GVT enabled suspend/resume, the
pinned intel_context will also get unpined when userspace put VM process
into suspend state since all workloads are retired, then it's safe to
unpin all intel_context for workloads created. So move the pin/unpin to
create_workload and destroy_workload, while still keep the
create/destroy in old place.
V2:
Rebase.
Fixes: e6ba764802 ("drm/i915: Remove i915->kernel_context")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201016054059.238371-1-colin.xu@intel.com
The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.
In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.
v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Catch up with upstream, in particular to get c1e8d7c6a7 ("mmap locking
API: convert mmap_sem comments").
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
When CONFIG_DRM_I915_DEBUG_GEM is not set, clang warns:
drivers/gpu/drm/i915/gvt/scheduler.c:884:1: warning: function
'check_shadow_context_ppgtt' is not needed and will not be emitted
[-Wunneeded-internal-declaration]
check_shadow_context_ppgtt(struct execlist_ring_context *c, struct
intel_vgpu_mm *m)
^
1 warning generated.
This warning is similar to -Wunused-function but rather than warning
that the function is completely unused, it warns that it is used in some
expression within the file but that expression will be evaluated to a
constant or be optimized away in the final assembly, essentially making
it appeared used but really isn't. Usually, this happens when a function
or variable is only used in sizeof, where it will appear to be used but
will be evaluated at compile time and not be required to be emitted.
In this case, the function is only used in GEM_BUG_ON, which is defined
as BUILD_BUG_ON_INVALID, which intentionally follows this pattern. To
fix this warning, add __maybe_unused to make it clear that this is
intentional depending on the configuration.
Fixes: bec3df930f ("drm/i915/gvt: Support PPGTT table load command")
Link: https://github.com/ClangBuiltLinux/linux/issues/1027
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200516023545.3332334-1-natechancellor@gmail.com
(cherry picked from commit 993fa32eb3)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When CONFIG_DRM_I915_DEBUG_GEM is not set, clang warns:
drivers/gpu/drm/i915/gvt/scheduler.c:884:1: warning: function
'check_shadow_context_ppgtt' is not needed and will not be emitted
[-Wunneeded-internal-declaration]
check_shadow_context_ppgtt(struct execlist_ring_context *c, struct
intel_vgpu_mm *m)
^
1 warning generated.
This warning is similar to -Wunused-function but rather than warning
that the function is completely unused, it warns that it is used in some
expression within the file but that expression will be evaluated to a
constant or be optimized away in the final assembly, essentially making
it appeared used but really isn't. Usually, this happens when a function
or variable is only used in sizeof, where it will appear to be used but
will be evaluated at compile time and not be required to be emitted.
In this case, the function is only used in GEM_BUG_ON, which is defined
as BUILD_BUG_ON_INVALID, which intentionally follows this pattern. To
fix this warning, add __maybe_unused to make it clear that this is
intentional depending on the configuration.
Fixes: bec3df930f ("drm/i915/gvt: Support PPGTT table load command")
Link: https://github.com/ClangBuiltLinux/linux/issues/1027
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200516023545.3332334-1-natechancellor@gmail.com
As i915 won't allocate extra PDP for current default PML4 table,
so for 3-level ppgtt guest, we would hit kernel pointer access
failure on extra PDP pointers. So this trys to bypass that now.
It won't impact real shadow PPGTT setup, so guest context still
works.
This is verified on 4.15 guest kernel with i915.enable_ppgtt=1
to force on old aliasing ppgtt behavior.
Fixes: 4f15665ccb ("drm/i915: Add ppgtt to GVT GEM context")
Reviewed-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200506095918.124913-1-zhenyuw@linux.intel.com
The PPGTT in context image can be overridden by LRI cmd with another
PPGTT's pdps. In such case, the load mm is used instead of the one in
the context image. So we need to load its shadow mm in GVT and replace
ppgtt pointers in command.
This feature is used by guest IGD driver to share gfx VM between
different contexts. Verified by IGT "gem_ctx_clone" test.
v4:
- consolidate shadow mm handlers (Yan)
- fix cmd shadow mm pin error path
v3: (Zhenyu Wang)
- Cleanup PDP register offset check
- Add debug check for guest context ppgtt update
- Skip 3-level ppgtt guest handling code. The reason is that all
guests now use 4-level ppgtt table and the only left case for
3-level table is ancient aliasing ppgtt case. But those guest
kernel has no use of PPGTT LRI command. So 3-level ppgtt guest
for this feature becomes simply un-testable.
v2: (Zhenyu Wang)
- Change to list for handling possible multiple ppgtt table loads
in one submission. Make sure shadow mm is to replace for each one.
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200508031409.2562-1-zhenyuw@linux.intel.com
The bspec is confusing on the nature of the upper 32bits of the LRC
descriptor. Once upon a time, it said that it uses the upper 32b to
decide if it should perform a lite-restore, and so we must ensure that
each unique context submitted to HW is given a unique CCID [for the
duration of it being on the HW]. Currently, this is achieved by using
a small circular tag, and assigning every context submitted to HW a
new id. However, this tag is being cleared on repinning an inflight
context such that we end up re-using the 0 tag for multiple contexts.
To avoid accidentally clearing the CCID in the upper 32bits of the LRC
descriptor, split the descriptor into two dwords so we can update the
GGTT address separately from the CCID.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
(cherry picked from commit 2632f174a2)
(cherry picked from commit a4b70fcc587860f4b972f68217d8ebebe295ec15)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The bspec is confusing on the nature of the upper 32bits of the LRC
descriptor. Once upon a time, it said that it uses the upper 32b to
decide if it should perform a lite-restore, and so we must ensure that
each unique context submitted to HW is given a unique CCID [for the
duration of it being on the HW]. Currently, this is achieved by using
a small circular tag, and assigning every context submitted to HW a
new id. However, this tag is being cleared on repinning an inflight
context such that we end up re-using the 0 tag for multiple contexts.
To avoid accidentally clearing the CCID in the upper 32bits of the LRC
descriptor, split the descriptor into two dwords so we can update the
GGTT address separately from the CCID.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
The history of i915_vma_close() is confusing, as is its use. As the
lifetime of the i915_vma is currently bounded by the object it is
attached to, we needed a means of identify when a vma was no longer in
use by userspace (via the user's fd). This is further complicated by
that only ppgtt vma should be closed at the user's behest, as the ggtt
were always shared.
Now that we attach the vma to a lut on the user's context, the open
count does indicate how many unique and open context/vm are referencing
this vma from the user. As such, we can and should just use the
open_count to track when the vma is still in use by userspace.
It's a poor man's replacement for reference counting.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
Software is not expected to populate engine context except when using
restore inhibit bit or golden state to initialize it for the first time.
Therefore, if a newly submitted guest context is the same as the last
shadowed one, no need to populate its engine context from guest again.
Currently using lrca + ring_context_gpa to identify whether two guest
contexts are the same.
The reason of why context id is not included as an identifier is that
i915 recently changed the code and context id is only unique for a
context when OA is enabled. And when OA is on, context id is generated
based on lrca. Therefore, in that case, if two contexts are of the same
lrca, they have identical context ids as well.
(This patch also works with old guest kernel like 4.20.)
for guest context, if its ggtt entry is modified after last context
shadowing, it is also deemed as not the same context as last shadowed one.
v7:
-removed local variable "valid". use the one in s->last_ctx diretly
v6:
-change type of lrca of last ctx to be u32. as currently it's all
protected by vgpu lock (Kevin Tian)
-reset valid of last ctx to false once it needs to be repopulated before
population completes successfully (Kevin Tian)
v5:
-merge all 3 patches into one patch (Zhenyu Wang)
v4:
- split the series into 3 patches.
- don't turn on optimization until last patch in this series (Kevin Tian)
- define lrca to be atomic in this patch rather than update its type in
the second patch (Kevin Tian)
v3: updated commit message to describe engine context and context id
clearly (Kevin Tian)
v2: rebased to 5.6.0-rc4+Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200417091334.32628-1-yan.y.zhao@intel.com
Drm specific drm_WARN* calls include device information in the
backtrace, so we know what device the warnings originate from.
Covert all the calls of WARN* with device specific drm_WARN*
variants in functions where drm_device struct pointer is readily
available.
The conversion was done automatically with below coccinelle semantic
patch. checkpatch errors/warnings are fixed manually.
@@
identifier func, T;
@@
func(struct intel_vgpu *T,...) {
+struct drm_i915_private *i915 = T->gvt->dev_priv;
<+...
(
-WARN(
+drm_WARN(&i915->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&i915->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&i915->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&i915->drm,
...)
)
...+>
}
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200220165507.16823-9-pankaj.laxminarayan.bharadiya@intel.com
Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).
Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.
GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk