Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.
v2: Hook up to debugfs/drop_caches to clear the cache on demand.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
We need to keep the default context state around to instantiate new
contexts (aka golden rendercontext), and we also keep it pinned while
the engine is active so that we can quickly reset a hanging context.
However, the default contexts are large enough to merit keeping in
swappable memory as opposed to kernel memory, so we store them inside
shmemfs. Currently, we use the normal GEM objects to create the default
context image, but we can throw away all but the shmemfs file.
This greatly simplifies the tricky power management code which wants to
run underneath the normal GT locking, and we definitely do not want to
use any high level objects that may appear to recurse back into the GT.
Though perhaps the primary advantage of the complex GEM object is that
we aggressively cache the mapping, but here we are recreating the
vm_area everytime time we unpark. At the worst, we add a lightweight
cache, but first find a microbenchmark that is impacted.
Having started to create some utility functions to make working with
shmemfs objects easier, we can start putting them to wider use, where
GEM objects are overkill, such as storing persistent error state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
CONFIG_AS_MOVNTDQA was introduced by commit 0b1de5d58e ("drm/i915:
Use SSE4.1 movntdqa to accelerate reads from WC memory").
We raise the minimal supported binutils version from time to time.
The last bump was commit 1fb12b35e5 ("kbuild: Raise the minimum
required binutils version to 2.21").
I confirmed the code in $(call as-instr,...) can be assembled by the
binutils 2.21 assembler and also by LLVM integrated assembler.
Remove CONFIG_AS_MOVNTDQA, which is always defined.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
A little bit of history :
Back when i915-perf was introduced (4.13), there was no way to
dynamically add new OA configurations to i915. Only the generated
configs baked in at build time were allowed.
It quickly became obvious that we would need to allow applications
to upload their own configurations, for instance to be able to test
new ones, and so by the next stable version (4.14) we added uAPIs
to allow uploading new configurations.
When adding that capability, we took the opportunity to remove most
HW configurations except the TestOa one which is a configuration
IGT would rely on to verify that the HW is outputting correct
values. At the time it made sense to have that confiuration in at
the same time a given HW platform added to the i915-perf driver.
Now that IGT has become the reference point for HW configurations (see
commit 53f8f541ca ("lib: Add i915_perf library"), previously this was
located in the GPUTop repository), the need for having those
configurations in i915-perf is gone.
On the Mesa side, we haven't relied on this test configuration for a
while. The MDAPI library always required 4.14 feature level and always
loaded its configuration into i915.
I'm sure nobody will miss this generated stuff in i915 :)
v2: Fix selftests by creating an empty config
v3: Fix unlocking on allocation error (Dan Carpenter)
v4: Fixup checkpatch warnings
v5: Fix incorrect unlock in error path (Umesh)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-1-lionel.g.landwerlin@intel.com
On gen7 and gen7.5 devices, there could be leftover data residuals in
EU/L3 from the retiring context. This patch introduces workaround to clear
that residual contexts, by submitting a batch buffer with dedicated HW
context to the GPU with ring allocation for each context switching.
This security mitigation changes does not triggers any performance
regression. Performance is on par with current drm-tips.
v2: Add igt generated header file for CB kernel assembled with Mesa tool
and addressed use of Kernel macro for ptr_align comment.
v3: Resolve Sparse warnings with newly generated, and imported CB
kernel.
v4: Include new igt generated CB kernel for gen7 and gen7.5. Also
add code formatting and compiler warnings changes (Chris Wilson)
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Balestrieri Francesco <francesco.balestrieri@intel.com>
Cc: Bloomfield Jon <jon.bloomfield@intel.com>
Cc: Dutt Sudeep <sudeep.dutt@intel.com>
Acked-by: Chris Wilson <chris@chris-wilso.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200306000957.2836150-2-chris@chris-wilson.co.uk
Preliminary stub to add engines underneath /sys/class/drm/cardN/, so
that we can expose properties on each engine to the sysadmin.
To start with we have basic analogues of the i915_query ioctl so that we
can pretty print engine discovery from the shell, and flesh out the
directory structure. Later we will add writeable sysadmin properties such
as per-engine timeout controls.
An example tree of the engine properties on Braswell:
/sys/class/drm/card0
└── engine
├── bcs0
│ ├── capabilities
│ ├── class
│ ├── instance
│ ├── known_capabilities
│ └── name
├── rcs0
│ ├── capabilities
│ ├── class
│ ├── instance
│ ├── known_capabilities
│ └── name
├── vcs0
│ ├── capabilities
│ ├── class
│ ├── instance
│ ├── known_capabilities
│ └── name
└── vecs0
├── capabilities
├── class
├── instance
├── known_capabilities
└── name
v2: Include stringified capabilities
v3: Include all known capabilities for futureproofing.
v4: Combine the two caps loops into one
v5: Hide underneath Kconfig.unstable for wider discussion
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com>
Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-1-chris@chris-wilson.co.uk
$(CC) with $(CFLAGS_GCOV) assumes the output filename with .gcno suffix
appended is writable. This is not the case when the output filename is
/dev/null:
HDRTEST drivers/gpu/drm/i915/display/intel_frontbuffer.h
/dev/null:1:0: error: cannot open /dev/null.gcno
HDRTEST drivers/gpu/drm/i915/display/intel_ddi.h
/dev/null:1:0: error: cannot open /dev/null.gcno
make[5]: *** [../drivers/gpu/drm/i915/Makefile:307:
drivers/gpu/drm/i915/display/intel_ddi.hdrtest] Error 1
make[5]: *** Waiting for unfinished jobs....
make[5]: *** [../drivers/gpu/drm/i915/Makefile:307:
drivers/gpu/drm/i915/display/intel_frontbuffer.hdrtest] Error 1
Filter out $(CFLAGS_GVOC) from the header test $(c_flags) as they don't
make sense here anyway.
References: http://lore.kernel.org/r/d8112767-4089-4c58-d7d3-2ce03139858a@infradead.org
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: c6d4a099a2 ("drm/i915: reimplement header test feature")
Cc: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200221105414.14358-1-jani.nikula@intel.com
Our current global state handling is pretty ad-hoc. Let's try to
make it better by imitating the standard drm core private object
approach.
The reason why we don't want to directly use the private objects
is locking; Each private object has its own lock so if we
introduce any global private objects we get serialized by that
single lock across all pipes. The global state apporoach instead
uses a read/write lock type of approach where each individual
crtc lock counts as a read lock, and grabbing all the crtc locks
allows one write access.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200120174728.21095-15-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Add a debugfs subdirectory i915_params with all the i915 module
parameters. This is a first step, with lots of boilerplate, and not much
benefit yet.
This will result in a new device specific debugfs directory at
/sys/kernel/debug/dri/<N>/i915_params duplicating the module specific
sysfs directory at /sys/module/i915/parameters/. Going forward, all
users of the parameters should use the debugfs, with the module
parameters being phased out.
Add debugfs permissions to I915_PARAMS_FOR_EACH(). This duplicates the
mode with module parameter sysfs, but the goal is to make the module
parameters read-only initial values for device specific parameters.
0 mode will bypass debugfs creation. Use it for verbose_state_checks
which will need special attention in follow-up work.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/600101c8433e7caf9303663fc85a9972fa1f05e7.1575560168.git.jani.nikula@intel.com
The design of the OA unit has been split into several units. We now
have a global unit (OAG) and a render specific unit (OAR). This leads
to some changes on how we program things. Some details :
OAR:
- has its own set of counter registers, they are per-context
saved/restored
- counters are not written to the circular OA buffer
- a snapshot of the counters can be acquired with
MI_RECORD_PERF_COUNT, or a single counter can be read with
MI_STORE_REGISTER_MEM.
OAG:
- has global counters that increment across context switches
- counters are written into the circular OA buffer (if requested)
v2: Fix checkpatch warnings on code style (Lucas)
v3: (Umesh)
- Update register from which tail, status and head are read
- Update logic to sample context reports
- Update whitelist mux and b counter regs
v4: Fix a bug when updating context image for new contexts (Umesh)
v5: Squash patch enabling save/restore of counters into context image
We want this so we can preempt performance queries and keep the
system responsive even when long running queries are ongoing. We
avoid doing it for all contexts.
- use LRI to modify context control (Chris)
- use MASKED_FIELD to program just the masked bits (Chris)
- disable save/restore of counters on cleanup (Chris)
v6: Do not use implicit parameters (Chris)
BSpec: 28727, 30021
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Chris Wilson <chris.p.wilson@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025193746.47155-2-umesh.nerlige.ramappa@intel.com
To flush idle barriers, and even inflight requests, we want to send a
preemptive 'pulse' along an engine. We use a no-op request along the
pinned kernel_context at high priority so that it should run or else
kick off the stuck requests. We can use this to ensure idle barriers are
immediately flushed, as part of a context cancellation mechanism, or as
part of a heartbeat mechanism to detect and reset a stuck GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021174339.5389-1-chris@chris-wilson.co.uk
UAPI Changes:
- Never allow userptr into the mappable GGTT (Chris)
No existing users. Avoid anyone from even trying to
spare a deadlock scenario.
Cross-subsystem Changes:
Core Changes:
Driver Changes:
- Eliminate struct_mutex use as BKL! (Chris)
Only used for execbuf serialisation.
- Initialize DDI TC and TBT ports (D-I) on Tigerlake (Lucas)
- Fix DKL link training for 2.7GHz and 1.62GHz (Jose)
- Add Tigerlake DKL PHY programming sequences (Clinton)
- Add Tigerlake Thunderbolt PLL divider values (Imre)
- drm/i915: Use helpers for drm_mm_node booleans (Chris)
- Restrict L3 remapping sysfs interface to dwords (Chris)
- Fix audio power up sequence for gen10+ display (Kai)
- Skip redundant execlist resubmission (Chris)
- Only unwedge if we can reset GPU first (Chris)
- Initialise breadcrumb lists on the virtual engine (Chris)
- Don't rely on kernel context existing during early errors (Matt A)
- Update Icelake+ MG_DP_MODE programming table (Clinton)
- Update DMC firmware for Icelake (Anusha)
- Downgrade DP MST error after unplugging TypeC cable (Srinivasan)
- Limit MST modes based on plane size too (Ville)
- Polish intel_tv_mode_valid() (Ville)
- Fix g4x sprite scaling stride check with GTT remapping (Ville)
- Don't advertize non-exisiting crtcs (Ville)
- Clean up encoder->crtc_mask setup (Ville)
- Use tc_port instead of port parameter to MG registers (Jose)
- Remove static variable for aux last status (Jani)
- Implement a better i945gm vblank irq vs. C-states workaround (Ville)
- Make the object creation interface consistent (CQ)
- Rename intel_vga_msr_write() to intel_vga_reset_io_mem() (Jani, Ville)
- Eliminate previous drm_dbg/drm_err usage (Jani)
- Move gmbus setup down to intel_modeset_init() (Jani)
- Abstract all vgaarb access to intel_vga.[ch] (Jani)
- Split out i915_switcheroo.[ch] from i915_drv.c (Jani)
- Use intel_gt in has_reset* (Chris)
- Eliminate return value for i915_gem_init_early (Matt A)
- Selftest improvements (Chris)
- Update HuC firmware header version number format (Daniele)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191007134801.GA24313@jlahtine-desk.ger.corp.intel.com
This patch adds a function, which will internally get the gem buffer
for DSB engine. The GEM buffer is from global GTT, and is mapped into
CPU domain, contains the data + opcode to be feed to DSB engine.
v1: Initial version.
v2:
- removed some unwanted code. (Chris)
- Used i915_gem_object_create_internal instead of _shmem. (Chris)
- cmd_buf_tail removed and can be derived through vma object. (Chris)
v3: vma realeased if i915_gem_object_pin_map() failed. (Shashank)
v4: for simplification and based on current usage added single dsb
object in intel_crtc. (Shashank)
v5: seting NULL to cmd_buf moved outside of mutex in dsb-put(). (Shashank)
v6:
- refcount machanism added.
- Used atomic_add_return and atomic_dec_and_test instead of
atomic_inc and atomic_dec. (Jani)
Cc: Imre Deak <imre.deak@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
[Jani: added #include <linux/types.h> while pushing]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-3-animesh.manna@intel.com
Pull Kbuild updates from Masahiro Yamada:
- add modpost warn exported symbols marked as 'static' because 'static'
and EXPORT_SYMBOL is an odd combination
- break the build early if gold linker is used
- optimize the Bison rule to produce .c and .h files by a single
pattern rule
- handle PREEMPT_RT in the module vermagic and UTS_VERSION
- warn CONFIG options leaked to the user-space except existing ones
- make single targets work properly
- rebuild modules when module linker scripts are updated
- split the module final link stage into scripts/Makefile.modfinal
- fix the missed error code in merge_config.sh
- improve the error message displayed on the attempt of the O= build in
unclean source tree
- remove 'clean-dirs' syntax
- disable -Wimplicit-fallthrough warning for Clang
- add CONFIG_CC_OPTIMIZE_FOR_SIZE_O3 for ARC
- remove ARCH_{CPP,A,C}FLAGS variables
- add $(BASH) to run bash scripts
- change *CFLAGS_<basetarget>.o to take the relative path to $(obj)
instead of the basename
- stop suppressing Clang's -Wunused-function warnings when W=1
- fix linux/export.h to avoid genksyms calculating CRC of trimmed
exported symbols
- misc cleanups
* tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (63 commits)
genksyms: convert to SPDX License Identifier for lex.l and parse.y
modpost: use __section in the output to *.mod.c
modpost: use MODULE_INFO() for __module_depends
export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols
export.h: remove defined(__KERNEL__), which is no longer needed
kbuild: allow Clang to find unused static inline functions for W=1 build
kbuild: rename KBUILD_ENABLE_EXTRA_GCC_CHECKS to KBUILD_EXTRA_WARN
kbuild: refactor scripts/Makefile.extrawarn
merge_config.sh: ignore unwanted grep errors
kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)
modpost: add NOFAIL to strndup
modpost: add guid_t type definition
kbuild: add $(BASH) to run scripts with bash-extension
kbuild: remove ARCH_{CPP,A,C}FLAGS
kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC
kbuild: Do not enable -Wimplicit-fallthrough for clang for now
kbuild: clean up subdir-ymn calculation in Makefile.clean
kbuild: remove unneeded '+' marker from cmd_clean
kbuild: remove clean-dirs syntax
kbuild: check clean srctree even earlier
...
Kbuild provides per-file compiler flag addition/removal:
CFLAGS_<basetarget>.o
CFLAGS_REMOVE_<basetarget>.o
AFLAGS_<basetarget>.o
AFLAGS_REMOVE_<basetarget>.o
CPPFLAGS_<basetarget>.lds
HOSTCFLAGS_<basetarget>.o
HOSTCXXFLAGS_<basetarget>.o
The <basetarget> is the filename of the target with its directory and
suffix stripped.
This syntax comes into a trouble when two files with the same basename
appear in one Makefile, for example:
obj-y += foo.o
obj-y += dir/foo.o
CFLAGS_foo.o := <some-flags>
Here, the <some-flags> applies to both foo.o and dir/foo.o
The real world problem is:
scripts/kconfig/util.c
scripts/kconfig/lxdialog/util.c
Both files are compiled into scripts/kconfig/mconf, but only the
latter should be given with the ncurses flags.
It is more sensible to use the relative path to the Makefile, like this:
obj-y += foo.o
CFLAGS_foo.o := <some-flags>
obj-y += dir/foo.o
CFLAGS_dir/foo.o := <other-flags>
At first, I attempted to replace $(basetarget) with $*. The $* variable
is replaced with the stem ('%') part in a pattern rule. This works with
most of cases, but does not for explicit rules.
For example, arch/ia64/lib/Makefile reuses rule_as_o_S in its own
explicit rules, so $* will be empty, resulting in ignoring the per-file
AFLAGS.
I introduced a new variable, target-stem, which can be used also from
explicit rules.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Marc Zyngier <maz@kernel.org>