Inspired by Tvrtko's critique of the reaping of the stale contexts
before allocating a new one, also limit the freed object reaping to the
oldest stale object before allocating a fresh object. Unlike contexts,
objects may have radically different sizes of backing storage, but
similar to contexts, while we want to prevent starvation due to
excessive freed lists, we also do not want to delay fresh allocations
for too long. Only freeing the oldest on the freed object list before
each allocation is a reasonable compromise.
v2: Only a single consumer of llist_del_first() is allowed (although
multiple llist_add are still allowed in parallel). Unlike
i915_gem_context, i915_gem_flush_free_objects() is itself not serialized
and so we need to add our own spinlock. Otherwise KASAN eventually spots
a use-after-free for the race on *first->next.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-8-chris@chris-wilson.co.uk
Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61 ("drm/i915/shrinker: Flush active on objects before
counting")).
v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
With preemption, we will want to "unsubmit" a request, taking it back
from the hw and returning it to the priority sorted execution list. In
order to know where to insert it into that list, we need to remember
its adjust priority (which may change even as it was being executed).
This also affects reset for execlists as we are now unsubmitting the
requests following the reset (rather than directly writing the ELSP for
the inflight contexts). This turns reset into an accidental preemption
point, as after the reset we may choose a different pair of contexts to
submit to hw.
GuC is not updated as this series doesn't add preemption to the GuC
submission, and so it can keep benefiting from the early pruning of the
DFS inside execlists_schedule() for a little longer. We also need to
find a way of reducing the cost of that DFS...
v2: Include priority in error-state
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-6-chris@chris-wilson.co.uk
Add another perma-pinned context for using for preemption at any time.
We cannot just reuse the existing kernel context, as first and foremost
we need to ensure that we can preempt the kernel context itself, so
require a distinct context id. Similar to the kernel context, we may
want to interrupt execution and switch to the preempt context at any
time, and so it needs to be permanently pinned and available.
To compensate for yet another permanent allocation, we shrink the
existing context and the new context by reducing their ringbuffer to the
minimum.
v2: Assert that we never allocate a request from the preemption context.
v3: Limit perma-pin to engines that may preempt.
v4: Onion cleanup for early driver death
v5: Onion ordering in main driver cleanup as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-4-chris@chris-wilson.co.uk
Getting started with v4.15 features:
- Cannonlake workarounds (Rodrigo, Oscar)
- Infoframe refactoring and fixes to enable infoframes for DP (Ville)
- VBT definition updates (Jani)
- Sparse warning fixes (Ville, Chris)
- Crtc state usage fixes and cleanups (Ville)
- DP vswing, pre-emph and buffer translation refactoring and fixes (Rodrigo)
- Prevent IPS from interfering with CRC capture (Ville, Marta)
- Enable Mesa to advertise ARB_timer_query (Nanley)
- Refactor GT number into intel_device_info (Lionel)
- Avoid eDP DP AUX CH timeouts harder (Manasi)
- CDCLK check improvements (Ville)
- Restore GPU clock boost on missed pageflip vblanks (Chris)
- Fence register reservation API for vGPU (Changbin)
- First batch of CCS fixes (Ville)
- Finally, numerous GEM fixes, cleanups and improvements (Chris)
* tag 'drm-intel-next-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel: (100 commits)
drm/i915: Update DRIVER_DATE to 20170907
drm/i915/cnl: WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod)
drm/i915: Lift has-pinned-pages assert to caller of ____i915_gem_object_get_pages
drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk
drm/i915/cnl: Allow the reg_read ioctl to read the RCS TIMESTAMP register
drm/i915: Move device_info.has_snoop into the static tables
drm/i915: Disable MI_STORE_DATA_IMM for i915g/i915gm
drm/i915: Re-enable GTT following a device reset
drm/i915/cnp: Wa 1181: Fix Backlight issue
drm/i915: Annotate user relocs with __user
drm/i915: Constify load detect mode
drm/i915/perf: Remove __user from u64 in drm_i915_perf_oa_config
drm/i915: Silence sparse by using gfp_t
drm/i915: io unmap functions want __iomem
drm/i915: Add __rcu to radix tree slot pointer
drm/i915: Wake up the device for the fbdev setup
drm/i915: Add interface to reserve fence registers for vGPU
drm/i915: Use correct path to trace include
drm/i915: Fix the missing PPAT cache attributes on CNL
drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder
...
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- DP SDP defines (Ville)
- polish for scdc helpers (Thierry Reding)
- fix lifetimes for connector/plane state across crtc changes (Maarten
Lankhorst).
- sparse fixes (Ville+Thierry)
- make legacy kms ioctls all interruptible (Maarten)
- push edid override into the edid helpers (out of probe helpers)
(Jani)
- DP ESI defines for link status (DK)
Driver Changes:
- drm-panel is now in drm-misc!
- minor panel-simple cleanups/refactoring by various folks
- drm_bridge_add cleanup (Inki Dae)
- constify a few i2c_device_id structs (Arvind Yadav)
- More patches from Noralf's fb/gem helper cleanup
- bridge/synopsis: reset fix (Philippe Cornu)
- fix tracepoint include handling in drivers (Thierry)
- rockchip: lvds support (Sandy Huang)
- move sun4i into drm-misc fold (Maxime Ripard)
- sun4i: refactor driver load + support TCON backend/layer muxing
(Chen-Yu Tsai)
- pl111: support more pl11x variants (Linus Walleij)
- bridge/adv7511: robustify probing/edid handling (Lars-Petersen
Clausen)
New hw support:
- S6E63J0X03 panel (Hoegeun Kwon)
- OTM8009A panel (Philippe CORNU)
- Seiko 43WVF1G panel (Marco Franchi)
- tve200 driver (Linus Walleij)
Plus assorted of tiny patches all over, including our first outreachy
patches from applicants for the winter round!
* tag 'drm-misc-next-2017-09-20' of git://anongit.freedesktop.org/git/drm-misc: (101 commits)
drm: add backwards compatibility support for drm_kms_helper.edid_firmware
drm: handle override and firmware EDID at drm_do_get_edid() level
drm/dp: DPCD register defines for link status within ESI field
drm/rockchip: Replace dev_* with DRM_DEV_*
drm/tinydrm: Drop driver registered message
drm/gem-fb-helper: Use debug message on gem lookup failure
drm/imx: Use drm_gem_fb_create() and drm_gem_fb_prepare_fb()
drm/bridge: adv7511: Constify HDMI CODEC platform data
drm/bridge: adv7511: Enable connector polling when no interrupt is specified
drm/bridge: adv7511: Remove private copy of the EDID
drm/bridge: adv7511: Properly update EDID when no EDID was found
drm/crtc: Convert setcrtc ioctl locking to interruptible.
drm/atomic: Convert pageflip ioctl locking to interruptible.
drm/legacy: Convert setplane ioctl locking to interruptible.
drm/legacy: Convert cursor ioctl locking to interruptible.
drm/atomic: Convert atomic ioctl locking to interruptible.
drm/atomic: Prepare drm_modeset_lock infrastructure for interruptible waiting, v2.
drm/tve200: Clean up panel bridging
drm/doc: Update todo.rst
drm/dp/mst: Sideband message transaction to power up/down nodes
...
i830 seems to occasionally forget the PIPESTAT enable bits when
we read the register. These aren't the only registers on i830 that
have problems with RMW, as reading the double buffered plane
registers returns the latched value rather than the last written
value. So something similar is perhaps going on with PIPESTAT.
This corruption results on vblank interrupts occasionally turning off
on their own, which leads to vblank timeouts and generally a stuck
display subsystem.
So let's not RMW the pipestat enable bits, and instead use the cached
copy we have around.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170914151731.5034-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The private PAT management is to support PPAT entry manipulation. Two
APIs are introduced for dynamically managing PPAT entries: intel_ppat_get
and intel_ppat_put.
intel_ppat_get will search for an existing PPAT entry which perfectly
matches the required PPAT value. If not, it will try to allocate a new
entry if there is any available PPAT indexs, or return a partially
matched PPAT entry if there is no available PPAT indexes.
intel_ppat_put will put back the PPAT entry which comes from
intel_ppat_get. If it's dynamically allocated, the reference count will
be decreased. If the reference count turns into zero, the PPAT index is
freed again.
Besides, another two callbacks are introduced to support the private PAT
management framework. One is ppat->update_hw(), which writes the PPAT
configurations in ppat->entries into HW. Another one is ppat->match, which
will return a score to show how two PPAT values match with each other.
v17:
- Refine the comparision of score of BDW. (Joonas)
v16:
- Fix a bug in PPAT match function of BDW. (Joonas)
v15:
- Refine some code flow. (Joonas)
v12:
- Fix a problem "not returning the entry of best score". (Zhenyu)
v7:
- Keep all the register writes unchanged in this patch. (Joonas)
v6:
- Address all comments from Chris:
http://www.spinics.net/lists/intel-gfx/msg136850.html
- Address all comments from Joonas:
http://www.spinics.net/lists/intel-gfx/msg136845.html
v5:
- Add check and warnnings for those platforms which don't have PPAT.
v3:
- Introduce dirty bitmap for PPAT registers. (Chris)
- Change the name of the pointer "dev_priv" to "i915". (Chris)
- intel_ppat_{get, put} returns/takes a const intel_ppat_entry *. (Chris)
v2:
- API re-design. (Chris)
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v7
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[Joonas: Use BIT() in the enum in bdw_private_pat_match]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1505392783-4084-1-git-send-email-zhi.a.wang@intel.com
The engine also provides a mirror of the CSB write pointer in the HWSP,
but not of our read pointer. To take advantage of this we need to
remember where we read up to on the last interrupt and continue off from
there. This poses a problem following a reset, as we don't know where
the hw will start writing from, and due to the use of power contexts we
cannot perform that query during the reset itself. So we continue the
current modus operandi of delaying the first read of the context-status
read/write pointers until after the first interrupt. With this we should
now have eliminated all uncached mmio reads in handling the
context-status interrupt, though we still have the uncached mmio writes
for submitting new work, and many uncached mmio reads in the global
interrupt handler itself. Still a step in the right direction towards
reducing our resubmit latency, although it appears lost in the noise!
v2: Cannonlake moved the CSB write index
v3: Include the sw/hwsp state in debugfs/i915_engine_info
v4: Also revert to using CSB mmio for GVT-g
v5: Prevent the compiler reloading tail (Mika)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-6-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
The sgt iterators cause an
drivers/gpu/drm/i915/i915_gpu_error.c:846 i915_error_object_create() warn: statement has no effect 7
everywhere they are used. If we change the code slightly, we can achieve
the same increment without altering the output or raising a warning.
text data bss dec hex filename
1267906 20587 3168 1291661 13b58d before
1267906 20587 3168 1291661 13b58d after
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170913105754.4423-1-chris@chris-wilson.co.uk
Pull i916 drm fixes from Rodrigo Vivi:
"Since Dave is on paternity leave we are sending drm/i915 fixes for
v4.14-rc1 directly to you as he had asked us to do.
The most critical ones are the GPU reset fix for gen2-4 and GVT fix
for a regression that is blocking gvt init to work on your tree.
The rest is general fixes for patches coming from drm-next"
Acked-by: Dave Airlie <airlied@redhat.com>
* tag 'drm-intel-next-fixes-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Re-enable GTT following a device reset
drm/i915: Annotate user relocs with __user
drm/i915: Silence sparse by using gfp_t
drm/i915: Add __rcu to radix tree slot pointer
drm/i915: Fix the missing PPAT cache attributes on CNL
drm/i915/gvt: Remove one duplicated MMIO
drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder
drm/i915: Make i2c lock ops static
drm/i915: Make i9xx_load_ycbcr_conversion_matrix() static
drm/i915/edp: Increase T12 panel delay to 900 ms to fix DP AUX CH timeouts
drm/i915: Ignore duplicate VMA stored within the per-object handle LUT
drm/i915: Skip fence alignemnt check for the CCS plane
drm/i915: Treat fb->offsets[] as a raw byte offset instead of a linear offset
drm/i915: Always wake the device to flush the GTT
drm/i915: Recreate vmapping even when the object is pinned
drm/i915: Quietly cancel FBC activation if CRTC is turned off before worker
With the addition of __sg_alloc_table_from_pages we can control
the maximum coalescing size and eliminate a separate path for
allocating backing store here.
Similar to 871dfbd67d ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.
v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.
v3:
* Actually include the swiotlb override fix.
v4:
* Regroup parameters a bit. (Chris Wilson)
v5:
* Rebase for swiotlb_max_segment.
* Add DMA map failure handling as in abb0deacb5
("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").
v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)
v7: Rebase.
v8: Commit spelling fix.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170803091417.23677-1-tvrtko.ursulin@linux.intel.com