More stuff for 4.13:
- skl+ wm fixes from Mahesh Kumar
- some refactor and tests for i915_sw_fence (Chris)
- tune execlist/scheduler code (Chris)
- g4x,g33 gpu reset improvements (Chris, Mika)
- guc code cleanup (Michal Wajdeczko, Michał Winiarski)
- dp aux backlight improvements (Puthikorn Voravootivat)
- buffer based guc/host communication (Michal Wajdeczko)
* tag 'drm-intel-next-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel: (253 commits)
drm/i915: Update DRIVER_DATE to 20170529
drm/i915: Keep the forcewake timer alive for 1ms past the most recent use
drm/i915/guc: capture GuC logs if FW fails to load
drm/i915/guc: Introduce buffer based cmd transport
drm/i915/guc: Disable send function on fini
drm: Add definition for eDP backlight frequency
drm/i915: Drop AUX backlight enable check for backlight control
drm/i915: Consolidate #ifdef CONFIG_INTEL_IOMMU
drm/i915: Only GGTT vma may be pinned and prevent shrinking
drm/i915: Serialize GTT/Aperture accesses on BXT
drm/i915: Convert i915_gem_object_ops->flags values to use BIT()
drm/i915/selftests: Silence compiler warning in igt_ctx_exec
drm/i915/guc: Skip port assign on first iteration of GuC dequeue
drm/i915: Remove misleading comment in request_alloc
drm/i915/g33: Improve reset reliability
Revert "drm/i915: Restore lost "Initialized i915" welcome message"
drm/i915/huc: Update GLK HuC version
drm/i915: Check for allocation failure
drm/i915/guc: Remove action status and statistics from debugfs
drm/i915/g4x: Improve gpu reset reliability
...
BXT has a H/W issue with IOMMU which can lead to system hangs when
Aperture accesses are queued within the GAM behind GTT Accesses.
This patch avoids the condition by wrapping all GTT updates in stop_machine
and using a flushing read prior to restarting the machine.
The stop_machine guarantees no new Aperture accesses can begin while
the PTE writes are being emmitted. The flushing read ensures that
any following Aperture accesses cannot begin until the PTE writes
have been cleared out of the GAM's fifo.
Only FOLLOWING Aperture accesses need to be separated from in flight
PTE updates. PTE Writes may follow tightly behind already in flight
Aperture accesses, so no flushing read is required at the start of
a PTE update sequence.
This issue was reproduced by running
igt/gem_readwrite and
igt/gem_render_copy
simultaneously from different processes, each in a tight loop,
with INTEL_IOMMU enabled.
This patch was originally published as:
drm/i915: Serialize GTT Updates on BXT
v2: Move bxt/iommu detection into static function
Remove #ifdef CONFIG_INTEL_IOMMU protection
Make function names more reflective of purpose
Move flushing read into static function
v3: Tidy up for checkpatch.pl
Testcase: igt/gem_concurrent_blit
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1495641251-30022-1-git-send-email-jon.bloomfield@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Now that drm_[cm]alloc* helpers are simple one line wrappers around
kvmalloc_array and drm_free_large is just kvfree alias we can drop
them and replace by their native forms.
This shouldn't introduce any functional change.
Changes since v1
- fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day
build robot
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers
[danvet: Fixup vgem which grew another user very recently.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz
If a vma is already bound to a ppgtt, we incorrectly call
allocate_va_range again when doing a PIN_UPDATE, which will result in
over accounting within our paging structures, such that when we do
unbind something we don't actually destroy the structures and end up
inadvertently recycling them. In reality this probably isn't too bad,
but once we start touching PDEs and PDPEs for 64K/2M/1G pages this
apparent recycling will manifest into lots of really, really subtle
bugs.
v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma
Fixes: ff685975d9 ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512091423.26085-1-chris@chris-wilson.co.uk
If we setup the vm size early, we can use the newly introduced
i915_vm_is_48bit() in majority of callsites wanting to know the vm size.
As we operate either with 3lvl or 4lvl page table structure,
wrap the vm size query inside a function which tells us if
4lvl setup is needed for particular vm, as the following
code uses the function names where level is noted.
v2: use_4lvl (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1488295691-9404-3-git-send-email-mika.kuoppala@intel.com
Testing with concurrent GGTT accesses no longer show the coherency
problems from yonder, commit 5bab6f60cb ("drm/i915: Serialise updates
to GGTT with access through GGTT on Braswell"). My presumption is that
the root cause was more likely fixed by commit 3b5724d702 ("drm/i915:
Wait for writes through the GTT to land before reading back"), along
with the use of WC updates to the global gTT in commit 8448661d65
("drm/i915: Convert clflushed pagetables over to WC maps". Given
that the original symptoms can no longer be reproduced, time to remove
the workaround.
Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170220124718.14796-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The hardware does not cope very well with us changing the PD within an
active context (the context must be idle for it to re-read the PD). As
we only check whether the page is idle before changing the entry (and on
through the PD tree), we cannot reliably replace PD entries on
gen6/gen7. To fully avoid changing the tree at runtime, preallocate it
on init.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-9-chris@chris-wilson.co.uk
We flush the entire page every time we update a few bytes, making the
update of a page table many, many times slower than is required. If we
create a WC map of the page for our updates, we can avoid the clflush
but incur additional cost for creating the pagetable. We amoritize that
cost by reusing page vmappings, and only changing the page protection in
batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
This removes the usage of intel_ring_emit in favour of
directly writing to the ring buffer.
intel_ring_emit was preventing the compiler for optimising
fetch and increment of the current ring buffer pointer and
therefore generating very verbose code for every write.
It had no useful purpose since all ringbuffer operations
are started and ended with intel_ring_begin and
intel_ring_advance respectively, with no bail out in the
middle possible, so it is fine to increment the tail in
intel_ring_begin and let the code manage the pointer
itself.
Useless instruction removal amounts to approximately
two and half kilobytes of saved text on my build.
Not sure if this has any measurable performance
implications but executing a ton of useless instructions
on fast paths cannot be good.
v2:
* Change return from intel_ring_begin to error pointer by
popular demand.
* Move tail increment to intel_ring_advance to enable some
error checking.
v3:
* Move tail advance back into intel_ring_begin.
* Rebase and tidy.
v4:
* Complete rebase after a few months since v3.
v5:
* Remove unecessary cast and fix !debug compile. (Chris Wilson)
v6:
* Make intel_ring_offset take request as well.
* Fix recording of request postfix plus a sprinkle of asserts.
(Chris Wilson)
v7:
* Use intel_ring_offset to get the postfix. (Chris Wilson)
* Convert GVT code as well.
v8:
* Rename *out++ to *cs++.
v9:
* Fix GVT out to cs conversion in GVT.
v10:
* Rebase for new intel_ring_begin in selftests.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com
It is possible whilst allocating the page-directory tree for a ppgtt
bind that the shrinker may run and reap unused parts of the tree. If the
shrinker happens to remove a chunk of the tree that the
allocate_va_range has already processed, we may then try to insert into
the dangling tree. This test uses the fault-injection framework to force
the shrinker to be invoked before we allocate new pages, i.e. new chunks
of the PD tree.
References: https://bugs.freedesktop.org/show_bug.cgi?id=99295
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>