In quite a few places, we have a list iteration over the vma on an
object that only want to inspect GGTT vma. By construction, these are
placed at the start of the list, so we have copied that knowledge into
many callsites. Pull that knowledge back to i915_vma.h and provide a
for_each_ggtt_vma() to tidy up the code.
v2: Add a backreference from vma_create() to remind ourselves why we put
ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail).
v3: Fixup s/vma/V/
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk
[airlied: fix conflict in intel_dsi.c]
drm-intel-next-2017-12-01:
- Init clock gate fix (Ville)
- Execlists event handling corrections (Chris, Michel)
- Improvements on GPU Cache invalidation and context switch (Chris)
- More perf OA changes (Lionel)
- More selftests improvements and fixes (Chris, Matthew)
- Clean-up on modules parameters (Chris)
- Clean-up around old ringbuffer submission and hw semaphore on old platforms (Chris)
- More Cannonlake stabilization effort (David, James)
- Display planes clean-up and improvements (Ville)
- New PMU interface for perf queries... (Tvrtko)
- ... and other subsequent PMU changes and fixes (Tvrtko, Chris)
- Remove success dmesg noise from rotation (Chris)
- New DMC for Kabylake (Anusha)
- Fixes around atomic commits (Daniel)
- GuC updates and fixes (Sagar, Michal, Chris)
- Couple gmbus/i2c fixes (Ville)
- Use exponential backoff for all our wait_for() (Chris)
- Fixes for i915/fbdev (Chris)
- Backlight fixes (Arnd)
- Updates on shrinker (Chris)
- Make Hotplug enable more robuts (Chris)
- Disable huge pages (TPH) on lack of a needed workaround (Joonas)
- New GuC images for SKL, KBL, BXT (Sagar)
- Add HW Workaround for Geminilake performance (Valtteri)
- Fixes for PPS timings (Imre)
- More IPS fixes (Maarten)
- Many fixes for Display Port on gen2-gen4 (Ville)
- Retry GPU reset making the recover from hang more robust (Chris)
* tag 'drm-intel-next-2017-12-01' of git://anongit.freedesktop.org/drm/drm-intel: (101 commits)
drm/i915: Update DRIVER_DATE to 20171201
drm/i915/cnl: Mask previous DDI - PLL mapping
drm/i915: Remove unsafe i915.enable_rc6
drm/i915: Sleep and retry a GPU reset if at first we don't succeed
drm/i915: Interlaced DP output doesn't work on VLV/CHV
drm/i915: Pass crtc state to intel_pipe_{enable,disable}()
drm/i915: Wait for pipe to start on i830 as well
drm/i915: Fix vblank timestamp/frame counter jumps on gen2
drm/i915: Fix deadlock in i830_disable_pipe()
drm/i915: Fix has_audio readout for DDI A
drm/i915: Don't add the "force audio" property to DP connectors that don't support audio
drm/i915: Disable DP audio for g4x
drm/i915/selftests: Wake the device before executing requests on the GPU
drm/i915: Set fake_vma.size as well as fake_vma.node.size for capture
drm/i915: Tidy up signed/unsigned comparison
drm/i915: Enable IPS with only sprite plane visible too, v4.
drm/i915: Make ips_enabled a property depending on whether IPS is enabled, v3.
drm/i915: Avoid PPS HW/SW state mismatch due to rounding
drm/i915: Skip switch-to-kernel-context on suspend when wedged
drm/i915/glk: Apply WaProgramL3SqcReg1DefaultForPerf for GLK too
...
More change sets for 4.16:
- Many improvements for selftests and other igt tests (Chris)
- Forcewake with PUNIT->PMIC bus fixes and robustness (Hans)
- Define an engine class for uABI (Tvrtko)
- Context switch fixes and improvements (Chris)
- GT powersavings and power gating simplification and fixes (Chris)
- Other general driver clean-ups (Chris, Lucas, Ville)
- Removing old, useless and/or bad workarounds (Chris, Oscar, Radhakrishna)
- IPS, pipe config, etc in preparation for another Fast Boot attempt (Maarten)
- OA perf fixes and support to Coffee Lake and Cannonlake (Lionel)
- Fixes around GPU fault registers (Michel)
- GEM Proxy (Tina)
- Refactor of Geminilake and Cannonlake plane color handling (James)
- Generalize transcoder loop (Mika Kahola)
- New HW Workaround for Cannonlake and Geminilake (Rodrigo)
- Resume GuC before using GEM (Chris)
- Stolen Memory handling improvements (Ville)
- Initialize entry in PPAT for older compilers (Chris)
- Other fixes and robustness improvements on execbuf (Chris)
- Improve logs of GEM_BUG_ON (Mika Kuoppala)
- Rework with massive rename of GuC functions and files (Sagar)
- Don't sanitize frame start delay if pipe is off (Ville)
- Cannonlake clock fixes (Rodrigo)
- Cannonlake HDMI 2.0 support (Rodrigo)
- Add a GuC doorbells selftest (Michel)
- Add might_sleep() check to our wait_for() (Chris)
Many GVT changes for 4.16:
- CSB HWSP update support (Weinan)
- GVT debug helpers, dyndbg and debugfs (Chuanxiao, Shuo)
- full virtualized opregion (Xiaolin)
- VM health check for sane fallback (Fred)
- workload submission code refactor for future enabling (Zhi)
- Updated repo URL in MAINTAINERS (Zhenyu)
- other many misc fixes
* tag 'drm-intel-next-2017-11-17-1' of git://anongit.freedesktop.org/drm/drm-intel: (260 commits)
drm/i915: Update DRIVER_DATE to 20171117
drm/i915: Add a policy note for removing workarounds
drm/i915/selftests: Report ENOMEM clearly for an allocation failure
Revert "drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk"
drm/i915: Calculate g4x intermediate watermarks correctly
drm/i915: Calculate vlv/chv intermediate watermarks correctly, v3.
drm/i915: Pass crtc_state to ips toggle functions, v2
drm/i915: Pass idle crtc_state to intel_dp_sink_crc
drm/i915: Enable FIFO underrun reporting after initial fastset, v4.
drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
drm/i915: Add might_sleep() check to wait_for()
drm/i915/selftests: Add a GuC doorbells selftest
drm/i915/cnl: Extend HDMI 2.0 support to CNL.
drm/i915/cnl: Simplify dco_fraction calculation.
drm/i915/cnl: Don't blindly replace qdiv.
drm/i915/cnl: Fix wrpll math for higher freqs.
drm/i915/cnl: Fix, simplify and unify wrpll variable sizes.
drm/i915/cnl: Remove useless conversion.
drm/i915/cnl: Remove spurious central_freq.
drm/i915/selftests: exercise_ggtt may have nothing to do
...
Pull drm updates from Dave Airlie:
"This is the main drm pull request for v4.15.
Core:
- Atomic object lifetime fixes
- Atomic iterator improvements
- Sparse/smatch fixes
- Legacy kms ioctls to be interruptible
- EDID override improvements
- fb/gem helper cleanups
- Simple outreachy patches
- Documentation improvements
- Fix dma-buf rcu races
- DRM mode object leasing for improving VR use cases.
- vgaarb improvements for non-x86 platforms.
New driver:
- tve200: Faraday Technology TVE200 block.
This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
the StorLink SL3516 (later Cortina Systems CS3516) as well as the
Grain Media GM8180.
New bridges:
- SiI9234 support
New panels:
- S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
LT089AC19000, Innolux AT043TN24
i915:
- Remove Coffeelake from alpha support
- Cannonlake workarounds
- Infoframe refactoring for DisplayPort
- VBT updates
- DisplayPort vswing/emph/buffer translation refactoring
- CCS fixes
- Restore GPU clock boost on missed vblanks
- Scatter list updates for userptr allocations
- Gen9+ transition watermarks
- Display IPC (Isochronous Priority Control)
- Private PAT management
- GVT: improved error handling and pci config sanitizing
- Execlist refactoring
- Transparent Huge Page support
- User defined priorities support
- HuC/GuC firmware refactoring
- DP MST fixes
- eDP power sequencing fixes
- Use RCU instead of stop_machine
- PSR state tracking support
- Eviction fixes
- BDW DP aux channel timeout fixes
- LSPCON fixes
- Cannonlake PLL fixes
amdgpu:
- Per VM BO support
- Powerplay cleanups
- CI powerplay support
- PASID mgr for kfd
- SR-IOV fixes
- initial GPU reset for vega10
- Prime mmap support
- TTM updates
- Clock query interface for Raven
- Fence to handle ioctl
- UVD encode ring support on Polaris
- Transparent huge page DMA support
- Compute LRU pipe tweaks
- BO flag to allow buffers to opt out of implicit sync
- CTX priority setting API
- VRAM lost infrastructure plumbing
qxl:
- fix flicker since atomic rework
amdkfd:
- Further improvements from internal AMD tree
- Usermode events
- Drop radeon support
nouveau:
- Pascal temperature sensor support
- Improved BAR2 handling
- MMU rework to support Pascal MMU
exynos:
- Improved HDMI/mixer support
- HDMI audio interface support
tegra:
- Prep work for tegra186
- Cleanup/fixes
msm:
- Preemption support for a5xx
- Display fixes for 8x96 (snapdragon 820)
- Async cursor plane fixes
- FW loading rework
- GPU debugging improvements
vc4:
- Prep for DSI panels
- fix T-format tiling scanout
- New madvise ioctl
Rockchip:
- LVDS support
omapdrm:
- omap4 HDMI CEC support
etnaviv:
- GPU performance counters groundwork
sun4i:
- refactor driver load + TCON backend
- HDMI improvements
- A31 support
- Misc fixes
udl:
- Probe/EDID read fixes.
tilcdc:
- Misc fixes.
pl111:
- Support more variants
adv7511:
- Improve EDID handling.
- HDMI CEC support
sii8620:
- Add remote control support"
* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
drm/rockchip: analogix_dp: Use mutex rather than spinlock
drm/mode_object: fix documentation for object lookups.
drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
drm/i915: Move init_clock_gating() back to where it was
drm/i915: Prune the reservation shared fence array
drm/i915: Idle the GPU before shinking everything
drm/i915: Lock llist_del_first() vs llist_del_all()
drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
drm/i915: Disable lazy PPGTT page table optimization for vGPU
drm/i915/execlists: Remove the priority "optimisation"
drm/i915: Filter out spurious execlists context-switch interrupts
drm/amdgpu: use irq-safe lock for kiq->ring_lock
drm/amdgpu: bypass lru touch for KIQ ring submission
drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
drm/amd/powerplay: initialize a variable before using it
drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
drm/rockchip: add CONFIG_OF dependency for lvds
...
Until Haswell/Baytrail, the hardware used to have a per engine fault
register (e.g. 0x4094 - render fault register, 0x4194 - media fault
register and so on). But since Broadwell, all these registers were
combined into a singe one and the engine id stored in bits 14:12.
Not only we should not been reading (and writing to) registers that do
not exist, in platforms with VCS2 (SKL), the address that would belong
this engine (0x4494, VCS2_HW = 4) is already assigned to other register.
v2: use less controversial function names (Chris).
v3: make non-exported functions static, remove now obsolete check for
engine presence before posting_read (Chris).
References: IHD-OS-BDW-Vol 2c-11.15, page 75.
References: IHD-OS-SKL-Vol 2c-05.16, page 350.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113173628.11689-1-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61 ("drm/i915/shrinker: Flush active on objects before
counting")).
v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
Getting started with v4.15 features:
- Cannonlake workarounds (Rodrigo, Oscar)
- Infoframe refactoring and fixes to enable infoframes for DP (Ville)
- VBT definition updates (Jani)
- Sparse warning fixes (Ville, Chris)
- Crtc state usage fixes and cleanups (Ville)
- DP vswing, pre-emph and buffer translation refactoring and fixes (Rodrigo)
- Prevent IPS from interfering with CRC capture (Ville, Marta)
- Enable Mesa to advertise ARB_timer_query (Nanley)
- Refactor GT number into intel_device_info (Lionel)
- Avoid eDP DP AUX CH timeouts harder (Manasi)
- CDCLK check improvements (Ville)
- Restore GPU clock boost on missed pageflip vblanks (Chris)
- Fence register reservation API for vGPU (Changbin)
- First batch of CCS fixes (Ville)
- Finally, numerous GEM fixes, cleanups and improvements (Chris)
* tag 'drm-intel-next-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel: (100 commits)
drm/i915: Update DRIVER_DATE to 20170907
drm/i915/cnl: WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod)
drm/i915: Lift has-pinned-pages assert to caller of ____i915_gem_object_get_pages
drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk
drm/i915/cnl: Allow the reg_read ioctl to read the RCS TIMESTAMP register
drm/i915: Move device_info.has_snoop into the static tables
drm/i915: Disable MI_STORE_DATA_IMM for i915g/i915gm
drm/i915: Re-enable GTT following a device reset
drm/i915/cnp: Wa 1181: Fix Backlight issue
drm/i915: Annotate user relocs with __user
drm/i915: Constify load detect mode
drm/i915/perf: Remove __user from u64 in drm_i915_perf_oa_config
drm/i915: Silence sparse by using gfp_t
drm/i915: io unmap functions want __iomem
drm/i915: Add __rcu to radix tree slot pointer
drm/i915: Wake up the device for the fbdev setup
drm/i915: Add interface to reserve fence registers for vGPU
drm/i915: Use correct path to trace include
drm/i915: Fix the missing PPAT cache attributes on CNL
drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder
...
The private PAT management is to support PPAT entry manipulation. Two
APIs are introduced for dynamically managing PPAT entries: intel_ppat_get
and intel_ppat_put.
intel_ppat_get will search for an existing PPAT entry which perfectly
matches the required PPAT value. If not, it will try to allocate a new
entry if there is any available PPAT indexs, or return a partially
matched PPAT entry if there is no available PPAT indexes.
intel_ppat_put will put back the PPAT entry which comes from
intel_ppat_get. If it's dynamically allocated, the reference count will
be decreased. If the reference count turns into zero, the PPAT index is
freed again.
Besides, another two callbacks are introduced to support the private PAT
management framework. One is ppat->update_hw(), which writes the PPAT
configurations in ppat->entries into HW. Another one is ppat->match, which
will return a score to show how two PPAT values match with each other.
v17:
- Refine the comparision of score of BDW. (Joonas)
v16:
- Fix a bug in PPAT match function of BDW. (Joonas)
v15:
- Refine some code flow. (Joonas)
v12:
- Fix a problem "not returning the entry of best score". (Zhenyu)
v7:
- Keep all the register writes unchanged in this patch. (Joonas)
v6:
- Address all comments from Chris:
http://www.spinics.net/lists/intel-gfx/msg136850.html
- Address all comments from Joonas:
http://www.spinics.net/lists/intel-gfx/msg136845.html
v5:
- Add check and warnnings for those platforms which don't have PPAT.
v3:
- Introduce dirty bitmap for PPAT registers. (Chris)
- Change the name of the pointer "dev_priv" to "i915". (Chris)
- intel_ppat_{get, put} returns/takes a const intel_ppat_entry *. (Chris)
v2:
- API re-design. (Chris)
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v7
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[Joonas: Use BIT() in the enum in bdw_private_pat_match]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1505392783-4084-1-git-send-email-zhi.a.wang@intel.com
GFP_TEMPORARY was introduced by commit e12ba74d8f ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation. As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.
The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory. So
this is rather misleading and hard to evaluate for any benefits.
I have checked some random users and none of them has added the flag
with a specific justification. I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring. This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.
I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL. Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.
I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.
This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic. It
seems to be a heuristic without any measured advantage for most (if not
all) its current users. The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers. So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.
[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we know that we will completely fill a pagetable (i.e. we are
inserting a complete set of 512 pages), we can skip prefilling that PT
with scratch entries. If we have to abort the insertion prior to writing
the real entries, we will teardown the pagetable and remove it from the
page directory (so that we will restart the allocation next time).
We could do similar tricks for the PD and PDP, but the likelihood of a
single insertion covering the entire 512 entries diminishes, as do the
cycle savings. The saving are even greater (relatively) when we are
preallocating page tables for huge pages, as then we never need to fill
the page table.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170908181622.17791-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Pull i916 drm fixes from Rodrigo Vivi:
"Since Dave is on paternity leave we are sending drm/i915 fixes for
v4.14-rc1 directly to you as he had asked us to do.
The most critical ones are the GPU reset fix for gen2-4 and GVT fix
for a regression that is blocking gvt init to work on your tree.
The rest is general fixes for patches coming from drm-next"
Acked-by: Dave Airlie <airlied@redhat.com>
* tag 'drm-intel-next-fixes-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Re-enable GTT following a device reset
drm/i915: Annotate user relocs with __user
drm/i915: Silence sparse by using gfp_t
drm/i915: Add __rcu to radix tree slot pointer
drm/i915: Fix the missing PPAT cache attributes on CNL
drm/i915/gvt: Remove one duplicated MMIO
drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder
drm/i915: Make i2c lock ops static
drm/i915: Make i9xx_load_ycbcr_conversion_matrix() static
drm/i915/edp: Increase T12 panel delay to 900 ms to fix DP AUX CH timeouts
drm/i915: Ignore duplicate VMA stored within the per-object handle LUT
drm/i915: Skip fence alignemnt check for the CCS plane
drm/i915: Treat fb->offsets[] as a raw byte offset instead of a linear offset
drm/i915: Always wake the device to flush the GTT
drm/i915: Recreate vmapping even when the object is pinned
drm/i915: Quietly cancel FBC activation if CRTC is turned off before worker
We use WC pages for coherent writes into the ppGTT on !llc
architectures. However, to create a WC page requires a stop_machine(),
i.e. is very slow. To compensate we currently keep a per-vm cache of
recently freed pages, but we still see the slow startup of new contexts.
We can amoritize that cost slightly by allocating WC pages in small
batches (PAGEVEC_SIZE == 14) and since creating a WC page implies a
stop_machine() there is no penalty for keeping that stash global.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822173828.5932-1-chris@chris-wilson.co.uk
Let's inherit workarounds from previous platforms that
according to wa_database and BSpec are still valid for
Cannonlake.
v2: Add missed workarounds.
v3: Rebase
v4: Remove bad chunk that was added to rc6 disable. (Ander)
Also remove A0 W/a that are not needed anymore.
v5: Rebase on top of CFL.
v6: Remove empty gen9_init_perctx_bb and gen9_init_indirectctx_bb
since they don't carry any gen10 related W/a. (by Oscar).
Also Remove A0 exclusive workaround.
v7: Remove more A0 exclusive workarounds. As pointed out by Oscar
many workarounds were changed to be A0 only so let's remove
them.
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-1-rodrigo.vivi@intel.com
Enable the guest i915 full ppgtt functionality when host can provide this
capability. vgt_caps is introduced to guest i915 driver to get the vgpu
capabilities from the device model. VGT_CPAS_FULL_PPGTT is one of the
capabilities type to let guest i915 dirver know that the guest i915 full
ppgtt is supported by device model.
Notice that the minor version of pvinfo isn't bumped because of this
vgt_caps introduction, due to older guest would be broken by simply
increasing the pvinfo version. Although the pvinfo minor version doesn't
increase, the compatibility won't be blocked. The compatibility is ensured
by checking the value of caps field in pvinfo. Zero means no full ppgtt
support and BIT(2) means this feature is provided.
Changes since v1:
- Use u32 instead of uint32_t (Joonas)
- Move VGT_CAPS_FULL_PPGTT introduction to this patch and use #define
instead of enum (Joonas)
- Rewrite the vgpu full ppgtt capability checking logic. (Joonas)
- Some coding style refine. (Joonas)
Changes since v2:
- Divide the whole patch set into two separate patch series, with one
patch in i915 side to check guest i915 full ppgtt capability and enable
it when this capability is supported by the device model, and the other
one in gvt side which fixs the blocking issue and enables the device
model to provide the capability to guest. And this patch focuses on guest
i915 side. (Joonas)
- Change the title from "introduce vgt_caps to pvinfo" to
"Enable guest i915 full ppgtt functionality". (Tina)
Change since v3:
- Add some comments about pvinfo caps and version. (Joonas)
Change since v4:
- Tested by Tina Zhang.
Change since v5:
- Add limitation about supporting 32bit full ppgtt.
Change since v6:
- Change the fallback to 48bit full ppgtt if i915.ppgtt_enable=2. (Zhenyu)
Change in v9:
- Remove the fixme comment due to no plan for 32bit full ppgtt
support. (Zhenyu)
- Reorder the patch-set to fix compiling issue with git-bisect. (Zhenyu)
- Add print log when forcing guest 48bit full ppgtt. (Zhenyu)
v10:
- Update against Joonas's has_full_ppgtt and has_full_48bit_ppgtt disconnect
change. (Zhenyu)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> # in v2
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
When choosing a slot for an execbuffer, we ideally want to use the same
address as last time (so that we don't have to rebind it) and the same
address as expected by the user (so that we don't have to fixup any
relocations pointing to it). If we first try to bind the incoming
execbuffer->offset from the user, or the currently bound offset that
should hopefully achieve the goal of avoiding the rebind cost and the
relocation penalty. However, if the object is not currently bound there
we don't want to arbitrarily unbind an object in our chosen position and
so choose to rebind/relocate the incoming object instead. After we
report the new position back to the user, on the next pass the
relocations should have settled down.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtien@linux.intel.com>