Commit Graph

298 Commits

Author SHA1 Message Date
Christian König
1afd30efed drm/amdgpu: revert "add new bo flag that indicates BOs don't need fallback (v2)"
This reverts commit 6f51d28bfe8e1a676de5cd877639245bed3cc818.

Makes fallback handling to complicated. This is just a feature for the
GEM interface and shouldn't leak into the core BO create function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:19 -05:00
Christian König
5422a28fe8 drm/amdgpu: fix and cleanup cpu visible VRAM handling
The detection if a BO was placed in CPU visible VRAM was incorrect.

Fix it and merge it with the correct detection in amdgpu_ttm.c

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:05 -05:00
Christian König
f1018f50d4 drm/amdgpu: use ctx bytes_moved
Instead of the global (inaccurate) counter.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:05 -05:00
Chunming Zhou
552825b28d drm/amdgpu: add new bo flag that indicates BOs don't need fallback (v2)
user cases:
1. KFD wraps amdgpu_bo_create, they have no fallback case which is different
with amdgpu_gem_object_create.
since upstream branch has no amdgpu_amdkfd_gpuvm.c, which need KFD
guys add this flag to __alloc_memory_of_gpu:
+       flags |= AMDGPU_GEM_CREATE_NO_FALLBACK;
2. UMD can specify this flag for their allocation as well if they like.

v2: squash in merge conflict fix (Chunming)

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: felix.kuehling@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-11 13:08:01 -05:00
Felix Kuehling
e52482dec8 drm/amdgpu: Add MMU notifier type for KFD userptr
This commit adds the notion of MMU notifier types GFX and HSA. GFX
continues to work like MMU notifiers did before. HSA adds support for
KFD userptr BOs. The implementation of KFD userptr eviction is a stub
for now.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:32:28 -04:00
Roger He
d330fca115 drm/ttm: use bit flag to replace allow_reserved_eviction in ttm_operation_ctx
for saving memory and more bit flag can be used in future

Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:34 -05:00
Bas Nieuwenhuizen
a20ee0b1f8 drm/amdgpu: Fix always_valid bos multiple LRU insertions.
If these bos are evicted and are in the validated list
things blow up, so do not put them in there. Notably,
that tries to add the bo to the LRU twice, which results
in a BUG_ON in ttm_bo.c.

While for the bo_list an alternative would be to not allow
always valid bos in there, that does not work for the user
fence.

v2: Fixed whitespace issue pointed out by checkpatch.pl

Signed-off-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2018-02-19 14:19:13 -05:00
Christian König
770d13b19f drm/amdgpu: move struct amdgpu_mc into amdgpu_gmc.h
And rename it to amdgpu_gmc as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Li <Samuel.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:17:43 -05:00
Christian König
0abc6878fc drm/amdgpu: update VM PDs after the PTs
Necessary for the next patch.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-10 15:44:53 -05:00
Roger He
9251859a9a drm/amdgpu: set allow_reserved_eviction and resv when bo allocation and cs
enable eviction of other per VM BOs during allocation and allows
reaping of deleted BOs during CS.

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-15 17:10:54 -05:00
Lucas Stach
1b1f42d8fd drm: move amd_gpu_scheduler into common location
This moves and renames the AMDGPU scheduler to a common location in DRM
in order to facilitate re-use by other drivers. This is mostly a straight
forward rename with no code changes.

One notable exception is the function to_drm_sched_fence(), which is no
longer a inline header function to avoid the need to export the
drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures.

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-07 11:51:56 -05:00
Andrey Grodzovsky
cebb52b7bc drm/amdgpu: Get rid of dep_sync as a seperate object.
Instead mark fence as explicit in it's amdgpu_sync_entry.

v2:
Fix use after free bug and add new parameter description.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:31 -05:00
Christian König
bb7939b203 drm/amdgpu: fix VA hole handling on Vega10 v3
Similar to the CPU address space the VA on Vega10 has a hole in it.

v2: use dev_dbg instead of dev_err
v3: add some more comments to explain how the hw works

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:05 -05:00
Christian König
6af046d26f drm/amdgpu: use the new TTM bytes moved counter v2
Instead of the global statistics use the per context bytes moved counter.

v2: rebased

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:04 -05:00
Christian König
19be557010 drm/ttm: add operation ctx to ttm_bo_validate v2
Give moving a BO into place an operation context to work with.

v2: rebased

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:01 -05:00
Chunming Zhou
6f16b4fb60 drm/amdgpu: use dep_sync for CS dependency/syncobj
Otherwise, they could be optimized by scheduled fence.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:47:21 -05:00
Christian König
c5835bbb11 drm/amdgpu: rename amdgpu_ttm_bind to amdgpu_ttm_alloc_gart
We actually don't bind here, but rather allocate GART space if necessary.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:43 -05:00
Christian König
4ff23be3d5 drm/amdgpu: remove extra parameter from amdgpu_ttm_bind() v2
We always use the BO mem now.

v2: minor rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:16 -05:00
Andrey Grodzovsky
a4176cb484 drm/amdgpu: Remove job->s_entity to avoid keeping reference to stale pointer.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:11 -05:00
Monk Liu
7716ea564f drm/amdgpu:skip job for guilty ctx in parser_init
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:10 -05:00
Andrey Grodzovsky
d1f6dc1a9a drm/amdgpu: Avoid accessing job->entity after the job is scheduled.
Bug: amdgpu_job_free_cb was accessing s_job->s_entity when the allocated
amdgpu_ctx (and the entity inside it) were already deallocated from
amdgpu_cs_parser_fini.

Fix: Save job's priority on it's creation instead of accessing it from
s_entity later on.

Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:33:08 -05:00
Christian König
6edc6910ba drm/amdgpu: don't try to move pinned BOs
Never try to move pinned BOs during CS.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-28 17:44:15 -05:00
Linus Torvalds
c353bfc6eb fixes/cleanups for rc1, non-desktop flags for VR
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaFkpiAAoJEAx081l5xIa+LOcQAJqXyh7vx++oPe5kJFC2rCoX
 MqX1aJ4nH8y04QJqLmKx1SC6eyYsTM92rcg3RfHOThktzonD5l2wSO9TvCkmLtr9
 2n9P/aYMcbPTZntrbJc4mQyzd82U0D4h40i5Cmhr9n4gcLPOsOpau/7eclyuEUds
 PHZSTCRq0Ygk1K5VWQPyKsY1k1TqFes2YE46FJzkD8SQwKDfbWxVZG0BPnvqb5Om
 PMVobnEukruzpsSqnetaEYsW89e0TJ2TW9MSCfVohzWvyCVGzmwSzqaooqOkgFe2
 5ZrzA4aW6qRez4nXN2Zw+p9qhS4DZ8MVEJO8qczrR6BGx5yRlHriGhs+5FQskGBT
 Idqj6YZX3x/qab/AXQy0fzn2lrZdwxTolG6BgnNOwdGhyFEfz7P7p9kcv4QLbyn5
 8MynMUcLmOkpouHD0mpIwn5kS7EU4hbEPGOeBwxy54FbiLFWb81FjlGts2N+/ckI
 69UlmyyFZrpxvTmL9vRzvGCeO0zdfvKtBa1GoYWbzNTs8r50F2EtdJkS64SYOVOf
 o4ApcG5bznx42NfBwa3TBc+NETTYJPS0blFImPVu1qvdQn5AciX137vYbqzwuqac
 2gM2m6Rdfpncw/3VRIePwXYwpNS/3fsa3V6UgzTFlDhrQCtP2XxKPhfru7pFN+te
 Vav1I46Q8pa7ko8dS3A3
 =P4O6
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15-part2' of git://people.freedesktop.org/~airlied/linux

Pull more drm updates from Dave Airlie:
 "Fixes/cleanups for rc1, non-desktop flags for VR

   - remove the MSM dt-bindings file Rob managed to push in the previous
     pull.

   - add a property/edid quirk to denote HMD devices, I had these
     hanging around for a few weeks and Keith had done some work on
     them, they are fairly self contained and small, and only affect
     people using HTC Vive VR headsets so far.

   - amdgpu, tegra, tilcdc, fsl fixes

   - some imx-drm cleanups I missed, these seemed pretty small, and no
     reason to hold off.

  I have one TTM regression fix (fixes bochs-vga in qemu) sitting
  locally awaiting review I'll probably send that in a separate pull
  request tomorrow"

* tag 'drm-for-v4.15-part2' of git://people.freedesktop.org/~airlied/linux: (33 commits)
  dt-bindings: remove file that was added accidentally
  drm/edid: quirk HTC vive headset as non-desktop. [v2]
  drm/fb: add support for not enabling fbcon on non-desktop displays [v2]
  drm: add connector info/property for non-desktop displays [v2]
  drm/amdgpu: fix rmmod KCQ disable failed error
  drm/amdgpu: fix kernel hang when starting VNC server
  drm/amdgpu: don't skip attributes when powerplay is enabled
  drm/amd/pp: fix typecast error in powerplay.
  drm/tilcdc: Remove obsolete "ti,tilcdc,slave" dts binding support
  drm/tegra: sor: Reimplement pad clock
  Revert "drm/radeon: dont switch vt on suspend"
  drm/amd/amdgpu: fix over-bound accessing in amdgpu_cs_wait_any_fence
  drm/amd/powerplay: fix unfreeze level smc message for smu7
  drm/amdgpu:fix memleak
  drm/amdgpu:fix memleak in takedown
  drm/amd/pp: fix dpm randomly failed on Vega10
  drm/amdgpu: set f_mapping on exported DMA-bufs
  drm/amdgpu: Properly allocate VM invalidate eng v2
  drm/fsl-dcu: enable IRQ before drm_atomic_helper_resume()
  drm/fsl-dcu: avoid disabling pixel clock twice on suspend
  ...
2017-11-23 21:04:56 -10:00
Roger He
eb174c77e2 drm/amd/amdgpu: fix over-bound accessing in amdgpu_cs_wait_any_fence
Fixes an oops in amdgpu_cs_wait_any_fence.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-17 15:00:52 -05:00
Linus Torvalds
e60e1ee606 main drm pull request for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaCm8RAAoJEAx081l5xIa+zX0QAJSm31kCG3vdw2CNiRx25L3q
 3hcsEOgAjVJ9FQVGKFWjzb8TK35tSqtNx5kWIj0VGaIfBE5Bdg5SLLgKKUYas8rY
 4LaphqICq2uxu2BNa2tpiar/sHhAnuozwQ4czpVWXzlaISnb9yYzRl7gMuyUVGkx
 +Gih5VUhLmQC0HsRTLJ3vaZQoUsLAl2gAjKcWa1bx57j2S+iKOPfsLaq7VYo+y1I
 Njc+iSGqMhJzRLXVkxL2lQKaslp7R38Bbh5K4Kvyjkm4Aq7zErOF6irpOXKMcrGl
 mwnr89vf1G9thjikrBaXpKnuvdbWYveoN/ORMlTdCfxkFnChHLnm3bd7NJ49RXDN
 Hv/Iq9YYjmZ9GTatxnx7lWtmXnZXC5he1yn1JAuz/yt7/0b/Wx+Mu/wEpBXYNFTd
 1AZdD586i+AmPo3yDkqH9nBu8JC0W0AnS9VZma4LVvZOP2UfJmj5Im1CLHItbGDN
 FnUCkwyD/lJUUk+WgT+w/GOMJgmFHDiFFl4tFtYVVjrUirpCFVguSKG9xuv6tT8P
 8iRsoP7RrcmDN9ojN2SEHwcpsAv3HnKkDv+9+GIbWnrGsSbCPq8Qm+JDSvf4h22I
 K5lwNpJrcpSKI+q10L7w2xliTBwb98sJkWGA/rssomrdBOWteGZAyqFRYAVgQ+mJ
 x/nJurIqQYh2KQN9+uLG
 =xVV2
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.15.

  Core:
   - Atomic object lifetime fixes
   - Atomic iterator improvements
   - Sparse/smatch fixes
   - Legacy kms ioctls to be interruptible
   - EDID override improvements
   - fb/gem helper cleanups
   - Simple outreachy patches
   - Documentation improvements
   - Fix dma-buf rcu races
   - DRM mode object leasing for improving VR use cases.
   - vgaarb improvements for non-x86 platforms.

  New driver:
   - tve200: Faraday Technology TVE200 block.

     This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
     the StorLink SL3516 (later Cortina Systems CS3516) as well as the
     Grain Media GM8180.

  New bridges:
   - SiI9234 support

  New panels:
   - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
     LT089AC19000, Innolux AT043TN24

  i915:
   - Remove Coffeelake from alpha support
   - Cannonlake workarounds
   - Infoframe refactoring for DisplayPort
   - VBT updates
   - DisplayPort vswing/emph/buffer translation refactoring
   - CCS fixes
   - Restore GPU clock boost on missed vblanks
   - Scatter list updates for userptr allocations
   - Gen9+ transition watermarks
   - Display IPC (Isochronous Priority Control)
   - Private PAT management
   - GVT: improved error handling and pci config sanitizing
   - Execlist refactoring
   - Transparent Huge Page support
   - User defined priorities support
   - HuC/GuC firmware refactoring
   - DP MST fixes
   - eDP power sequencing fixes
   - Use RCU instead of stop_machine
   - PSR state tracking support
   - Eviction fixes
   - BDW DP aux channel timeout fixes
   - LSPCON fixes
   - Cannonlake PLL fixes

  amdgpu:
   - Per VM BO support
   - Powerplay cleanups
   - CI powerplay support
   - PASID mgr for kfd
   - SR-IOV fixes
   - initial GPU reset for vega10
   - Prime mmap support
   - TTM updates
   - Clock query interface for Raven
   - Fence to handle ioctl
   - UVD encode ring support on Polaris
   - Transparent huge page DMA support
   - Compute LRU pipe tweaks
   - BO flag to allow buffers to opt out of implicit sync
   - CTX priority setting API
   - VRAM lost infrastructure plumbing

  qxl:
   - fix flicker since atomic rework

  amdkfd:
   - Further improvements from internal AMD tree
   - Usermode events
   - Drop radeon support

  nouveau:
   - Pascal temperature sensor support
   - Improved BAR2 handling
   - MMU rework to support Pascal MMU

  exynos:
   - Improved HDMI/mixer support
   - HDMI audio interface support

  tegra:
   - Prep work for tegra186
   - Cleanup/fixes

  msm:
   - Preemption support for a5xx
   - Display fixes for 8x96 (snapdragon 820)
   - Async cursor plane fixes
   - FW loading rework
   - GPU debugging improvements

  vc4:
   - Prep for DSI panels
   - fix T-format tiling scanout
   - New madvise ioctl

  Rockchip:
   - LVDS support

  omapdrm:
   - omap4 HDMI CEC support

  etnaviv:
   - GPU performance counters groundwork

  sun4i:
   - refactor driver load + TCON backend
   - HDMI improvements
   - A31 support
   - Misc fixes

  udl:
   - Probe/EDID read fixes.

  tilcdc:
   - Misc fixes.

  pl111:
   - Support more variants

  adv7511:
   - Improve EDID handling.
   - HDMI CEC support

  sii8620:
   - Add remote control support"

* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
  drm/rockchip: analogix_dp: Use mutex rather than spinlock
  drm/mode_object: fix documentation for object lookups.
  drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
  drm/i915: Move init_clock_gating() back to where it was
  drm/i915: Prune the reservation shared fence array
  drm/i915: Idle the GPU before shinking everything
  drm/i915: Lock llist_del_first() vs llist_del_all()
  drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
  drm/i915: Disable lazy PPGTT page table optimization for vGPU
  drm/i915/execlists: Remove the priority "optimisation"
  drm/i915: Filter out spurious execlists context-switch interrupts
  drm/amdgpu: use irq-safe lock for kiq->ring_lock
  drm/amdgpu: bypass lru touch for KIQ ring submission
  drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
  drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
  drm/amd/powerplay: initialize a variable before using it
  drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
  drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
  drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
  drm/rockchip: add CONFIG_OF dependency for lvds
  ...
2017-11-15 20:42:10 -08:00
Mel Gorman
c6f92f9fbe mm: remove cold parameter for release_pages
All callers of release_pages claim the pages being released are cache
hot.  As no one cares about the hotness of pages being released to the
allocator, just ditch the parameter.

No performance impact is expected as the overhead is marginal.  The
parameter is removed simply because it is a bit stupid to have a useless
parameter copied everywhere.

Link: http://lkml.kernel.org/r/20171018075952.10627-7-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:06 -08:00
Emily Deng
cdadab89f8 drm/amdgpu: Fix null pointer issue in amdgpu_cs_wait_any_fence
The array[first] may be null when the fence has already been signaled.

BUG: SWDEV-136239

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-09 18:11:47 -05:00
Christian König
4b6b691ee3 drm/amdgpu: linear validate first then bind to GART
For VM emulation for old UVD/VCE we need to validate the BO with linear
VRAM flag set first and then eventually bind it to GART.

Validating with linear VRAM flag set can move the BO to GART making
UVD/VCE read/write from an unbound GART BO.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:21 -04:00
Monk Liu
c70b78a71e drm/amdgpu:fix duplicated setting job's vram_lost
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:17 -04:00
Christian König
c5795c555b drm/amdgpu: minor CS optimization
We only need to loop over all IBs for old UVD/VCE command stream patching.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:09 -04:00
Andrey Grodzovsky
26eedf6dae drm/amdgpu: Fix extra call to amdgpu_ctx_put.
In amdgpu_cs_parser_init() in case of error handling
amdgpu_ctx_put() is called without setting p->ctx to NULL after that,
later amdgpu_cs_parser_fini() also calls amdgpu_ctx_put() again and
mess up the reference count.

Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:06 -04:00
Christian König
7a0a48ddf6 drm/amdgpu: set -ECANCELED when dropping jobs
And return from the wait functions the fence error code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:05 -04:00
Christian König
e55f2b646d drm/amdgpu: move the VRAM lost counter per context
Instead of per device track the VRAM lost per context and return ECANCELED
instead of ENODEV.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:04 -04:00
Christian König
14e47f93c5 drm/amdgpu: keep copy of VRAM lost counter in job
Instead of reading the current counter from fpriv.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:03 -04:00
Christian König
396bcb41e0 drm/amdgpu: partial revert VRAM lost handling v2
Keep blocking the CS, but revert everything else. Mapping BOs and info IOCTL
are harmless and can still happen even when VRAM content ist lost.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:03 -04:00
Andrey Grodzovsky
0ae94444c0 drm/amdgpu: Move old fence waiting before reservation lock is aquired v2
Helps avoiding deadlock during GPU reset.
Added mutex to amdgpu_ctx to preserve order of fences on a ring.

v2:
Put waiting logic in a function in a seperate function in amdgpu_ctx.c

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:26:59 -04:00
Andrey Grodzovsky
ad864d2438 drm/amdgpu: Refactor amdgpu_cs_ib_vm_chunk and amdgpu_cs_ib_fill.
This enables old fence waiting before reservation lock is aquired
which in turn is part of a bigger solution to deadlock happening
when gpu reset with VRAM recovery accures during intensive rendering.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:26:58 -04:00
Andres Rodriguez
b2ff0e8ac4 drm/amdgpu: add framework for HW specific priority settings v9
Add an initial framework for changing the HW priorities of rings. The
framework allows requesting priority changes for the lifetime of an
amdgpu_job. After the job completes the priority will decay to the next
lowest priority for which a request is still valid.

A new ring function set_priority() can now be populated to take care of
the HW specific programming sequence for priority changes.

v2: set priority before emitting IB, and take a ref on amdgpu_job
v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_*
v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb
v5: use atomic for tracking job priorities instead of last_job
v6: rename amdgpu_ring_priority_[get/put]() and align parameters
v7: replace spinlocks with mutexes for KIQ compatibility
v8: raise ring priority during cs_ioctl, instead of job_run
v9: priority_get() before push_job()

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09 16:30:21 -04:00
Andres Rodriguez
177ae09b5d drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
Introduce a flag to signal that access to a BO will be synchronized
through an external mechanism.

Currently all buffers shared between contexts are subject to implicit
synchronization. However, this is only required for protocols that
currently don't support an explicit synchronization mechanism (DRI2/3).

This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that
users can specify when it is safe to disable implicit sync.

v2: only disable explicit sync in amdgpu_cs_ioctl

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09 16:30:19 -04:00
Marek Olšák
7ca24cf2d2 drm/amdgpu: add FENCE_TO_HANDLE ioctl that returns syncobj or sync_file
for being able to convert an amdgpu fence into one of the handles.
Mesa will use this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-06 16:47:56 -04:00
Monk Liu
eb01abc7c4 drm/amdgpu:make ctx_add_fence interruptible(v2)
otherwise a gpu hang will make application couldn't be killed
under timedout=0 mode

v2:
Fix memoryleak job/job->s_fence issue
unlock mn
remove the ERROR msg after waiting being interrupted

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:13 -04:00
Christian König
4e55eb3879 drm/amdgpu: fix amdgpu_vm_handle_moved as well v2
There is no guarantee that the last BO_VA actually needed an update.

Additional to that all command submissions must wait for moved BOs to
be cleared, not just the first one.

v2: Don't overwrite any newer fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-13 12:10:06 -04:00
Dave Airlie
47e0cd6b1d Merge branch 'drm-next-4.14' of git://people.freedesktop.org/~agd5f/linux into drm-next
A few fixes for 4.14.  Nothing too major.
2017-09-13 14:34:11 +10:00
Christian König
3d138c14c4 drm/amdgpu: revert "fix deadlock of reservation between cs and gpu reset v2"
This reverts commit 10e709cb29.

The patch doesn't work at all:
1. The CS can still be blocked because of amdgpu_ctx_add_fence().
2. The order of submission isn't correct any more.
3. We could end up using freed up memory because we now drop the
   ctx reference to early.

This needs to be fixed cleanly by doing the context handling after the BO
handling, but this is a larger task just avoid the obvious crashes for now.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu monk.liu@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 22:16:31 -04:00
Christian König
d5884513a3 drm/amdgpu: fix VM sync with always valid BOs v2
All users of a VM must always wait for updates with always
valid BOs to be completed.

v2: remove debugging leftovers, rename struct member

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:30:39 -04:00
Christian König
aebc5e6f50 drm/amdgpu: rework amdgpu_cs_find_mapping
Use the VM instead of the BO list to find the BO for a virtual address.

This fixes UVD/VCE in physical mode with VM local BOs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:27:48 -04:00
Christian König
9cca0b8e5d drm/amdgpu: move amdgpu_cs_sysvm_access_required into find_mapping
When we need to find the mapping we need sysvm access anyway.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:27:22 -04:00
Christian König
3fe89771cb drm/amdgpu: stop reserving the BO in the MMU callback v3
Instead take the callback lock during the final parts of CS.

This should solve the last remaining locking order problems with BO reservations.

v2: rebase, make dummy functions static inline
v3: add one more missing inline and comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:26:37 -04:00
Christian König
1b0c0f9dc5 drm/amdgpu: move userptr BOs to CPU domain during CS v2
Instead of moving them in the MMU notifier move them during CS.

v2: still mark pages as accessed/dirty

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:24:18 -04:00
Christian König
ca666a3c29 drm/amdgpu: stop using BO status for user pages
Instead use a counter to figure out if we need to set new pages or not.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:24:09 -04:00
Christian König
b72cf4fca2 drm/amdgpu: move taking mmap_sem into get_user_pages v2
This didn't helped as intended, just simplify the code.

v2: unlock mmap_sem in the error path as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:24:00 -04:00
Christian König
aa4ec7ce7e drm/amdgpu: revert "fix deadlock of reservation between cs and gpu reset v2"
This reverts commit 10e709cb29.

The patch doesn't work at all:
1. The CS can still be blocked because of amdgpu_ctx_add_fence().
2. The order of submission isn't correct any more.
3. We could end up using freed up memory because we now drop the
   ctx reference to early.

This needs to be fixed cleanly by doing the context handling after the BO
handling, but this is a larger task just avoid the obvious crashes for now.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu monk.liu@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:23:30 -04:00
Christian König
a216ab0995 drm/amdgpu: fix userptr put_page handling
Move calling put_page into the unpopulate callback. Otherwise we mess up the pages
reference count when it is unbound multiple times.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:23:10 -04:00
Monk Liu
a2138eaf97 drm/amdgpu: fix wait_any_fence
first is incorrect if hit NULL/signaled fence

Signed-off-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-12 14:22:43 -04:00
Christian König
73fb16e7eb drm/amdgpu: add support for per VM BOs v2
Per VM BOs are handled like VM PDs and PTs. They are always valid and don't
need to be specified in the BO lists.

v2: validate PDs/PTs first

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-31 13:46:26 -04:00
Christian König
3f3333f8a0 drm/amdgpu: track evicted page tables v2
Instead of validating all page tables when one was evicted,
track which one needs a validation.

v2: simplify amdgpu_vm_ready as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:28:04 -04:00
Christophe JAILLET
06f10a537e drm/amdgpu: check memory allocation failure
Check memory allocation failure and return -ENOMEM in such a case.

'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.

The calling graph would be, in such a case!
   failure in amdgpu_cs_process_syncobj_out_dep()
      ---> error code returned by amdgpu_cs_dependencies()
         --> amdgpu_cs_parser_fini() is called

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:48 -04:00
Jason Ekstrand
afaf592378 drm/syncobj: Rename fence_get to find_fence
The function has far more in common with drm_syncobj_find than with
any in the get/put functions.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:17:37 +10:00
Christophe JAILLET
a1d6b1901a drm/amdgpu: check memory allocation failure
Check memory allocation failure and return -ENOMEM in such a case.

'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.

The calling graph would be, in such a case!
   failure in amdgpu_cs_process_syncobj_out_dep()
      ---> error code returned by amdgpu_cs_dependencies()
         --> amdgpu_cs_parser_fini() is called

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-24 14:27:43 -04:00
Christian König
27c7b9aeec drm/amdgpu: rename VM invalidated to moved
That better describes what happens here with the BO.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-17 15:46:08 -04:00
Christian König
ec681545af drm/amdgpu: separate bo_va structure
Split that into vm_bo_base and bo_va to allow other uses as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-17 15:46:07 -04:00
Christian König
0f4b3c6862 drm/amdgpu: cleanup static CSA handling
Move the CSA bo_va from the VM to the fpriv structure.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-17 15:46:05 -04:00
Christian König
3c848bb38a drm/amdgpu: move vram usage tracking into the vram manager v2
Looks like a better place for this.

v2: use atomic64_t members instead

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-17 15:46:03 -04:00
Christian König
b636922553 drm/amdgpu: only move VM BOs in the LRU during validation v2
This should save us a bunch of command submission overhead.

v2: move the LRU move to the right place to avoid the move for the root BO
    and handle the shadow BOs as well. This turned out to be a bug fix because
    the move needs to happen before the kmap.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-17 15:45:57 -04:00
Kent Russell
6d7d9c5aa2 drm/amdgpu: Fix preferred typo
Change "prefered" to "preferred"

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-15 14:46:16 -04:00
Cihangir Akturk
f62facc2eb drm/amdgpu: switch to drm_*{get,put} helpers
drm_*_reference() and drm_*_unreference() functions are just
compatibility alias for drm_*_get() and drm_*_put() and should not be
used by new code. So convert all users of compatibility functions to use
the new APIs.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Cihangir Akturk <cakturk@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-15 14:46:12 -04:00
Christian König
7ecc245a8c drm/amdgpu: consistent use u64_to_user_ptr
Instead of open coding the conversion from u64 to pointers.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-15 14:45:53 -04:00
John Brooks
00f06b246a drm/amdgpu: Throttle visible VRAM moves separately
The BO move throttling code is designed to allow VRAM to fill quickly if it
is relatively empty. However, this does not take into account situations
where the visible VRAM is smaller than total VRAM, and total VRAM may not
be close to full but the visible VRAM segment is under pressure. In such
situations, visible VRAM would experience unrestricted swapping and
performance would drop.

Add a separate counter specifically for moves involving visible VRAM, and
check it before moving BOs there.

v2: Only perform calculations for separate counter if visible VRAM is
    smaller than total VRAM. (Michel Dänzer)
v3: [Michel Dänzer]
* Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED
  flag to determine whether to account a move for visible VRAM in most
  cases.
* Use a single

	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {

  block in amdgpu_cs_get_threshold_for_moves.

Fixes: 95844d20ae (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2))
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-14 11:06:33 -04:00
Chris Wilson
00fc2c26bc drm: Remove unused drm_file parameter to drm_syncobj_replace_fence()
the drm_file parameter is unused, so remove it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 15:53:00 +10:00
Alex Xie
9211c784c6 drm/amdgpu: Make amdgpu_cs_parser_init static (v2)
The function is called only once inside the .c file.
v2: update the commit message (Michel)

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-29 12:43:51 -04:00
Alex Xie
9f69c0fd4d drm/amdgpu/cs: fix a typo in a comment
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-29 12:43:50 -04:00
Dave Airlie
660e855813 amdgpu: use drm sync objects for shared semaphores (v6)
This creates a new command submission chunk for amdgpu
to add in and out sync objects around the submission.

Sync objects are managed via the drm syncobj ioctls.

The command submission interface is enhanced with two new
chunks, one for syncobj pre submission dependencies,
and one for post submission sync obj signalling,
and just takes a list of handles for each.

This is based on work originally done by David Zhou at AMD,
with input from Christian Konig on what things should look like.

In theory VkFences could be backed with sync objects and
just get passed into the cs as syncobj handles as well.

NOTE: this interface addition needs a version bump to expose
it to userspace.

TODO: update to dep_sync when rebasing onto amdgpu master.
(with this - r-b from Christian)

v1.1: keep file reference on import.
v2: move to using syncobjs
v2.1: change some APIs to just use p pointer.
v3: make more robust against CS failures, we now add the
wait sems but only remove them once the CS job has been
submitted.
v4: rewrite names of API and base on new syncobj code.
v5: move post deps earlier, rename some apis
v6: lookup post deps earlier, and just replace fences
in post deps stage (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:32 -04:00
Dave Airlie
6f0308ebc1 amdgpu/cs: split out fence dependency checking (v2)
This just splits out the fence depenency checking into it's
own function to make it easier to add semaphore dependencies.

v2: rebase onto other changes.

v1-Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:31 -04:00
Dave Airlie
04d4fb5fa6 Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-next
New radeon and amdgpu features for 4.13:
- Lots of Vega10 bug fixes
- Preliminary Raven support
- KIQ support for compute rings
- MEC queue management rework from Andres
- Audio support for DCE6
- SR-IOV improvements
- Improved module parameters for controlling radeon vs amdgpu support
  for SI and CIK
- Bug fixes
- General code cleanups

[airlied: dropped drmP.h header from one file was needed and build broke]

* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits)
  drm/amdgpu: Fix compiler warnings
  drm/amdgpu: vm_update_ptes remove code duplication
  drm/amd/amdgpu: Port VCN over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
  drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
  drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
  drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
  drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
  drm/amd/amdgpu: Add offset variant to SOC15 macros
  drm/amd/powerplay: add avfs control for Vega10
  drm/amdgpu: add virtual display support for raven
  drm/amdgpu/gfx9: fix compute ring doorbell index
  drm/amd/amdgpu: Rename KIQ ring to avoid spaces
  drm/amd/amdgpu: gfx9 tidy ups (v2)
  drm/amdgpu: add contiguous flag in ucode bo create
  drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
  drm/amdgpu: export test ib debugfs interface
  ...
2017-06-16 09:56:53 +10:00
Alex Xie
eb0f0373e5 drm/amdgpu: fix a typo in comment
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:30:20 -04:00
Andres Rodriguez
effd924d2f drm/amdgpu: untie user ring ids from kernel ring ids v6
Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's
ring ids from the kernel's ring ids.

The queue manager maintains a per-file descriptor map of user ring ids
to amdgpu_ring pointers. Once a map is created it is permanent (this is
required to maintain FIFO execution guarantees for a context's ring).

Different queue map policies can be configured for each HW IP.
Currently all HW IPs use the identity mapper, i.e. kernel ring id is
equal to the user ring id.

The purpose of this mechanism is to distribute the load across multiple
queues more effectively for HW IPs that support multiple rings.
Userspace clients are unable to check whether a specific resource is in
use by a different client. Therefore, it is up to the kernel driver to
make the optimal choice.

v2: remove amdgpu_queue_mapper_funcs
v3: made amdgpu_queue_mgr per context instead of per-fd
v4: add context_put on error paths
v5: rebase and include new IPs UVD_ENC & VCN_*
v6: drop unused amdgpu_ring_is_valid_index (Alex)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:49:01 -04:00
Chunming Zhou
f1892138ab drm/amdgpu: return -ENODEV to user space when vram is lost v2
below ioctl will return -ENODEV:
amdgpu_cs_ioctl
amdgpu_cs_wait_ioctl
amdgpu_cs_wait_fences_ioctl
amdgpu_gem_va_ioctl
amdgpu_info_ioctl

v2: only for map and replace cases in amdgpu_gem_va_ioctl

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24 18:11:52 -04:00
Leo Liu
f93aa00c0b drm/amdgpu: get cs support for AMDGPU_HW_IP_VCN_ENC
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24 17:41:45 -04:00
Leo Liu
fc739f82c5 drm/amdgpu: get cs support of AMDGPU_HW_IP_VCN_DEC
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24 17:41:36 -04:00
Michal Hocko
2098105ec6 drm: drop drm_[cm]alloc* helpers
Now that drm_[cm]alloc* helpers are simple one line wrappers around
kvmalloc_array and drm_free_large is just kvfree alias we can drop
them and replace by their native forms.

This shouldn't introduce any functional change.

Changes since v1
- fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day
  build robot

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers
[danvet: Fixup vgem which grew another user very recently.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz
2017-05-18 17:22:39 +02:00
Chunming Zhou
10e709cb29 drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2
the case could happen when gpu reset:
1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation.
2. gpu_reset routine will also need pd reservation to restore page table from their shadow.
3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation.

v2: handle amdgpu_cs_submit error path.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-04-28 17:33:14 -04:00
Chunming Zhou
32df87dff0 drm/amdgpu: fix fence memory leak in wait_all_fence V2
V2: remove **array method, directly fence_put after fence wait.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <chrstian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-04-07 15:15:45 -04:00
Alex Xie
f4e7c7c1b4 drm/amdgpu: use uintptr_t instead of unsigned long to store pointer
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-04-06 13:28:08 -04:00
Christian König
a9f87f6452 drm/amdgpu: use a 64bit interval tree for VM management v2
This only makes a difference for 32-bit systems. The idea is to have a
fixed virtual address space size with 4-level page tables and to
minimize differences between 32 and 64-bit systems.

v2: Update commit message.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-04-04 13:40:32 -04:00
Harry Wentland
e51a3226d4 drm/amdgpu: Couple small warning fixes
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:48 -04:00
Monk Liu
e9d672b291 drm/amdgpu:changes in gfx DMAframe scheme (v2)
1) Adapt to vulkan:
Now use double SWITCH BUFFER to replace the 128 nops w/a,
because when vulkan introduced, umd can insert 7 ~ 16 IBs
per submit which makes 256 DW size cannot hold the whole
DMAframe (if we still insert those 128 nops), CP team suggests
use double SWITCH_BUFFERs, instead of tricky 128 NOPs w/a.

2) To fix the CE VM fault issue when MCBP introduced:
Need one more COND_EXEC wrapping IB part (original one us
for VM switch part).

this change can fix vm fault issue caused by below scenario
without this change:

>CE passed original COND_EXEC (no MCBP issued this moment),
 proceed as normal.

>DE catch up to this COND_EXEC, but this time MCBP issued,
 thus DE treats all following packages as NOP. The following
 VM switch packages now looks just as NOP to DE, so DE
 dosen't do VM flush at all.

>Now CE proceeds to the first IBc, and triggers VM fault,
 because DE didn't do VM flush for this DMAframe.

3) change estimated alloc size for gfx9.
with new DMAframe scheme, we need modify emit_frame_size
for gfx9

4) No need to insert 128 nops after gfx8 vm flush anymore
because there was double SWITCH_BUFFER append to vm flush,
and for gfx7 we already use double SWITCH_BUFFER following
after vm_flush so no change needed for it.

5) Change emit_frame_size for gfx8

v2: squash in BUG removal from Monk

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:42 -04:00
Monk Liu
65333e4429 drm/amdgpu:fix the check in cs_ib_fill for SRIOV
1,the check is only appliable for SRIOV GFX engine.
2,use chunk_ib instead of ib.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Ken Wang <Qingqing.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:34 -04:00
Monk Liu
9a1b3af10d drm/amdgpu:protect cs submit
to prevent submit two or more IBs with PREEMPT flags.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:33 -04:00
Monk Liu
2a9ceb8daa drm/amdgpu:fix cs_ib_fill
should use chunk_ib instead of ib, otherwise the logic
is incorrect.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Ken Wang <Qingqing.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:32 -04:00
Christian König
194d216113 drm/amdgpu: handle multi level PD updates V2
Update all levels of the page directory.

V2:
a. sub level pdes always are written to incorrect place.
b. sub levels need to update regardless of parent updates.

Signed-off-by: Christian König <christian.koenig@amd.com> (V1)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (V1)
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> (V2)
Acked-by: Alex Deucher <alexander.deucher@amd.com> (V2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:27 -04:00
Christian König
67003a15b7 drm/amdgpu: generalize page table level
No functional change, but the base for multi level page tables.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:24 -04:00
Christian König
a24960f321 drm/amdgpu: rename page_directory_fence to last_dir_update
Decribes better what this is used for.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:55:22 -04:00
Nicolai Hähnle
f34678187a drm/amdgpu: add optional fence out-parameter to amdgpu_vm_clear_freed
We will add the fence to freed buffer objects in a later commit, to ensure
that the underlying memory can only be re-used after all references in
page tables have been cleared.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:54:15 -04:00
Leo Liu
166c8178fb drm/amdgpu: get cs support of AMDGPU_HW_IP_UVD_ENC
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:53:48 -04:00
Junwei Zhang
b85891bd6d drm/amdgpu: IOCTL interface for PRT support v4
Till GFX8 we can only enable PRT support globally, but with the next hardware
generation we can do this on a per page basis.

Keep the interface consistent by adding PRT mappings and enable
support globally on current hardware when the first mapping is made.

v2: disable PRT support delayed and on all error paths
v3: PRT and other permissions are mutal exclusive,
    PRT mappings don't need a BO.
v4: update PRT mappings durign CS as well, make va_flags 64bit

Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:52:56 -04:00
Dave Airlie
607523d19c drm/amdgpu: fix parser init error path to avoid crash in parser fini
If we don't reset the chunk info in the error path, the subsequent
fini path will double free.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-10 14:27:14 -05:00
Dave Airlie
94000cc329 Linux 4.10-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJYoM2fAAoJEHm+PkMAQRiGr9MH/izEAMri7rJ0QMc3ejt+WmD0
 8pkZw3+MVn71z6cIEgpzk4QkEWJd5rfhkETCeCp7qQ9V6cDW1FDE9+0OmPjiphDt
 nnzKs7t7skEBwH5Mq5xygmIfkv+Z0QGHZ20gfQWY3F56Uxo+ARF88OBHBLKhqx3v
 98C7YbMFLKBslKClA78NUEIdx0UfBaRqerlERx0Lfl9aoOrbBS6WI3iuREiylpih
 9o7HTrwaGKkU4Kd6NdgJP2EyWPsd1LGalxBBjeDSpm5uokX6ALTdNXDZqcQscHjE
 RmTqJTGRdhSThXOpNnvUJvk9L442yuNRrVme/IqLpxMdHPyjaXR3FGSIDb2SfjY=
 =VMy8
 -----END PGP SIGNATURE-----

Merge tag 'v4.10-rc8' into drm-next

Linux 4.10-rc8

Backmerge Linus rc8 to fix some conflicts, but also
to avoid pulling it in via a fixes pull from someone.
2017-02-23 12:10:12 +10:00
Samuel Pitoiset
fad061270a drm/amdgpu: report the number of bytes moved at buffer creation
Like ttm_bo_validate(), ttm_bo_init() might need to move BO and
the number of bytes moved by TTM should be reported. This can help
the throttle buffer migration mechanism to make a better decision.

v2: fix computation

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-02-09 11:29:44 -05:00
Alex Deucher
034041f334 drm/amdgpu: use the num_rings variable for checking vce rings
Difference families may have different numbers of rings. Use
the variable rather than a hardcoded number.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-01-27 11:13:29 -05:00
Monk Liu
2493664f05 drm/amdgpu:invoke CSA functions (v2)
Make sure the CSA is mapped.

v2: agd: rebase.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-01-27 11:13:20 -05:00