linux

Author	SHA1	Message	Date
Alex Deucher	b249e18df1	drm/amdgpu: set sched_hw_submission higher for KIQ (v3) KIQ doesn't really use the GPU scheduler. The base drivers generally use the KIQ ring directly rather than submitting IBs. However, amdgpu_sched_hw_submission (which defaults to 2) limits the number of outstanding fences to 2. KFD uses the KIQ for TLB flushes and the 2 fence limit hurts performance when there are several KFD processes running. v2: move some expressions to one line change KIQ sched_hw_submission to at least 16 v3: bump to 256 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-08-29 15:27:44 -04:00
Trigger Huang	41cc07cff2	drm/amdgpu: don't finish the ring if not initialized If a ring is not initialized, it also should not be finished. For example, in Vega10's SR-IOV environment, UVD's decode ring is not initialized, but will be finnished in amdgpu_uvd_sw_fini, because UVD driver put all the uvd decode ring's finish operation into amdgpu_uvd_sw_fini function, while not uvd_vXXX_0_sw_fini. This will lead to amdgpu module unloading failure. Signed-off-by: Trigger Huang <trigger.huang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-08-15 14:46:17 -04:00
Alex Deucher	97407b63ea	drm/amdgpu: use 256 bit buffers for all wb allocations (v2) May waste a bit of memory, but simplifies the interface significantly. v2: convert internal accounting to use 256bit slots Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-08-15 14:46:08 -04:00
Alex Deucher	eacf3e149e	drm/amdgpu: make wb 256bit function names consistent Use a lower case b to be consistent with the other wb functions. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-08-15 14:45:59 -04:00
Monk Liu	0915fdbc69	drm/amdgpu:fix gfx fence allocate size 1, for sriov, we need 8dw for the gfx fence due to CP behaviour 2, cleanup wrong logic in wptr/rptr wb alloc and free Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294 Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-07-25 16:29:26 -04:00
Alex Xie	e59c020598	drm/amdgpu: Move compute vm bug logic to amdgpu_vm.c In review, Christian would like to keep the logic inside amdgpu_vm.c with a cost of slightly slower. The loop is still optimized out with this patch. v2: remove the if statement. Now it is not slower. Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Christian König <christian.koeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-06-01 16:00:20 -04:00
Andres Rodriguez	6065343a11	drm/amdgpu: guarantee bijective mapping of ring ids for LRU v3 Depending on usage patterns, the current LRU policy may create a non-injective mapping between userspace ring ids and kernel rings. This behaviour is undesired as apps that attempt to fill all HW blocks would be unable to reach some of them. This change forces the LRU policy to create bijective mappings only. v2: compress ring_blacklist v3: simplify amdgpu_ring_is_blacklisted() logic Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-31 16:49:03 -04:00
Andres Rodriguez	795f2813e6	drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4 Use an LRU policy to map usermode rings to HW compute queues. Most compute clients use one queue, and usually the first queue available. This results in poor pipe/queue work distribution when multiple compute apps are running. In most cases pipe 0 queue 0 is the only queue that gets used. In order to better distribute work across multiple HW queues, we adopt a policy to map the usermode ring ids to the LRU HW queue. This fixes a large majority of multi-app compute workloads sharing the same HW queue, even though 7 other queues are available. v2: use ring->funcs->type instead of ring->hw_ip v3: remove amdgpu_queue_mapper_funcs v4: change ring_lru_list_lock to spinlock, grab only once in lru_get() Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-31 16:49:02 -04:00
Alex Xie	dd684d313e	drm/amdgpu: Optimize a function called by every IB sheduling Move several if statements and a loop statment from run time to initialization time. Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-31 14:16:38 -04:00
Tom St Denis	ec63982e90	drm/amd/amdgpu: Correct ring wptr address in debugfs (v2) On gfx9 hardware the value is not wrapped and is a 64-bit value. So we reduce it modulo the ring size. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (v2) use buf_mask instead of computing on the fly Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-03-29 23:55:53 -04:00
Monk Liu	e09706f46e	drm/amdgpu:fix ring init sequence ring->buf_mask need be set prior to ring_clear_ring invoke and fix ring_clear_ring as well which should use buf_mask instead of ptr_mask Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-03-29 23:55:38 -04:00
Ken Wang	7014285ade	drm/amdgpu: add 64bit wb functions Newer asics need 64 bit writeback slots. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-03-29 23:53:35 -04:00
Ken Wang	536fbf946c	drm/amdgpu: change wptr to 64 bits (v2) Newer asics need 64 bit wptrs. If the wptr is now smaller than the rptr that doesn't indicate a wrap-around anymore. v2: integrate Christian's comments. Signed-off-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-03-29 23:53:35 -04:00
Monk Liu	f6bd79424c	drm/amdgpu:use clear_ring to clr RB In resume routine, we need clr RB prior to the ring test of engine, otherwise some engine hang duplicated during GPU reset. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-03-29 23:53:14 -04:00
Monk Liu	714fbf8039	drm/amdgpu:set cond_exec polling value to 1 in ring_init no need to set it per ib_schedule(), hw won't override this polling address. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-01-27 11:13:37 -05:00
Linus Torvalds	9a19a6db37	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: - more ->d_init() stuff (work.dcache) - pathname resolution cleanups (work.namei) - a few missing iov_iter primitives - copy_from_iter_full() and friends. Either copy the full requested amount, advance the iterator and return true, or fail, return false and do _not_ advance the iterator. Quite a few open-coded callers converted (and became more readable and harder to fuck up that way) (work.iov_iter) - several assorted patches, the big one being logfs removal * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: logfs: remove from tree vfs: fix put_compat_statfs64() does not handle errors namei: fold should_follow_link() with the step into not-followed link namei: pass both WALK_GET and WALK_MORE to should_follow_link() namei: invert WALK_PUT logics namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link() namei: saner calling conventions for mountpoint_last() namei.c: get rid of user_path_parent() switch getfrag callbacks to ..._full() primitives make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success [iov_iter] new primitives - copy_from_iter_full() and friends don't open-code file_inode() ceph: switch to use of ->d_init() ceph: unify dentry_operations instances lustre: switch to use of ->d_init()	2016-12-16 10:24:44 -08:00
Al Viro	450630975d	don't open-code file_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-04 18:29:28 -05:00
Christian König	7988714237	drm/amdgpu: move align_mask and nop into ring funcs as well (v2) They are constant as well. v2: update uvd and vce phys ring structures as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-10-25 14:38:38 -04:00
Christian König	21cd942e5c	drm/amdgpu: move the ring type into the funcs structure (v2) It's constant, so it doesn't make to much sense to keep it with the variable data. v2: update vce and uvd phys mode ring structures as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-10-25 14:38:37 -04:00
Dan Carpenter	eeb2fa0c97	drm/amdgpu: potential NULL dereference in debugfs code debugfs_create_file() returns NULL on error, it only returns error pointers if debugfs isn't enabled in the config and we checked for that earlier so it can't happen. Fixes: `4f4824b556` ('drm/amd/amdgpu: Convert ring debugfs entries to binary') Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-10-13 18:25:50 -04:00
Grazvydas Ignotas	d8907643cc	drm/amdgpu: clear ring pointer in amdgpu_device on teardown This is in symmetry to setup done in amdgpu_ring_init. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-09-27 13:00:51 -04:00
Junwei Zhang	8640faed5a	drm/amdgpu: free the BO in kernel by helper amdgpu_bo_free_kernel() Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-09-14 15:10:27 -04:00
Christian König	37ac235bf8	drm/amdgpu: use amdgpu_bo_create_kernel in amdgpu_ring.c Saves us quite a bunch of code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-08-08 11:32:21 -04:00
Christian König	f06505b8d2	drm/amdgpu: add begin/end_use ring callbacks For manual UVD/VCE power and clock gating. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-29 14:37:02 -04:00
Christian König	7c23ace2db	drm/amdgpu: remove fence_lock Was never used as far as I can see. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-29 14:37:01 -04:00
Alex Deucher	33b7ed0122	drm/amdgpu: remove more of the ring backup code Not used anymore. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:19 -04:00
Chunming Zhou	40019dc4a3	drm/amdgpu: clean up ring_backup code, no need more Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:18 -04:00
Monk Liu	a909c6bd9f	drm/amdgpu: fix ring debugfs bug debugfs file added but not released after driver unloaded Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:12 -04:00
Tom St Denis	c71dbd93eb	drm/amd/amdgpu: ring debugfs is read in increments of 4 bytes If a user tries to read a non-multiple of 4 bytes it would have read until the end of the ring potentially crashing the user task. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:11 -04:00
Tom St Denis	4f4824b556	drm/amd/amdgpu: Convert ring debugfs entries to binary They now emit ring data in binary which will be read/written by the userspace tool umr shortly. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:10 -04:00
Monk Liu	cc7d8c7979	drm/amdgpu: clear RB at ring init This help fix reloading driver hang issue of SDMA ring. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:50:37 -04:00
Monk Liu	67a6a504af	drm/amdgpu: fix missing free wb for cond_exec Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-06-09 10:49:05 -04:00
Christian König	eb43096900	drm/amdgpu: fix the coding style in amdgpu_ring.c No functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:21:12 -04:00
Christian König	771c8ec177	drm/amdgpu: use the ring name for debugfs (v2) Instead of hard coding just another name in the ring code. v2: squash in Tom's rebase fix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:21:03 -04:00
Christian König	a3f1cf355e	drm/amdgpu: use max_dw in ring_init Instead of specifying the total ring size calculate that from the maximum number of dw a submission can have and the number of concurrent submissions. This fixes UVD with 8 concurrent submissions or more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:20:50 -04:00
Nils Wallménius	06ab6832ac	drm/amdgpu: Mark all instances of struct drm_info_list as const All these are compile time constand and the drm_debugfs_create/remove_files functions take a const pointer argument. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:20:10 -04:00
Monk Liu	128cff1af6	drm/amdgpu: support cond exec This adds the groundwork for conditional execution on SDMA which is necessary for preemption. Signed-off-by: Monk Liu <monk.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-02 15:09:17 -04:00
Christian König	e6151a08bb	drm/amdgpu: add number of hardware submissions to amdgpu_fence_driver_init_ring Make this a parameter instead of using the global variable directly. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2016-03-16 17:59:12 -04:00
Christian König	f104fbcb8f	drm/amdgpu: remove amdgpu_ring_from_fence Not used any more. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-14 13:43:36 -04:00
Christian König	9e5d53094c	drm/amdgpu: make pad_ib a ring function v3 The padding depends on the firmware version and we need that for BO moves as well, not only for VM updates. v2: new approach of making pad_ib a ring function v3: fix typo in macro name Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:20 -05:00
Christian König	c7e6be2303	drm/amdgpu: remove rptr checking With the scheduler enabled we don't need that any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:16:59 -05:00
Christian König	a27de35caa	drm/amdgpu: remove the ring lock v2 It's not needed any more because all access goes through the scheduler now. v2: Update commit message. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:16:58 -05:00
Alex Deucher	ea5e4c8731	drm/amdgpu: remove some more semaphore leftovers No longer needed since semaphores were removed. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <David1.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:16:51 -05:00
Christian König	41f2d99056	drm/amdgpu: fix next_rptr handling for debugfs That somehow got lost. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-22 14:44:08 -05:00
Christian König	8120b61fdf	drm/amdgpu: move ring_from_fence to common code Going to need that elsewhere as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-30 01:54:07 -04:00
Christian König	b7e4dad3e1	drm/amdgpu: remove old lockup detection infrastructure It didn't worked to well anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>	2015-10-21 11:35:12 -04:00
Alex Deucher	c113ea1c4f	drm/amdgpu: rework sdma structures Rework the sdma structures in the driver to consolidate all of the sdma info into a single structure and allow for asics that may have different numbers of sdma instances. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-14 16:16:36 -04:00
Christian König	4f839a243d	drm/amdgpu: more scheduler cleanups v2 Embed the scheduler into the ring structure instead of allocating it. Use the ring name directly instead of the id. v2: rebased, whitespace cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com>	2015-09-23 17:23:39 -04:00
Christian König	5ec92a7692	drm/amdgpu: cleanup fence queue init v2 Move the fence related stuff into amdgpu_fence.c v2: rework commit message, cause this is actually not a bug Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>	2015-09-23 17:23:38 -04:00
Christian König	72d7668b5b	drm/amdgpu: export reservation_object from dmabuf to ttm (v2) Adds an extra argument to amdgpu_bo_create, which is only used in amdgpu_prime.c. Port of radeon commit `831b6966a6`. v2: fix up kfd. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-23 17:23:34 -04:00

1 2

54 Commits