linux

Author	SHA1	Message	Date
Alex Deucher	a069a9eb73	drm/amdgpu: fix a warning in amdgpu_ras.c (v2) drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_fs_init’: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1284:2: warning: ignoring return value of ‘sysfs_create_group’, declared with attribute warn_unused_result [-Wunused-result] 1284 \| sysfs_create_group(&adev->dev->kobj, &group); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ v2: just print an error for sysfs group creation failure Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 17:03:22 -04:00
Guchun Chen	c3d4d45db2	drm/amdgpu: clean up ras sysfs creation (v2) Merge ras sysfs creation together by calling sysfs_create_group once, as sysfs_update_group may not work properly as expected. v2: improve commit message Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 17:03:22 -04:00
Tiecheng Zhou	b602ca5f31	drm/amdgpu: stop data_exchange work thread before reset In FLR routine, init_data_exchange is called at reset_sriov while fini_data_exchange is not. This will duplicating work thread. So call fini_data_exchange before reset for SRIOV Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com> Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 17:03:22 -04:00
Bokun Zhang	519b8b76f0	drm/amdgpu: Implement new guest side VF2PF message transaction (v2) - Refactor the driver code to use amdgpu_virt_read_pf2vf_data and amdgpu_virt_write_vf2pf_data instead of writing all code in one function (which is the old amdgpu_virt_init_data_exchange) - Adding a new transaction method for VF2PF message between host and guest driver. Guest side will periodically update VF2PF message in the framebuffer. In the new header, we include guest ucode information, guest framebuffer usage, and engine usage - Clean up the old macros since they will cause compile error if the new transaction method is used v2: squash in build fix Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 17:03:22 -04:00
Bokun Zhang	1721bc1b2a	drm/amdgpu: Update VF2PF interface - Update guest side VF2PF interface header file Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:55:44 -04:00
John Clements	265c280a48	drm/amdgpu: disable sienna chichlid UMC RAS disable UMC RAS in lieu of stability issues on certain sku Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:55:26 -04:00
Alex Deucher	d5cc02d97a	drm/amdgpu: add an auto setting to the noretry parameter This allows us to set different defaults on a per asic basis. This way we can enable noretry on dGPUs where it can increase performance in certain cases and disable it on chips where it can be problematic. For now the default is 0 for all asics, but we may want to try and enable it again for newer dGPUs. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:55:21 -04:00
Alex Deucher	9b498efae2	drm/amdgpu: store noretry parameter per driver instance This will allow us to have different defaults per asic in a future patch. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:55:16 -04:00
Emily.Deng	884dcf3c87	drm/amdgpu: Remove some useless code Signed-off-by: Emily.Deng <Emily.Deng@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:54:16 -04:00
Jingwen Chen	162b786f0f	drm/amd: Skip not used microcode loading in SRIOV smc, sdma, sos, ta and asd fw is not used in SRIOV. Skip them to accelerate sw_init for navi12. v2: skip above fw in SRIOV for vega10 and sienna_cichlid v3: directly skip psp fw loading in SRIOV Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:54:00 -04:00
Jiansong Chen	84d244a364	drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc. Remove gpu_info fw support for sienna_cichlid etc., since the information can be retrieved from discovery binary. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:53:17 -04:00
Thomas Zimmermann	246cb7e49a	drm/amdgpu: Introduce GEM object functions GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in amdgpu. The only exception is gem_prime_mmap, which is non-trivial to convert. v3: * remove amdgpu_object.c from patch (Christian) v2: * move object-function instance to amdgpu_gem.c (Christian) * set callbacks in amdgpu_gem_object_create() (Christian) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200923102159.24084-2-tzimmermann@suse.de	2020-09-25 09:19:42 +02:00
Dave Airlie	3a08446b31	drm/amdgpu/ttm: handle tt moves properly. The core move code currently handles use_tt moves, for amdgpu this was being handled also in the driver, but not using the same paths. If moving between TT/SYSTEM (all the use_tt paths on amdgpu) use the core move function. Eventually the core will be flipped over to calling the driver. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200924051845.397177-4-airlied@gmail.com	2020-09-25 05:48:00 +10:00
Christian König	4671078eb8	drm/amdgpu: switch over to the new pin interface Stop using TTM_PL_FLAG_NO_EVICT. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/391617/?series=81973&rev=1	2020-09-24 16:16:50 +02:00
Dave Airlie	6ea6be7708	drm-misc-next for 5.10: UAPI Changes: Cross-subsystem Changes: Core Changes: - dev: More devm_drm convertions and removal of drm_dev_init Driver Changes: - i915: selftests improvements - panfrost: support for Amlogic SoC - vc4: one fix -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX2jGxQAKCRDj7w1vZxhR xR3DAQCiZOnaxVcY49iG4343Z1aHHaIEShbnB0bDdaWstn7kiQD/UXBXUoOSFoFQ FkTsW31JsdXNnWP5e6/eJd2Lb6waVAA= =VlsU -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2020-09-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.10: UAPI Changes: Cross-subsystem Changes: - virtio: Merged a PR for patches that will affect drm/virtio Core Changes: - dev: More devm_drm convertions and removal of drm_dev_init - atomic: Split out drm_atomic_helper_calc_timestamping_constants of drm_atomic_helper_update_legacy_modeset_state - ttm: More rework Driver Changes: - i915: selftests improvements - panfrost: support for Amlogic SoC - vc4: one fix - tree-wide: conversions to devm_drm_dev_alloc, - ast: simplifications of the atomic modesetting code - panfrost: multiple fixes - vc4: multiple fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200921152956.2gxnsdgxmwhvjyut@gilmour.lan	2020-09-23 09:52:24 +10:00
Bernard Zhao	f349f772b0	drm/amd: fix typoes in comments Change the comment typo: "programm" -> "program". Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 17:37:38 -04:00
Stanley.Yang	78f0aef11f	drm/amdgpu: fix hdp register access error mmHDP_READ_CACHE_INVALIDATE register is in HDP not in NBIO Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 17:37:38 -04:00
Stanley.Yang	3f975d0f71	drm/amdgpu: update athub interrupt harvesting handle GCEA/MMHUB EA error should not result to DF freeze, this is fixed in next generation, but for some reasons the GCEA/MMHUB EA error will result to DF freeze in previous generation, diver should avoid to indicate GCEA/MMHUB EA error as hw fatal error in kernel message by read GCEA/MMHUB err status registers. Changed from V1: make query_ras_error_status function more general make read mmhub er status register more friendly Changed from V2: move ras error status query function into do_recovery workqueue Changed from V3: remove useless code from V2, print GCEA error status instance number Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 17:37:38 -04:00
Liu Shixin	c24a3c0505	drm/amdgpu/gmc9: simplify the return expression of gmc_v9_0_suspend Simplify the return expression. Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 17:37:37 -04:00
Qinglang Miao	da51e50d45	drm/amdgpu: simplify the return expression Simplify the return expression. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 17:37:37 -04:00
Qinglang Miao	d94c8250c6	drm/amdgpu/mes: simplify the return expression of mes_v10_1_ring_init Simplify the return expression. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 17:37:37 -04:00
Emily.Deng	36499e4c77	drm/amdgpu: Fix dead lock issue for vblank Always start vblank timer, but only calls vblank function when vblank is enabled. This is used to fix the dead lock issue. When drm_crtc_vblank_off want to disable vblank, it first get event_lock, and then call hrtimer_cancel, but hrtimer_cancel want to wait timer handler function finished. Timer handler also want to aquire event_lock in drm_handle_vblank. Signed-off-by: Emily.Deng <Emily.Deng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 12:25:15 -04:00
Felix Kuehling	c7651b7358	drm/amdgpu: Fix handling of KFD initialization failures Remember KFD module initializaton status in a global variable. Skip KFD device probing when the module was not initialized. Other amdgpu_amdkfd calls are then protected by the adev->kfd.dev check. Also print a clear error message when KFD disables itself. Amdgpu continues its initialization even when KFD failed. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-22 12:24:11 -04:00
Luben Tuikov	df2ce4596c	drm/amdgpu: Convert to using devm_drm_dev_alloc() (v2) Convert to using devm_drm_dev_alloc(), as drm_dev_init() is going away. v2: Remove drm_dev_put() since a) devres doesn't do refcounting, see Documentation/driver-api/driver-model/devres.rst, Section 4, paragraph 1; and since b) devres acts as garbage collector when the DRM device's parent's devres "action" callback is called to free the container device (amdgpu_device), which embeds the DRM dev. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200918132505.2316382-4-daniel.vetter@ffwll.ch	2020-09-21 10:44:46 +02:00
Alex Deucher	b4ebd0827f	drm/amdgpu: remove experimental flag from navi12 Navi12 has worked fine for a while now. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 23:06:58 -04:00
Likun Gao	fc08ce66c0	drm/amdgpu: add device ID for sienna_cichlid (v2) Add device ID for sienna_cichlid. v2: squash in additional device ids. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 23:06:30 -04:00
Alex Deucher	8a410da6aa	drm/amdgpu: use the AV1 defines for VCN 3.0 Switch from magic numbers to defines for AV1 clockgating. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 23:06:10 -04:00
Philip Yang	1d0e16ac1a	drm/amdgpu: prevent double kfree ttm->sg Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace: [ 420.932812] kernel BUG at /build/linux-do9eLF/linux-4.15.0/mm/slub.c:295! [ 420.934182] invalid opcode: 0000 [#1] SMP NOPTI [ 420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE [ 420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 1.5.4 07/09/2020 [ 420.952887] RIP: 0010:__slab_free+0x180/0x2d0 [ 420.954419] RSP: 0018:ffffbe426291fa60 EFLAGS: 00010246 [ 420.955963] RAX: ffff9e29263e9c30 RBX: ffff9e29263e9c30 RCX: 000000018100004b [ 420.957512] RDX: ffff9e29263e9c30 RSI: fffff3d33e98fa40 RDI: ffff9e297e407a80 [ 420.959055] RBP: ffffbe426291fb00 R08: 0000000000000001 R09: ffffffffc0d39ade [ 420.960587] R10: ffffbe426291fb20 R11: ffff9e49ffdd4000 R12: ffff9e297e407a80 [ 420.962105] R13: fffff3d33e98fa40 R14: ffff9e29263e9c30 R15: ffff9e2954464fd8 [ 420.963611] FS: 00007fa2ea097780(0000) GS:ffff9e297e840000(0000) knlGS:0000000000000000 [ 420.965144] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 420.966663] CR2: 00007f16bfffefb8 CR3: 0000001ff0c62000 CR4: 0000000000340ee0 [ 420.968193] Call Trace: [ 420.969703] ? __page_cache_release+0x3c/0x220 [ 420.971294] ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu] [ 420.972789] kfree+0x168/0x180 [ 420.974353] ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu] [ 420.975850] ? kfree+0x168/0x180 [ 420.977403] amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu] [ 420.978888] ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm] [ 420.980357] ttm_tt_destroy.part.11+0x4f/0x60 [amdttm] [ 420.981814] ttm_tt_destroy+0x13/0x20 [amdttm] [ 420.983273] ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm] [ 420.984725] ttm_bo_release+0x1c9/0x360 [amdttm] [ 420.986167] amdttm_bo_put+0x24/0x30 [amdttm] [ 420.987663] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 420.989165] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10 [amdgpu] [ 420.990666] kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu] Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 23:04:22 -04:00
Alex Deucher	d34c7b7b6b	drm/amdgpu: remove experimental flag from navi12 Navi12 has worked fine for a while now. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 21:22:17 -04:00
Likun Gao	61278d14bb	drm/amdgpu: add device ID for sienna_cichlid (v2) Add device ID for sienna_cichlid. v2: squash in additional device ids. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 18:01:59 -04:00
Alex Deucher	d9ed8cb5aa	drm/amdgpu: use the AV1 defines for VCN 3.0 Switch from magic numbers to defines for AV1 clockgating. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 18:01:53 -04:00
Alex Deucher	4192f7b576	drm/amdgpu: unmap register bar on device init failure We never unmapped the regiser BAR on failure. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 18:00:08 -04:00
Philip Yang	c8e74b17c1	drm/amdgpu: prevent double kfree ttm->sg Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace: [ 420.932812] kernel BUG at /build/linux-do9eLF/linux-4.15.0/mm/slub.c:295! [ 420.934182] invalid opcode: 0000 [#1] SMP NOPTI [ 420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE [ 420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 1.5.4 07/09/2020 [ 420.952887] RIP: 0010:__slab_free+0x180/0x2d0 [ 420.954419] RSP: 0018:ffffbe426291fa60 EFLAGS: 00010246 [ 420.955963] RAX: ffff9e29263e9c30 RBX: ffff9e29263e9c30 RCX: 000000018100004b [ 420.957512] RDX: ffff9e29263e9c30 RSI: fffff3d33e98fa40 RDI: ffff9e297e407a80 [ 420.959055] RBP: ffffbe426291fb00 R08: 0000000000000001 R09: ffffffffc0d39ade [ 420.960587] R10: ffffbe426291fb20 R11: ffff9e49ffdd4000 R12: ffff9e297e407a80 [ 420.962105] R13: fffff3d33e98fa40 R14: ffff9e29263e9c30 R15: ffff9e2954464fd8 [ 420.963611] FS: 00007fa2ea097780(0000) GS:ffff9e297e840000(0000) knlGS:0000000000000000 [ 420.965144] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 420.966663] CR2: 00007f16bfffefb8 CR3: 0000001ff0c62000 CR4: 0000000000340ee0 [ 420.968193] Call Trace: [ 420.969703] ? __page_cache_release+0x3c/0x220 [ 420.971294] ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu] [ 420.972789] kfree+0x168/0x180 [ 420.974353] ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu] [ 420.975850] ? kfree+0x168/0x180 [ 420.977403] amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu] [ 420.978888] ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm] [ 420.980357] ttm_tt_destroy.part.11+0x4f/0x60 [amdttm] [ 420.981814] ttm_tt_destroy+0x13/0x20 [amdttm] [ 420.983273] ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm] [ 420.984725] ttm_bo_release+0x1c9/0x360 [amdttm] [ 420.986167] amdttm_bo_put+0x24/0x30 [amdttm] [ 420.987663] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 420.989165] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10 [amdgpu] [ 420.990666] kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu] Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 17:56:38 -04:00
Luben Tuikov	5aea5327ea	drm/amdgpu: No sysfs, not an error condition Not being able to create amdgpu sysfs attributes is not a fatal error warranting not to continue to try to bring up the display. Thus, if we get an error trying to create amdgpu sysfs attrs, report it and continue on to try to bring up a display. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 17:56:31 -04:00
Jiansong Chen	24b763d0fb	drm/amdgpu: declare ta firmware for navy_flounder The firmware provided via MODULE_FIRMWARE appears in the module information. External tools(eg. dracut) may use the list of fw files to include them as appropriate in an initramfs, thus missing declaration will lead to request firmware failure in boot time. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 17:56:17 -04:00
Shirish S	0eaa801242	amdgpu/gmc_v9: Warn if SDPIF_MMIO_CNTRL_0 is not set With IOMMU enabled, if SDPIF_MMIO_CNTRL_0 is not set appropriately the system hangs without any trace during S3. To ease debug and to ensure that the failure, if any, was caused by a race conditions that disabled write access to SDPIF_MMIO_CNTRL_0 register, warn the user about it. Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 17:49:04 -04:00
Dave Airlie	e46f468fef	drm/ttm: drop special pipeline accel cleanup function. The two accel cleanup paths were mostly the same once refactored. Just pass a bool to say if the evictions are to be pipelined. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200917064132.148521-2-airlied@gmail.com	2020-09-18 06:23:06 +10:00
Dave Airlie	cae515f4a5	drm/ttm/drivers: call the bind function directly. Now the bind functions have all the protection explicitly the drivers can just call them directly, and the api can be unexported Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-5-airlied@gmail.com	2020-09-18 06:16:03 +10:00
Dave Airlie	37bff6542c	drm/ttm: move unbind into the tt destroy. This moves unbind into the driver side on destroy paths. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-4-airlied@gmail.com	2020-09-18 06:15:24 +10:00
Dave Airlie	7626168fd1	drm/ttm: flip tt destroy ordering. Call the driver first and have it call the common code cleanup. This is useful later to fix unbind. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-3-airlied@gmail.com	2020-09-18 06:14:41 +10:00
Dave Airlie	0b988ca1c7	drm/ttm: protect against reentrant bind in the drivers This moves the generic tracking into the drivers and protects against reentrancy in the drivers. It fixes up radeon and agp to be able to query the bound status as that is required. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-2-airlied@gmail.com	2020-09-18 06:14:00 +10:00
Fenghua Yu	c7b6bac9c7	drm, iommu: Change type of pasid to u32 PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com	2020-09-17 19:21:16 +02:00
Jiansong Chen	e60c27f1ff	drm/amdgpu: declare ta firmware for navy_flounder The firmware provided via MODULE_FIRMWARE appears in the module information. External tools(eg. dracut) may use the list of fw files to include them as appropriate in an initramfs, thus missing declaration will lead to request firmware failure in boot time. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-17 00:12:36 -04:00
Dave Airlie	9e9a153bdf	drm/ttm: move ttm binding/unbinding out of ttm_tt paths. Move these up to the bo level, moving ttm_tt to just being backing store. Next step is to move the bound flag out. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-6-airlied@gmail.com	2020-09-16 09:35:30 +10:00
Dave Airlie	2040ec970e	drm/ttm: split populate out from binding. Drivers have to call populate themselves now before binding. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-5-airlied@gmail.com	2020-09-16 09:34:54 +10:00
Dave Airlie	7eec915138	drm/ttm/tt: add wrappers to set tt state. This adds 2 getters and 4 setters, however unbound and populated are currently the same thing, this will change, it also drops a BUG_ON that seems not that useful. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-2-airlied@gmail.com	2020-09-16 09:33:24 +10:00
Andrey Grodzovsky	5367eb6d8a	drm/amdgpu: Include sienna_cichlid in USBC PD FW support. Create sysfs interface also for sienna_cichlid. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 18:25:40 -04:00
Alex Deucher	f4075be882	drm/amdgpu/gmc9: remove mmhub client duplicated case Copy paste typo. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Alex Deucher	ea68573d40	drm/amdgpu: Fail to load on RAVEN if SME is active Due to hardware bugs, scatter/gather display on raven requires a 1:1 IOMMU mapping, however, SME (System Memory Encryption) requires an indirect IOMMU mapping because the encryption bit is beyond the DMA mask of the chip. As such, the two are incompatible. Acked-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	724dc53b92	drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v4_0.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1003:4-9: WARNING: Comparison to bool drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1083:5-11: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	8f00d1fc9d	drm/amd/amdgpu: fix comparison pointer to bool warning in amdgpu_atpx_handler.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:619:15-49: WARNING: Comparison to bool drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:629:15-49: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	3d0c75afdc	drm/amd/amdgpu: fix comparison pointer to bool warning in uvd_v6_0.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c:1243:14-25: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	e66cdf250e	drm/amd/amdgpu: fix comparison pointer to bool warning in si.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/si.c:1342:5-10: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	4bbbe77c15	drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v5_2.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:562:5-11: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	960a06ff91	drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v5_0.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c:619:5-11: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	89cf8b0637	drm/amd/amdgpu: fix comparison pointer to bool warning in gfx_v10_0.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3563:5-31: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Zheng Bin	7b3fa67d6e	drm/amd/amdgpu: fix comparison pointer to bool warning in gfx_v9_0.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2805:5-11: WARNING: Comparison to bool Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:43 -04:00
Jonathan Kim	7c679ef667	drm/amdgpu: stop resetting xgmi perfmons on disable Disabling perf events does not specify reset in ABI so stop doing it in hardware. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:42 -04:00
Oak Zeng	719a6513fb	drm/amdgpu: More accurate description of a function param Add more accurate description of the pe parameter of function amdgpu_vm_sdma_udpate and amdgpu_vm_cpu_update Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Christian Konig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:42 -04:00
Oak Zeng	91b5900507	drm/amdgpu: Add comment to function amdgpu_ttm_alloc_gart Add comments to refect what function does Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Christian Konig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:42 -04:00
Andrey Grodzovsky	ce87c98db4	drm/amdgpu: Include sienna_cichlid in USBC PD FW support. Create sysfs interface also for sienna_cichlid. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:42 -04:00
Mukul Joshi	62f6b1162e	drm/amdgpu: Enable SDMA utilization for Arcturus SDMA utilization calculations are enabled/disabled by writing to SDMAx_PUB_DUMMY_REG2 register. Currently, enable this only for Arcturus. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:39 -04:00
Aurabindo Pillai	5d1c59c479	drm/amdgpu: Move existing pflip fields into separate struct [Why&How] To refactor DM IRQ management, all fields used by IRQ is best moved to a separate struct so that main amdgpu_crtc struct need not be changed Location of the new struct shall be in DM Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:39 -04:00
John Clements	9c7e2ceb1d	drm/amdgpu: Update RAS init handling Output RAS init status If RAS init fails, teardown RAS context Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:39 -04:00
Changfeng	f399d4de2d	drm/amdgpu: add ta DTM/HDCP print in amdgpu_firmware_info for apu It needs to add ta DTM/HDCP print to get HDCP/DTM version info when cat amdgpu_firmware_info Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:52:39 -04:00
Andrey Grodzovsky	7cbbc745dc	drm/amdgpu: Minor checkpatch fix Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:25:29 -04:00
Andrey Grodzovsky	6894305c97	drm/amdgpu: Disable DPC for XGMI for now. XGMI support is more complicated than single device support as questions of synchronization between the device recovering from PCI error and other members of the hive are required. Leaving this for next round. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:25:22 -04:00
Andrey Grodzovsky	7ac71382e9	drm/amdgpu: Trim amdgpu_pci_slot_reset by reusing code. Reuse exsisting functions from GPU recovery to avoid code duplications. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:25:16 -04:00
Andrey Grodzovsky	c1dd4aa624	drm/amdgpu: Fix consecutive DPC recovery failures. Cache the PCI state on boot and before each case where we might loose it. v2: Add pci_restore_state while caching the PCI state to avoid breaking PCI core logic for stuff like suspend/resume. v3: Extract pci_restore_state from amdgpu_device_cache_pci_state to avoid superflous restores during GPU resets and suspend/resumes. v4: Style fixes. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:25:04 -04:00
Andrey Grodzovsky	362c7b91c1	drm/amdgpu: Fix SMU error failure Wait for HW/PSP initiated ASIC reset to complete before starting the recovery operations. v2: Remove typo Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:24:55 -04:00
Andrey Grodzovsky	acd89fca67	drm/amdgpu: Block all job scheduling activity during DPC recovery DPC recovery involves ASIC reset just as normal GPU recovery so block SW GPU schedulers and wait on all concurrent GPU resets. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:24:48 -04:00
Andrey Grodzovsky	bf36b52e78	drm/amdgpu: Avoid accessing HW when suspending SW state At this point the ASIC is already post reset by the HW/PSP so the HW not in proper state to be configured for suspension, some blocks might be even gated and so best is to avoid touching it. v2: Rename in_dpc to more meaningful name Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:24:39 -04:00
Andrey Grodzovsky	c9a6b82f45	drm/amdgpu: Implement DPC recovery Add PCI Downstream Port Containment (DPC) with basic recovery functionality v2: remove pci_save_state to avoid breaking suspend/resume v3: Fix style comments v4: Improve description. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:24:32 -04:00
Liu ChengZhe	2a9787dcf5	drm/amdgpu: Do gpu recovery when no job is running In function flr_work, we should do gpu recovery when no job is running. Fix the logic by inverting it. v2: modify the description Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-15 17:24:18 -04:00
Christian König	9c3006a4cc	drm/ttm: remove available_caching Instead of letting TTM make an educated guess based on some mask all drivers should just specify what caching they want for their CPU mappings. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/390207/	2020-09-15 16:05:19 +02:00
Christian König	0fe438cec9	drm/ttm: remove default caching As far as I can tell this was never used either and we just always fallback to the order cached > wc > uncached anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/390142/	2020-09-15 16:03:44 +02:00
Maxime Ripard	00af6729b5	Merge drm/drm-next into drm-misc-next Paul Cercueil needs some patches in -rc5 to apply new patches for ingenic properly. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2020-09-14 18:11:40 +02:00
Christian König	48e07c23cb	drm/ttm: nuke memory type flags It's not supported to specify more than one of those flags. So it never made sense to make this a flag in the first place. Nuke the flags and specify directly which memory type to use. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/389826/?series=81551&rev=1	2020-09-11 13:31:23 +02:00
Gerd Hoffmann	707d561f77	drm: allow limiting the scatter list size. Add drm_device argument to drm_prime_pages_to_sg(), so we can call dma_max_mapping_size() to figure the segment size limit and call into __sg_alloc_table_from_pages() with the correct limit. This fixes virtio-gpu with sev. Possibly it'll fix other bugs too given that drm seems to totaly ignore segment size limits so far ... v2: place max_segment in drm driver not gem object. v3: move max_segment next to the other gem fields. v4: just use dma_max_mapping_size(). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20200907112425.15610-2-kraxel@redhat.com	2020-09-09 07:58:56 +02:00
Dave Airlie	877d8c0743	Merge tag 'topic/nouveau-i915-dp-helpers-and-cleanup-2020-08-31-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next UAPI Changes: None Cross-subsystem Changes: * Moves a bunch of miscellaneous DP code from the i915 driver into a set of shared DRM DP helpers Core Changes: * New DRM DP helpers (see above) Driver Changes: * Implements usage of the aforementioned DP helpers in the nouveau driver, along with some other various HPD related cleanup for nouveau Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/11e59ebdea7ee4f46803a21fe9b21443d2b9c401.camel@redhat.com	2020-09-09 12:27:13 +10:00
Dave Airlie	5d26eba988	drm/amdgpu/ttm: move to driver backend binding funcs Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-9-airlied@gmail.com	2020-09-09 08:30:24 +10:00
Dave Airlie	ecfe6953fa	drm/ttm: introduce ttm_bo_move_null This pattern is cut-n-pasted across 4 drivers, switch it to a WARN_ON instead, as BUG_ON is considered a bad idea usually. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-2-airlied@gmail.com	2020-09-09 08:28:53 +10:00
Christian König	54d04ea8cd	drm/ttm: merge offset and base in ttm_bus_placement This is used by TTM to communicate the physical address which should be used with ioremap(), ioremap_wc(). We don't need to separate the base and offset in any way here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/389457/	2020-09-08 10:43:30 +02:00
Dave Airlie	0c8d22fcae	Merge tag 'amd-drm-next-5.10-2020-09-03' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.10-2020-09-03: amdgpu: - RAS fixes - Sienna Cichlid updates - Navy Flounder updates - DCE6 (SI) support in DC - Enable plane rotation - Rework pre-OS vram reservation handling during driver init - Add standard interface to dump GPU metrics table from SMU - Rework tiling and tmz state handling in atomic commits - Pstate fixes - Add voltage and power hwmon interfaces for renoir - SW CTF fixes - S/G display fix for Raven - Print client strings for vmfaults for vega and newer - Manual fan control fixes - Display updates - Reorg power management directory structure - Misc bug fixes - Misc code cleanups amdkfd: - Topology fixes - Add SMI events for thermal throttling and GPU resets radeon: - switch from pci_* to dma_* for dma allocations - PLL fix Scheduler: - Clean up priority levels UAPI: - amdgpu INFO IOCTL query update for TMZ state https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049 - amdkfd SMI event interface updates https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/therm_thrott From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903222921.4152-1-alexander.deucher@amd.com	2020-09-08 16:40:13 +10:00
Dave Airlie	ce5c207c6b	Linux 5.9-rc4 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9VerweHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGhc4H/iHD6qLdB36gZB6K oc2nJyrqyWitv4ti2Mnt5PA7o4wX4l6nnr1QvoaJ4BRs5Ja1czRvb2XDmdzqAoIA xITGoafqaAeDfxQ91bWrJsVN0pCRKiGwddXlU7TWmqw/riAkfOqi6GYKvav4biJH +n1mUPQb1M2IbRFsqkAS+ebKHq3CWaRvzKOEneS88nGlL5u31S9NAru8Ru/fkxRn 6CwGcs1XRaBPYaZAhdfIb0NuatUlpkhPC9yhNS9up6SqrWmK3m65vmFVng6H0eCF fwn1jVztboY/XcNAi5sM9ExpQCql6WLQEEktVikqRDojC8fVtSx6W55tPt7qeaoO Z6t4/DA= =bcA4 -----END PGP SIGNATURE----- Merge tag 'v5.9-rc4' into drm-next Backmerge 5.9-rc4 as there is a nasty qxl conflict that needs to be resolved. Signed-off-by: Dave Airlie <airlied@redhat.com>	2020-09-08 14:41:40 +10:00
Dave Airlie	0a667b5007	drm/ttm: remove bdev from ttm_tt I want to split this structure up and use it differently, step one remove bdev pointer from it and pass it explicitly. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200826014428.828392-4-airlied@gmail.com	2020-09-08 06:39:21 +10:00
Alex Deucher	11bc98bd71	drm/amdgpu/mmhub2.0: print client id string for mmhub Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:48:34 -04:00
Alex Deucher	02f23f5f7c	drm/amdgpu/gmc9: print client id string for mmhub Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:48:29 -04:00
Alex Deucher	93fabd84c9	drm/amdgpu/gmc10: print client id string for gfxhub Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:48:27 -04:00
Alex Deucher	be99ecbfff	drm/amdgpu/gmc9: print client id string for gfxhub Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:48:23 -04:00
Ye Bin	2d37949dc3	drm/amdgpu/gfx10: Delete some duplicated argument to '\|' 1. gfx_v10_0_soft_reset GRBM_STATUS__SPI_BUSY_MASK 2. gfx_v10_0_update_gfx_clock_gating AMD_CG_SUPPORT_GFX_CGLS Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:48:18 -04:00
Changfeng	6627d1c1a8	drm/amdgpu: add ta firmware load in psp_v12_0 for renoir It needs to load renoir_ta firmware because hdcp is enabled by default for renoir now. This can avoid error:DTM TA is not initialized Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:48:11 -04:00
Christian König	ee354ff1c7	drm/amdgpu: fix max_entries calculation v4 Calculate the correct value for max_entries or we might run after the page_address array. v2: Xinhui pointed out we don't need the shift v3: use local copy of start and simplify some calculation v4: fix the case that we map less VA range than BO size Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `1e691e2444` drm/amdgpu: stop allocating dummy GTT nodes Reviewed-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:48:04 -04:00
Luben Tuikov	1625951a3a	drm/amdgpu: Remove superfluous NULL check The DRM device is a static member of the amdgpu device structure and as such always exists, so long as the PCI and thus the amdgpu device exist. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:47:55 -04:00
Alex Sierra	abb6fccbb4	drm/amdgpu: enable ih1 ih2 for Arcturus only Enable multi-ring ih1 and ih2 for Arcturus only. For Navi10 family multi-ring has been disabled. Apparently, having multi-ring enabled in Navi was causing continus page fault interrupts. Further investigation is needed to get to the root cause. Related issue link: https://gitlab.freedesktop.org/drm/amd/-/issues/1279 Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:47:48 -04:00
xinhui pan	3d7248d7ce	drm/amdgpu: Fix a redundant kfree drm_dev_alloc() alloc dev and set managed.final_kfree to dev to free itself. Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into amdgpu_device (v3)") we alloc adev and ddev is just a member of it. So drm_dev_release try to free a wrong pointer then. Also driver's release trys to free adev, but drm_dev_release will access dev after call drvier's release. To fix it, remove driver's release and set managed.final_kfree to adev. [ 36.269348] BUG: unable to handle page fault for address: ffffa0c279940028 [ 36.276841] #PF: supervisor read access in kernel mode [ 36.282434] #PF: error_code(0x0000) - not-present page [ 36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 800ffff8066bf060 [ 36.296868] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI [ 36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G O 5.9.0-rc2+ #46 [ 36.310670] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019 [ 36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm] [ 36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5 [ 36.347217] RSP: 0018:ffffb9424141fce0 EFLAGS: 00010282 [ 36.352931] RAX: 0000000000000006 RBX: ffffa0c279940010 RCX: 0000000000000006 [ 36.360718] RDX: ffffffffc0419f5a RSI: 0000000000000200 RDI: ffffa0c279940010 [ 36.368503] RBP: ffffb9424141fd10 R08: 0000000000000001 R09: 0000000000000001 [ 36.376304] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0c279940010 [ 36.384070] R13: ffffffffc0e2a000 R14: ffffa0c26924e220 R15: fffffffffffffff2 [ 36.391845] FS: 00007fc4a277b740(0000) GS:ffffa0c288e00000(0000) knlGS:0000000000000000 [ 36.400669] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.406937] CR2: ffffa0c279940028 CR3: 0000000792304006 CR4: 00000000003706e0 [ 36.414732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 36.422550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 36.430354] Call Trace: [ 36.433044] drm_dev_put.part.0+0x40/0x60 [drm] [ 36.438017] drm_dev_put+0x13/0x20 [drm] [ 36.442398] amdgpu_pci_remove+0x56/0x60 [amdgpu] [ 36.447528] pci_device_remove+0x3e/0xb0 [ 36.451807] device_release_driver_internal+0xff/0x1d0 [ 36.457416] device_release_driver+0x12/0x20 [ 36.462094] pci_stop_bus_device+0x70/0xa0 [ 36.466588] pci_stop_and_remove_bus_device_locked+0x1b/0x30 [ 36.472786] remove_store+0x7b/0x90 [ 36.476614] dev_attr_store+0x17/0x30 [ 36.480646] sysfs_kf_write+0x4b/0x60 [ 36.484655] kernfs_fop_write+0xe8/0x1d0 [ 36.488952] vfs_write+0xf5/0x230 [ 36.492562] ksys_write+0x70/0xf0 [ 36.496206] __x64_sys_write+0x1a/0x20 [ 36.500292] do_syscall_64+0x38/0x90 [ 36.504219] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: xinhui pan <xinhui.pan@amd.com> Acked-by: Alex Deucher <alexancer.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:47:33 -04:00
Dennis Li	81202807ae	drm/amdgpu: block ring buffer access during GPU recovery When GPU is in reset, its status isn't stable and ring buffer also need be reset when resuming. Therefore driver should protect GPU recovery thread from ring buffer accessed by other threads. Otherwise GPU will randomly hang during recovery. v2: correct indent Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:46:55 -04:00
Nirmoy Das	bc21585f3f	drm/amdgpu: disable gpu-sched load balance for uvd On hardware with multiple uvd instances, dependent uvd jobs may get scheduled to different uvd instances. Because uvd_enc jobs retain hw context, dependent jobs should always run on the same uvd instance. This patch disables GPU scheduler's load balancer for a context that binds jobs from the same context to a uvd instance. v2: Squash in uvd_enc fix Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:46:54 -04:00
Dave Airlie	05010c1e2f	drm/amdgpu/ttm: remove unused parameter to move blit Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200826014428.828392-2-airlied@gmail.com	2020-08-31 12:41:50 +10:00
Nirmoy Das	e230ac1118	drm/amdgpu: fix compiler warnings Fixes below compiler warnings: CC [M] drivers/gpu/drm/amd/amdgpu/amdgpu_device.o drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] 381 \| void static inline amdgpu_mm_wreg_mmio(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t acc_flags) \| ^~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_fini’: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3381:6: warning: variable ‘r’ set but not used [-Wunused-but-set-variable] 3381 \| int r; \| ^ Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-28 14:00:23 -04:00
Linus Torvalds	5ec06b5c0d	drm fixes for 5.9-rc3 core: - Take modeset bkl for legacy drivers. dp_mst: - Allow null crtc in dp_mst. i915: - Fix command parser desc matching with masks amdgpu: - Misc display fixes - Backlight fixes - MPO fix for DCN1 - Fixes for Sienna Cichlid - Fixes for Navy Flounder - Vega SW CTF fixes - SMU fix for Raven - Fix a possible overflow in INFO ioctl - Gfx10 clockgating fix msm: - opp/bw scaling patch followup - frequency restoring fux - vblank in atomic commit fix - dpu modesetting fixes - fencing fix etnaviv: - scheduler interaction fix - gpu init regression fix exynos: - Just drop __iommu annotation to fix sparse warning. omap: - locking state fix. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJfSGXpAAoJEAx081l5xIa+bs4QAJ6dQb0Llv9TyUBZL/FTTbc2 iv8PDzA55aoBn7qI00mkUN9M9elfAV75dfLcn9h1fJg/EN3UZXkPCxFAcsscN+gm 0TXTU9SmgqeDDK2RFxyvSGbuJbfuoOiXMbERdd+sYupUh7FowckDFVrOvbRcEe// PX3klUuQRSRxsgShrruJjaLdpqA+saZ12fhoE3eagsvlaFFAuVz8GY2zj533yaw9 zjgDTkOWiO4Lw2/X10dmjEoxa5Nn26ECF4y+iFih7Uw2e+tu+ZVaL+/PcVPlJk5a NdlDAgU1gS8U9c0lSbrLNGGwBRXj1899FIRMS55uLwIBSmdZlOkh/wEOmnC+c1uK kbrUfYlU6dXLjzUOTd2d8GQx3F9OTFxOXoZjshMYlryf2RsVl4kImBlpdAC3TT7p B/Qk4xbU3uOuDzMgF76b1wT5XiFoHKcPrxNcf5L1tkqULdmYioB78hs4uQij5Bh0 p6ynfR8rb19J4m+ctI84HqfG6ZbloBgGDZLzIe37lHsvG6xKG7/VjcclBV9BHBXs mgG0h+CeogY14a9JWeZ2dQFH8wT77QT5G0/ODQo1r9+OXM1i10DGCZyihrNJf2Nl //4uIDlgq695Lxd+h7FMuupWLpvAvnooDaUvnnNfVGsQVFsEhagl3R+TCGaFULHJ hi1vfCSZEBV+tCeiKSRU =ZmIT -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2020-08-28' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "As expected a bit of an rc3 uptick, amdgpu and msm are the main ones, one msm patch was from the merge window, but had dependencies and we dropped it until the other tree had landed. Otherwise it's a couple of fixes for core, and etnaviv, and single i915, exynos, omap fixes. I'm still tracking the Sandybridge gpu relocations issue, if we don't see much movement I might just queue up the reverts. I'll talk to Daniel next week once he's back from holidays. core: - Take modeset bkl for legacy drivers dp_mst: - Allow null crtc in dp_mst i915: - Fix command parser desc matching with masks amdgpu: - Misc display fixes - Backlight fixes - MPO fix for DCN1 - Fixes for Sienna Cichlid - Fixes for Navy Flounder - Vega SW CTF fixes - SMU fix for Raven - Fix a possible overflow in INFO ioctl - Gfx10 clockgating fix msm: - opp/bw scaling patch followup - frequency restoring fux - vblank in atomic commit fix - dpu modesetting fixes - fencing fix etnaviv: - scheduler interaction fix - gpu init regression fix exynos: - Just drop __iommu annotation to fix sparse warning omap: - locking state fix" * tag 'drm-fixes-2020-08-28' of git://anongit.freedesktop.org/drm/drm: (41 commits) drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init drm/amdgpu: disable runtime pm for navy_flounder drm/amd/display: Retry AUX write when fail occurs drm/amdgpu: Fix buffer overflow in INFO ioctl drm/amd/powerplay: Fix hardmins not being sent to SMU for RV drm/amdgpu: use MODE1 reset for navy_flounder by default drm/amd/pm: correct the thermal alert temperature limit settings drm/amdgpu: add asd fw check before loading asd drm/amd/display: Keep current gain when ABM disable immediately drm/amd/display: Fix passive dongle mistaken as active dongle in EDID emulation drm/amd/display: Revert HDCP disable sequence change drm/amd/display: Send DISPLAY_OFF after power down on boot drm/amdgpu/gfx10: refine mgcg setting drm/amd/pm: correct Vega20 swctf limit setting drm/amd/pm: correct Vega12 swctf limit setting drm/amd/pm: correct Vega10 swctf limit setting drm/amd/pm: set VCN pg per instances drm/amd/pm: enable run_btc callback for sienna_cichlid drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in amdgpu_dm_update_backlight_caps drm/amd/display: Reject overlay plane configurations in multi-display scenarios ...	2020-08-28 09:46:48 -07:00
Dave Airlie	cbc2e82932	drm-misc-next for 5.10: UAPI Changes: Cross-subsystem Changes: Core Changes: - ttm: various cleanups and reworks of the API Driver Changes: - ast: various cleanups - gma500: A few fixes, conversion to GPIOd API - hisilicon: Change of maintainer, various reworks - ingenic: Clock handling and formats support improvements - mcde: improvements to the DSI support - mgag200: Support G200 desktop cards - mxsfb: Support the i.MX7 and i.MX8M and the alpha plane - panfrost: support devfreq - ps8640: Retrieve the EDID from eDP control, misc improvements - tidss: Add a workaround for AM65xx YUV formats handling - virtio: a few cleanups, support for virtio-gpu exported resources - bridges: Support the chained bridges on more drivers, new bridges: Toshiba TC358762, Toshiba TC358775, Lontium LT9611 - panels: Convert to dev_ based logging, read orientation from the DT, various fixes, new panels: Mantix MLAF057WE51-X, Chefree CH101OLHLWH-002, Powertip PH800480T013, KingDisplay KD116N21-30NV-A010 -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX0fXGwAKCRDj7w1vZxhR xTmMAQDPmfSsBLLNnDxu4++zFrQ7OKmNSHCkVr4nAQ/yg3GVPQEAuRw6qPwPWuV3 +jEPxaQSSmHOhx/jXfolV1tJaE/FHgA= =WYoO -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2020-08-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.10: UAPI Changes: Cross-subsystem Changes: Core Changes: - ttm: various cleanups and reworks of the API Driver Changes: - ast: various cleanups - gma500: A few fixes, conversion to GPIOd API - hisilicon: Change of maintainer, various reworks - ingenic: Clock handling and formats support improvements - mcde: improvements to the DSI support - mgag200: Support G200 desktop cards - mxsfb: Support the i.MX7 and i.MX8M and the alpha plane - panfrost: support devfreq - ps8640: Retrieve the EDID from eDP control, misc improvements - tidss: Add a workaround for AM65xx YUV formats handling - virtio: a few cleanups, support for virtio-gpu exported resources - bridges: Support the chained bridges on more drivers, new bridges: Toshiba TC358762, Toshiba TC358775, Lontium LT9611 - panels: Convert to dev_ based logging, read orientation from the DT, various fixes, new panels: Mantix MLAF057WE51-X, Chefree CH101OLHLWH-002, Powertip PH800480T013, KingDisplay KD116N21-30NV-A010 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200827155517.do6emeacetpturli@gilmour.lan	2020-08-28 12:38:06 +10:00
Jiawei	4cd2a96d3a	drm/amdgpu: simplify hw status clear/set logic Optimize code to iterate less loops in amdgpu_device_ip_reinit_early_sriov() Signed-off-by: Jiawei <Jiawei.Gu@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-27 10:09:07 -04:00
Guchun Chen	0bbb5462d3	drm/amdgpu: correct SE number for arcturus gfx ras When resetting EDC related register, all CUs needs to be visited, otherwise, garbage data from EDC register of missed SEs would present. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Jiansong Chen	faeefe4e54	drm/amdgpu: disable runtime pm for navy_flounder Disable runtime pm for navy_flounder temporarily. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Alex Deucher	cf851f3ff8	drm/amdgpu: Fix buffer overflow in INFO ioctl The values for "se_num" and "sh_num" come from the user in the ioctl. They can be in the 0-255 range but if they're more than AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in an out of bounds read. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Alex Deucher	c997e8e26c	drm/amdgpu: report DC not supported if virtual display is enabled (v2) Virtual display is non-atomic so report false to avoid checking atomic state and other atomic things at runtime. v2: squash into the sr-iov check Acked-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Jiansong Chen	22dd44f47c	drm/amdgpu: use MODE1 reset for navy_flounder by default Switch default gpu reset method to MODE1 for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Stanley.Yang	5436ab94cd	drm/amdkfd: fix set kfd node ras properties value The ctx->features are new RAS implementation which is only available for Vega20 and onwards, it is not available for vega10, vega10 should follow legacy ECC implementation. Changed from V1: wrap function to initialize kfd node properties Changed from V2: remove wrap function and SDMA SRAM ECC check Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Tao Zhou	c56c90f413	drm/amdgpu: add asd fw check before loading asd asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Alex Deucher	4d2997ab21	drm/amdgpu: add a wrapper for atom asic_init This allows us to add asic specific workarounds for atom asic init while keeping the adev specifics out of the atombios parser code. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Alex Deucher	a71737313e	drm/amdgpu: add pre_asic_init callback for navi Nothing to do for this family. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Alex Deucher	b0a2db9b48	drm/amdgpu: add pre_asic_init callback for SOC15 We need to restore some registers prior to running asic init to work around a firmware bug. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Alex Deucher	cff6c7f91a	drm/amdgpu: add pre_asic_init callback for VI Nothing to do for this family. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Alex Deucher	819515c7f3	drm/amdgpu: add pre_asic_init callback for CIK Nothing to do for this family. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Alex Deucher	632d9f9492	drm/amdgpu: add pre_asic_init callback for SI Nothing to do for this family. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Alex Deucher	9737a923c9	drm/amdgpu: add an asic callback for pre asic init This callback can be used by asics that need to do something special prior to calling atom asic init. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Alex Deucher	f8646661f7	drm/amdgpu: fix up DCHUBBUB_SDPIF_MMIO_CNTRL_0 handling Properly define this register using a relative offset rather than an absolute offset and use the proper SOC15 macros to access it. It's also DCN, not DCE, so remove it from the DCE12 header. No functional change. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Felix Kuehling	332f6e1e98	drm/amdkfd: call amdgpu_amdkfd_get_hive_id directly No need to use a function pointer because the implementation is not ASIC-specific. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Felix Kuehling	817154c1a2	drm/amdkfd: call amdgpu_amdkfd_get_unique_id directly No need to use a function pointer because the implementation is not ASIC-specific. This fixes missing support due to a missing function pointer on Arcturus. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Youling Tang	a590a83d74	gpu: amd: Remove duplicate semicolons at the end of line Remove duplicate semicolons at the end of line. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:18 -04:00
Jiansong Chen	d3bbba79eb	drm/amdgpu/gfx10: refine mgcg setting 1. enable ENABLE_CGTS_LEGACY to fix specviewperf11 random hang. 2. remove obsolete RLC_CGTT_SCLK_OVERRIDE workaround. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:17 -04:00
Huang Rui	6127896f4a	drm/amdkfd: implement the dGPU fallback path for apu (v6) We still have a few iommu issues which need to address, so force raven as "dgpu" path for the moment. This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled or ACPI CRAT table not correct. v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2. v3: Align with existed thunk, don't change the way of raven, only renoir will use "dgpu" path by default. v4: don't update global ignore_crat in the driver, and revise fallback function if CRAT is broken. v5: refine acpi crat good but no iommu support case, and rename the title. v6: fix the issue of dGPU initialized firstly, just modify the report value in the node_show(). Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:17 -04:00
Luben Tuikov	8aba21b751	drm/amdgpu: Embed drm_device into amdgpu_device (v3) a) Embed struct drm_device into struct amdgpu_device. b) Modify the inline-f drm_to_adev() accordingly. c) Modify the inline-f adev_to_drm() accordingly. d) Eliminate the use of drm_device.dev_private, in amdgpu. e) Switch from using drm_dev_alloc() to drm_dev_init(). f) Add a DRM driver release function, which frees the container amdgpu_device after all krefs on the contained drm_device have been released. v2: Split out adding adev_to_drm() into its own patch (previous commit), making this patch more succinct and clear. More detailed commit description. v3: squash in fix to call drmm_add_final_kfree() to avoid a warning. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:17 -04:00
Jiansong Chen	82dff839c9	drm/amdgpu: disable runtime pm for navy_flounder Disable runtime pm for navy_flounder temporarily. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 15:45:52 -04:00
Alex Deucher	b5b97cab55	drm/amdgpu: Fix buffer overflow in INFO ioctl The values for "se_num" and "sh_num" come from the user in the ioctl. They can be in the 0-255 range but if they're more than AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in an out of bounds read. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-08-26 15:45:51 -04:00
Jiansong Chen	75947544c8	drm/amdgpu: use MODE1 reset for navy_flounder by default Switch default gpu reset method to MODE1 for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 15:45:51 -04:00
Tao Zhou	14e4f3bd81	drm/amdgpu: add asd fw check before loading asd asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 15:27:52 -04:00
Jiansong Chen	de7a1b0b87	drm/amdgpu/gfx10: refine mgcg setting 1. enable ENABLE_CGTS_LEGACY to fix specviewperf11 random hang. 2. remove obsolete RLC_CGTT_SCLK_OVERRIDE workaround. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-08-26 15:17:16 -04:00
Luben Tuikov	4a580877bd	drm/amdgpu: Get DRM dev from adev by inline-f Add a static inline adev_to_drm() to obtain the DRM device pointer from an amdgpu_device pointer. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 13:06:06 -04:00
Luben Tuikov	1348969ab6	drm/amdgpu: drm_device to amdgpu_device by inline-f (v2) Get the amdgpu_device from the DRM device by use of an inline function, drm_to_adev(). The inline function resolves a pointer to struct drm_device to a pointer to struct amdgpu_device. v2: Use a typed visible static inline function instead of an invisible macro. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 13:06:06 -04:00
Prike.Liang	50166d1ce5	drm/amdgpu: enable HDP clock gatting Enabe HDP SD/DS clock gatting in Renoir series. Signed-off-by: Prike.Liang <Prike.Liang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 13:06:05 -04:00
Prike.Liang	d844812b28	drm/amdgpu: enable ATHUB clock gatting Enable ATHUB clock gatting set in Renoir series. Signed-off-by: Prike.Liang <Prike.Liang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 13:06:05 -04:00
Dennis Li	08ebb485f0	drm/amdgpu: annotate a false positive recursive locking Re-apply commit `72e14ebf9f` [ 584.110304] ============================================ [ 584.110590] WARNING: possible recursive locking detected [ 584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G OE [ 584.111164] -------------------------------------------- [ 584.111456] kworker/38:1/553 is trying to acquire lock: [ 584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.112112] but task is already holding lock: [ 584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.113068] other info that might help us debug this: [ 584.113689] Possible unsafe locking scenario: [ 584.114350] CPU0 [ 584.114685] ---- [ 584.115014] lock(&adev->reset_sem); [ 584.115349] lock(&adev->reset_sem); [ 584.115678] * DEADLOCK * [ 584.116624] May be due to missing lock nesting notation [ 584.117284] 4 locks held by kworker/38:1/553: [ 584.117616] #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630 [ 584.117967] #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630 [ 584.118358] #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [ 584.118786] #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.119222] stack backtrace: [ 584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 [ 584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 584.121638] Call Trace: [ 584.122050] dump_stack+0x98/0xd5 [ 584.122499] __lock_acquire+0x1139/0x16e0 [ 584.122931] ? trace_hardirqs_on+0x3b/0xf0 [ 584.123358] ? cancel_delayed_work+0xa6/0xc0 [ 584.123771] lock_acquire+0xb8/0x1c0 [ 584.124197] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599] down_write+0x49/0x120 [ 584.125032] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125472] amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125910] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [ 584.126367] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [ 584.126789] process_one_work+0x29e/0x630 [ 584.127208] worker_thread+0x3c/0x3f0 [ 584.127621] ? __kthread_parkme+0x61/0x90 [ 584.128014] kthread+0x12f/0x150 [ 584.128402] ? process_one_work+0x630/0x630 [ 584.128790] ? kthread_park+0x90/0x90 [ 584.129174] ret_from_fork+0x3a/0x50 Each adev has owned lock_class_key to avoid false positive recursive locking. v2: 1. register adev->lock_key into lockdep, otherwise lockdep will report the below warning [ 1216.705820] BUG: key ffff890183b647d0 has not been registered! [ 1216.705924] ------------[ cut here ]------------ [ 1216.705972] DEBUG_LOCKS_WARN_ON(1) [ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210 v3: change to use down_write_nest_lock to annotate the false dead-lock warning. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 13:05:52 -04:00
Dennis Li	d95e8e97e2	drm/amdgpu: refine create and release logic of hive info Change to dynamically create and release hive info object, which help driver support more hives in the future. v2: Change to save hive object pointer in adev, to avoid locking xgmi_mutex every time when calling amdgpu_get_xgmi_hive. v3: 1. Change type of hive object pointer in adev from void* to amdgpu_hive_info*. 2. remove unnecessary variable initialization. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:24:14 -04:00
Dennis Li	aac891685d	drm/amdgpu: refine message print for devices of hive Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device of a hive is the message for. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:24:06 -04:00
Dennis Li	cbfd17f7ba	drm/amdgpu: fix the nullptr issue when reenter GPU recovery in single gpu system, if driver reenter gpu recovery, amdgpu_device_lock_adev will return false, but hive is nullptr now. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:23:54 -04:00
Dennis Li	6049db43d6	drm/amdgpu: change reset lock from mutex to rw_semaphore clients don't need reset-lock for synchronization when no GPU recovery. v2: change to return the return value of down_read_killable. v3: if GPU recovery begin, VF ignore FLR notification. Reviewed-by: Monk Liu <monk.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:23:48 -04:00
Lukas Bulwahn	5049a05269	drm/amd/display: remove unintended executable mode Besides the intended change, commit `4cc1178e16` ("drm/amdgpu: replace DRM prefix with PCI device info for gfx/mmhub") also set the source files mmhub_v1_0.c and gfx_v9_4.c to be executable, i.e., changed fromold mode 644 to new mode 755. Commit `241b2ec931` ("drm/amd/display: Add dcn30 Headers (v2)") added the four header files {dpcs,dcn}_3_0_0_{offset,sh_mask}.h as executable, i.e., mode 755. Set to the usual modes for source and headers files and clean up those mistakes. No functional change. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:23:02 -04:00
Dennis Li	53b3f8f40e	drm/amdgpu: refine codes to avoid reentering GPU recovery if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:22:56 -04:00
Dave Airlie	ebb21aa188	drm/ttm: drop bus.size from bus placement. This is always calculated the same, and only used in a couple of places. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200811074658.58309-2-airlied@gmail.com	2020-08-24 17:06:08 +10:00
Dave Airlie	098754fe3c	drm/ttm: init mem->bus in common code. The drivers all do the same thing here. Reviewed-by: Christian König <christian.koenig@amd.com> for both. Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200811074658.58309-1-airlied@gmail.com	2020-08-24 17:00:48 +10:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Dave Airlie	ba9086a6df	Merge tag 'amd-drm-fixes-5.9-2020-08-20' of git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.9-2020-08-20: amdgpu: - Fixes for Navy Flounder - Misc display fixes - RAS fix amdkfd: - SDMA fix for renoir Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200820041938.3928-1-alexander.deucher@amd.com	2020-08-21 10:17:52 +10:00
Jiansong Chen	da2446b66b	Revert "drm/amdgpu: disable gfxoff for navy_flounder" This reverts commit `9c9b17a7d1`. Newly released sdma fw (51.52) provides a fix for the issue. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-19 23:55:04 -04:00
Dave Airlie	485d41b092	Merge tag 'amd-drm-fixes-5.9-2020-08-12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.9-2020-08-12: amdgpu: - Fix allocation size - SR-IOV fixes - Vega20 SMU feature state caching fix - Fix custom pptable handling - Arcturus golden settings update - Several display fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200813033610.4008-1-alexander.deucher@amd.com	2020-08-19 13:56:28 +10:00
Leo Liu	cdab4211f6	drm/amdgpu/jpeg: remove redundant check when it returns Fix warning from kernel test robot v2: remove the local variable as well Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 18:22:16 -04:00
jqdeng	8e1d88f948	drm/amdgpu: Limit the error info print rate Use function printk_ratelimit to limit the print rate. Signed-off-by: jqdeng <Emily.Deng@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 18:22:10 -04:00
jqdeng	9a1cddd637	drm/amdgpu: Fix repeatly flr issue Only for no job running test case need to do recover in flr notification. For having job in mirror list, then let guest driver to hit job timeout, and then do recover. Signed-off-by: jqdeng <Emily.Deng@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 18:22:02 -04:00
Jiansong Chen	332d790365	Revert "drm/amdgpu: disable gfxoff for navy_flounder" This reverts commit `ba4e049e63`. Newly released sdma fw (51.52) provides a fix for the issue. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 18:20:34 -04:00
Luben Tuikov	9af5e21dac	drm/scheduler: Remove priority macro INVALID (v2) Remove DRM_SCHED_PRIORITY_INVALID. We no longer carry around an invalid priority and cut it off at the source. Backwards compatibility behaviour of AMDGPU CTX IOCTL passing in garbage for context priority from user space and then mapping that to DRM_SCHED_PRIORITY_NORMAL is preserved. v2: Revert "res" --> "r" and "prio" --> "priority". Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 18:20:26 -04:00
Luben Tuikov	e2d732fdb7	drm/scheduler: Scheduler priority fixes (v2) Remove DRM_SCHED_PRIORITY_LOW, as it was used in only one place. Rename and separate by a line DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT as it represents a (total) count of said priorities and it is used as such in loops throughout the code. (0-based indexing is the the count number.) Remove redundant word HIGH in priority names, and rename KERNEL to HIGH, as it really means that, high. v2: Add back KERNEL and remove SW and HW, in lieu of a single HIGH between NORMAL and KERNEL. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 18:20:17 -04:00
Huang Rui	34174b89bf	drm/amdkfd: fix the wrong sdma instance query for renoir Renoir only has one sdma instance, it will get failed once query the sdma1 registers. So use switch-case instead of static register array. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 17:54:12 -04:00
Bhawanpreet Lakha	0a668aee0a	drm/amdgpu: parse ta firmware for navy_flounder Use the same case as sienna_cichlid Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 17:53:50 -04:00
Guchun Chen	1a68d96f81	drm/amdgpu: fix NULL pointer access issue when unloading driver When unloading driver by "modprobe -r amdgpu", one NULL pointer dereference bug occurs in ras debugfs releasing. The cause is the duplicated debugfs_remove, as drm debugfs_root dir has been cleaned up already by drm_minor_unregister. BUG: kernel NULL pointer dereference, address: 00000000000000a0 PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 11 PID: 1526 Comm: modprobe Tainted: G OE 5.6.0-guchchen #1 Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 RIP: 0010:down_write+0x15/0x40 Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92 d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3 RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0 RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0 R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40 R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000 FS: 00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: simple_recursive_removal+0x63/0x370 ? debugfs_remove+0x60/0x60 debugfs_remove+0x40/0x60 amdgpu_ras_fini+0x82/0x230 [amdgpu] ? __kernfs_remove.part.17+0x101/0x1f0 ? kernfs_name_hash+0x12/0x80 amdgpu_device_fini+0x1c0/0x580 [amdgpu] amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu] amdgpu_pci_remove+0x36/0x60 [amdgpu] pci_device_remove+0x3b/0xb0 device_release_driver_internal+0xe5/0x1c0 driver_detach+0x46/0x90 bus_remove_driver+0x58/0xd0 pci_unregister_driver+0x29/0x90 amdgpu_exit+0x11/0x25 [amdgpu] __x64_sys_delete_module+0x13d/0x210 do_syscall_64+0x5f/0x250 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 17:00:46 -04:00
Jiansong Chen	9c9b17a7d1	drm/amdgpu: disable gfxoff for navy_flounder gfxoff is temporarily disabled for navy_flounder, since at present the feature has broken some basic amdgpu test. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 16:58:59 -04:00
Maxime Ripard	d85ddd1318	Linux 5.9-rc1 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl85kWkeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGGPwIAJpEmEBkMoQ+KARK PaaVDQW9fwAlC1nThMpGv/m8Ym7KbfLkTgEJQiQyNv3pDDhyLP8jvcZcscIkfs4s 56IMjFndRHWNeCVu9YPXWmAEp/WycZNC7YVPu0j1bI9VgvaHvbHOqUWzxB716RbY K4TFprJEA3sotNm0vdda2NgSlSup/0NVKiP2LwQPjkwH+Kf6/Ol1j2uxbWywEo75 BdW5LreDtUoJ7W5BeX8GJ0IVgWdyxBV61eVbaINNY3EOPc7+uMGOgR9oHeGWRceH V4ELYww5yjizUDtKFvVTc/k0tj+Rq73mtOADdaF0YWItqxtDBvAcdKIpC0KYzVaa 2fB+rts= =9Pnj -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXzvIXwAKCRDj7w1vZxhR xYXLAQC80uF6JkpBeNyuewyY7CDadDG1qDchDmYquwGVDnO+HwEAmvL84csLcxBy ah3UMOKUyWz5Sahlg48ZIaaUhRaulwE= =Lu/B -----END PGP SIGNATURE----- Merge v5.9-rc1 into drm-misc-next Sam needs 5.9-rc1 to have dev_err_probe in to merge some patches. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2020-08-18 14:14:25 +02:00
Kevin Wang	4444457450	drm/amdgpu: add condition check for trace_amdgpu_cs() v1: add trace event enabled check to avoid nop loop when submit multi ibs in amdgpu_cs_ioctl() function. v2: add a new wrapper function to trace all amdgpu cs ibs. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-17 14:06:37 -04:00
Kevin Wang	736b172978	drm/amdgpu: fix amdgpu_bo_release_notify() comment error fix amdgpu_bo_release_notify() comment error. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-17 14:06:28 -04:00
Huang Rui	d95c42a150	drm/amdkfd: fix the wrong sdma instance query for renoir Renoir only has one sdma instance, it will get failed once query the sdma1 registers. So use switch-case instead of static register array. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-17 14:06:13 -04:00
Alex Deucher	11043b7a99	drm/amdgpu: note what type of reset we are using When we reset the GPU, note what type of reset will be used. This makes debugging different reset scenarios more clear as the driver may use different reset methods depending on conditions on the system. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 17:03:20 -04:00
Alex Deucher	bddbacc9e0	drm/amdgpu: print where we get the vbios image from ACPI, ROM, PCI BAR, etc. Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 17:03:20 -04:00
Bhawanpreet Lakha	31e726ca3d	drm/amdgpu: parse ta firmware for navy_flounder Use the same case as sienna_cichlid Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:41 -04:00
James Zhu	ac1128c996	drm/amdgpu/vcn3.0: only SIENNA_CICHLID need specify instance for dec/enc Only SIENNA_CICHLID(VCN3) has 2 unsymmetrical instances, there're less codecs on instance 1, we use 0 for decode and 1 for encode. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:41 -04:00
Evan Quan	e098bc9612	drm/amd/pm: optimize the power related source code layout The target is to provide a clear entry point(for power routines). Also this can help to maintain a clear view about the frameworks used on different ASICs. Hopefully all these can make power part more friendly to play with. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:41 -04:00
Evan Quan	e9372d2371	drm/amd/powerplay: put those exposed power interfaces in amdgpu_dpm.c As other power interfaces. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:41 -04:00
Evan Quan	20d3c28ce4	drm/amd/powerplay: optimize i2c bus access implementation The caller needs not care about the internal details how the powerplay API implemented. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:41 -04:00
Evan Quan	70bdb6ed22	drm/amd/powerplay: drop unnecessary pp_funcs checker It's redundant. Also, the callers should not care about the implementation details. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:41 -04:00
Evan Quan	b89e9eb681	drm/amd/powerplay: optimize amdgpu_dpm_set_clockgating_by_smu() implementation Cover the implementation details from outside(of power part). Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:41 -04:00
Guchun Chen	ae2bf61ff3	drm/amdgpu: guard ras debugfs creation/removal based on CONFIG_DEBUG_FS It can avoid potential build warn/error when CONFIG_DEBUG_FS is not set. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Guchun Chen	2e2f5dd514	drm/amdgpu: fix NULL pointer access issue when unloading driver When unloading driver by "modprobe -r amdgpu", one NULL pointer dereference bug occurs in ras debugfs releasing. The cause is the duplicated debugfs_remove, as drm debugfs_root dir has been cleaned up already by drm_minor_unregister. BUG: kernel NULL pointer dereference, address: 00000000000000a0 PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 11 PID: 1526 Comm: modprobe Tainted: G OE 5.6.0-guchchen #1 Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 RIP: 0010:down_write+0x15/0x40 Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92 d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3 RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0 RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0 R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40 R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000 FS: 00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: simple_recursive_removal+0x63/0x370 ? debugfs_remove+0x60/0x60 debugfs_remove+0x40/0x60 amdgpu_ras_fini+0x82/0x230 [amdgpu] ? __kernfs_remove.part.17+0x101/0x1f0 ? kernfs_name_hash+0x12/0x80 amdgpu_device_fini+0x1c0/0x580 [amdgpu] amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu] amdgpu_pci_remove+0x36/0x60 [amdgpu] pci_device_remove+0x3b/0xb0 device_release_driver_internal+0xe5/0x1c0 driver_detach+0x46/0x90 bus_remove_driver+0x58/0xd0 pci_unregister_driver+0x29/0x90 amdgpu_exit+0x11/0x25 [amdgpu] __x64_sys_delete_module+0x13d/0x210 do_syscall_64+0x5f/0x250 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Christian König	f1403342eb	drm/amdgpu: revert "fix system hang issue during GPU reset" The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit `df9c8d1aa2`. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Evan Quan	9f979a49e2	drm/amd/powerplay: enable swSMU mgpu fan boost support Enable mgpu fan boost feature on swSMU routines. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Evan Quan	f10bb940d8	drm/amd/powerplay: optimize the interface for mgpu fan boost enablement Cover the implementation details from outside(of power). Also preparing for expanding this to swSMU. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Jiansong Chen	ba4e049e63	drm/amdgpu: disable gfxoff for navy_flounder gfxoff is temporarily disabled for navy_flounder, since at present the feature has broken some basic amdgpu test. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Oak Zeng	9fb1506eb6	drm/amdgpu: Use function pointer for some mmhub functions Add more function pointers to amdgpu_mmhub_funcs. ASIC specific implementation of most mmhub functions are called from a general function pointer, instead of calling different function for different ASIC. Simplify the code by deleting duplicate functions Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Nirmoy Das	2f53072434	drm/amdgpu: pass NULL pointer instead of 0 Fixes: `c030f2e416` ("drm/amdgpu: add amdgpu_ras.c to support ras (v2)") Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Dennis Li	72e14ebf9f	drm/amdgpu: annotate a false positive recursive locking [ 584.110304] ============================================ [ 584.110590] WARNING: possible recursive locking detected [ 584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G OE [ 584.111164] -------------------------------------------- [ 584.111456] kworker/38:1/553 is trying to acquire lock: [ 584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.112112] but task is already holding lock: [ 584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.113068] other info that might help us debug this: [ 584.113689] Possible unsafe locking scenario: [ 584.114350] CPU0 [ 584.114685] ---- [ 584.115014] lock(&adev->reset_sem); [ 584.115349] lock(&adev->reset_sem); [ 584.115678] * DEADLOCK * [ 584.116624] May be due to missing lock nesting notation [ 584.117284] 4 locks held by kworker/38:1/553: [ 584.117616] #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630 [ 584.117967] #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630 [ 584.118358] #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [ 584.118786] #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.119222] stack backtrace: [ 584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 [ 584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 584.121638] Call Trace: [ 584.122050] dump_stack+0x98/0xd5 [ 584.122499] __lock_acquire+0x1139/0x16e0 [ 584.122931] ? trace_hardirqs_on+0x3b/0xf0 [ 584.123358] ? cancel_delayed_work+0xa6/0xc0 [ 584.123771] lock_acquire+0xb8/0x1c0 [ 584.124197] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599] down_write+0x49/0x120 [ 584.125032] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125472] amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125910] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [ 584.126367] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [ 584.126789] process_one_work+0x29e/0x630 [ 584.127208] worker_thread+0x3c/0x3f0 [ 584.127621] ? __kthread_parkme+0x61/0x90 [ 584.128014] kthread+0x12f/0x150 [ 584.128402] ? process_one_work+0x630/0x630 [ 584.128790] ? kthread_park+0x90/0x90 [ 584.129174] ret_from_fork+0x3a/0x50 Each adev has owned lock_class_key to avoid false positive recursive locking. v2: 1. register adev->lock_key into lockdep, otherwise lockdep will report the below warning [ 1216.705820] BUG: key ffff890183b647d0 has not been registered! [ 1216.705924] ------------[ cut here ]------------ [ 1216.705972] DEBUG_LOCKS_WARN_ON(1) [ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210 v3: change to use down_write_nest_lock to annotate the false dead-lock warning. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Wenhui Sheng	a4322e1881	drm/amdgpu: add debugfs interface for RAP test After amdgpu driver loading successfully, we can use RAP debugfs interface <debugfs_dir>/dri/xxx/rap_test to trigger RAP test. Currently only L0 validate test is supported. v2: refine amdgpu_rap.h Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Guchun Chen <Guchun.Chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Wenhui Sheng	8602692b6f	drm/amdgpu: enable RAP TA load Enable the RAP TA loading path and add RAP test trigger interface. v2: fix potential mem leak issue Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Guchun Chen <Guchun.Chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:39 -04:00
Wenhui Sheng	a189d0ae0c	drm/amdgpu: add RAP TA header file The RAP TA contains tests used to verify if RAP(Register Access Policy), or otherwise known as Security Policy is applied correctly by PSP BL&TOS. The RAP test is a measure to ensure that we reduce the avenue of complexity and mistakes when dealing with RAP in post-si execution, where debugging failures related to RAP is quite difficult and expensive. v2: add introduction for RAP TA Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Guchun Chen <Guchun.Chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:39 -04:00
Tianci.Yin	425a78f43b	drm/amdgpu: reconfigure spm golden settings on Navi1x after GFXOFF exit(v3) On Navi1x, the SPM golden settings are lost after GFXOFF enter/exit, so reconfigure the golden settings after GFXOFF exit. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:39 -04:00
Tianci.Yin	d58fe3cf11	drm/amdgpu: add interface amdgpu_gfx_init_spm_golden for Navi1x On Navi1x, the SPM golden settings are lost after GFXOFF enter/exit, so reconfiguration is needed. Make the configuration code as an interface for future use. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:29 -04:00
Guchun Chen	66459e1db2	drm/amdgpu: add debugfs node to toggle ras error cnt harvest Before ras recovery is issued, user could operate this debugfs node to enable/disable the harvest of all RAS IPs' ras error count registers, which will help keep hardware's registers' status instead of cleaning up them. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:12:47 -04:00
Guchun Chen	f75e94d868	drm/amdgpu: bypass querying ras error count registers Once ras recovery is issued by ras sync flood interrupt or ras controller interrupt, add this guard to bypass or execute ras error count register harvest of all IPs. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:12:22 -04:00
Linus Torvalds	ea6ec77437	drm fixes for 5.9-rc1 core: - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi - Remove null check for kfree in drm_dev_release. - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition. - re-added docs for drm_gem_flink_ioctl() - add orientation quirk for ASUS T103HAF ttm: - ttm: fix page-offset calculation within TTM - revert patch causing vmwgfx regressions fbcon: - Fix a fbcon OOB read in fbdev, found by syzbot. vga: - Mark vga_tryget static as it's not used elsewhere. amdgpu: - Re-add spelling typo fix - Sienna Cichlid fixes - Navy Flounder fixes - DC fixes - SMU i2c fix - Power fixes vmwgfx: - regression fixes for modesetting crashes - misc fixes xlnx: - Small fixes to xlnx. omap: - Fix mode initialization in omap_connector_mode_valid(). - force runtime PM suspend on system suspend tidss: - fix modeset init for DPI panels -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJfM3QrAAoJEAx081l5xIa+4H8P/0YLHHK52yuYqyYEtvfehlDC i3ur47QdQS/6fkspVSbVndsToQS+D9ZNAZFUUSumVr1kR40yGC6oGZGmo+UWYH+T AiZC1LHmz8hsGMni40uCb+ble1mtcsbqjAkb5hG/9RxPfBiunQ6OJJGU7X2FoB3e TZNWwYKyF/Y5f/OlKdRbi9qfszWwWb/+ZW94q9ER1MjlcwrW3EqgRrkRGT3hH4UO AvzYhgGlzz7FXwQic3mJw3Sirm6NnbuUv9aRPRBCGm9i/BDSRHFpIcXIuLuMsiOk L/WtdYodMDbhB0xRKwDQzf9TIUxSwulrXO0Jfo8CVKknyHha62fz2bMGpf+4ron2 3WrZmV+5w0Q0qcfbh1EgHkVXmDry+mtFws1dvAtvyNp7GSI675tNI2qH3cgLfvQX FgMrfbmOp+f+ohXL7fUw+J/7cUjrVYZxtkCAEctB/6B44INjFdRuwQfP98x5CcmX CpZHLJKsMI+Y6r6Jno5foIQJIZJrDiUJLhRE2sN7cWv54+2gmHXbqsLAfMMpB6Mc 2gMZNIwB+qSVjULR+MHvzC3iZHHTkD3hijkYshb8+s5pICGo+eq2v4S/r31C4Wn1 DpToox1t/RPY+WPS5dbi95zxg1QDwr/BsExTCKTUB2k+ovoWE6byPvt2Vslh+F/K i+LrkziEqKb/BhyGA7z9 =dr7v -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "This has a few vmwgfx regression fixes we hit from the merge window (one in TTM), it also has a bunch of amdgpu fixes along with a scattering everywhere else. core: - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi - Remove null check for kfree in drm_dev_release. - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition. - re-added docs for drm_gem_flink_ioctl() - add orientation quirk for ASUS T103HAF ttm: - ttm: fix page-offset calculation within TTM - revert patch causing vmwgfx regressions fbcon: - Fix a fbcon OOB read in fbdev, found by syzbot. vga: - Mark vga_tryget static as it's not used elsewhere. amdgpu: - Re-add spelling typo fix - Sienna Cichlid fixes - Navy Flounder fixes - DC fixes - SMU i2c fix - Power fixes vmwgfx: - regression fixes for modesetting crashes - misc fixes xlnx: - Small fixes to xlnx. omap: - Fix mode initialization in omap_connector_mode_valid(). - force runtime PM suspend on system suspend tidss: - fix modeset init for DPI panels" * tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm: (70 commits) drm/ttm: revert "drm/ttm: make TT creation purely optional v3" drm/vmwgfx: fix spelling mistake "Cant" -> "Can't" drm/vmwgfx: fix spelling mistake "Cound" -> "Could" drm/vmwgfx/ldu: Use drm_mode_config_reset drm/vmwgfx/sou: Use drm_mode_config_reset drm/vmwgfx/stdu: Use drm_mode_config_reset drm/vmwgfx: Fix two list_for_each loop exit tests drm/vmwgfx: Use correct vmw_legacy_display_unit pointer drm/vmwgfx: Use struct_size() helper drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3) drm/amd/powerplay: update swSMU VCN/JPEG PG logics drm/amdgpu: use mode1 reset by default for sienna_cichlid drm/amdgpu/smu: rework i2c adpater registration drm/amd/display: Display goes blank after inst drm/amd/display: Change null plane state swizzle mode to 4kb_s drm/amd/display: Use helper function to check for HDMI signal drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink drm/amd/display: Fix logger context drm/amd/display: populate new dml variable ...	2020-08-12 11:53:01 -07:00
Thomas Zimmermann	534b1f9071	Merge drm/drm-next into drm-misc-next Backmerging drm-next into drm-misc-next for nouveau and panel updates. Resolves a conflict between ttm and nouveau, where struct ttm_mem_res got renamed to struct ttm_resource. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2020-08-12 20:42:08 +02:00
Christian König	b2458726b3	drm/ttm: give resource functions their own [ch] files This is a separate object we work within TTM. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/384338/?series=80346&rev=1	2020-08-12 15:51:03 +02:00
Christian König	e92ae67d6e	drm/ttm: rename ttm_resource_manager_func callbacks The names get/put are associated with reference counting in the Linux kernel, use alloc/free instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/384340/?series=80346&rev=1	2020-08-12 15:50:51 +02:00
Arunpravin	0cf0ee983b	drm/amdgpu: Enable P2P dmabuf over XGMI Access the exported P2P dmabuf over XGMI, if available. Otherwise, fall back to the existing PCIe method. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arunpravin <apaneers@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-11 11:47:35 -04:00
Oleg Vasilev	65bf2cf95d	drm/amdgpu: utilize subconnector property for DP through atombios Since DP-specific information is stored in driver's structures, every driver needs to implement subconnector property by itself. v2: rebase v3: renamed a function call Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: David (ChunMing) Zhou <David1.Zhou@amd.com> Cc: amd-gfx@lists.freedesktop.org Signed-off-by: Jeevan B <jeevan.b@intel.com> Signed-off-by: Oleg Vasilev <oleg.vasilev@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1587732655-17544-4-git-send-email-jeevan.b@intel.com	2020-08-11 14:19:41 +02:00
Dave Airlie	16e6eea29d	Merge tag 'amd-drm-fixes-5.9-2020-08-07' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-fixes-5.9-2020-08-07: amdgpu: - Re-add spelling typo fix - Sienna Cichlid fixes - Navy Flounder fixes - DC fixes - SMU i2c fix - Power fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200807222843.3909-1-alexander.deucher@amd.com	2020-08-11 13:08:45 +10:00
Linus Torvalds	97d052ea3f	A set of locking fixes and updates: - Untangle the header spaghetti which causes build failures in various situations caused by the lockdep additions to seqcount to validate that the write side critical sections are non-preemptible. - The seqcount associated lock debug addons which were blocked by the above fallout. seqcount writers contrary to seqlock writers must be externally serialized, which usually happens via locking - except for strict per CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot validate that the lock is held. This new debug mechanism adds the concept of associated locks. sequence count has now lock type variants and corresponding initializers which take a pointer to the associated lock used for writer serialization. If lockdep is enabled the pointer is stored and write_seqcount_begin() has a lockdep assertion to validate that the lock is held. Aside of the type and the initializer no other code changes are required at the seqcount usage sites. The rest of the seqcount API is unchanged and determines the type at compile time with the help of _Generic which is possible now that the minimal GCC version has been moved up. Adding this lockdep coverage unearthed a handful of seqcount bugs which have been addressed already independent of this. While generaly useful this comes with a Trojan Horse twist: On RT kernels the write side critical section can become preemtible if the writers are serialized by an associated lock, which leads to the well known reader preempts writer livelock. RT prevents this by storing the associated lock pointer independent of lockdep in the seqcount and changing the reader side to block on the lock when a reader detects that a writer is in the write side critical section. - Conversion of seqcount usage sites to associated types and initializers. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8xmPYTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoTuQEACyzQCjU8PgehPp9oMqWzaX2fcVyuZO QU2yw6gmz2oTz3ZHUNwdW8UnzGh2OWosK3kDruoD9FtSS51lER1/ISfSPCGfyqxC KTjOcB1Kvxwq/3LcCx7Zi3ZxWApat74qs3EhYhKtEiQ2Y9xv9rLq8VV1UWAwyxq0 eHpjlIJ6b6rbt+ARslaB7drnccOsdK+W/roNj4kfyt+gezjBfojGRdMGQNMFcpnv shuTC+vYurAVIiVA/0IuizgHfwZiXOtVpjVoEWaxg6bBH6HNuYMYzdSa/YrlDkZs n/aBI/Xkvx+Eacu8b1Zwmbzs5EnikUK/2dMqbzXKUZK61eV4hX5c2xrnr1yGWKTs F/juh69Squ7X6VZyKVgJ9RIccVueqwR2EprXWgH3+RMice5kjnXH4zURp0GHALxa DFPfB6fawcH3Ps87kcRFvjgm6FBo0hJ1AxmsW1dY4ACFB9azFa2euW+AARDzHOy2 VRsUdhL9CGwtPjXcZ/9Rhej6fZLGBXKr8uq5QiMuvttp4b6+j9FEfBgD4S6h8csl AT2c2I9LcbWqyUM9P4S7zY/YgOZw88vHRuDH7tEBdIeoiHfrbSBU7EQ9jlAKq/59 f+Htu2Io281c005g7DEeuCYvpzSYnJnAitj5Lmp/kzk2Wn3utY1uIAVszqwf95Ul 81ppn2KlvzUK8g== =7Gj+ -----END PGP SIGNATURE----- Merge tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Thomas Gleixner: "A set of locking fixes and updates: - Untangle the header spaghetti which causes build failures in various situations caused by the lockdep additions to seqcount to validate that the write side critical sections are non-preemptible. - The seqcount associated lock debug addons which were blocked by the above fallout. seqcount writers contrary to seqlock writers must be externally serialized, which usually happens via locking - except for strict per CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot validate that the lock is held. This new debug mechanism adds the concept of associated locks. sequence count has now lock type variants and corresponding initializers which take a pointer to the associated lock used for writer serialization. If lockdep is enabled the pointer is stored and write_seqcount_begin() has a lockdep assertion to validate that the lock is held. Aside of the type and the initializer no other code changes are required at the seqcount usage sites. The rest of the seqcount API is unchanged and determines the type at compile time with the help of _Generic which is possible now that the minimal GCC version has been moved up. Adding this lockdep coverage unearthed a handful of seqcount bugs which have been addressed already independent of this. While generally useful this comes with a Trojan Horse twist: On RT kernels the write side critical section can become preemtible if the writers are serialized by an associated lock, which leads to the well known reader preempts writer livelock. RT prevents this by storing the associated lock pointer independent of lockdep in the seqcount and changing the reader side to block on the lock when a reader detects that a writer is in the write side critical section. - Conversion of seqcount usage sites to associated types and initializers" * tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) locking/seqlock, headers: Untangle the spaghetti monster locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header x86/headers: Remove APIC headers from <asm/smp.h> seqcount: More consistent seqprop names seqcount: Compress SEQCNT_LOCKNAME_ZERO() seqlock: Fold seqcount_LOCKNAME_init() definition seqlock: Fold seqcount_LOCKNAME_t definition seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g hrtimer: Use sequence counter with associated raw spinlock kvm/eventfd: Use sequence counter with associated spinlock userfaultfd: Use sequence counter with associated spinlock NFSv4: Use sequence counter with associated spinlock iocost: Use sequence counter with associated spinlock raid5: Use sequence counter with associated spinlock vfs: Use sequence counter with associated spinlock timekeeping: Use sequence counter with associated raw spinlock xfrm: policy: Use sequence counters with associated lock netfilter: nft_set_rbtree: Use sequence counter with associated rwlock netfilter: conntrack: Use sequence counter with associated spinlock sched: tasks: Use sequence counter with associated spinlock ...	2020-08-10 19:07:44 -07:00
Dave Airlie	c44264f9f7	Linux 5.8 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8nLmkeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGBEkH/RJAnEan4gcdkBDf 2xS0yk4XLMjKZwbz61VSeKMUBkGCsh1cWsaAtJJAIVYj/6o/7mld01sZCnOASLnJ ET0nXL2NiT/f+prYkTE5qeYH225/Yfh5jgmrqZtx/uXFCwgE5Nzi3f72IXQDmCR+ kmpNhNox3YqQTKXhv7DXobDKcO0n8nZavnhxmA9SBZn2h9RHvmvJghD0UOfLjMpA 1SbknaE67n5JN/JjI6TkYWk4nuJmqfvmBL5IYVDEZYO4UlM5Bqzhw0XN7Ax70K3M KRK/eiqRmNwun5MxWnbzQU7t7iTgVmzjHLTpWGcM3V4blgGXC3uhjc+p/R8KTQUE bIydSzs= =fDeo -----END PGP SIGNATURE----- Merge tag 'v5.8' into drm-next I need to backmerge 5.8 as I've got a bunch of fixes sitting on an rc7 base that I want to land. Signed-off-by: Dave Airlie <airlied@redhat.com>	2020-08-11 11:58:31 +10:00
shiwu.zhang	97a9b60fa3	drm/amdgpu: update gc golden register for arcturus Update golden setting to improve performance on HPC and ML apps Signed-off-by: shiwu.zhang <shiwu.zhang@amd.com> Tested-by: gang.long <gang.long@amd.com> Reviewed-by: guchun.chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-10 18:08:21 -04:00
Liu ChengZhe	d5bbb4761c	drm/amdgpu: Skip some registers config for SRIOV Some registers are not accessible to virtual function setup, so skip their initialization when in VF-SRIOV mode. v2: move SRIOV VF check into specify functions; modify commit description and comment. Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-10 18:06:35 -04:00
Christophe JAILLET	78484d7c74	drm: amdgpu: Use the correct size when allocating memory When 'sgt' is allocated, we must allocated 'sizeof(sgt)' bytes instead of 'sizeof(sg)'. The sizeof(sg) is bigger than sizeof(*sgt) so this wastes memory but it won't lead to corruption. Fixes: `f44ffd677f` ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-08-10 18:06:10 -04:00
Monk Liu	bcca629806	drm/amdgpu: fix reload KMD hang on GFX10 KIQ GFX10 KIQ will hang if we try below steps: modprobe amdgpu rmmod amdgpu modprobe amdgpu sched_hw_submission=4 Due to KIQ is always living there even after KMD unloaded thus when doing the realod KIQ will crash upon its register being programed by different values with the previous loading (the config like HQD addr, ring size, is easily changed if we alter the sched_hw_submission) the fix is we must inactive KIQ first before touching any of its registgers Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-10 17:26:52 -04:00
shiwu.zhang	5a58abf5ed	drm/amdgpu: update gc golden register for arcturus Update golden setting to improve performance on HPC and ML apps Signed-off-by: shiwu.zhang <shiwu.zhang@amd.com> Tested-by: gang.long <gang.long@amd.com> Reviewed-by: guchun.chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-10 17:26:52 -04:00
Liu ChengZhe	1d44732619	drm/amdgpu: Skip some registers config for SRIOV Some registers are not accessible to virtual function setup, so skip their initialization when in VF-SRIOV mode. v2: move SRIOV VF check into specify functions; modify commit description and comment. Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-10 17:26:52 -04:00
Christophe JAILLET	5068ed578e	drm: amdgpu: Use the correct size when allocating memory When 'sgt' is allocated, we must allocated 'sizeof(sgt)' bytes instead of 'sizeof(sg)'. The sizeof(sg) is bigger than sizeof(*sgt) so this wastes memory but it won't lead to corruption. Fixes: `f44ffd677f` ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-10 17:19:12 -04:00
Dave Airlie	373627930f	drm/amdgpu/ttm: drop the adev link from vram mgr There is no need for that now since it's embedded. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806233459.4057784-3-airlied@gmail.com	2020-08-10 10:33:50 +10:00
Dave Airlie	4f297b9c82	drm/amdgpu/ttm: move vram/gtt mgr allocations to mman. Christian suggested this and it makes sense. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806233459.4057784-2-airlied@gmail.com	2020-08-10 10:33:35 +10:00
Likun Gao	f2e2573c08	drm/amdgpu: use mode1 reset by default for sienna_cichlid Swith default gpu reset method for sienna_cichlid to MODE1 reset. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-07 17:47:25 -04:00
Dennis Li	94561899dd	drm/amdgpu: unlock mutex on error Make sure to unlock the mutex when error happen v2: 1. correct syntax error in the commit comments 2. remove change-Id Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-07 17:31:26 -04:00
Likun Gao	ca6fd7a668	drm/amdgpu: use mode1 reset by default for sienna_cichlid Swith default gpu reset method for sienna_cichlid to MODE1 reset. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-07 17:29:29 -04:00
Christian König	77f47d2395	drm/amdgpu: make sure userptr ttm is allocated We need to allocate that manually now. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Link: https://patchwork.freedesktop.org/patch/384330/	2020-08-07 16:02:43 +02:00
Jiansong Chen	a15383893f	drm/amdgpu: enable GFXOFF for navy_flounder Enable GFXOFF for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:45:46 -04:00
Liu ChengZhe	d392aa02db	drm amdgpu: Skip tmr load for SRIOV 1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation; 2. Check pointer before release firmware. v2: use CHIP_SIENNA_CICHLID instead v3: remove local "bool ret"; fix grammer issue v4: use my name instead of "root" v5: fix grammer issue and indent issue Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:44:59 -04:00
Liu ChengZhe	6b6124bb4a	drm/amdgpu: fix PSP autoload twice in FLR Assigning false to block->status.hw overwrites PSP's previous hardware status, which causes the PSP to Resume operation after hardware init. Remove this assignment and let the PSP execute Resume operation when it is told to. v2: Remove the braces. v3: Modify the description. Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:44:41 -04:00
Jiansong Chen	278a4b5f62	drm/amdgpu: update GC golden setting for navy_flounder Update GC golden setting for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:43:58 -04:00
Huang Rui	a676a97623	drm/amdgpu: skip crit temperature values on APU (v2) It doesn't expose PPTable descriptor on APU platform. So max/min temperature values cannot be got from APU platform. v2: Stoney needs to skip crit temperature as well. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:43:15 -04:00
Boyuan Zhang	bbf16f530c	drm/amdgpu: update dec ring test for VCN 3.0 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:33:18 -04:00
James Zhu	fa40a00900	drm/amdgpu/jpeg3.0: remove extra asic type check jpeg ip block is already selected based on ASIC type during set_ip_blocks. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:30:03 -04:00
Likun Gao	3da0a2dfbd	drm/amdgpu: update golden setting for sienna_cichlid Update golden setting for sienna_cichlid. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:28:29 -04:00
Guchun Chen	ee10e06eb0	drm/amdgpu: add printing after executing page reservation to eeprom This will tell users if the faulty page has been written to external eeprom device in dmesg log. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:24:11 -04:00
John Clements	ebfbd1c2ca	drm/amdgpu: expand sienna chichlid reg access support Added dedicated 64bit reg read/write support Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:23:20 -04:00
Colin Ian King	2342ef4e60	drm/amdgpu: fix spelling mistake "Falied" -> "Failed" There is a spelling mistake in a DRM_ERROR error message. Fix it. This got lost in a merge, restore the fix. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `4afaa61db9`)	2020-08-06 16:16:57 -04:00
Pierre-Eric Pelloux-Prayer	16c642ec3f	drm/amdgpu: new ids flag for tmz (v2) Allows UMD to know if TMZ is supported and enabled. This commit also bumps KMS_DRIVER_MINOR because if we don't UMD can't tell if "ids_flags & AMDGPU_IDS_FLAGS_TMZ == 0" means "tmz is not enabled" or "tmz may be enabled but the kernel doesn't report it". v2: use amdgpu_is_tmz() and reworded commit message. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:45:52 -04:00
Evan Quan	25c933b1c4	drm/amd/powerplay: add new sysfs interface for retrieving gpu metrics(V2) A new interface for UMD to retrieve gpu metrics data. V2: rich the documentation Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:43:56 -04:00
Colin Ian King	c16ce56240	drm/amdgpu: fix spelling mistake "paramter" -> "parameter" There is a spelling mistake in a dev_warn message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:43:26 -04:00
Philip Yang	b80f050ff2	drm/amdkfd: option to disable system mem limit If multiple process share system memory through /dev/shm, KFD allocate memory should not fail if it reaches the system memory limit because one copy of physical system memory are shared by multiple process. Add module parameter no_system_mem_limit to provide user option to disable system memory limit check at runtime using sysfs or during driver module init using kernel boot argument. By default the system memory limit is on. Print out debug message to warn user if KFD allocate memory failed because system memory reaches limit. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:43:08 -04:00
Ingo Molnar	a703f3633f	Merge branch 'WIP.locking/seqlocks' into locking/urgent Pick up the full seqlock series PeterZ is working on. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-08-06 10:16:38 +02:00
Dave Airlie	2966141ad2	drm/ttm: rename ttm_mem_reg to ttm_resource. This name better reflects what the object does. I didn't rename all the pointers it seemed too messy. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-60-airlied@gmail.com	2020-08-06 13:19:21 +10:00
Dave Airlie	9de59bc201	drm/ttm: rename ttm_mem_type_manager -> ttm_resource_manager. This name makes a lot more sense, since these are about managing driver resources rather than just memory ranges. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-59-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	90a0489a71	drm/ttm: drop type manager has_type under driver control, this flag isn't needed anymore, remove the API that used to access it, and consoldiate with the used api. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-56-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	7541ce1a6f	drm/ttm: drop man->bdev link. This link isn't needed anymore, drop it from the init interface. Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-54-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	a29050c4cd	drm/amdgpu/ttm: remove man->bdev references. Just store the device in the private so the link can be removed from the manager Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-53-airlied@gmail.com	2020-08-06 13:12:40 +10:00
Dave Airlie	37205891d8	drm/ttm: make ttm_range_man_init/takedown take type + args This makes it easier to move these to a driver allocated system Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-47-airlied@gmail.com	2020-08-06 13:12:39 +10:00
Dave Airlie	0af135b892	drm/amdgpu/ttm: use bo manager subclassing for vram/gtt mgrs Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-46-airlied@gmail.com	2020-08-06 13:12:21 +10:00
Linus Torvalds	8186749621	drm next for 5.9-rc1 core: - add user def flag to cmd line modes - dma_fence_wait added might_sleep - dma-fence lockdep annotations - indefinite fences are bad documentation - gem CMA functions used in more drivers - struct mutex removal - more drm_ debug macro usage - set/drop master api fixes - fix for drm/mm hole size comparison - drm/mm remove invalid entry optimization - optimise drm/mm hole handling - VRR debugfs added - uncompressed AFBC modifier support - multiple display id blocks in EDID - multiple driver sg handling fixes - __drm_atomic_helper_crtc_reset in all drivers - managed vram helpers ttm: - ttm_mem_reg handling cleanup - remove bo offset field - drop CMA memtype flag - drop mappable flag xilinx: - New Xilinx ZynqMP DisplayPort Subsystem driver nouveau: - add CRC support - start using NVIDIA published class header files - convert all push buffer emission to new macros - Proper push buffer space management for EVO/NVD channels. - firmware loading fixes - 2MiB system memory pages support on Pascal and newer vkms: - larget cursor support i915: - Rocketlake platform enablement - Early DG1 enablement - Numerous GEM refactorings - DP MST fixes - FBC, PSR, Cursor, Color, Gamma fixes - TGL, RKL, EHL workaround updates - TGL 8K display support fixes - SDVO/HDMI/DVI fixes amdgpu: - Initial support for Sienna Cichlid GPU - Initial support for Navy Flounder GPU - SI UVD/VCE support - expose rotation property - Add support for unique id on Arcturus - Enable runtime PM on vega10 boards that support BACO - Skip BAR resizing if the bios already did id - Major swSMU code cleanup - Fixes for DCN bandwidth calculations amdkfd: - Track SDMA usage per process - SMI events interface radeon: - Default to on chip GART for AGP boards on all arches - Runtime PM reference count fixes msm: - headers regenerated causing churn - a650/a640 display and GPU enablement - dpu dither support for 6bpc panels - dpu cursor fix - dsi/mdp5 enablement for sdm630/sdm636/sdm66 tegra: - video capture prep support - reflection support mediatek: - convert mtk_dsi to bridge API meson: - FBC support sun4i: - iommu support rockchip: - register locking fix - per-pixel alpha support PX30 VOP - mgag200: - ported to simple and shmem helpers - device init cleanups - use managed pci functions - dropped hw cursor support ast: - use managed pci functions - use managed VRAM helpers - rework cursor support malidp: - dev_groups support hibmc: - refactor hibmc_drv_vdac: vc4: - create TXP CRTC imx: - error path fixes and cleanups etnaviv: - clock handling and error handling cleanups - use pin_user_pages -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJfK1atAAoJEAx081l5xIa+vDkQAJvl/mjbEA7fDy8Ysa0cgPLI 8nI4Bo/MaxkyRfUcP8+f/n3QQrRME37C0xa/Mn6SG1oFAdlovPwDqmDr5kjhkrMI geo8oJb2Q+AsrJr+ejpuF+iq0FxWi64bLbwJFJ2nBet+lHTMzoPceWeq3gG1Vvfl h6PV4B/9TjrnbhcKLIQSEmJ0kZp9uMkDBF/iynVn4+AKAkG1rQNjigdTH48IFPoz 28KuqG0B4NWu648zYXhjsN0kD3Dxjv3YOH+FsoWQpQa9icCTySYbySsQ7l0/XvA3 4BPtP3rWMhU46FHTBkWF72WQR4F0B4wm5DJJKMeG4vb1mXakOqAKcAq7JAbka+wL PBIiU+AcAKRSiwHmYDuDWtDoSpvYncuo0p3IvNP5hhih+7hqCnLIULDWS+V8AUzW 39usS/DXsVKk/HGYIYC89cRwsqWYD4c7edzOBdPQxW4LNYCD2gXezLJ5TeeR2lih y9JIVnPiluWleOovs4W3BoZNRuLc1rHBO6COToXjlme/48Z+sRHBAoge6UZurqRN jr+e60cS7n/DOeJQuNf4UHZnK48Pc24+3kVfMHlX+OKn8VuKPGr+USkeHV/NYL/B USiKCAxkkZM0dxerSb1/Ra9kGnchf0QBpA6Fsem8kV61Z4GVc+K6xJWg7KXB6n/3 7ZyalUKLwlOCz9sYsCCe =Yvtd -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "New xilinx displayport driver, AMD support for two new GPUs (more header files), i915 initial support for RocketLake and some work on their DG1 (discrete chip). The core also grew some lockdep annotations to try and constrain what drivers do with dma-fences, and added some documentation on why the idea of indefinite fences doesn't work. The long list is below. I do have some fixes trees outstanding, but I'll follow up with those later. core: - add user def flag to cmd line modes - dma_fence_wait added might_sleep - dma-fence lockdep annotations - indefinite fences are bad documentation - gem CMA functions used in more drivers - struct mutex removal - more drm_ debug macro usage - set/drop master api fixes - fix for drm/mm hole size comparison - drm/mm remove invalid entry optimization - optimise drm/mm hole handling - VRR debugfs added - uncompressed AFBC modifier support - multiple display id blocks in EDID - multiple driver sg handling fixes - __drm_atomic_helper_crtc_reset in all drivers - managed vram helpers ttm: - ttm_mem_reg handling cleanup - remove bo offset field - drop CMA memtype flag - drop mappable flag xilinx: - New Xilinx ZynqMP DisplayPort Subsystem driver nouveau: - add CRC support - start using NVIDIA published class header files - convert all push buffer emission to new macros - Proper push buffer space management for EVO/NVD channels. - firmware loading fixes - 2MiB system memory pages support on Pascal and newer vkms: - larger cursor support i915: - Rocketlake platform enablement - Early DG1 enablement - Numerous GEM refactorings - DP MST fixes - FBC, PSR, Cursor, Color, Gamma fixes - TGL, RKL, EHL workaround updates - TGL 8K display support fixes - SDVO/HDMI/DVI fixes amdgpu: - Initial support for Sienna Cichlid GPU - Initial support for Navy Flounder GPU - SI UVD/VCE support - expose rotation property - Add support for unique id on Arcturus - Enable runtime PM on vega10 boards that support BACO - Skip BAR resizing if the bios already did id - Major swSMU code cleanup - Fixes for DCN bandwidth calculations amdkfd: - Track SDMA usage per process - SMI events interface radeon: - Default to on chip GART for AGP boards on all arches - Runtime PM reference count fixes msm: - headers regenerated causing churn - a650/a640 display and GPU enablement - dpu dither support for 6bpc panels - dpu cursor fix - dsi/mdp5 enablement for sdm630/sdm636/sdm66 tegra: - video capture prep support - reflection support mediatek: - convert mtk_dsi to bridge API meson: - FBC support sun4i: - iommu support rockchip: - register locking fix - per-pixel alpha support PX30 VOP mgag200: - ported to simple and shmem helpers - device init cleanups - use managed pci functions - dropped hw cursor support ast: - use managed pci functions - use managed VRAM helpers - rework cursor support malidp: - dev_groups support hibmc: - refactor hibmc_drv_vdac: vc4: - create TXP CRTC imx: - error path fixes and cleanups etnaviv: - clock handling and error handling cleanups - use pin_user_pages" * tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm: (1747 commits) drm/msm: use kthread_create_worker instead of kthread_run drm/msm/mdp5: Add MDP5 configuration for SDM636/660 drm/msm/dsi: Add DSI configuration for SDM660 drm/msm/mdp5: Add MDP5 configuration for SDM630 drm/msm/dsi: Add phy configuration for SDM630/636/660 drm/msm/a6xx: add A640/A650 hwcg drm/msm/a6xx: hwcg tables in gpulist drm/msm/dpu: add SM8250 to hw catalog drm/msm/dpu: add SM8150 to hw catalog drm/msm/dpu: intf timing path for displayport drm/msm/dpu: set missing flush bits for INTF_2 and INTF_3 drm/msm/dpu: don't use INTF_INPUT_CTRL feature on sdm845 drm/msm/dpu: move some sspp caps to dpu_caps drm/msm/dpu: update UBWC config for sm8150 and sm8250 drm/msm/dpu: use right setup_blend_config for sm8150 and sm8250 drm/msm/a6xx: set ubwc config for A640 and A650 drm/msm/adreno: un-open-code some packets drm/msm: sync generated headers drm/msm/a6xx: add build_bw_table for A640/A650 drm/msm/a6xx: fix crashstate capture for A650 ...	2020-08-05 19:50:06 -07:00
Dave Airlie	6c28aed6e5	drm/amdgfx/ttm: use wrapper to get ttm memory managers Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-38-airlied@gmail.com	2020-08-06 12:32:03 +10:00
Dave Airlie	6fe1c54353	drm/amdgpu/ttm: use new takedown path Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-28-airlied@gmail.com	2020-08-06 12:32:03 +10:00
Dave Airlie	158d20d185	drm/amdgpu/ttm: init managers from the driver side. Use new init calls to unwrap manager init Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-16-airlied@gmail.com	2020-08-06 12:31:36 +10:00
Dave Airlie	46bca88bbd	drm/ttm/amdgpu: consolidate ttm reserve paths Drop the WARN_ON and consolidate the two paths into one. Use the consolidate slowpath in the execbuf utils code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-6-airlied@gmail.com	2020-08-06 12:16:31 +10:00
Alex Deucher	87ded5caee	drm/amdgpu: move vram usage by vbios to mman (v2) It's related to the memory manager so move it there. v2: inline the structure Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	72de33f8f7	drm/amdgpu: move IP discovery data to mman It's related to the memory manager so move it there. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	cacbbe7c00	drm/amdgpu: move stolen memory from gmc to mman It's more related to memory management than memory controller. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	7438ae6e52	drm/amdgpu/gmc: disable keep_stolen_vga_memory on arcturus I suspect the only reason this was set was to avoid touching the display related registers on arcturus. Someone should double check this on arcturus with S3. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	14b18937cb	drm/amdgpu: drop the CPU pointers for the stolen vga bos We never use them. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	7348c20a4e	drm/amdgpu/gmc10: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	7b885f0eb4	drm/amdgpu/gmc9: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	3853626d2c	drm/amdgpu/gmc8: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	71755699b5	drm/amdgpu/gmc7: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	422fe8d27d	drm/amdgpu/gmc6: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	dd285c5df9	drm/amdgpu/gmc: add new helper to get the FB size used by pre-OS console This adds a new gmc callback to get the size reserved by the pre-OS console and provides a helper function for use by gmc IP drivers. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	0635019412	drm/amdgpu: add support for extended stolen vga memory This will allow us to split the allocation for systems where we have to keep the stolen memory around to avoid S3 issues. This way we don't waste as much memory and still avoid any screen artifacts during the bios to driver transition. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	5db62dc8d4	drm/amdgpu: move keep stolen memory check into gmc core Rather than leaving this as a gmc v9 specific hack. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	fcbc92e2e1	drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmc Since that is where we store the other data related to the stolen vga memory. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	81b54fb7a2	drm/amdgpu: use a define for the memory size of the vga emulator Rather than open coding it everywhere. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	adb5be8122	drm/amdgpu: use create_at for the stolen pre-OS buffer Should be functionally the same since nothing else is allocated at that point, but let's be exact. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	37912e963d	drm/amdgpu: handle bo size 0 in amdgpu_bo_create_kernel_at (v2) Just return early to match other bo_create functions. v2: check if the bo_ptr is NULL rather than checking the size. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
John Clements	4bfb74282f	drm/amdgpu: added RAS EEPROM device support check updated RAS EEPROM init/threshold sequences to check for device support Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:18 -04:00
John Clements	0ad7a64d69	drm/amdgpu: enable RAS support for sienna cichlid enabled GECC error injection and query support Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:08 -04:00
Monk Liu	a300de40f6	drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5) what: the MQD's save and restore of KCQ (kernel compute queue) cost lots of clocks during world switch which impacts a lot to multi-VF performance how: introduce a paramter to control the number of KCQ to avoid performance drop if there is no kernel compute queue needed notes: this paramter only affects gfx 8/9/10 v2: refine namings v3: choose queues for each ring to that try best to cross pipes evenly. v4: fix indentation some cleanupsin the gfx_compute_queue_acquire() v5: further fix on indentations more cleanupsin gfx_compute_queue_acquire() TODO: in the future we will let hypervisor driver to set this paramter automatically thus no need for user to configure it through modprobe in virtual machine Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:29 -04:00
Guchun Chen	9b856defbe	drm/amdgpu: update eeprom once specifying one bigger threshold(v3) During driver's probe, when it hits bad gpu tag in eeprom i2c init calling(the tag was set when reported bad page reaches bad page threshold in last driver's working loop), there are some strategys to deal with the cases: 1. when the module parameter amdgpu_bad_page_threshold = 0, that means page retirement feature is disabled, so just resetting the eeprom is fine. 2. When amdgpu_bad_page_threshold is not 0, and moreover, user sets one bigger valid data in order to make current boot up succeeds, correct eeprom header tag and do not break booting. 3. For other cases, driver's probe will be broken. v2: Just update eeprom header tag instead of resetting the whole table header when user sets one bigger threshold data. v3: Use dev_info/dev_err to print PCI device information, which helps in mGPU case. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:25 -04:00
Guchun Chen	a219ecbb83	drm/amdgpu: disable page reservation when amdgpu_bad_page_threshold = 0 When amdgpu_bad_page_threshold = 0, bad page reservation stuffs are skipped in either UMC ECC irq or page retirement calling of sync flood isr. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:20 -04:00
Guchun Chen	f848159b57	drm/amdgpu: decouple sysfs creating of bad page node Bad page information should not be exposed by sysfs when bad page retirement is disabled, so decouple it from ras sysfs group creating, and add one guard before creating. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:16 -04:00
Guchun Chen	eb0c3cd48f	drm/amdgpu: add one definition for RAS's sysfs/debugfs name(v2) Add one definition for the RAS module's FS name. It's used in both debugfs and sysfs cases. v2: Use static variable instead of macro definition. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:08 -04:00
Guchun Chen	bf0b91b78f	drm/amdgpu: restore ras flags when user resets eeprom(v2) RAS flags needs to be cleaned as well when user requires one clean eeprom. v2: RAS flags shall be restored after eeprom reset succeeds. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:04 -04:00
Guchun Chen	e8fbaf0342	drm/amdgpu: break GPU recovery once it's in bad state(v4) When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness. v2: Refine warning message in threshold reaching case, and fix spelling typo. v3: Fix explicit calling of bad gpu. v4: Rename function names. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:54 -04:00
Guchun Chen	9c06f91ff2	drm/amdgpu: schedule ras recovery when reaching bad page threshold(v2) Once the bad page saved to eeprom reaches the configured threshold, ras recovery will be issued to notify user. v2: Fix spelling typo. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:46 -04:00
Guchun Chen	35cd2cdadb	drm/amdgpu: skip bad page reservation once issuing from eeprom write Once the ras recovery is issued from eeprom write itself, bad page reservation should be ignored, otherwise, recursive calling of writting to eeprom would happen. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:38 -04:00
Guchun Chen	b82e65a935	drm/amdgpu: break driver init process when it's bad GPU(v5) When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check. v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message. v3: Refine function argument name. v4: Fix missing check of returning value of i2c initialization error case. v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:31 -04:00
Guchun Chen	1d6a9d122d	drm/amdgpu: add bad gpu tag definition This tag will be hired for bad gpu detection in eeprom's access. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:16 -04:00
Guchun Chen	c84d46707e	drm/amdgpu: validate bad page threshold in ras(v3) Bad page threshold value should be valid in the range between -1 and max records length of eeprom. It could determine when saved bad pages exceed threshold value, and proceed corresponding actions. v2: When using the default typical value, it should be min value between typical value and eeprom max records length. v3: drop the case of setting bad_page_cnt_threshold to be 0xFFFFFFFF, as it confuses user. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:25:58 -04:00
Guchun Chen	acc0204cdb	drm/amdgpu: add bad page count threshold in module parameter(v3) bad_page_threshold could be configured to enable/disable the associated bad page retirement feature in RAS. When it's -1, ras will use typical bad page failure value to handle bad page retirement. When it's 0, disable bad page retirement, and no bad page will be recorded and saved. For other valid value, driver will use this manual value as the threshold value of totoal bad pages. v2: correct documentation of this parameter. v3: remove confused statement in documentation. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:25:41 -04:00
Christian König	1a3fb59085	drm/ttm: remove the init_mem_type callback It is a very strange concept to call a function which just calls back the caller for the functions parameters. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/382085/	2020-07-31 17:14:37 +02:00
Christian König	473633540c	drm/amdgpu: stop implementing init_mem_type Instead just initialize the memory type parameters before calling ttm_bo_init_mm. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/382084/	2020-07-31 17:13:34 +02:00
Christian König	be1213a341	drm/ttm: remove TTM_MEMTYPE_FLAG_FIXED v2 Instead use a boolean field in the memory manager structure. Also invert the meaning of the field since the use of a TT structure is the special case here. v2: cleanup zero init. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/382079/	2020-07-31 17:13:09 +02:00
Christian König	418d2ad1ac	drm/ttm: initialize the system domain with defaults v2 Instead of repeating that in each driver. v2: keep the caching limitation for VMWGFX for now. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/382078/	2020-07-31 17:12:51 +02:00
Alex Deucher	2456c290a7	Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers" This regressed some working configurations so revert it. Will fix this properly for 5.9 and backport then. This reverts commit `38e0c89a19`. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-07-30 15:36:44 -04:00
Jiansong Chen	74b3595913	drm/amdgpu: enable GFXOFF for navy_flounder Enable GFXOFF for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Liu ChengZhe	f61772cd13	drm amdgpu: Skip tmr load for SRIOV 1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation; 2. Check pointer before release firmware. v2: use CHIP_SIENNA_CICHLID instead v3: remove local "bool ret"; fix grammer issue v4: use my name instead of "root" v5: fix grammer issue and indent issue Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Liu ChengZhe	392cf6a739	drm/amdgpu: fix PSP autoload twice in FLR Assigning false to block->status.hw overwrites PSP's previous hardware status, which causes the PSP to Resume operation after hardware init. Remove this assignment and let the PSP execute Resume operation when it is told to. v2: Remove the braces. v3: Modify the description. Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Peilin Ye	8e32628592	drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Compiler leaves a 4-byte hole near the end of `dev_info`, causing amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace when `size` is greater than 356. In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which unfortunately does not initialize that 4-byte hole. Fix it by using memset() instead. Cc: stable@vger.kernel.org Fixes: `c193fa91b9` ("drm/amdgpu: information leak in amdgpu_info_ioctl()") Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Jiansong Chen	defa489636	drm/amdgpu: update GC golden setting for navy_flounder Update GC golden setting for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
John Clements	01eee24fce	drm/amdgpu: enable umc 8.7 functions in gmc v10 add support for umc 8.7 initialization add umc 8.7 source to makefile Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:33 -04:00
Huang Rui	35dab589de	drm/amdgpu: skip crit temperature values on APU (v2) It doesn't expose PPTable descriptor on APU platform. So max/min temperature values cannot be got from APU platform. v2: Stoney needs to skip crit temperature as well. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 14:14:07 -04:00
Alex Deucher	87004abfbc	Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers" This regressed some working configurations so revert it. Will fix this properly for 5.9 and backport then. This reverts commit `38e0c89a19`. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-07-30 11:03:28 -04:00
Peilin Ye	543e8669ed	drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Compiler leaves a 4-byte hole near the end of `dev_info`, causing amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace when `size` is greater than 356. In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which unfortunately does not initialize that 4-byte hole. Fix it by using memset() instead. Cc: stable@vger.kernel.org Fixes: `c193fa91b9` ("drm/amdgpu: information leak in amdgpu_info_ioctl()") Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 11:02:10 -04:00
Ahmed S. Darwish	cd29f22019	dma-buf: Use sequence counter with associated wound/wait mutex A sequence counter write side critical section must be protected by some form of locking to serialize writers. If the serialization primitive is not disabling preemption implicitly, preemption has to be explicitly disabled before entering the sequence counter write side critical section. The dma-buf reservation subsystem uses plain sequence counters to manage updates to reservations. Writer serialization is accomplished through a wound/wait mutex. Acquiring a wound/wait mutex does not disable preemption, so this needs to be done manually before and after the write side critical section. Use the newly-added seqcount_ww_mutex_t instead: - It associates the ww_mutex with the sequence count, which enables lockdep to validate that the write side critical section is properly serialized. - It removes the need to explicitly add preempt_disable/enable() around the write side critical section because the write_begin/end() functions for this new data type automatically do this. If lockdep is disabled this ww_mutex lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://lkml.kernel.org/r/20200720155530.1173732-13-a.darwish@linutronix.de	2020-07-29 16:14:25 +02:00
Dave Airlie	08bb88cfc4	drm/ttm: make ttm_tt unbind function return void. The return value just led to BUG_ON, I think if a driver wants to BUG_ON here it can do it itself. (don't BUG_ON). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728040003.20398-1-airlied@gmail.com	2020-07-29 09:43:06 +10:00
Alex Deucher	6cd3c6798a	drm/amdgpu/si: initial support for GPU reset Ported from radeon. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-28 09:22:57 -04:00
Mauro Rossi	64200c468f	drm/amdgpu: enable DC support for SI parts (v2) [Why] amdgpu_device.c requires changes for SI chipsets support si.c require changes for Display Manager IP block enabling [How] amdgpu_device.c: add SI families in amdgpu_device_asic_has_dc_support() si.c: changes in si_set_ip_blocks() for Display Manager IP blocks enablement (v1) NOTE: As per Kaveri and older amdgpu.dc=1 kernel cmdline is required (v2) fix for bc011f9350 ("drm/amdgpu: Change SI/CI gfx/sdma/smu init sequence") remove CHIP_HAINAN support since it does not have physical DCE6 module Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-28 09:22:48 -04:00
Paul Menzel	7427a7a0b3	drm/amdgpu: Change type of module param `ppfeaturemask` to hexint The newly added hexint helper is more convenient for bitmasks. Before: $ more /sys/module/amdgpu/parameters/ppfeaturemask 4294950911 After: $ more /sys/module/amdgpu/parameters/ppfeaturemask 0xffffbfff Cc: amd-gfx@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/374724/	2020-07-28 13:44:54 +02:00
John Clements	48ef409c25	drm/amdgpu: add support for umc 8.7 ras functions added support for umc 8.7 error reporting and query Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:23:00 -04:00
Evan Quan	81b41ff5d2	drm/amd/powerplay: revise the outputs layout of amdgpu_pm_info debugfs The current outputs of amdgpu_pm_info debugfs come with clock gating status and followed by current clock/power information. However the clock gating status retrieving may pull GFX out of CG status. That will make the succeeding clock/power information retrieving inaccurate. To overcome this and be with minimum impact, the outputs are updated to show current clock/power information first. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:22:37 -04:00
James Zhu	1df67a4ece	Revert "drm/amdgpu/vcn3.0: remove extra asic type check" This reverts commit 058c07201ec7d373fc6a0a570b38a8a9d62c29fb. Chip NAVY_FLOUNDER uses vcn3.0, but it has only one VCN instance. Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:22:23 -04:00
Mukul Joshi	2c2b0d880f	drm/amdkfd: Add thermal throttling SMI event Add support for reporting thermal throttling events through SMI. Also, add a counter to count the number of throttling interrupts observed and report the count in the SMI event message. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:50 -04:00
Dennis Li	df9c8d1aa2	drm/amdgpu: fix system hang issue during GPU reset when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:37 -04:00
Boyuan Zhang	c5079f35c0	drm/amdgpu: update dec ring test for VCN 3.0 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:16 -04:00
James Zhu	309182389e	drm/amdgpu/vcn3.0: remove extra asic type check vcn ip block is already selected based on ASIC type during set_ip_blocks. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:05 -04:00
James Zhu	156589f74d	drm/amdgpu/jpeg3.0: remove extra asic type check jpeg ip block is already selected based on ASIC type during set_ip_blocks. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:00 -04:00
James Zhu	8214617aaf	drm/amdgpu: Remove extra asic type check vcn ip block is already selected based on ASIC type during set_ip_blocks Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:20:35 -04:00
James Zhu	de7fe7e87a	drm/amdgpu/jpeg: Remove extra asic type check jpeg ip block is already selected based on ASIC type during set_ip_blocks. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:20:22 -04:00
Dave Airlie	92be423922	Merge tag 'amd-drm-next-5.9-2020-07-24' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.9-2020-07-24: amdgpu: - Misc sienna cichlid fixes - Final bits of swSMU cleanup - Misc display fixes - Misc VCN fixes - Eeprom i2c cleanup - Drop amd vrr_range debugfs in favor of core drm Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200724205712.3913-1-alexander.deucher@amd.com	2020-07-27 12:32:13 +10:00
Dave Airlie	41206a073c	Linux 5.8-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG 0+eXO4A= =9EpR -----END PGP SIGNATURE----- Merge v5.8-rc6 into drm-next I've got a silent conflict + two trees based on fixes to merge. Fixes a silent merge with amdgpu Signed-off-by: Dave Airlie <airlied@redhat.com>	2020-07-24 08:48:05 +10:00
Alex Deucher	ccda42a462	drm/amdgpu/powerplay: add some documentation about memory clock We expose the actual memory controller clock rate in Linux, not the effective memory clock of the DRAMs. To translate it, it follows the following formula: Clock conversion (Mhz): HBM: effective_memory_clock = memory_controller_clock * 1 G5: effective_memory_clock = memory_controller_clock * 1 G6: effective_memory_clock = memory_controller_clock * 2 DRAM data rate (MT/s): HBM: effective_memory_clock * 2 = data_rate G5: effective_memory_clock * 4 = data_rate G6: effective_memory_clock * 8 = data_rate Bandwidth (MB/s): data_rate * vram_bit_width / 8 = memory_bandwidth Some examples: G5 on RX460: memory_controller_clock = 1750 Mhz effective_memory_clock = 1750 Mhz * 1 = 1750 Mhz data rate = 1750 * 4 = 7000 MT/s memory_bandwidth = 7000 * 128 bits / 8 = 112000 MB/s G6 on RX5600: memory_controller_clock = 900 Mhz effective_memory_clock = 900 Mhz * 2 = 1800 Mhz data rate = 1800 * 8 = 14400 MT/s memory_bandwidth = 14400 * 192 bits / 8 = 345600 MB/s Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-23 10:45:16 -04:00
John Clements	c5a4ef3e20	drm/amdgpu: move umc specific macros to header certain umc macros are common across umc versions Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-23 10:45:00 -04:00

... 4 5 6 7 8 ...

8299 Commits