linux

Author	SHA1	Message	Date
Yintian Tao	d2790e10d3	drm/amdgpu: no need to clean debugfs at amdgpu drm_minor_unregister will invoke drm_debugfs_cleanup to clean all the child node under primary minor node. We don't need to invoke amdgpu_debugfs_fini and amdgpu_debugfs_regs_cleanup to clean agian. Otherwise, it will raise the NULL pointer like below. [ 45.046029] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 45.047256] PGD 0 P4D 0 [ 45.047713] Oops: 0002 [#1] SMP PTI [ 45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: G W OE 4.18.0-15-generic #16~18.04.1-Ubuntu [ 45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 45.050651] RIP: 0010:down_write+0x1f/0x40 [ 45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01 [ 45.053702] RSP: 0018:ffffad8f4133fd40 EFLAGS: 00010246 [ 45.054384] RAX: 00000000000000a8 RBX: 00000000000000a8 RCX: ffffa011327dd814 [ 45.055349] RDX: ffffffff00000001 RSI: 0000000000000001 RDI: 00000000000000a8 [ 45.056346] RBP: ffffad8f4133fd48 R08: 0000000000000000 R09: ffffffffc0690a00 [ 45.057326] R10: ffffad8f4133fd58 R11: 0000000000000001 R12: ffffa0113cff0300 [ 45.058266] R13: ffffa0113c0a0000 R14: ffffffffc0c02a10 R15: ffffa0113e5c7860 [ 45.059221] FS: 00007f60d46f9540(0000) GS:ffffa0113fc00000(0000) knlGS:0000000000000000 [ 45.060809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.061826] CR2: 00000000000000a8 CR3: 0000000136250004 CR4: 00000000003606f0 [ 45.062913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.064404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.065897] Call Trace: [ 45.066426] debugfs_remove+0x36/0xa0 [ 45.067131] amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu] [ 45.068019] amdgpu_debugfs_fini+0x2c/0x50 [amdgpu] [ 45.068756] amdgpu_pci_remove+0x49/0x70 [amdgpu] [ 45.069439] pci_device_remove+0x3e/0xc0 [ 45.070037] device_release_driver_internal+0x18a/0x260 [ 45.070842] driver_detach+0x3f/0x80 [ 45.071325] bus_remove_driver+0x59/0xd0 [ 45.071850] driver_unregister+0x2c/0x40 [ 45.072377] pci_unregister_driver+0x22/0xa0 [ 45.073043] amdgpu_exit+0x15/0x57c [amdgpu] [ 45.073683] __x64_sys_delete_module+0x146/0x280 [ 45.074369] do_syscall_64+0x5a/0x120 [ 45.074916] entry_SYSCALL_64_after_hwframe+0x44/0xa9 v2: remove all debugfs cleanup/fini code at amdgpu v3: squash in unused variable removal Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:22 -05:00
Jacob He	460c484f24	drm/amdgpu: Initialize SPM_VMID with 0xf (v2) SPM_VMID is a global resource, SPM access the video memory according to SPM_VMID. The initial valude of SPM_VMID is 0 which is used by kernel. That means UMD can overwrite the memory of VMID0 by enabling SPM, that is really dangerous. Initialize SPM_VMID with 0xf, it messes up other user mode process at most. v2: squash in indentation fix Signed-off-by: Jacob He <jacob.he@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:21 -05:00
Emily Deng	89510a2737	drm/amdgpu/sriov: Use kiq to copy the gpu clock For vega10 sriov, the register is blocked, use copy data command to fix the issue. v2: Rename amdgpu_kiq_read_clock to gfx_v9_0_kiq_read_clock. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:21 -05:00
Yong Zhao	fd7d08bad7	drm/amdkfd: Make get_tile_config() generic Given we can query all the asic specific information from amdgpu_gfx_config, we can make get_tile_config() generic. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:20 -05:00
Yong Zhao	94b5c215ce	drm/amdgpu: Add num_banks and num_ranks to gfx config structure The two members will be used by KFD later. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:20 -05:00
Dave Airlie	a2ae604da7	Merge tag 'amd-drm-next-5.7-2020-02-26' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.7-2020-02-26: amdgpu: - Rework VM update handling in preparation for HMM support - HDCP srm support - PSR fixes - DC watermark fixes - OLED panel support - SR-IOV fixes - BACO fixes - Optimize debugging vram access - RAS fixes - Use BACO for runtime pm - HDCP fixes - XGMI fixes - DDC fixes - DC clock programming optimizations and fixes - PSP fw loading sequence updates - Drop DRIVER_USE_AGP - Remove legacy drm load and unload callbacks amdkfd: - Add runtime pm support radeon: - Drop DRIVER_USE_AGP Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227043142.4075-1-alexander.deucher@amd.com	2020-02-28 15:40:26 +10:00
Christian König	bd2275eeed	dma-buf: drop dynamic_mapping flag Instead use the pin() callback to detect dynamic DMA-buf handling. Since amdgpu is now migrated it doesn't make much sense to keep the extra flag. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353997/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	a448cb003e	drm/amdgpu: implement amdgpu_gem_prime_move_notify v2 Implement the importer side of unpinned DMA-buf handling. v2: update page tables immediately Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353998/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	2d4dad2734	drm/amdgpu: add amdgpu_dma_buf_pin/unpin v2 This implements the exporter side of unpinned DMA-buf handling. v2: fix minor coding style issues Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353999/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	4993ba0263	drm/amdgpu: use allowed_domains for exported DMA-bufs Avoid that we ping/pong the buffers when we stop to pin DMA-buf exports by using the allowed domains for exported buffers. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353996/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	bb42df4662	dma-buf: add dynamic DMA-buf handling v15 On the exporter side we add optional explicit pinning callbacks. Which are called when the importer doesn't implement dynamic handling, move notification or need the DMA-buf locked in place for its use case. On the importer side we add an optional move_notify callback. This callback is used by the exporter to inform the importers that their mappings should be destroyed as soon as possible. This allows the exporter to provide the mappings without the need to pin the backing store. v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag v6: squash together with pin/unpin changes v7: pin/unpin takes an attachment now v8: nuke dma_buf_attachment_(map\|unmap)_locked, everything is now handled backward compatible v9: always cache when export/importer don't agree on dynamic handling v10: minimal style cleanup v11: drop automatically re-entry avoidance v12: rename callback to move_notify v13: add might_lock in appropriate places v14: rebase on separated locking change v15: add EXPERIMENTAL flag, some more code comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353993/?series=73646&rev=1	2020-02-27 14:58:00 +01:00
Alex Deucher	c6385e503a	drm/amdgpu: drop legacy drm load and unload callbacks We've moved the debugfs handling into a centralized place so we can remove the legacy load an unload callbacks. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	405a1f9090	drm/amdgpu/display: split dp connector registration (v4) Split into init and register functions to avoid a segfault in some configs when the load/unload callbacks are removed. v2: - add back accidently dropped has_aux setting - set dev in late_register v3: - fix dp cec ordering v4: - squash in kdev reference fix Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	d090e7db5a	drm/amdgpu/display: move debugfs init into core amdgpu debugfs (v2) In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for display. v2: add config guard for DC Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Harry Wentland <harry.wentland@amd.com> (v1) Acked-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	4074892967	drm/amdgpu: don't call drm_connector_register for non-MST ports The core does this for us now. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	fd23cfcc2e	drm/amdgpu/ring: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for rings. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	cd9e29e717	drm/amdgpu/firmware: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for firmware. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	f9d64e6c4a	drm/amdgpu/regs: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for register access files. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	3f5cea671c	drm/amdgpu/gem: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for gem. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	24038d581c	drm/amdgpu/fence: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for fence handling. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	15997544a3	drm/amdgpu/sa: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for SA (sub allocator). Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	a4c5b1bb7b	drm/amdgpu/pm: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for pm. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	c5820361da	drm/amdgpu/ttm: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for ttm. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	923ffa6b02	drm/amdgpu: rename amdgpu_debugfs_preempt_cleanup to amdgpu_debugfs_fini. It will be used for other things in the future. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Yong Zhao	8bdab6bb1c	drm/amdgpu: Increase timout on emulator to tenfold instead of twice Since emulators are slower, sometime some operations like flushing tlb through FM need more than twice the regular timout of 100ms, so increase the timeout to 1s on emulators. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:03 -05:00
Yong Zhao	e694530418	drm/amdkfd: Avoid ambiguity by indicating it's cp queue The queues represented in queue_bitmap are only CP queues. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:20:05 -05:00
Divya Shikre	0c663695a6	drm/amd: Extend ROCt to surface UUID for devices that have them Devices from Arcturus onwards will have their UUID exposed to Thunk. Adding neccessary functions to the kernel to propagate the uuid. Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:18:17 -05:00
Kent Russell	944effd337	drm/amdgpu: Fix check for DPM when returning max clock pp_funcs may not exist, while dpm may be enabled. This change ensures that KFD topology will report the same as pp_dpm_sclk, as the conditions for reporting them will be the same. Otherwise, we may see the issue where KFD reports "100MHz" in topology as the max speed, while DPM is working correctly. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:42 -05:00
Rohit Khaire	75ddb640e1	drm/amdgpu: Don't write GCVM_L2_CNTL* regs on navi12 VF This change disables programming of GCVM_L2_CNTL* regs on VF. Signed-off-by: Rohit Khaire <Rohit.Khaire@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Daniel Vetter	f3ed67395d	drm/amdgpu: Drop DRIVER_USE_AGP This doesn't do anything except auto-init drm_agp support when you call drm_get_pci_dev(). Which amdgpu stopped doing with commit `b58c11314a` Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Jun 2 17:16:31 2017 -0400 drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev No idea whether this was intentional or accidental breakage, but I guess anyone who manages to boot a this modern gpu behind an agp bridge deserves a price. A price I never expect anyone to ever collect :-) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Xiaojie Yuan <xiaojie.yuan@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: "Tianci.Yin" <tianci.yin@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Tom St Denis	669e2f91e4	drm/amd/amdgpu: Add gfxoff debugfs entry Write a 32-bit value of zero to disable GFXOFF and write a 32-bit value of non-zero to enable GFXOFF. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Nirmoy Das	c6fc97f9bc	drm/amdgpu: use amdgpu_ring_test_helper when possible amdgpu_ring_test_helper already handles ring->sched.ready correctly Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Christian König	42e5fee65e	drm/amdgpu: add VM update fences back to the root PD v2 Add update fences to the root PD while mapping BOs. Otherwise PDs freed during the mapping won't wait for updates to finish and can cause corruptions. v2: rebased on drm-misc-next Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `90b69cdc5f` drm/amdgpu: stop adding VM updates fences to the resv obj Reviewed-by: xinhui pan <xinhui.pan@amd.com> Tested-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Nirmoy Das	6f9f960472	drm/amdgpu: cleanup amdgpu_ring_fini cleanup amdgpu_ring_fini to check the prerequisites before changing ring->sched.ready Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
John Clements	ef1caf48bd	drm/amdgpu: Add Arcturus D342 page retire support Check Arcturus SKU type to select I2C address of page retirement EEPROM Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Hawking Zhang	938065d4cb	drm/amdgpu: toggle DF-Cstate to protect DF reg access driver needs to take DF out Cstate before any DF register access. otherwise, the DF register may not be accessible. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Hawking Zhang	19744f5f2d	drm/amdgpu: move get_xgmi_relative_phy_addr to amdgpu_xgmi.c centralize all the xgmi related function to amdgpu_xgmi.c Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Hawking Zhang	53e0f1e6be	drm/amdgpu: add dpm helper function for DF Cstate control The helper function hides software smu and legacy powerplay implementation for DF Cstate control. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Evan Quan	995da6cc4c	drm/amdgpu: update psp firmwares loading sequence V2 For those ASICs with DF Cstate management centralized to PMFW, TMR setup should be performed between pmfw loading and other non-psp firmwares loading. V2: skip possible SMU firmware reloading Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
xinhui pan	f4a3c42b5c	drm/amdgpu: Remove kfd eviction fence before release bo (v2) No need to trigger eviction as the memory mapping will not be used anymore. All pt/pd bos share same resv, hence the same shared eviction fence. Everytime page table is freed, the fence will be signled and that cuases kfd unexcepted evictions. v2: squash in 32 bit fix CC: Christian König <christian.koenig@amd.com> CC: Felix Kuehling <felix.kuehling@amd.com> CC: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Daniel Vetter	8a3bddf67c	drm/amdgpu: Drop DRIVER_USE_AGP This doesn't do anything except auto-init drm_agp support when you call drm_get_pci_dev(). Which amdgpu stopped doing with commit `b58c11314a` Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Jun 2 17:16:31 2017 -0400 drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev No idea whether this was intentional or accidental breakage, but I guess anyone who manages to boot a this modern gpu behind an agp bridge deserves a price. A price I never expect anyone to ever collect :-) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Xiaojie Yuan <xiaojie.yuan@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: "Tianci.Yin" <tianci.yin@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-26 14:02:41 -05:00
Shirish S	a3ed353cf8	amdgpu/gmc_v9: save/restore sdpif regs during S3 fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-25 11:30:42 -05:00
Felix Kuehling	b80cd524ac	drm/amdgpu: Improve Vega20 XGMI TLB flush workaround Using a heavy-weight TLB flush once is not sufficient. Concurrent memory accesses in the same TLB cache line can re-populate TLB entries from stale texture cache (TC) entries while the heavy-weight TLB flush is in progress. To fix this race condition, perform another TLB flush after the heavy-weight one, when TC is known to be clean. Move the workaround into the low-level TLB flushing functions. This way they apply to amdgpu as well, and KIQ-based TLB flush only needs to synchronize once. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:57 -05:00
Monk Liu	82c4ebfa35	drm/amdgpu: fix psp ucode not loaded in bare-metal for bare-metal we alawys need to load sys/sos/kdb Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:49 -05:00
Shirish S	c2ecd79bec	amdgpu/gmc_v9: save/restore sdpif regs during S3 fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:26 -05:00
Alex Deucher	91aeda1811	drm/amdgpu/discovery: make the discovery code less chatty Make the IP block base output debug only. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:25 -05:00
Monk Liu	6325b38d89	drm/amdgpu: fix colliding of preemption what: some os preemption path is messed up with world switch preemption fix: cleanup those logics so os preemption not mixed with world switch this patch is a general fix for issues comes from SRIOV MCBP, but there is still UMD side issues not resovlved yet, so this patch cannot fix all world switch bug. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:25 -05:00
Monk Liu	f77a9c920a	drm/amdgpu: cleanup some incorrect reg access for SRIOV 1) we shouldn't load PSP kdb and sys/sos for VF, they are supposed to be handled by hypervisor 2) ih reroute doesn't work on VF thus we should avoid calling it, besides VF should not use those PSP register sets for PF 3) shouldn't load SMU ucode under SRIOV, otherwise PSP would report error Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:25 -05:00
Evan Quan	14008574a3	drm/amdgpu: drop the non-sense firmware version check on arcturus As the firmware versions of arcturus are different from other gfx9 ASICs. And the warning("CP firmware version too old, please update!") caused by this check can be eliminated. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
changzhu	f61f01b14d	drm/amdgpu: add is_raven_kicker judgement for raven1 The rlc version of raven_kicer_rlc is different from the legacy rlc version of raven_rlc. So it needs to add a judgement function for raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
Guchun Chen	3cd4f61859	drm/amdgpu: record non-zero error counter info in NBIO before resetting GPU When NBIO's RAS error happens, before trigging GPU reset, it's needed to record error counter information, which can correct the error counter value missed issue when reading from debugfs. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
Guchun Chen	313c8fd33e	drm/amdgpu: log on non-zero error conter per IP before GPU reset Once sync flood interrupt is triggered by RAS error, before actual GPU recovery job, it's necessary to log on and print non-zero error counter, this will help user knows where the RAS error source is from quickly. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
changzhu	debcf83770	drm/amdgpu: add is_raven_kicker judgement for raven1 The rlc version of raven_kicer_rlc is different from the legacy rlc version of raven_rlc. So it needs to add a judgement function for raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:33:42 -05:00
Maxime Ripard	28f2aff1ca	Linux 5.6-rc2 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5JsVQeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGHZEH+wddtJO4dZk5TZdF KZB2w2ldsCvrBZAmas1TVvm8ncvUf+ATUZcSIzvZ3YbLmxsuLF2Cz3kD+n+36Mvy ejwq8Scl7jwnouYps/Gfd6rRj/uCafqST4qp15GMGeiy2ST4A8dJrv5IAgZhD8/N SN1bSr1AXpZ2JlEzzLDQ/NdVoNMS6IzCOsaINZcc60/XQoQZFRBWamMJFqu+CmXD SBJOybQNFJhziy45cGZSAl+67sSCcoPftwTs0Stu4CJsvFWRb3MsbNTDS51Hjcc4 3tdgGOhoNXzyzZr96MEAHmiaW4VLQv0PGgUfOajE35viMz48OrwCTru8Kbuae3XM YHV4qJk= =yiem -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXkpecgAKCRDj7w1vZxhR xU4CAQD/8ZzfYroiiEI7FHEgvnuldGYZqA9kbNAO2Vueeac0OgD+Jojewdxy+pJ9 dxfA/POdtzx3hOdN+U1YgaDC0hbXwww= =XgKq -----END PGP SIGNATURE----- Merge v5.6-rc2 into drm-misc-next Lyude needs some patches in 5.6-rc2 and we didn't bring drm-misc-next forward yet, so it looks like a good occasion. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2020-02-17 10:34:34 +01:00
Alex Deucher	b08c3ed609	drm/amdgpu/gfx10: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Alex Deucher	120cf95930	drm/amdgpu/gfx9: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Alex Deucher	c657b936ea	drm/amdgpu/soc15: fix xclk for raven It's 25 Mhz (refclk / 4). This fixes the interpretation of the rlc clock counter. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Bhawanpreet Lakha	c6f8c44044	drm/amd/display: fix dtm unloading there was a type in the terminate command. We should be calling psp_dtm_unload() instead of psp_hdcp_unload() Fixes: `143f230533` ("drm/amdgpu: psp DTM init") Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-14 12:58:58 -05:00
Thomas Zimmermann	e3eff4b5d9	drm/amdgpu: Convert to CRTC VBLANK callbacks VBLANK callbacks in struct drm_driver are deprecated in favor of their equivalents in struct drm_crtc_funcs. Convert amdgpu over. v2: * don't wrap existing functions; change signature instead Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-6-tzimmermann@suse.de	2020-02-13 13:08:13 +01:00
Thomas Zimmermann	ea702333e5	drm/amdgpu: Convert to struct drm_crtc_helper_funcs.get_scanout_position() The callback struct drm_driver.get_scanout_position() is deprecated in favor of struct drm_crtc_helper_funcs.get_scanout_position(). Convert amdgpu over. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-5-tzimmermann@suse.de	2020-02-13 13:08:13 +01:00
Dan Carpenter	434cbcb1bd	drm/amdgpu: return -EFAULT if copy_to_user() fails The copy_to_user() function returns the number of bytes remaining to be copied, but we want to return a negative error code to the user. Fixes: `030d5b97a5` ("drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:41 -05:00
Alex Deucher	72b4c01d66	drm/amdgpu/gfx10: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:41 -05:00
Alex Deucher	e5f134958d	drm/amdgpu/gfx9: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:40 -05:00
Alex Deucher	b90c4d667c	drm/amdgpu/soc15: fix xclk for raven It's 25 Mhz (refclk / 4). This fixes the interpretation of the rlc clock counter. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:40 -05:00
Bhawanpreet Lakha	c786530b21	drm/amd/display: fix dtm unloading there was a type in the terminate command. We should be calling psp_dtm_unload() instead of psp_hdcp_unload() Fixes: `143f230533` ("drm/amdgpu: psp DTM init") Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:03:22 -05:00
Alex Deucher	4fdda2e66d	drm/amdgpu/runpm: enable runpm on baco capable VI+ asics Seems to work reliably on VI+ except for a few so enable runpm barring those where baco for runtime power management is not supported. [rajneesh] Picked https://patchwork.freedesktop.org/patch/335402/ to enable runtime pm with baco for kfd. Also fixed a checkpatch warning and added extra checks for VEGA20 and ARCTURUS. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:01:03 -05:00
Rajneesh Bhardwaj	9593f4d6a6	drm/amdkfd: refactor runtime pm for baco So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user processes to create queues and interact with kfd driver and amd gpu. This mechanism created problem when amdgpu device is runtime suspended with BACO enabled. Any application that relies on kfd driver fails to load because the driver reports a locked kfd device since gpu is runtime suspended. However, in an ideal case, when gpu is runtime suspended the kfd driver should be able to: - auto resume amdgpu driver whenever a client requests compute service - prevent runtime suspend for amdgpu while kfd is in use This change refactors the amdgpu and amdkfd drivers to support BACO and runtime power management. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:00:54 -05:00
Rajneesh Bhardwaj	70bedd68e7	drm/amdgpu: Fix missing error check in suspend amdgpu_device_suspend might return an error code since it can be called from both runtime and system suspend flows. Add the missing return code in case of a failure. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:00:25 -05:00
Guchun Chen	a934f9d866	drm/amdgpu: correct comment to clear up the confusion Former comment looks to be one intended behavior in code, actually it's not. So correct it. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:40:12 -05:00
James Zhu	b5336bfd6f	drm/amdgpu/vcn2.5: fix warning Fix warning during switching to dpg pause mode for VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:37:22 -05:00
Guchun Chen	2cabe0d4cd	drm/amdgpu: limit GDS clearing workaround in cold boot sequence GDS clear workaround will cause gfx failure in suspend/resume case. [ 98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] ERROR late_init of IP block <gfx_v9_0> failed -110 [ 98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110 [ 98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110 As this workaround is specific to the HW bug of GDS's ECC error existing in cold boot up, so bypass this workaround in suspend/ resume case after booting up. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:37:02 -05:00
Jonathan Kim	46d1da733f	drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf hwc->conf was designated specifically for AMD APU IOMMU purposes. This could cause problems in performance and/or function since APU IOMMU implementation is elsewhere. Also hwc->conf and hwc->config are different members of an anonymous union so hwc->conf aliases as hw->last_tag. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:35:54 -05:00
James Zhu	f4d0242b7b	drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1 Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode power off issue on instance 1. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:10:36 -05:00
Alex Deucher	f0f7ddfc34	drm/amdgpu: add flag for runtime suspend So we know whether we in are in runtime suspend or system suspend. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:52 -05:00
xinhui pan	a6605c43f9	drm/amdgpu: Do not move root PT bo to relocated list As root PD has no parent, we just need move its status to idle. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> CC: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:38 -05:00
Guchun Chen	278628fa46	drm/amdgpu: correct comment to clear up the confusion Former comment looks to be one intended behavior in code, actually it's not. So correct it. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:31 -05:00
James Zhu	3b4a18a355	drm/amdgpu/vcn2.5: fix warning Fix warning during switching to dpg pause mode for VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:49:23 -05:00
Guchun Chen	ea6f0931c1	drm/amdgpu: limit GDS clearing workaround in cold boot sequence GDS clear workaround will cause gfx failure in suspend/resume case. [ 98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] ERROR late_init of IP block <gfx_v9_0> failed -110 [ 98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110 [ 98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110 As this workaround is specific to the HW bug of GDS's ECC error existing in cold boot up, so bypass this workaround in suspend/ resume case after booting up. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:49:16 -05:00
Linus Torvalds	c16b99d6c5	drm fixes for 5.6-rc1 tegra: - merge window regression fixes nouveau: - couple of volta/turing modesetting fixes amdgpu: - EDC fixes for Arcturus - GDDR6 memory training fixe - Fix for reading gfx clockgating registers while in GFXOFF state - i2c freq fixes - Misc display fixes - TLB invalidation fix when using semaphores - VCN 2.5 instancing fixes - Switch raven1 gfxoff to a blacklist - Coreboot workaround for KV/KB - Root cause dongle fixes for display and revert workaround - Enable GPU reset for renoir and navi - Navi overclocking fixes - Fix up confusing warnings in display clock validation on raven amdkfd: - SDMA fix radeon: - Misc LUT fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJePNasAAoJEAx081l5xIa+axgP+wZbUkJZYt75p5thhptp2CY3 235yIZQcDy0T+J/34Agv0e5DfK2hCafPIVv7dIV1A9AjWxhVzIoZmD2mHSLKnxVb MsfemKsrm1WO9KezDkDIEOgkN5rY811xTIsg89VSjyiiPXQmwAcXQBbmIBHg9YtT 7ttek1afJv7h0HD4W2JdTxLEeKn9lpZHUXESXz+xoE8OUMDizMgmbt8nWXlbFpcw Tx1EmxZ3jbCYGsrcQBOLURZiwzOb2zbXJ9o8BgNdomvtI69l2/3mMonavw+IgVq+ aiHMgWViVsDDP8ijn8uBa7EEqtGWuqTRaL/Ei71w2MGkG7S3Nz4PxbLbl5P9NHAc D3OsiaXGdNhoSGfGD6sr7TlOIEo7gjLGKJ3fMW15XmBF4isb0lwlnIB7dTiU9Zaq UBMpNA9/Vtty2MtT1SkVKQrC+Mujl4lAYssWhcrh1/RX0Ij6do4aFswy17dbVg3Q LIigj1quIfNPXONb0fwkg5JGQyOMnrYs0Q6SkrhQfSR2UzsNWnGoBRyROcZipNnQ STaM8F6eEi7RbFUaT0uWL71GBxM00PyjQgpf5fnrOiKp6rfgK3B3ThpsVoAQ2OW4 CnodlCa1uxIkIgQdcoG/3vT5B6nzpdhgRVjHt+N5sraLjOZFZHxex0YmJAYXcN+/ QR8ARNgILJvA14xCsOT7 =C0nE -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Just some fixes for this merge window: the tegra changes fix some regressions in the merge, nouveau has a few modesetting fixes. The amdgpu fixes are bit bigger, but they contain a couple of weeks of fixes, and don't seem to contain anything that isn't really a fix. Summary: tegra: - merge window regression fixes nouveau: - couple of volta/turing modesetting fixes amdgpu: - EDC fixes for Arcturus - GDDR6 memory training fixe - Fix for reading gfx clockgating registers while in GFXOFF state - i2c freq fixes - Misc display fixes - TLB invalidation fix when using semaphores - VCN 2.5 instancing fixes - Switch raven1 gfxoff to a blacklist - Coreboot workaround for KV/KB - Root cause dongle fixes for display and revert workaround - Enable GPU reset for renoir and navi - Navi overclocking fixes - Fix up confusing warnings in display clock validation on raven amdkfd: - SDMA fix radeon: - Misc LUT fixes" * tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm: (90 commits) gpu: host1x: Set DMA direction only for DMA-mapped buffer objects drm/tegra: Reuse IOVA mapping where possible drm/tegra: Relax IOMMU usage criteria on old Tegra drm/amd/dm/mst: Ignore payload update failures drm/amdgpu: update default voltage for boot od table for navi1x drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2) drm/amdgpu: fetch default VDDC curve voltages (v2) drm/amdgpu/smu_v11_0: Correct behavior of restoring default tables (v2) drm/amdgpu/navi10: add OD_RANGE for navi overclocking drm/amdgpu/navi: fix index for OD MCLK drm/amd/display: Fix HW/SW state mismatch drm/amd/display: Fix a typo when computing dsc configuration drm/amd/powerplay: fix navi10 system intermittent reboot issue V2 drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode drm/amd/display: Only enable cursor on pipes that need it drm/nouveau/kms/gv100-: avoid sending a core update until the first modeset drm/nouveau/kms/gv100-: move window ownership setup into modesetting path drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR storms ...	2020-02-07 12:46:08 -08:00
Christian König	dd1ab79910	drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_access_memory v2 Make use of the better performance here as well. This patch is only compile tested! v2: fix calculation bug pointed out by Felix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:39 -05:00
Christian König	030d5b97a5	drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read This speeds up the access quite a bit from 2.2 MB/s to 2.9 MB/s on 32bit and 12,8 MB/s on 64bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:33 -05:00
Christian König	c12b84d6e0	drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2 This should speed up debugging VRAM access a lot. v2: add HDP flush/invalidate Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:25 -05:00
Christian König	ce05ac56e6	drm/amdgpu: optimize amdgpu_device_vram_access a bit. Only write the _HI register when necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:17 -05:00
Jonathan Kim	42d708db8e	drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf hwc->conf was designated specifically for AMD APU IOMMU purposes. This could cause problems in performance and/or function since APU IOMMU implementation is elsewhere. Also hwc->conf and hwc->config are different members of an anonymous union so hwc->conf aliases as hw->last_tag. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:05 -05:00
James Zhu	80ff3e10c8	drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1 Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode power off issue on instance 1. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:44:51 -05:00
Jack Zhang	86b93fd62d	drm/amdgpu/sriov Don't send msg when smu suspend For sriov and pp_onevf_mode, do not send message to set smu status, because smu doesn't support these messages under VF. Besides, it should skip smu_suspend when pp_onevf_mode is disabled. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:44:24 -05:00
Hawking Zhang	0b9d37609a	drm/amdgpu: move xgmi init/fini to xgmi_add/remove_device call (v2) For sriov, psp ip block has to be initialized before ih block for the dynamic register programming interface that needed for vf ih ring buffer. On the other hand, current psp ip block hw_init function will initialize xgmi session which actaully depends on interrupt to return session context. This results an empty xgmi ta session id and later failures on all the xgmi ta cmd invoked from vf. xgmi ta session initialization has to be done after ih ip block hw_init call. to unify xgmi session init/fini for both bare-metal sriov virtualization use scenario, move xgmi ta init to xgmi_add_device call, and accordingly terminate xgmi ta session in xgmi_remove_device call. The existing suspend/resume sequence will not be changed. v2: squash in return fix from Nirmoy Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-06 15:04:36 -05:00
Christian König	9f3cc18d19	drm/amdgpu: rework synchronization of VM updates v4 If provided we only sync to the BOs reservation object and no longer to the root PD. v2: update comment, cleanup amdgpu_bo_sync_wait_resv v3: use correct reservation object while clearing v4: fix typo in amdgpu_bo_sync_wait_resv Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	4939d973b6	drm/amdgpu: simplify and fix amdgpu_sync_resv No matter what we always need to sync to moves. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	fe6796ac12	drm/amdgpu: allow higher level PD invalidations Allow partial invalidation on unallocated PDs. This is useful when we need to silence faults to stop interrupt floods on Vega. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	7d28efe0c3	drm/amdgpu: return EINVAL instead of ENOENT in the VM code That we can't find a PD above the root is expected can only happen if we try to update a larger range than actually managed by the VM. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	bfcd6c69e4	drm/amdgpu: fix parentheses in amdgpu_vm_update_ptes For the root PD mask can be 0xffffffff as well which would overrun to 0 if we don't cast it before we add one. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	46cf5f7626	drm/amdgpu: make sure to never allocate PDs/PTs for invalidations Make sure that we never allocate a page table for an invalidation operation. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	55cdd4e9b9	drm/amdgpu: drop unnecessary restriction for huge root PDEs The root PD can also contain huge PDEs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	ef48d4b39e	drm/amdgpu: stop using amdgpu_bo_gpu_offset in the VM backend We need to update page tables without any lock held. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	5d3196605d	drm/amdgpu: rework job synchronization v2 For unlocked page table updates we need to be able to sync to fences of a specific VM. v2: use SYNC_ALWAYS in the UVD code Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	114fbc3195	drm/amdgpu: use the VM as job owner For HMM we need to rework how VM synchronization works, so instead of the filp use VM as job owner. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	81417bea87	drm/amdgpu: explicitly sync VM update to PDs/PTs Explicitly sync VM updates to the moving fence in PDs and PTs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Nathan Chancellor	968162204a	drm/amdgpu: Fix implicit enum conversion in gfx_v9_4_ras_error_inject Clang warns: ../drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c:967:35: warning: implicit conversion from enumeration type 'enum amdgpu_ras_block' to different enumeration type 'enum ta_ras_block' [-Wenum-conversion] block_info.block_id = info->head.block; ~ ~~~~~~~~~~~^~~~~ 1 warning generated. Use the function added in commit `828cfa2909` ("drm/amdgpu: Fix amdgpu ras to ta enums conversion") that handles this conversion explicitly. Fixes: `4c461d89db` ("drm/amdgpu: add RAS support for the gfx block of Arcturus") Link: https://github.com/ClangBuiltLinux/linux/issues/849 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:43 -05:00
Stephen Rothwell	7044cb6c20	amdgpu: using vmalloc requires includeing vmalloc.h Fixes: `240c811ccd` ("drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2)") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:42 -05:00
Nirmoy Das	977f7e1068	drm/amdgpu: allocate entities on demand Currently we pre-allocate entities and fences for all the HW IPs on context creation and some of which are might never be used. This patch tries to resolve entity/fences wastage by creating entity only when needed. v2: allocate memory for entity and fences together Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:42 -05:00
Joseph Greathouse	18c6b74e7c	drm/amdgpu: Enable DISABLE_BARRIER_WAITCNT for Arcturus In previous gfx9 parts, S_BARRIER shader instructions are implicitly S_WAITCNT 0 instructions as well. This setting turns off that mechanism in Arcturus and beyond. With this, shaders must follow the ISA guide insofar as putting in explicit S_WAITCNT operations even after an S_BARRIER. v2: Fix patch title to list component Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:27 -05:00
Linus Torvalds	9f68e3655a	drm pull for 5.6-rc1 uapi: - dma-buf heaps added (and fixed) - command line add support for panel oreientation - command line allow overriding penguin count drm: - mipi dsi definition updates - lockdep annotations for dma_resv - remove dma-buf kmap/kunmap support - constify fb_ops in all fbdev drivers - MST fix for daisy chained hotplug- - CTA-861-G modes with VIC >= 193 added - fix drm_panel_of_backlight export - LVDS decoder support - more device based logging support - scanline alighment for dumb buffers - MST DSC helpers scheduler: - documentation fixes - job distribution improvements panel: - Logic PD type 28 panel support - Jimax8729d MIPI-DSI - igenic JZ4770 - generic DSI devicetree bindings - sony acx424AKP panel - Leadtek LTK500HD1829 - xinpeng XPP055C272 - AUO B116XAK01 - GiantPlus GPM940B0 - BOE NV140FHM-N49 - Satoz SAT050AT40H12R2 - Sharp LS020B1DD01D panels. ttm: - use blocking WW lock i915: - hw/uapi state separation - Lock annotation improvements - selftest improvements - ICL/TGL DSI VDSC support - VBT parsing improvments - Display refactoring - DSI updates + fixes - HDCP 2.2 for CFL - CML PCI ID fixes - GLK+ fbc fix - PSR fixes - GEN/GT refactor improvments - DP MST fixes - switch context id alloc to xarray - workaround updates - LMEM debugfs support - tiled monitor fixes - ICL+ clock gating programming removed - DP MST disable sequence fixed - LMEM discontiguous object maps - prefaulting for discontiguous objects - use LMEM for dumb buffers if possible - add LMEM mmap support amdgpu: - enable sync object timelines for vulkan - MST atomic routines - enable MST DSC support - add DMCUB display microengine support - DC OEM i2c support - Renoir DC fixes - Initial HDCP 2.x support - BACO support for Arcturus - Use BACO for runtime PM power save - gfxoff on navi10 - gfx10 golden updates and fixes - DCN support on POWER - GFXOFF for raven1 refresh - MM engine idle handlers cleanup - 10bpc EDP panel fixes - renoir watermark fixes - SR-IOV fixes - Arcturus VCN fixes - GDDR6 training fixes - freesync fixes - Pollock support amdkfd: - unify more codepath with amdgpu - use KIQ to setup HIQ rather than MMIO radeon: - fix vma fault handler race - PPC DMA fix - register check fixes for r100/r200 nouveau: - mmap_sem vs dma_resv fix - rewrite the ACR secure boot code for Turing - TU10x graphics engine support (TU11x pending) - Page kind mapping for turing - 10-bit LUT support - GP10B Tegra fixes - HD audio regression fix hisilicon/hibmc: - use generic fbdev code and helpers rockchip: - dsi/px30 support virtio: - fb damage support - static some functions vc4: - use dma_resv lock wrappers msm: - use dma_resv lock wrappers - sc7180 display + DSI support - a618 support - UBWC support improvements vmwgfx: - updates + new logging uapi exynos: - enable/disable callback cleanups etnaviv: - use dma_resv lock wrappers atmel-hlcdc: - clock fixes mediatek: - cmdq support - non-smooth cursor fixes - ctm property support sun4i: - suspend support - A64 mipi dsi support rcar-du: - Color management module support - LVDS encoder dual-link support - R8A77980 support analogic: - add support for an6345 ast: - atomic modeset support - primary plane garbage fix arcgpu: - fixes for fourcc handling tegra: - minor fixes and improvments mcde: - vblank support meson: - OSD1 plane AFBC commit gma500: - add pageflip support - reomve global drm_dev komeda: - tweak debugfs output - d32 support - runtime PM suppotr udl: - use generic shmem helpers - cleanup and fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJeMm6RAAoJEAx081l5xIa+vN8P/0j4jEOv+KIinAhoH+LG3EpD m2TUuu5OQIoBrcCoWOgFBk3wqYpw6PdMBdkXh+5sE5lfeBynp8oC3Bin+QsHJE05 eGBpZtHe+70MQb0Eha+Aic0hchvBKzRnq6i0MYSIHn6afs76dLmF8knTjycxrvV5 Xu1Z3WDmjzqgWF9ja5JCD6fby11seP5RrwObYKVikO35QQyJJwGSGKgu5rq/pByK /n0PCnCOINuL0Lz6J9qexdh/0/XYFQilRC31GJNlKbDSFuECF0GOEzEE/xUBW/pI dLh2YwIIygm18Gar9PgvMwXJn3BfzQ0qEJsf+HlQeNw9iLgbHpp2AsTxHTE87OGe R/y85taW3jGjPsNOKZOeLpvg/Ro8l8ZipLApvDCG2O22DThg/cd6NDjZxl1FJfRH acDG/JdgPo5MbdRAH/cM1WuFS9gEM+0BeSQ5gCjtPakF+X4Vz+ABFDLMRJoaejkJ q8DG32TQXELQx0RMghsqK7YCWGfl+2alA1u9w6TgJh9Rq4iVckvpDeqAZnK1Adkc 87g957Tl0n6FA4wJj/t5jrceiLRMJAm/rBK+R3GZNfWrgx4bHbCmb4fZDZsrFzph nbAjNJ5kOchrFCaRR47ULby6+Q14MAFbkWq4Crfu4YDdzUkTPpep6pi2GIe8w0rV P0hdYOYJf6LUda0utuQX =oFrI -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Davbe Airlie: "This is the main pull request for graphics for 5.6. Usual selection of changes all over. I've got one outstanding vmwgfx pull that touches mm so kept it separate until after all of this lands. I'll try and get it to you soon after this, but it might be early next week (nothing wrong with code, just my schedule is messy) This also hits a lot of fbdev drivers with some cleanups. Other notables: - vulkan timeline semaphore support added to syncobjs - nouveau turing secureboot/graphics support - Displayport MST display stream compression support Detailed summary: uapi: - dma-buf heaps added (and fixed) - command line add support for panel oreientation - command line allow overriding penguin count drm: - mipi dsi definition updates - lockdep annotations for dma_resv - remove dma-buf kmap/kunmap support - constify fb_ops in all fbdev drivers - MST fix for daisy chained hotplug- - CTA-861-G modes with VIC >= 193 added - fix drm_panel_of_backlight export - LVDS decoder support - more device based logging support - scanline alighment for dumb buffers - MST DSC helpers scheduler: - documentation fixes - job distribution improvements panel: - Logic PD type 28 panel support - Jimax8729d MIPI-DSI - igenic JZ4770 - generic DSI devicetree bindings - sony acx424AKP panel - Leadtek LTK500HD1829 - xinpeng XPP055C272 - AUO B116XAK01 - GiantPlus GPM940B0 - BOE NV140FHM-N49 - Satoz SAT050AT40H12R2 - Sharp LS020B1DD01D panels. ttm: - use blocking WW lock i915: - hw/uapi state separation - Lock annotation improvements - selftest improvements - ICL/TGL DSI VDSC support - VBT parsing improvments - Display refactoring - DSI updates + fixes - HDCP 2.2 for CFL - CML PCI ID fixes - GLK+ fbc fix - PSR fixes - GEN/GT refactor improvments - DP MST fixes - switch context id alloc to xarray - workaround updates - LMEM debugfs support - tiled monitor fixes - ICL+ clock gating programming removed - DP MST disable sequence fixed - LMEM discontiguous object maps - prefaulting for discontiguous objects - use LMEM for dumb buffers if possible - add LMEM mmap support amdgpu: - enable sync object timelines for vulkan - MST atomic routines - enable MST DSC support - add DMCUB display microengine support - DC OEM i2c support - Renoir DC fixes - Initial HDCP 2.x support - BACO support for Arcturus - Use BACO for runtime PM power save - gfxoff on navi10 - gfx10 golden updates and fixes - DCN support on POWER - GFXOFF for raven1 refresh - MM engine idle handlers cleanup - 10bpc EDP panel fixes - renoir watermark fixes - SR-IOV fixes - Arcturus VCN fixes - GDDR6 training fixes - freesync fixes - Pollock support amdkfd: - unify more codepath with amdgpu - use KIQ to setup HIQ rather than MMIO radeon: - fix vma fault handler race - PPC DMA fix - register check fixes for r100/r200 nouveau: - mmap_sem vs dma_resv fix - rewrite the ACR secure boot code for Turing - TU10x graphics engine support (TU11x pending) - Page kind mapping for turing - 10-bit LUT support - GP10B Tegra fixes - HD audio regression fix hisilicon/hibmc: - use generic fbdev code and helpers rockchip: - dsi/px30 support virtio: - fb damage support - static some functions vc4: - use dma_resv lock wrappers msm: - use dma_resv lock wrappers - sc7180 display + DSI support - a618 support - UBWC support improvements vmwgfx: - updates + new logging uapi exynos: - enable/disable callback cleanups etnaviv: - use dma_resv lock wrappers atmel-hlcdc: - clock fixes mediatek: - cmdq support - non-smooth cursor fixes - ctm property support sun4i: - suspend support - A64 mipi dsi support rcar-du: - Color management module support - LVDS encoder dual-link support - R8A77980 support analogic: - add support for an6345 ast: - atomic modeset support - primary plane garbage fix arcgpu: - fixes for fourcc handling tegra: - minor fixes and improvments mcde: - vblank support meson: - OSD1 plane AFBC commit gma500: - add pageflip support - reomve global drm_dev komeda: - tweak debugfs output - d32 support - runtime PM suppotr udl: - use generic shmem helpers - cleanup and fixes" * tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits) drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing drm/nouveau/acr: return error when registering LSF if ACR not supported drm/nouveau/disp/gv100-: not all channel types support reporting error codes drm/nouveau/disp/nv50-: prevent oops when no channel method map provided drm/nouveau: support synchronous pushbuf submission drm/nouveau: signal pending fences when channel has been killed drm/nouveau: reject attempts to submit to dead channels drm/nouveau: zero vma pointer even if we only unreference it rather than free drm/nouveau: Add HD-audio component notifier support drm/nouveau: fix build error without CONFIG_IOMMU_API drm/nouveau/kms/nv04: remove set but not used variable 'width' drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector' drm/nouveau/mmu: fix comptag memory leak drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping drm/exynos: Rename Exynos to lowercase drm/exynos: change callback names drm/mst: Don't do atomic checks over disabled managers drm/amdgpu: add the lost mutex_init back drm/amd/display: skip opp blank or unblank if test pattern enabled ...	2020-01-30 08:04:01 -08:00
Alex Deucher	2cb44fb093	drm/amdgpu: enable GPU reset by default on renoir Everything is in place. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Alex Deucher	658c663947	drm/amdgpu: enable GPU reset by default on Navi Has been working fine for a while. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Christian König	77171eade8	drm/amdgpu: add coreboot workaround for KV/KB Coreboot seems to have a problem correctly setting up access to the stolen VRAM on KV/KB. Use the direct access only when necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-and-tested-by: Fredrik Bruhn <fredrik.bruhn@unibap.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Alex Deucher	276cc92945	drm/amdgpu: original raven doesn't support full asic reset So don't use it. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Alex Deucher	7af2a5771e	drm/amdgpu: attempt to enable gfxoff on more raven1 boards (v2) Switch to a blacklist so we can disable specific boards that are problematic. v2: make the blacklist non-raven specific. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Colin Ian King	b20dcd72c1	drm/amd/amdgpu: fix spelling mistake "to" -> "too" There is a spelling mistake in a DRM_ERROR message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
xinhui pan	f583cc57ba	drm/amdgpu: initialize bo_va_list when add gws to process bo_va_list is list_head, so initialize it. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	55bbb747ec	drm/amdgpu/vcn: use inst_idx relacing inst Use inst_idx relacing inst in SOC15_DPG_MODE macro to avoid confusion. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	a455573214	drm/amdgpu/vcn: fix typo error Fix typo error, should be inst_idx instead of inst. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	326b523eeb	drm/amdgpu/vcn: fix vcn2.5 instance issue Fix vcn2.5 instance issue, vcn0 and vcn1 have same register offset Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	62884a7bf3	drm/amdgpu/vcn2.5: fix a bug for the 2nd vcn instance (v2) Fix a bug for the 2nd vcn instance at start and stop. v2: squash in unused label removal. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	b650121726	drm/amdgpu/vcn: Share vcn_v2_0_dec_ring_test_ring to vcn2.5 Share vcn_v2_0_dec_ring_test_ring to vcn2.5 to support vcn software ring. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
Felix Kuehling	fa34edbed4	drm/amdgpu: Use the correct flush_type in flush_gpu_tlb_pasid The flush_type was incorrectly hard-coded to 0 when calling falling back to MMIO-based invalidation in flush_gpu_tlb_pasid. Fixes: `ea930000a6` ("drm/amdgpu: export function to flush TLB via pasid") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
Felix Kuehling	37c58ddf57	drm/amdgpu: Fix TLB invalidation request when using semaphore Use a more meaningful variable name for the invalidation request that is distinct from the tmp variable that gets overwritten when acquiring the invalidation semaphore. Fixes: `4ed8a03740` ("drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:36 -05:00
Chris Wilson	7e13ad8964	drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count Since drm_global_mutex is a true global mutex across devices, we don't want to acquire it unless absolutely necessary. For maintaining the device local open_count, we can use atomic operations on the counter itself, except when making the transition to/from 0. Here, we tackle the easy portion of delaying acquiring the drm_global_mutex for the final release by using atomic_dec_and_mutex_lock(), leaving the global serialisation across the device opens. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Thomas Hellström <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124130107.125404-1-chris@chris-wilson.co.uk	2020-01-24 17:41:34 +00:00
Alex Deucher	23fe1390c7	drm/amdgpu: remove the experimental flag for renoir Should work properly with the latest sbios on 5.5 and newer kernels. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-23 12:14:53 -05:00
Nirmoy Das	63e3ab9a82	drm/amdgpu: individualize fence allocation per entity Allocate fences for each entity and remove ctx->fences reference as fences should be bound to amdgpu_ctx_entity instead amdgpu_ctx. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Tianci.Yin	7db1d560a4	Revert "drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5)" This reverts commit `9e44147862`. The patch will be replaced with a better solution, revert it. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Tianci.Yin	240c811ccd	drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2) [why] In GDDR6 BIST training, a certain mount of bottom VRAM will be encroached by UMC, that causes problems(like GTT corrupted and page fault observed). [how] Saving the content of this bottom VRAM to system memory before training, and restoring it after training to avoid VRAM corruption. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Nirmoy Das	a9d4fe2fd6	drm/amdgpu: remove unnecessary conversion to bool Better clean that up before some automation starts to complain about it Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Dennis Li	4c461d89db	drm/amdgpu: add RAS support for the gfx block of Arcturus Implement functions to do the RAS error injection and query EDC counter. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:36:30 -05:00
Dennis Li	504c5e72d7	drm/amdgpu: abstract EDC counter clear to a separated function 1. Add IP prefix for the IP related codes. 2. Refactor the code to clear EDC counter. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:36:14 -05:00
Dennis Li	5e66403e4d	drm/amdgpu: refine the security check for RAS functions To avoid calling RAS related functions when RAS feature isn't supported in hardware. Change to check supported features, instead of checking asic type. v2: reuse amdgpu_ras_is_supported function, instead of introducing a new flag for hardware ras feature. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:36:04 -05:00
Dennis Li	39aa0ef163	drm/amdgpu: enable RAS feature for more mmhub sub-blocks of Acrturus Compared with Vg20, the size of mmhub range is changed from 2 to 8. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:35:56 -05:00
chen gong	e3cd03603d	drm/amdgpu: read gfx register using RREG32_KIQ macro Reading CP_MEM_SLP_CNTL register with RREG32_SOC15 macro will lead to hang when GPU is in "gfxoff" state. I do a uniform substitution here. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:29 -05:00
chen gong	c68dbcd8f9	drm/amdgpu: add kiq version interface for RREG32/WREG32 Reading some registers by mmio will result in hang when GPU is in "gfxoff" state.This problem can be solved by GPU in "ring command packages" way. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:22 -05:00
chen gong	d33a99c4b6	drm/amdgpu: provide a generic function interface for reading/writing register by KIQ Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c, and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and flexible. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:14 -05:00
John Clements	a6c44d2538	drm/amdgpu: added support to get mGPU DRAM base resolves issue with RAS error injection in mGPU configuration Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:07 -05:00
Alex Sierra	36a1707afd	drm/amdgpu: modify packet size for pm4 flush tlbs [Why] PM4 packet size for flush message was oversized. [How] Packet size adjusted to allocate flush + fence packets. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:33:52 -05:00
Pan, Xinhui	bd05221123	drm/amdgpu: add the lost mutex_init back Initialize notifier_lock. Bug: https://gitlab.freedesktop.org/drm/amd/issues/1016 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-17 16:03:09 -05:00
Hawking Zhang	e9d4cf918f	drm/amdgpu: add arcturus to gpu recovery check code path support check if dirver should try gpu recovery for arcturus Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:40:54 -05:00
Hawking Zhang	93af20f74e	drm/amdgpu: check if driver should try recovery in ras recovery path To allow the flexibilty for user to disable gpu recovery in RAS recovery path by module parameter amdgpu_gpu_recovery Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:40:47 -05:00
Evan Quan	2ac0d68697	drm/amd/powerplay: a quick fix for the deadlock issue below NFO: task ocltst:2028 blocked for more than 120 seconds. Tainted: G OE 5.0.0-37-generic #40~18.04.1-Ubuntu echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. cltst D 0 2028 2026 0x00000000 all Trace: __schedule+0x2c0/0x870 schedule+0x2c/0x70 schedule_preempt_disabled+0xe/0x10 __mutex_lock.isra.9+0x26d/0x4e0 __mutex_lock_slowpath+0x13/0x20 ? __mutex_lock_slowpath+0x13/0x20 mutex_lock+0x2f/0x40 amdgpu_dpm_set_powergating_by_smu+0x64/0xe0 [amdgpu] gfx_v8_0_enable_gfx_static_mg_power_gating+0x3c/0x70 [amdgpu] gfx_v8_0_set_powergating_state+0x66/0x260 [amdgpu] amdgpu_device_ip_set_powergating_state+0x62/0xb0 [amdgpu] pp_dpm_force_performance_level+0xe7/0x100 [amdgpu] amdgpu_set_dpm_forced_performance_level+0x129/0x330 [amdgpu] Fixes: `a64c9e15e6` ("drm/amd/powerplay: cleanup the interfaces for powergate setting through SMU") Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-by: Rui Teng <Rui.Teng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:38:23 -05:00
Huang Rui	0e5b7a9528	drm/amdgpu: only set cp active field for kiq queue The mec ucode will set the CP_HQD_ACTIVE bit while the queue is mapped by MAP_QUEUES packet. So we only need set cp active field for kiq queue. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:38:16 -05:00
Alex Deucher	27414cd42a	drm/amdgpu/pm: clean up return types count is size_t so don't use negative values. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:38:02 -05:00
James Zhu	0c0dab86d9	drm/amdgpu/vcn2.5: implement indirect DPG SRAM mode Implement indirect DPG SRAM mode for vcn2.5 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:47 -05:00
James Zhu	8484df9601	drm/amdgpu/vcn2.5: add dpg pause mode Add dpg pause mode support for vcn2.5 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:41 -05:00
James Zhu	d2a2c64f53	drm/amdgpu/vcn2.5: add DPG mode start and stop Add DPG mode start and stop functions for vcn2.5 v2: Correct firmware ucode index in vcn_v2_5_mc_resume_dpg_mode Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:34 -05:00
James Zhu	45cec87cd6	drm/amdgpu/vcn: move macro from vcn2.0 to share amdgpu_vcn (v2) Move macro from vcn2.0 to amdgpu_vcn to share with vcn2.5 v2: squash in macro fix Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:34 -05:00
James Zhu	5db86843e8	drm/amdgpu/vcn: support multiple instance direct SRAM read and write (v2) Add multiple instance direct SRAM read and write support for vcn2.5 v2: squash in indexing fix Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:36:47 -05:00
James Zhu	597e6ac3a7	drm/amdgpu/vcn: support multiple-instance dpg pause mode Add multiple-instance dpg pause mode support for VCN2.5 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:51 -05:00
Tianci.Yin	9e44147862	drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5) [why] In dual GPUs scenario, stolen_size is assigned to zero on the secondary GPU, since there is no pre-OS console using that memory. Then the bottom region of VRAM was allocated as GTT, unfortunately a small region of bottom VRAM was encroached by UMC firmware during GDDR6 BIST training, this cause page fault. [how] Forcing stolen_size to 3MB, then the bottom region of VRAM was allocated as stolen memory, GTT corruption avoid. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:37 -05:00
Tianci.Yin	6a1094ab68	drm/amdgpu/gfx10: update gfx golden settings for navi14 remove registers: mmSPI_CONFIG_CNTL add registers: mmSPI_CONFIG_CNTL_1 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:30 -05:00
Tianci.Yin	7b7041f892	drm/amdgpu/gfx10: update gfx golden settings remove registers: mmSPI_CONFIG_CNTL add registers: mmSPI_CONFIG_CNTL_1 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:23 -05:00
shaoyunl	b4df2823ec	drm/amdgpu: check rlc_g firmware pointer is valid before using it In SRIOV, rlc_g firmware is loaded by host, guest driver won't load it which will cause the rlc_fw pointer is null Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:17 -05:00
Christian König	971fe55545	drm/amdgpu: drop amdgpu_job.owner Entirely unused. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:10 -05:00
Nirmoy Das	55414ad5c9	drm/amdgpu: error out on entity with no run queue Disabled HW IP's entity initialized with NULL rq. We should not process any submit request from userspace for a disabled HW IP. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:03 -05:00
Huang Rui	8eee00f615	drm/amdkfd: use map_queues for hiq on gfx v10 as well To align with gfx v9, we use the map_queues packet to load hiq MQD. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:57 -05:00
Aaron Liu	35cd89d5a6	drm/amdkfd: use kiq to load the mqd of hiq queue for gfx v9 (v6) There is an issue that CP will check the HIQ queue to be configured and mapped with KIQ ring, otherwise, it will be unable to read back the secure buffer while the gfxoff is enabled even with trusted IP blocks. v1 -> v2: - Fix to remove surplus set_resources packets. - Fill the whole configuration in MQD. - Change the author as Aaron because he addressed the key point of this issue. - Add kiq ring lock. v2 -> v3: - Free the lock while in error return case. - Remove the programming only needed by the queue is unmapped. v3 -> v4: - Remove doorbell programming because it's used for restarting queue. - Remove CP scheduler programming because map_queue packet will handle this. v4 -> v5: - Remove cp_hqd_active because mec ucode will enable it while use map_queues. - Revise goto out_unlock. - Correct the right doorbell offset for HIQ that kfd driver assigned in the packet. v5 -> v6: - Merge Arcturus fix into this patch because it will get oops in Arcturus platform. Reported-by: Lisa Saturday <Lisa.Saturday@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:50 -05:00
Alex Sierra	d175e9acf6	drm/amdgpu: flush TLB functions removal from kfd2kgd interface [Why] kfd2kgd interface will be deprecated. This removal only covers TLB invalidation for now. They have been replaced in amdgpu_amdkfd API. [How] TLB invalidate functions removed from the different amdkfd_gfx_v* versions. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:42 -05:00
Alex Sierra	ffa022696f	drm/amdgpu: GPU TLB flush API moved to amdgpu_amdkfd [Why] TLB flush method has been deprecated using kfd2kgd interface. This implementation is now on the amdgpu_amdkfd API. [How] TLB flush functions now implemented in amdgpu_amdkfd. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:33 -05:00
Alex Sierra	ea930000a6	drm/amdgpu: export function to flush TLB via pasid This can be used directly from amdgpu and amdkfd to invalidate TLB through pasid. It supports gmc v7, v8, v9 and v10. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:27 -05:00
Alex Sierra	4f01f1e58e	drm/amdgpu: replace kcq enable/disable functions on gfx_v9 [Why] There are HW-indpendent functions that enables and disables kcq. These functions use the kiq_pm4_funcs implementation. [How] Local kcq enable and disable functions removed and replace it by the generic kcq enable under amdgpu_gfx Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:19 -05:00
Alex Sierra	58e508b6be	drm/amdgpu: implement tlbs invalidate on gfx9 gfx10 tlbs invalidate pointer function added to kiq_pm4_funcs struct. This way, tlb flush can be done through kiq member. TLBs invalidatation implemented for gfx9 and gfx10. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:11 -05:00
Alex Sierra	f167ea6a14	drm/amdgpu: kiq pm4 function implementation for gfx_v9 Functions implemented from kiq_pm4_funcs struct members for gfx_v9 version. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:34:04 -05:00
Alex Sierra	a269e44989	drm/amdgpu: Avoid reclaim fs while eviction lock [Why] Avoid reclaim filesystem while eviction lock is held called from MMU notifier. [How] Setting PF_MEMALLOC_NOFS flags while eviction mutex is locked. Using memalloc_nofs_save / memalloc_nofs_restore API. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:33:16 -05:00
Christian König	5e791166d3	drm/ttm: nuke invalidate_caches callback Another completely unused feature. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/348265/	2020-01-16 16:35:07 +01:00
Aaron Liu	f2360e333b	drm/amdgpu: update goldensetting for renoir Update mmSDMA0_UTCL1_WATERMK golden setting for renoir. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-15 17:52:21 -05:00
Alex Deucher	a9ffe2a983	drm/amdgpu/debugfs: properly handle runtime pm If driver debugfs files are accessed, power up the GPU when necessary. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:20:34 -05:00
Alex Deucher	b9a9294b91	drm/amdgpu/pm: properly handle runtime pm If power management sysfs or debugfs files are accessed, power up the GPU when necessary. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:20:34 -05:00
Flora Cui	f81110b852	drm/amdgpu: add header file for macro SZ_1M Fixes: `4dee6e4ca5` ("drm/amdgpu: use linux size macro to simplify ONE_Kib & One_Mib") Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:58 -05:00
Alex Deucher	a2e4b418c6	drm/amdgpu/psp: declare navi1x ta firmware So that it gets included in the initrd. At the moment this is optional firmware that contains support for HDCP. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:58 -05:00
Joseph Greathouse	22d39fe729	drm/amdgpu: Match TC hash settings to DF settings (v2) On Arcturus, data fabric hashing is set by the VBIOS, and affects which addresses map to which memory channels. The gfx core's caches also need to know this mapping, but the hash settings for these these caches is set by the driver. This change queries the DF to understand how the VBIOS configured DF, then matches the TC hash configuration bits to do the same thing. v2: squash in warning fix Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:58 -05:00
Joseph Greathouse	bdf84a80e0	drm/amdgpu: Create generic DF struct in adev The only data fabric information the adev struct currently contains is a function pointer table. In the near future, we will be adding some cached DF information into adev. As such, this patch creates a new amdgpu_df struct for adev. Right now, it only containst the old function pointer table, but new stuff will be added soon. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:41 -05:00
John Clements	eee2eabafe	drm/amdgpu: preserve RSMU UMC index mode state between UMC RAS err register access restore previous RSMU UMC index mode state Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:10 -05:00
John Clements	9c8c81fe7d	drm/amdgpu: disable XGMI TA unload for arcturus in event of GPU reset, XGMI TA unload causes unrecoverable GPU hang Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:10 -05:00
Aaron Liu	d8459d1b7f	drm/amdgpu: update goldensetting for renoir Update mmSDMA0_UTCL1_WATERMK golden setting for renoir. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:10 -05:00
Alex Deucher	1499bcc7a2	drm/amdgpu/gmc10: free stolen memory in late_init We don't need to store the pre-OS console memory after the driver has loaded so free it. Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:10 -05:00
Alex Deucher	bbde7162f7	drm/amdgpu/gmc10: remove dead code Leftover from bring up. We look up the actual pre-OS memory usage value later in the same function. Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:10 -05:00
Alex Deucher	403c1ef0d2	drm/amdgpu: enable S/G display on PCO and RV2 (v2) It should work on all Raven variants, but some users have reported issues with original Raven with IOMMU enabled. So far there have been no issues observed with PCO or RV2. v2: split out the dm init and domain changes into separate patches. Acked-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:09 -05:00
Alex Deucher	d44394a9e1	drm/amdgpu/gfx9: remove unused sdma headers All of the sdma stuff these were used for moves to the sdma code, so remove them. Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:09 -05:00
Hawking Zhang	49da2ccd2d	drm/amdgpu: check sdma ras funcs pointer before accessing sdma ras funcs are not supported by ASIC prior to vega20 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:09 -05:00
Guchun Chen	5d4667ec33	drm/amdgpu: calculate MCUMC_ADDRT0 per asic's UMC offset Hardcoded offset is not friendly. And another benifit of this patch is to keep read and write access to this register be consistent with other similar UMC regsiters in this file. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:09 -05:00
Tiecheng Zhou	df5e984c8b	drm/amdgpu/sriov: workaround on rev_id for Navi12 under sriov guest vm gets 0xffffffff when reading RCC_DEV0_EPF0_STRAP0, as a consequence, the rev_id and external_rev_id are wrong. workaround it by hardcoding the rev_id to 0, which is the default value. v2. add comment in the code Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:09 -05:00
Hawking Zhang	5e62db9df6	drm/amdgpu: read sdma edc counter to clear the counters SDMA edc counter registers were added in gfx edc counters array. When querying gfx error counter in that array, there is no way to differentiate sdma instance number for different asic and then results to NULL pointer access when trying to read sdma register base address for instances greater than 2 on Vega20. In addition, this also results to wrong gfx error counters since it actually added sdma edc counters. Therefore, sdma edc counter registers should be separated from gfx edc counter regsiter array and only get initialized when driver tries to enable sdma ras. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Hawking Zhang	1dd5ead294	drm/amdgpu: add ras_late_init and ras_fini for sdma v4 move ras_late_init and ras_fini to sdma_ras_funcs table Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Hawking Zhang	3e81ee9a78	drm/amdgpu: support error reporting for sdma ip block invoke sdma query_ras_error_count to get sdma single bit error count Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Hawking Zhang	93070deb58	drm/amdgpu: add query_ras_error_count function for sdma v4 query_ras_error_count function will be invoked to query single bit error count detected in sdma ip block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Leo Liu	e7ddb87848	drm/amdgpu: enable VCN2.5 IP block for Arcturus With default PSP FW loading Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Leo Liu	2d6605911d	drm/amdgpu/vcn2.5: fix PSP FW loading for the second instance ucodes for instances are from different location Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Nirmoy Das	5021e9a831	drm/amdgpu: catch amdgpu_irq_add_id failure Do not ignore amdgpu_irq_add_id return value while registering VMC page fault interrupt. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Evan Quan	9530273ec9	drm/amd/powerplay: cover the powerplay implementation details V3 This can save users much troubles. As they do not actually need to care whether swSMU or traditional powerplay routine should be used. V2: apply the fixes to vi.c and cik.c also V3: squash in oops fix Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Yong Zhao	a434b94c5a	drm/amdkfd: Improve function get_sdma_rlc_reg_offset() (v2) The SOC15_REG_OFFSET() macro needs to dereference adev->reg_offset[IP] pointer, which is sometimes NULL when there are fewer than 8 sdma engines. Avoid that by not initializing the array regardless. v2: squash in warning fixes Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:17:05 -05:00
Alex Deucher	2cacd20e91	Revert "drm/amdgpu: Set no-retry as default." This reverts commit `51bfac71ca`. This causes stability issues on some raven boards. Revert for now until a proper fix is completed. Bug: https://gitlab.freedesktop.org/drm/amd/issues/934 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206017 Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:04:02 -05:00
Alex Deucher	48ccd5ffe5	drm/amdgpu/gfx: simplify old firmware warning Put it on one line to avoid whitespace issues when printing in the log. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:03:54 -05:00
Alex Deucher	5677c52090	drm/amdgpu/gmc10: use common invalidation engine helper Rather than open coding it. This also changes the free masks to better reflect the usage by other components. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:03:47 -05:00
Alex Deucher	bdbe90f04d	drm/amdgpu/gmc: move invaliation bitmap setup to common code So it can be shared with newer GMC versions. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:03:42 -05:00
John Clements	c8aa6ae30c	drm/amdgpu: updated UMC error address record with correct channel index defined macros for repetitive for loops Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:02:08 -05:00
John Clements	0ee51f1d94	drm/amdgpu: resolved bug in UMC RAS CE query switch CE counter register access' to use SMN disable UMC indexing mode Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:02:01 -05:00
Evan Quan	a64c9e15e6	drm/amd/powerplay: cleanup the interfaces for powergate setting through SMU Provided an unified entry point. And fixed the confusing that the API usage is conflict with what the naming implies. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:01:54 -05:00
Zhigang Luo	25344d7e98	drm/amd/amdgpu: L1 Policy(5/5) - removed IH_CHICKEN from VF Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Jane Jian <jane.jian@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:00:46 -05:00
Zhigang Luo	2ee9403e81	drm/amd/amdgpu: L1 Policy(3/5) - removed ECC interrupt from VF Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Jane Jian <jane.jian@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:00:40 -05:00
Zhigang Luo	08546895bc	drm/amd/amdgpu: L1 Policy(2/5) - removed GC GRBM violations from gfxhub Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Jane Jian <jane.jian@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:00:33 -05:00
Zhigang Luo	20bf2f6fef	drm/amd/amdgpu: L1 Policy(1/5) - removed VM settings for mmhub and gfxhub from VF Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Jane Jian <jane.jian@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 12:00:17 -05:00
John Clements	61130c7432	drm/amdgpu: removed GFX RAS support check in UMC ECC callback enable GPU recovery in event of uncorrectable UMC error Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:59:37 -05:00
John Clements	097dc53ee9	drm/amdgpu: added function to wait for PSP BL availability reduced duplicate code increased wait time for PSP BL readiness Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:58:50 -05:00
Kevin Wang	4dee6e4ca5	drm/amdgpu: use linux size macro to simplify ONE_Kib & One_Mib replace internal size macro with linux size macro Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:58:44 -05:00
John Clements	bd68fb94b3	drm/amdgpu: resolve bug in UMC 6 error counter query iterate over all error counter registers in SMN space removed support error counter access via MMIO Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:58:37 -05:00
Jack Zhang	895bd048fb	amd/amdgpu/sriov tdr enablement with pp_onevf_mode Under sriov and pp_onevf mode, 1.take resume instead of hw_init for smc recover to avoid potential memory leak. 2.add return condition inside smc resume function for sriov_pp_onevf_mode and pm_enabled param. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:58:19 -05:00
Jack Zhang	c2a801af31	amd/amdgpu/sriov enable onevf mode for ARCTURUS VF Before, initialization of smu ip block would be skipped for sriov ASICs. But if there's only one VF being used, guest driver should be able to dump some HW info such as clks, temperature,etc. To solve this, now after onevf mode is enabled, host driver will notify guest. If it's onevf mode, guest will do smu hw_init and skip some steps in normal smu hw_init flow because host driver has already done it for smu. With this fix, guest app can talk with smu and dump hw information from smu. v2: refine the logic for pm_enabled.Skip hw_init by not changing pm_enabled. v3: refine is_support_sw_smu and fix some indentation issue. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:58:10 -05:00
John Clements	955c712007	drm/amdgpu: update UMC 6.1 RAS error counter register access path use proper method for SMN register access Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:57:48 -05:00
John Clements	34e48caee4	drm/amdgpu: amalgamated PSP TA invoke functions reduce duplicate code Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:57:15 -05:00
John Clements	1f455f2580	drm/amdgpu: amalgamate PSP TA load/unload functions reduce duplicate code Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:57:09 -05:00
John Clements	e6e193c00d	drm/amdgpu: by default output PSP ret status in event of cmd failure update log level from DRM_DEBUG_DRIVER to DRM_WARN Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:57:01 -05:00
Evan Quan	0753e56e9a	drm/amdgpu: correct RLC firmwares loading sequence Per confirmation with RLC firmware team, the RLC should be unhalted after all RLC related firmwares uploaded. However, in fact the RLC is unhalted immediately after RLCG firmware uploaded. And that may causes unexpected PSP hang on loading the succeeding RLC save restore list related firmwares. So, we correct the firmware loading sequence to load RLC save restore list related firmwares before RLCG ucode. That will help to get around this issue. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:56:52 -05:00
Evan Quan	b8ab58f350	drm/amd/powerplay: add check for baco support on Arcturus This is used to determine whether runtime pm can be supported or not. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:56:43 -05:00
Alex Deucher	c00ca07f64	Revert "drm/amdgpu: simplify ATPX detection" This reverts commit `f5fda6d89a`. You can't use BASE_CLASS in pci_get_class. Bug: https://gitlab.freedesktop.org/drm/amd/issues/995 Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:55:56 -05:00
Guchun Chen	8831fa6e89	drm/amdgpu: simplify function return logic Former return logic is redundant. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:55:29 -05:00
Chunming Zhou	b6025eeaa1	drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu Can expose it now that the khronos has exposed the vlk extension. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Flora Cui <Flora.Cui@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:54:30 -05:00
Chunming Zhou	db4ff423cd	drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu Can expose it now that the khronos has exposed the vlk extension. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Flora Cui <Flora.Cui@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-01-07 11:33:32 -05:00
Alex Deucher	7aec9ec1cf	Revert "drm/amdgpu: Set no-retry as default." This reverts commit `51bfac71ca`. This causes stability issues on some raven boards. Revert for now until a proper fix is completed. Bug: https://gitlab.freedesktop.org/drm/amd/issues/934 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206017 Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-01-07 11:27:03 -05:00
Evan Quan	969e115292	drm/amdgpu: correct RLC firmwares loading sequence Per confirmation with RLC firmware team, the RLC should be unhalted after all RLC related firmwares uploaded. However, in fact the RLC is unhalted immediately after RLCG firmware uploaded. And that may causes unexpected PSP hang on loading the succeeding RLC save restore list related firmwares. So, we correct the firmware loading sequence to load RLC save restore list related firmwares before RLCG ucode. That will help to get around this issue. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-01-01 09:26:09 -05:00
changzhu	e0c6381235	drm/amdgpu: enable gfxoff for raven1 refresh When smu version is larger than 0x41e2b, it will load raven_kicker_rlc.bin.To enable gfxoff for raven_kicker_rlc.bin,it needs to avoid adev->pm.pp_feature &= ~PP_GFXOFF_MASK when it loads raven_kicker_rlc.bin. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-29 02:00:27 -05:00
Alex Deucher	5d30ed3c2c	Revert "drm/amdgpu: simplify ATPX detection" This reverts commit `f5fda6d89a`. You can't use BASE_CLASS in pci_get_class. Bug: https://gitlab.freedesktop.org/drm/amd/issues/995 Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-29 01:53:27 -05:00
zhengbin	e95cd6b2ac	drm/amdgpu: use true, false for bool variable in amdgpu_psp.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:674:2-26: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:794:1-25: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:897:2-36: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1016:1-35: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1087:2-34: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1177:1-33: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 15:00:02 -05:00
zhengbin	c5b2bd5d39	drm/amdgpu: use true, false for bool variable in amdgpu_debugfs.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:132:2-10: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:140:2-10: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:142:13-21: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 15:00:01 -05:00
zhengbin	2a9b90ae47	drm/amdgpu: use true, false for bool variable in amdgpu_device.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3961:1-19: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3981:1-19: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 15:00:01 -05:00
zhengbin	6df3dab619	drm/amdgpu: use true, false for bool variable in mxgpu_nv.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:255:2-20: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:267:2-20: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 15:00:01 -05:00
zhengbin	eb28038cc6	drm/amdgpu: use true, false for bool variable in mxgpu_ai.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:253:2-20: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:265:2-20: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 15:00:00 -05:00
Guchun Chen	46cf2fecf5	drm/amdgpu: add missed return value set for error case Return value should be set when going to error handle tag for error case, this can avoid potential invalid array access by upper caller. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:59:51 -05:00
Frank.Min	55d62fe10f	drm/amdgpu: remove FB location config for sriov FB location is already programmed by HV driver for arcutus so remove this part Signed-off-by: Frank.Min <Frank.Min@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:59:43 -05:00
Frank.Min	fdf57ba690	drm/amdgpu: enable xgmi init for sriov use case 1. enable xgmi ta initialization for sriov 2. enable xgmi initialization for sriov Signed-off-by: Frank.Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:59:37 -05:00
Tianci.Yin	33a9a5ab1e	drm/amdgpu: remove memory training p2c buffer reservation(V2) IP discovery TMR(occupied the top VRAM with size DISCOVERY_TMR_SIZE) has been reserved, and the p2c buffer is in the range of this TMR, so the p2c buffer reservation is unnecessary. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:59:28 -05:00
Tianci.Yin	8d40002fee	drm/amdgpu: update the method to get fb_loc of memory training(V4) The method of getting fb_loc changed from parsing VBIOS to taking certain offset from top of VRAM Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:59:20 -05:00
Ma Feng	7eca40066f	drm/amdgpu: Remove unneeded variable 'ret' in navi10_ih.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/navi10_ih.c:113:5-8: Unneeded variable: "ret". Return "0" on line 182 Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ma Feng <mafeng.ma@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:59:09 -05:00
Ma Feng	e3c00faa7a	drm/amdgpu: Remove unneeded variable 'ret' in amdgpu_device.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1036:5-8: Unneeded variable: "ret". Return "0" on line 1079 Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ma Feng <mafeng.ma@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:58:56 -05:00
James Zhu	57cb635bb4	drm/amdgpu/gfx: Add mmSDMA2-7_EDC_COUNTER to support Arcturus Add mmSDMA2-7_EDC_COUNTER to support Arcturus Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:57 -05:00
James Zhu	107ab06136	drm/amdgpu/gfx: Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:51 -05:00
James Zhu	d8c61373e0	drm/amdgpu/gfx: Replace ARRAY_SIZE with size variable Replace ARRAY_SIZE with size variables to support different ASICs. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:45 -05:00
John Clements	1e2c6d5582	drm/amdgpu: Added ASIC specific check in gmc v9.0 ECC interrupt programming sequence Devices newer then VEGA10/12 shall have these programming sequences performed by PSP BL Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:27 -05:00
Frank.Min	56ca8628ac	drm/amdgpu: enlarge agp_start address into 48bit max range of the agp aperture is 48 bits, so enlarge agp_start address into 48bit with all bits set Signed-off-by: Frank.Min <Frank.Min@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:21 -05:00
Jane Jian	8adf5d2184	drm/amdgpu: disable VCN2.5 ib test for Arcturus sriov currently using TMR loading VCN fw MMSCH would fail to init after FLR, just disable ib test for temporarily daily testing, continuing debug with mm team. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:15 -05:00
Le Ma	0a96afc7c5	drm/amdgpu: fix ctx init failure for asics without gfx ring This workaround does not affect other asics because amdgpu only need expose one gfx sched to user for now. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:09 -05:00
Jonathan Kim	a7843c0379	drm/amdgpu: attempt xgmi perfmon re-arm on failed arm The DF routines to arm xGMI performance will attempt to re-arm both on performance monitoring start and read on initial failure to arm. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:56:02 -05:00
Luben Tuikov	ce73516d42	drm/amdgpu: simplify padding calculations (v2) Simplify padding calculations. v2: Comment update and spacing. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:55:44 -05:00
Jane Jian	ab5999dea0	drm/amdgpu: enable VCN0 and VCN1 sriov instances support for Arcturus v1: compared to bare-metal: sriov support psp loading VCN firmware; only one encoding ring would be used in each instance. v2: keep unchange for bare-metal VCN2.5 hw_init, just add a flag with sriov and also remove multiple lines. v3: squash in warning fix Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-19 10:10:05 -05:00
Jane Jian	b40953c2ba	drm/amdgpu: skip VCN2.5 power gating and clock gating for sriov Arcturus v1: skip gating in serveral called functions by power gating and clock gating v2: from suggestion, skip setting gate in both set function, which is where it being called. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:33:26 -05:00
Jane Jian	d83c7a07a7	drm/amdgpu: update VCN1(dual instances) fw types ID and VCN ip block type Previously there is no VCN1 type ID in psp gfx interface. Also add VCN ip block type unless the reinit after FLR for sriov would fail. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:33:26 -05:00
Jane Jian	7daaebfea5	drm/amdgpu: add VCN2.5 sriov start for Arctrus Use MMSCH V1 to finish Memory Controller programming as well as start MMSCH to do VCN2.5 initialization. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:33:26 -05:00
Jane Jian	95f1b55b67	drm/amdgpu: add VCN2.5 MMSCH start for Arcturus Use MMSCH to do the initialization since MMSCH manages VCN2.5 instances and its world switch. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:33:26 -05:00
Guchun Chen	fb71a336cd	drm/amdgpu: move umc offset to one new header file for Arcturus Code refactor and no functional change. Fixes: `4cf781c24c` ("drm/amdgpu: Added RAS UMC error query support for Arcturus") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:33:01 -05:00
Alex Deucher	719423f670	drm/amdgpu: wait for all rings to drain before runtime suspending Add a safety check to runtime suspend to make sure all outstanding fences have signaled before we suspend. Doesn't fix any known issue. We already do this via the fence driver suspend function, but we just force completion rather than bailing. This bails on runtime suspend so we can try again later once the fences are signaled to avoid missing any outstanding work. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:13 -05:00
Andrey Grodzovsky	c96cf2823d	drm/amdgpu: Switch from system_highpri_wq to system_unbound_wq This is to avoid queueing jobs to same CPU during XGMI hive reset because there is a strict timeline for when the reset commands must reach all the GPUs in the hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:13 -05:00
Andrey Grodzovsky	c6a6e2db99	drm/amdgpu: Redo XGMI reset synchronization. Use task barrier in XGMI hive to synchronize ASIC resets across devices in XGMI hive. v2: Return right away with a warning if no xgmi hive, update doc. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:13 -05:00
Andrey Grodzovsky	f33a8770cd	drm/amdgpu: Add task barrier to XGMI hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:13 -05:00
Andrey Grodzovsky	041a62bc06	drm/amdgpu: reverts commit `ce316fa55e`. In preparation for doing XGMI reset synchronization using task barrier. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Leo Liu	f06a58db92	drm/amdgpu/vcn: remove unnecessary included headers Esp. VCN1.0 headers should not be here v2: add back the <linux/module.h> to keep consistent. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Monk Liu	5a7489a7e1	drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV issues: MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but there is a limitation to block this sequence: if we do pre_reset() before VF FLR, it would go KIQ way to do register access and stuck there, because KIQ probably won't work by that time (e.g. you already made GFX hang) so the best way right now is to simply remove it. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Monk Liu	1512d064f5	drm/amdgpu: fix double gpu_recovery for NV of SRIOV issues: gpu_recover() is re-entered by the mailbox interrupt handler mxgpu_nv.c fix: we need to bypass the gpu_recover() invoke in mailbox interrupt as long as the timeout is not infinite (thus the TDR will be triggered automatically after time out, no need to invoke gpu_recover() through mailbox interrupt. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Nirmoy Das	f880799d7f	amd/amdgpu: add sched array to IPs with multiple run-queues This sched array can be passed on to entity creation routine instead of manually creating such sched array on every context creation. v2: squash in missing break fix Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Nirmoy Das	0c88b43032	drm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds list drm_sched_entity_init() takes drm gpu scheduler list instead of drm_sched_rq list. This makes conversion of drm_sched_rq list to drm gpu scheduler list unnecessary Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Nirmoy Das	b3ac17667f	drm/scheduler: rework entity creation Entity currently keeps a copy of run_queue list and modify it in drm_sched_entity_set_priority(). Entities shouldn't modify run_queue list. Use drm_gpu_scheduler list instead of drm_sched_rq list in drm_sched_entity struct. In this way we can select a runqueue based on entity/ctx's priority for a drm scheduler. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Alex Deucher	45a80abebc	drm/amdgpu/pm_runtime: update usage count in fence handling Increment the usage count in emit fence, and decrement in process fence to make sure the GPU is always considered in use while there are fences outstanding. We always wait for the engines to drain in runtime suspend, but in practice that only covers short lived jobs for gfx. This should cover us for longer lived fences. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:11 -05:00
zhengbin	374bf7bd6a	drm/amdgpu: Remove unneeded semicolon in amdgpu_ras.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:318:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:11 -05:00
zhengbin	640f079325	drm/amdgpu: Remove unneeded semicolon in gfx_v10_0.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:1967:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:11 -05:00
zhengbin	2111a5f715	drm/amdgpu: Remove unneeded semicolon in amdgpu_pmu.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:110:3-4: Unneeded semicolon drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:133:2-3: Unneeded semicolon drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:163:2-3: Unneeded semicolon drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:191:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:11 -05:00
Alex Deucher	42a9938e1e	drm/amdgpu/sdma5: make ring tests less chatty We already did this for older generations. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:11 -05:00
Alex Deucher	e47c9bce46	drm/amdgpu/gfx10: make ring tests less chatty We already did this for older generations. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:11 -05:00
Leo Liu	5e1e89eead	drm/amdgpu/vcn: remove JPEG related code from idle handler and begin use For VCN2.0 and above, VCN has been separated from JPEG Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:07 -05:00
Leo Liu	d58ed70778	drm/amdgpu/vcn1.0: use its own idle handler and begin use funcs Because VCN1.0 power management and DPG mode are managed together with JPEG1.0 under both HW and FW, so separated them from general VCN code. Also the multiple instances case got removed, since VCN1.0 HW just have a single instance. v2: override work func with vcn1.0's own Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:07 -05:00
changzhu	aaff8b448d	drm/amdgpu: enable gfxoff for raven1 refresh When smu version is larger than 0x41e2b, it will load raven_kicker_rlc.bin.To enable gfxoff for raven_kicker_rlc.bin,it needs to avoid adev->pm.pp_feature &= ~PP_GFXOFF_MASK when it loads raven_kicker_rlc.bin. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:07 -05:00
Emily Deng	8973d9ec8f	drm/amdgpu/sriov: Tonga sriov also need load firmware with smu Fix Tonga sriov load driver fail issue. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewd-by Yintian Tao <Yintian.tao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:06 -05:00
Guchun Chen	6193462409	drm/amdgpu: drop useless BACO arg in amdgpu_ras_reset_gpu BACO reset mode strategy is determined by latter func when calling amdgpu_ras_reset_gpu. So not to confuse audience, drop it. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:06 -05:00
Yong Zhao	d7f72fe482	drm/amdgpu: Add CU info print log The log will be useful for easily getting the CU info on various emulation models or ASICs. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:06 -05:00
Yong Zhao	ad5901df88	drm/amdkfd: Use Arcturus specific set_vm_context_page_table_base() Since Arcturus has it own function pointer, we can move Arcturus specific logic to there rather than leaving it entangled with other GFX9 chips. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:06 -05:00
Daniel Vetter	be452c4e8d	Merge tag 'drm-next-5.6-2019-12-11' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.6-2019-12-11: amdgpu: - Add MST atomic routines - Add support for DMCUB (new helper microengine for displays) - Add OEM i2c support in DC - Use vstartup for vblank events on DCN - Simplify Kconfig for DC - Renoir fixes for DC - Clean up function pointers in DC - Initial support for HDCP 2.x - Misc code cleanups - GFX10 fixes - Rework JPEG engine handling for VCN - Add clock and power gating support for JPEG - BACO support for Arcturus - Cleanup PSP ring handling - Add framework for using BACO with runtime pm to save power - Move core pci state handling out of the driver for pm ops - Allow guest power control in 1 VF case with SR-IOV - SR-IOV fixes - RAS fixes - Support for power metrics on renoir - Golden settings updates for gfx10 - Enable gfxoff on supported navi10 skus - Update MAINTAINERS amdkfd: - Clean up generational gfx code - Fixes for gfx10 - DIQ fixes - Share more code with amdgpu radeon: - PPC DMA fix - Register checker fixes for r1xx/r2xx - Misc cleanups From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211223020.7510-1-alexander.deucher@amd.com	2019-12-17 18:47:46 +01:00
Daniel Vetter	6c56e8adc0	drm-misc-next for v5.6: UAPI Changes: - Add support for DMA-BUF HEAPS. Cross-subsystem Changes: - mipi dsi definition updates, pulled into drm-intel as well. - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim. - Remove support for dma-buf kmap/kunmap. - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well. Core Changes: - Small cleanups to ttm. - Fix SCDC definition. - Assorted cleanups to core. - Add todo to remove load/unload hooks, and use generic fbdev emulation. - Assorted documentation updates. - Use blocking ww lock in ttm fault handler. - Remove drm_fb_helper_fbdev_setup/teardown. - Warning fixes with W=1 for atomic. - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers. - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted) - Various kconfig indentation fixes in core and drivers. - Fix freeing transactions in dp-mst correctly. - Sean Paul is steping down as core maintainer. :-( - Add lockdep annotations for atomic locks vs dma-resv. - Prevent use-after-free for a bad job in drm_scheduler. - Fill out all block sizes in the P01x and P210 definitions. - Avoid division by zero in drm/rect, and fix bounds. - Add drm/rect selftests. - Add aspect ratio and alternate clocks for HDMI 4k modes. - Add todo for drm_framebuffer_funcs and fb_create cleanup. - Drop DRM_AUTH for prime import/export ioctls. - Clear DP-MST payload id tables downstream when initializating. - Fix for DSC throughput definition. - Add extra FEC definitions. - Fix fake offset in drm_gem_object_funs.mmap. - Stop using encoder->bridge in core directly - Handle bridge chaining slightly better. - Add backlight support to drm/panel, and use it in many panel drivers. - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes. Driver Changes: - Small fixes all over. - Fix documentation in vkms. - Fix mmap_sem vs dma_resv in nouveau. - Small cleanup in komeda. - Add page flip support in gma500 for psb/cdv. - Add ddc symlink in the connector sysfs directory for many drivers. - Add support for analogic an6345, and fix small bugs in it. - Add atomic modesetting support to ast. - Fix radeon fault handler VMA race. - Switch udl to use generic shmem helpers. - Unconditional vblank handling for mcde. - Miscellaneous fixes to mcde. - Tweak debug output from komeda using debugfs. - Add gamma and color transform support to komeda for DOU-IPS. - Add support for sony acx424AKP panel. - Various small cleanups to gma500. - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation. - Add support for Logic PD Type 28 panel. - Use drm_panel_* wrapper functions in exynos/tegra/msm. - Add devicetree bindings for generic DSI panels. - Don't include drm_pci.h directly in many drivers. - Add support for begin/end_cpu_access in udmabuf. - Stop using drm_get_pci_dev in gma500 and mga200. - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access. - Add devfreq thermal support to panfrost. - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager. - meson: Add support for OSD1 plane AFBC commit. - Stop displaying garbage when toggling ast primary plane on/off. - More cleanups and fixes to UDL. - Add D32 suport to komeda. - Remove globle copy of drm_dev in gma500. - Add support for Boe Himax8279d MIPI-DSI LCD panel. - Add support for ingenic JZ4770 panel. - Small null pointer deference fix in ingenic. - Remove support for the special tfp420 driver, as there is a generic way to do it. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl34lkkACgkQ/lWMcqZw E8M76g//WRYl9fWnV063s44FBVJYjGxaus0vQJSGidaPCIE6Ep6TNjXp8DVzV82M HR79P9glL02DC9B8pflioNNXdIRGSVk/FJcKVB2seFAqEFCAknvWDM/X/y+mOUpp fUeFl+Znlwx3YlM8f4Qujdbm+CbTewfbya4VAWeWd8XG2V8jfq5cmODPPlUMNenZ J6Ja+W3ph741uSIfAKaP69LVJgOcuUjXINE4SWhRk/i5QF3GIRej/A7ZjWGLQ/t2 2zUUF7EiCzhPomM40H3ddKtXb4ZjNJuc5pOD4GpxR8ciNbe2gUOHEZ5aenwYBdsU 5MwbxNKyMbKXATtn3yv3fSc4jH3DtmEKpmovONeO8ZDBrQBnxeYa3tQvfkNghA2f acoZMzYUImV+ft6DMIgpXppASvo7mQYDAbLPOGEJ9E44AL4UP00jesEjnK5FOHSR 3BEzGUnK/6QL5zFNPni8YZQ8dan4jDIno1mqIV+cQ4WCGlaKckzIWO6243Bf13b/ kROSJpgWkiK6Ngq0ofhD0MHyT/m1QnqUzWRKTJhRtPflSWRBsDZqWCQ5Vx1QlNIE /HfTNbTpXWwa+5wXbbB8TkDw5t9cQGnR+QcrEd9HgoIec7B5Re8rx9i0TJAT4N05 03RCQCecSfD8gwKd2wgaFIpFGRl9lTdLYSpffSmyL2X5a20lZhM= =b15X -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2019-12-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.6: UAPI Changes: - Add support for DMA-BUF HEAPS. Cross-subsystem Changes: - mipi dsi definition updates, pulled into drm-intel as well. - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim. - Remove support for dma-buf kmap/kunmap. - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well. Core Changes: - Small cleanups to ttm. - Fix SCDC definition. - Assorted cleanups to core. - Add todo to remove load/unload hooks, and use generic fbdev emulation. - Assorted documentation updates. - Use blocking ww lock in ttm fault handler. - Remove drm_fb_helper_fbdev_setup/teardown. - Warning fixes with W=1 for atomic. - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers. - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted) - Various kconfig indentation fixes in core and drivers. - Fix freeing transactions in dp-mst correctly. - Sean Paul is steping down as core maintainer. :-( - Add lockdep annotations for atomic locks vs dma-resv. - Prevent use-after-free for a bad job in drm_scheduler. - Fill out all block sizes in the P01x and P210 definitions. - Avoid division by zero in drm/rect, and fix bounds. - Add drm/rect selftests. - Add aspect ratio and alternate clocks for HDMI 4k modes. - Add todo for drm_framebuffer_funcs and fb_create cleanup. - Drop DRM_AUTH for prime import/export ioctls. - Clear DP-MST payload id tables downstream when initializating. - Fix for DSC throughput definition. - Add extra FEC definitions. - Fix fake offset in drm_gem_object_funs.mmap. - Stop using encoder->bridge in core directly - Handle bridge chaining slightly better. - Add backlight support to drm/panel, and use it in many panel drivers. - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes. Driver Changes: - Small fixes all over. - Fix documentation in vkms. - Fix mmap_sem vs dma_resv in nouveau. - Small cleanup in komeda. - Add page flip support in gma500 for psb/cdv. - Add ddc symlink in the connector sysfs directory for many drivers. - Add support for analogic an6345, and fix small bugs in it. - Add atomic modesetting support to ast. - Fix radeon fault handler VMA race. - Switch udl to use generic shmem helpers. - Unconditional vblank handling for mcde. - Miscellaneous fixes to mcde. - Tweak debug output from komeda using debugfs. - Add gamma and color transform support to komeda for DOU-IPS. - Add support for sony acx424AKP panel. - Various small cleanups to gma500. - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation. - Add support for Logic PD Type 28 panel. - Use drm_panel_* wrapper functions in exynos/tegra/msm. - Add devicetree bindings for generic DSI panels. - Don't include drm_pci.h directly in many drivers. - Add support for begin/end_cpu_access in udmabuf. - Stop using drm_get_pci_dev in gma500 and mga200. - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access. - Add devfreq thermal support to panfrost. - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager. - meson: Add support for OSD1 plane AFBC commit. - Stop displaying garbage when toggling ast primary plane on/off. - More cleanups and fixes to UDL. - Add D32 suport to komeda. - Remove globle copy of drm_dev in gma500. - Add support for Boe Himax8279d MIPI-DSI LCD panel. - Add support for ingenic JZ4770 panel. - Small null pointer deference fix in ingenic. - Remove support for the special tfp420 driver, as there is a generic way to do it. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com	2019-12-17 13:57:54 +01:00
Dave Airlie	d16f0f6140	Merge tag 'drm-fixes-5.5-2019-12-12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes drm-fixes-5.5-2019-12-12: amdgpu: - DC fixes for renoir - Gfx8 fence flush align with mesa - Power profile fix for arcturus - Freesync fix - DC I2c over aux fix - DC aux defer fix - GPU reset fix - GPUVM invalidation semaphore fixes for PCO and SR-IOV - Golden settings updates for gfx10 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191212223211.8034-1-alexander.deucher@amd.com	2019-12-13 14:50:01 +10:00
changzhu	f271fe1856	drm/amdgpu: add invalidate semaphore limit for SRIOV in gmc10 It may fail to load guest driver in round 2 when using invalidate semaphore for SRIOV. So it needs to avoid using invalidate semaphore for SRIOV. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-12 16:13:48 -05:00
changzhu	90f6452ca5	drm/amdgpu: add invalidate semaphore limit for SRIOV and picasso in gmc9 It may fail to load guest driver in round 2 or cause Xstart problem when using invalidate semaphore for SRIOV or picasso. So it needs avoid using invalidate semaphore for SRIOV and picasso. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-12 16:13:24 -05:00
changzhu	413fc385a5	drm/amdgpu: avoid using invalidate semaphore for picasso It may cause timeout waiting for sem acquire in VM flush when using invalidate semaphore for picasso. So it needs to avoid using invalidate semaphore for piasso. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-12 09:39:39 -05:00
Alex Deucher	a680aea00d	Revert "drm/amdgpu: dont schedule jobs while in reset" This reverts commit `f2efc6e600`. This was fixed properly for 5.5, but came back via 5.4 merge into drm-next, so revert it again. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-12 09:39:17 -05:00
Alex Deucher	ad808910be	drm/amdgpu: fix license on Kconfig and Makefiles amdgpu is MIT licensed. Fixes: `ec8f24b7fa` ("treewide: Add SPDX license identifier - Makefile/Kconfig") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:08 -05:00
Simon Ser	93b09a9a89	drm/amdgpu: log when amdgpu.dc=1 but ASIC is unsupported This makes it easier to figure out whether the kernel parameter has been taken into account. Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Harry Wentland <hwentlan@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:08 -05:00
Leo Liu	3504bd45a9	drm/amdgpu: fix JPEG instance checking when ctx init Use proper structure. Fixes: `0388aee766` ("drm/amdgpu: use the JPEG structure for general driver support") Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:08 -05:00
Leo Liu	21a174f5ad	drm/amdgpu: fix VCN2.x number of irq types The JPEG irq type has been moved to its own structure Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:08 -05:00
Tianci.Yin	89ed5a5211	drm/amdgpu/gfx10: update gfx golden settings for navi14 add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Tianci.Yin	eaec03f206	drm/amdgpu/gfx10: update gfx golden settings add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Kevin Wang	d549991ce5	drm/amdgpu: enable gfxoff feature for navi10 asic enable gfxoff feature for some navi10 asics Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Tianci.Yin	5f5202bf69	drm/amdgpu/gfx10: update gfx golden settings for navi14 add registers: mmSPI_CONFIG_CNTL Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Tianci.Yin	d4117354c8	drm/amdgpu/gfx10: update gfx golden settings add registers: mmSPI_CONFIG_CNTL Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Yintian Tao	c9ffa427db	drm/amd/powerplay: enable pp one vf mode for vega10 Originally, due to the restriction from PSP and SMU, VF has to send message to hypervisor driver to handle powerplay change which is complicated and redundant. Currently, SMU and PSP can support VF to directly handle powerplay change by itself. Therefore, the old code about the handshake between VF and PF to handle powerplay will be removed and VF will use new the registers below to handshake with SMU. mmMP1_SMN_C2PMSG_101: register to handle SMU message mmMP1_SMN_C2PMSG_102: register to handle SMU parameter mmMP1_SMN_C2PMSG_103: register to handle SMU response v2: remove module parameter pp_one_vf v3: fix the parens v4: forbid vf to change smu feature v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute v6: change skip condition at vega10_copy_table_to_smc Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
John Clements	4cf781c24c	drm/amdgpu: Added RAS UMC error query support for Arcturus Updated UMC 6.1 function set to support UMC 6.1.1 and 6.1.2 devices Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
changzhu	418899d615	drm/amdgpu: avoid using invalidate semaphore for picasso It may cause timeout waiting for sem acquire in VM flush when using invalidate semaphore for picasso. So it needs to avoid using invalidate semaphore for piasso. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Le Ma	feffbaac36	drm/amdgpu: add condition to enable baco for ras recovery Switch to baco reset method for ras recovery if the PMFW supported. If not, keep the original reset method. v2: revise the condition Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Alex Deucher	bd95c14452	drm/amdgpu: fix license on Kconfig and Makefiles amdgpu is MIT licensed. Fixes: `ec8f24b7fa` ("treewide: Add SPDX license identifier - Makefile/Kconfig") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 14:29:38 -05:00
Tianci.Yin	69897d3425	drm/amdgpu/gfx10: update gfx golden settings for navi14 add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 14:29:38 -05:00
Tianci.Yin	847b0d8795	drm/amdgpu/gfx10: update gfx golden settings add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 14:29:38 -05:00
Tianci.Yin	5714a2026f	drm/amdgpu/gfx10: update gfx golden settings for navi14 add registers: mmSPI_CONFIG_CNTL Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 14:29:38 -05:00
Tianci.Yin	02cca5769f	drm/amdgpu/gfx10: update gfx golden settings add registers: mmSPI_CONFIG_CNTL Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 14:29:38 -05:00
Hawking Zhang	0d6f39bb77	drm/amdgpu: fix resume failures due to psp fw loading sequence change (v3) this fix the regression caused by asd/ta loading sequence adjustment recently. asd/ta loading was move out from hw_start and should also be applied to psp_resume. otherwise those fw loading will be ignored in resume phase. v2: add the mutex unlock for asd loading failure case v3: merge the error handling to failed tag Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-09 17:03:38 -05:00
Thong Thai	d515959125	Revert "drm/amdgpu: enable VCN DPG on Raven and Raven2" This reverts commit `a4840d91c9`. Reverting due to power efficiency issues seen on Raven 1 and 2 when DPG mode is enabled. Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-09 17:02:33 -05:00
Christian König	b4ff0f8a85	drm/amdgpu: add VM eviction lock v3 This allows to invalidate VM entries without taking the reservation lock. v3: use -EBUSY Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-09 17:02:26 -05:00
Christian König	90b69cdc5f	drm/amdgpu: stop adding VM updates fences to the resv obj Don't add the VM update fences to the resv object and remove the handling to stop implicitely syncing to them. Ongoing updates prevent page tables from being evicted and we manually block for all updates to complete before releasing PDs and PTS. This way we can do updates even without the resv obj locked. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-09 17:02:15 -05:00
Christian König	e095fc17bb	drm/amdgpu: explicitely sync to VM updates v2 Allows us to reduce the overhead while syncing to fences a bit. v2: also drop adev parameter from the functions Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-09 17:02:07 -05:00
Christian König	6ceeb144b1	drm/amdgpu: move VM eviction decision into amdgpu_vm.c When a page tables needs to be evicted the VM code should decide if that is possible or not. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-09 17:01:51 -05:00
Linus Torvalds	7ada90eb9c	drm msm + fixes for 5.5-rc1 msm-next: - OCMEM support for a3xx and a4xx GPUs. - a510 support + display support core: - mst payload deletion fix i915: - uapi alignment fix - fix for power usage regression due to security fixes - change default preemption timeout to 640ms from 100ms - EHL voltage level display fixes - TGL DGL PHY fix - gvt - MI_ATOMIC cmd parser fix, CFL non-priv warning - CI spotted deadlock fix - EHL port D programming fix amdgpu: - VRAM lost fixes on BACO for CI/VI - navi14 DC fixes - misc SR-IOV, gfx10 fixes - XGMI fixes for arcturus - SRIOV fixes amdkfd: - KFD on ppc64le enabled - page table optimisations radeon: - fix for r1xx/2xx register checker. tegra: - displayport regression fixes - DMA API regression fixes mgag200: - fix devices that can't scanout except at 0 addr omap: - fix dma_addr refcounting -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJd6cqnAAoJEAx081l5xIa+YR0P/A0LkilEbSnF/k7zKDjm0HN8 JGsf9ZfQRGA2y8URoLRtNdFjZfyuTSpiDSxsbDI0ShBhRimGHyCSxAJXO42vp8q3 jE57jBoaTSiGtagSO3nxrc1vQP7CfUpaggC2ilKSmcVvTrlqip6iPx7s2PoNyQYc GRVUhkcylnZK5UrMiE8Yz/iNcy3Mh0X8bJQKXMEYxpW2KA3SL4qxuRlYIxXEoMyB 4MlWEV09wHTduf1uYuKdusHjILgR5EiVOdmbvpM92obqZOTokt5/S20TEdhFqiy0 0IHxuEkgVx+trXzGFbmqgh2I7BZvZIbKVCSnBT4AXAvUEJ99kGTdEP0I6uOp2lsC 1DCm+7/hcI8BlwmwC9N6ogUwoAzKn7DNc1urcet/0QVbnZLZlueUK/6fSgUNnUYe miOeMNBmfHr83b75MpnNxYVoyz5S+/DFbtUplYKqxgjDYfiWWceSSE47NB+IHAiI RVpz3AxGpKaw4/w5l2q8VuToWZxdO85TNjgVCTmKfwlYjIbEuveWpZNFqO/GHMm9 x50f4ZYVOjU2TEPnLQNTIJOgv71JrTpoAdFzPVwCeWUf4h4Y4lVLgTLvdG1JLcw+ k9BrA5z2R0kjzPtabRhS6WfSjpgSbY3DgY9hfi+HIUmKvZq4fdtAbBlp1oGSXJ9N zkVrs9eE6Ahkcndi6ZV9 =3cs2 -----END PGP SIGNATURE----- Merge tag 'drm-next-2019-12-06' of git://anongit.freedesktop.org/drm/drm Pull more drm updates from Dave Airlie: "Rob pointed out I missed his pull request for msm-next, it's been in next for a while outside of my tree so shouldn't cause any unexpected issues, it has some OCMEM support in drivers/soc that is acked by other maintainers as it's outside my tree. Otherwise it's a usual fixes pull, i915, amdgpu, the main ones, with some tegra, omap, mgag200 and one core fix. Summary: msm-next: - OCMEM support for a3xx and a4xx GPUs. - a510 support + display support core: - mst payload deletion fix i915: - uapi alignment fix - fix for power usage regression due to security fixes - change default preemption timeout to 640ms from 100ms - EHL voltage level display fixes - TGL DGL PHY fix - gvt - MI_ATOMIC cmd parser fix, CFL non-priv warning - CI spotted deadlock fix - EHL port D programming fix amdgpu: - VRAM lost fixes on BACO for CI/VI - navi14 DC fixes - misc SR-IOV, gfx10 fixes - XGMI fixes for arcturus - SRIOV fixes amdkfd: - KFD on ppc64le enabled - page table optimisations radeon: - fix for r1xx/2xx register checker. tegra: - displayport regression fixes - DMA API regression fixes mgag200: - fix devices that can't scanout except at 0 addr omap: - fix dma_addr refcounting" * tag 'drm-next-2019-12-06' of git://anongit.freedesktop.org/drm/drm: (100 commits) drm/dp_mst: Correct the bug in drm_dp_update_payload_part1() drm/omap: fix dma_addr refcounting drm/tegra: Run hub cleanup on ->remove() drm/tegra: sor: Make the +5V HDMI supply optional drm/tegra: Silence expected errors on IOMMU attach drm/tegra: vic: Export module device table drm/tegra: sor: Implement system suspend/resume drm/tegra: Use proper IOVA address for cursor image drm/tegra: gem: Remove premature import restrictions drm/tegra: gem: Properly pin imported buffers drm/tegra: hub: Remove bogus connection mutex check ia64: agp: Replace empty define with do while agp: Add bridge parameter documentation agp: remove unused variable num_segments agp: move AGPGART_MINOR to include/linux/miscdevice.h agp: remove unused variable size in agp_generic_create_gatt_table drm/dp_mst: Fix build on systems with STACKTRACE_SUPPORT=n drm/radeon: fix r1xx/r2xx register checker for POT textures drm/amdgpu: fix GFX10 missing CSIB set(v3) drm/amdgpu: should stop GFX ring in hw_fini ...	2019-12-06 10:28:09 -08:00
Gerd Hoffmann	b3fac52c51	drm: share address space for dma bufs Use the shared address space of the drm device (see drm_open() in drm_file.c) for dma-bufs too. That removes a difference betweem drm device mmap vmas and dma-buf mmap vmas and fixes corner cases like dropping ptes (using madvise(DONTNEED) for example) not working properly. Also remove amdgpu driver's private dmabuf update. It is not needed any more now that we are doing this for everybody. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: http://patchwork.freedesktop.org/patch/msgid/20191127092523.5620-3-kraxel@redhat.com	2019-12-06 11:18:11 +01:00
Pierre-Eric Pelloux-Prayer	bf26da927a	drm/amdgpu: add cache flush workaround to gfx8 emit_fence The same workaround is used for gfx7. Both PAL and Mesa use it for gfx8 too, so port this commit to gfx_v8_0_ring_emit_fence_gfx. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 17:54:55 -05:00
Guchun Chen	6e807535da	drm/amdgpu: add check before enabling/disabling broadcast mode When security violation from new vbios happens, data fabric is risky to stop working. So prevent the direct access to DF mmFabricConfigAccessControl from the new vbios and onwards. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 17:54:14 -05:00
Le Ma	76434f75d4	drm/amdgpu: reduce redundant uvd context lost warning message Move the print out of uvd instance loop in amdgpu_uvd_suspend v2: drop unnecessary brackets v3: grab ras_intr state once for multiple times use Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:26:20 -05:00
Le Ma	00eaa57172	drm/amdgpu: clear err_event_athub flag after reset exit Otherwise next err_event_athub error cannot call gpu reset. And following resume sequence will not be affected by this flag. v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:26:11 -05:00
Le Ma	b823821f22	drm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs This athub fatal error can be recovered by baco without system-level reboot, so add a mode to use baco for the recovery. Not affect the default psp reset situations for now. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:26:03 -05:00
Le Ma	ce316fa55e	drm/amdgpu: add concurrent baco reset support for XGMI Currently each XGMI node reset wq does not run in parrallel if bound to same cpu. Make change to bound the xgmi_reset_work item to different cpus. XGMI requires all nodes enter into baco within very close proximity before any node exit baco. So schedule the xgmi_reset_work wq twice for enter/exit baco respectively. To use baco for XGMI, PMFW supported for baco on XGMI needs to be involved. The case that PSP reset and baco reset coexist within an XGMI hive never exist and is not in the consideration. v2: define use_baco flag to simplify the code for xgmi baco sequence Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:53 -05:00
Le Ma	7a22677b95	drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper This operation is needed when baco entry/exit for ras recovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:45 -05:00
Le Ma	5c39d600e3	drm/amdgpu: clear uncorrectable parity error status bit This should be cleared during every nbif uncorrectable error cleanup work. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:35 -05:00
Le Ma	28f87950d9	drm/amdgpu: clear ras controller status registers when interrupt occurs To fix issue that ras controller interrupt cannot be triggered anymore after one time nbif uncorrectable error. And error count is stored in nbif ras object for query. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:27 -05:00
Le Ma	f2a79be1c0	drm/amdgpu: export amdgpu_ras_find_obj to use externally Change it to external interface. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:20 -05:00
Le Ma	4a2d93565a	drm/amdgpu: remove ras global recovery handling from ras_controller_int handler v2: add notification when ras controller interrupt generates Signed-off-by: Le Ma <Le.Ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:11 -05:00
Pierre-Eric Pelloux-Prayer	b456c93253	drm/amdgpu: add cache flush workaround to gfx8 emit_fence The same workaround is used for gfx7. Both PAL and Mesa use it for gfx8 too, so port this commit to gfx_v8_0_ring_emit_fence_gfx. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:24:58 -05:00
James Zhu	f83f5a1e11	drm/amdgpu/gfx: Improvement on EDC GPR workarounds SPI limits total CS waves in flight per SE to no more than 32 * num_cu and we need to stuff 40 waves on a CU to completely clean the SGPR. This is accomplished in the WR by cleaning the SE in two steps, half of the CU per step. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:24:28 -05:00
Guchun Chen	79c4c8ea91	drm/amdgpu: add check before enabling/disabling broadcast mode When security violation from new vbios happens, data fabric is risky to stop working. So prevent the direct access to DF mmFabricConfigAccessControl from the new vbios and onwards. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:24:09 -05:00
Jani Nikula	b6ff753a0c	drm: constify fb ops across all drivers Now that the fbops member of struct fb_info is const, we can start making the ops const as well. Cc: dri-devel@lists.freedesktop.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/59b43629ac60031c5bbf961d8c49695019bc9c6f.1575390740.git.jani.nikula@intel.com	2019-12-05 10:57:42 +02:00
Linus Torvalds	c3bed3b20e	pci-v5.5-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl3leXUUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vyY3g/9FAVVdPEaadNtAhQ/zIxcjozDovKq 0q7yOA3aTBTUoNEinm88an6p0dcC4gNKtGukXmzVH2Hhxm9kLRdtpZGYY00tpLUB 9rI7XsgwwHa+hLwsHbIs507sKGFGy5FLr0ChTTGLDEMppnEvjA2hZooYmcB/OgrC LlFcwbNKGOk/Si9u2bF2nLO0JDoVHnwzpF99saew/nqc7Lfj9e9IPZFom+VjPBUh AOvRp2H7uBN+WQlpLeFeMDDoeXh34lX0kYqIV/cVkXVnknDGYKV2CBTg2aeX7jd0 QiPHZh6zlW8zNQgaCZRiBAbatVEOnRMRJ++yiqB8hBYp1LMXm6kJ01YSQpXkugoY Vp9dtzzTARWV/XkKwD4brw9ZEmIDnO+Ed2x2VbUkPJVcXAvzSQWAx82IU0Iuqmcb 9qr6U2Zf/Xk5aFlGPYVH8QOG+QqzIbZNRQ7NlhDlITyW4P6QPu0mw374yYP2wDGL sP5YSS3YGa0sQcEgDtVnd4z+WTZI4AwXLPaeaLkDhdfHp2FsERUY4TrPs33J99xw og4EyokVFzjYzlnBPU6WWn7LL+jj5ccXkL3MA4DR4FJOnNGHh7NXfQUH56rrgsq7 F9/8shL5DuTbQkde1uSyUG9Iq/RigVLlV5DQavFm3dSXvZi0E16t5alC5URNTzk7 at8Bogn53QhlmYc= =uUXw -----END PGP SIGNATURE----- Merge tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Warn if a host bridge has no NUMA info (Yunsheng Lin) - Add PCI_STD_NUM_BARS for the number of standard BARs (Denis Efremov) Resource management: - Fix boot-time Embedded Controller GPE storm caused by incorrect resource assignment after ACPI Bus Check Notification (Mika Westerberg) - Protect pci_reassign_bridge_resources() against concurrent addition/removal (Benjamin Herrenschmidt) - Fix bridge dma_ranges resource list cleanup (Rob Herring) - Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters to control the MMIO and prefetchable MMIO window sizes of hotplug bridges independently (Nicholas Johnson) - Fix MMIO/MMIO_PREF window assignment that assigned more space than desired (Nicholas Johnson) - Only enforce bus numbers from bridge EA if the bridge has EA devices downstream (Subbaraya Sundeep) - Consolidate DT "dma-ranges" parsing and convert all host drivers to use shared parsing (Rob Herring) Error reporting: - Restore AER capability after resume (Mayurkumar Patel) - Add PoisonTLPBlocked AER counter (Rajat Jain) - Use for_each_set_bit() to simplify AER code (Andy Shevchenko) - Fix AER kernel-doc (Andy Shevchenko) - Add "pcie_ports=dpc-native" parameter to allow native use of DPC even if platform didn't grant control over AER (Olof Johansson) Hotplug: - Avoid returning prematurely from sysfs requests to enable or disable a PCIe hotplug slot (Lukas Wunner) - Don't disable interrupts twice when suspending hotplug ports (Mika Westerberg) - Fix deadlocks when PCIe ports are hot-removed while suspended (Mika Westerberg) Power management: - Remove unnecessary ASPM locking (Bjorn Helgaas) - Add support for disabling L1 PM Substates (Heiner Kallweit) - Allow re-enabling Clock PM after it has been disabled (Heiner Kallweit) - Add sysfs attributes for controlling ASPM link states (Heiner Kallweit) - Remove CONFIG_PCIEASPM_DEBUG, including "link_state" and "clk_ctl" sysfs files (Heiner Kallweit) - Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on USB 2.0 or 1.1 connect events (Kai-Heng Feng) - Move power state check out of pci_msi_supported() (Bjorn Helgaas) - Fix incorrect MSI-X masking on resume and revert related nvme quirk for Kingston NVME SSD running FW E8FK11.T (Jian-Hong Pan) - Always return devices to D0 when thawing to fix hibernation with drivers like mlx4 that used legacy power management (previously we only did it for drivers with new power management ops) (Dexuan Cui) - Clear PCIe PME Status even for legacy power management (Bjorn Helgaas) - Fix PCI PM documentation errors (Bjorn Helgaas) - Use dev_printk() for more power management messages (Bjorn Helgaas) - Apply D2 delay as milliseconds, not microseconds (Bjorn Helgaas) - Convert xen-platform from legacy to generic power management (Bjorn Helgaas) - Removed unused .resume_early() and .suspend_late() legacy power management hooks (Bjorn Helgaas) - Rearrange power management code for clarity (Rafael J. Wysocki) - Decode power states more clearly ("4" or "D4" really refers to "D3cold") (Bjorn Helgaas) - Notice when reading PM Control register returns an error (~0) instead of interpreting it as being in D3hot (Bjorn Helgaas) - Add missing link delays required by the PCIe spec (Mika Westerberg) Virtualization: - Move pci_prg_resp_pasid_required() to CONFIG_PCI_PRI (Bjorn Helgaas) - Allow VFs to use PRI (the PF PRI is shared by the VFs, but the code previously didn't recognize that) (Kuppuswamy Sathyanarayanan) - Allow VFs to use PASID (the PF PASID capability is shared by the VFs, but the code previously didn't recognize that) (Kuppuswamy Sathyanarayanan) - Disconnect PF and VF ATS enablement, since ATS in PFs and associated VFs can be enabled independently (Kuppuswamy Sathyanarayanan) - Cache PRI and PASID capability offsets (Kuppuswamy Sathyanarayanan) - Cache the PRI PRG Response PASID Required bit (Bjorn Helgaas) - Consolidate ATS declarations in linux/pci-ats.h (Krzysztof Wilczynski) - Remove unused PRI and PASID stubs (Bjorn Helgaas) - Removed unnecessary EXPORT_SYMBOL_GPL() from ATS, PRI, and PASID interfaces that are only used by built-in IOMMU drivers (Bjorn Helgaas) - Hide PRI and PASID state restoration functions used only inside the PCI core (Bjorn Helgaas) - Add a DMA alias quirk for the Intel VCA NTB (Slawomir Pawlowski) - Serialize sysfs sriov_numvfs reads vs writes (Pierre Crégut) - Update Cavium ACS quirk for ThunderX2 and ThunderX3 (George Cherian) - Fix the UPDCR register address in the Intel ACS quirk (Steffen Liebergeld) - Unify ACS quirk implementations (Bjorn Helgaas) Amlogic Meson host bridge driver: - Fix meson PERST# GPIO polarity problem (Remi Pommarel) - Add DT bindings for Amlogic Meson G12A (Neil Armstrong) - Fix meson clock names to match DT bindings (Neil Armstrong) - Add meson support for Amlogic G12A SoC with separate shared PHY (Neil Armstrong) - Add meson extended PCIe PHY functions for Amlogic G12A USB3+PCIe combo PHY (Neil Armstrong) - Add arm64 DT for Amlogic G12A PCIe controller node (Neil Armstrong) - Add commented-out description of VIM3 USB3/PCIe mux in arm64 DT (Neil Armstrong) Broadcom iProc host bridge driver: - Invalidate iProc PAXB address mapping before programming it (Abhishek Shah) - Fix iproc-msi and mvebu __iomem annotations (Ben Dooks) Cadence host bridge driver: - Refactor Cadence PCIe host controller to use as a library for both host and endpoint (Tom Joseph) Freescale Layerscape host bridge driver: - Add layerscape LS1028a support (Xiaowei Bao) Intel VMD host bridge driver: - Add VMD bus 224-255 restriction decode (Jon Derrick) - Add VMD 8086:9A0B device ID (Jon Derrick) - Remove Keith from VMD maintainer list (Keith Busch) Marvell ARMADA 3700 / Aardvark host bridge driver: - Use LTSSM state to build link training flag since Aardvark doesn't implement the Link Training bit (Remi Pommarel) - Delay before training Aardvark link in case PERST# was asserted before the driver probe (Remi Pommarel) - Fix Aardvark issues with Root Control reads and writes (Remi Pommarel) - Don't rely on jiffies in Aardvark config access path since interrupts may be disabled (Remi Pommarel) - Fix Aardvark big-endian support (Grzegorz Jaszczyk) Marvell ARMADA 370 / XP host bridge driver: - Make mvebu_pci_bridge_emul_ops static (Ben Dooks) Microsoft Hyper-V host bridge driver: - Add hibernation support for Hyper-V virtual PCI devices (Dexuan Cui) - Track Hyper-V pci_protocol_version per-hbus, not globally (Dexuan Cui) - Avoid kmemleak false positive on hv hbus buffer (Dexuan Cui) Mobiveil host bridge driver: - Change mobiveil csr_read()/write() function names that conflict with riscv arch functions (Kefeng Wang) NVIDIA Tegra host bridge driver: - Fix Tegra CLKREQ dependency programming (Vidya Sagar) Renesas R-Car host bridge driver: - Remove unnecessary header include from rcar (Andrew Murray) - Tighten register index checking for rcar inbound range programming (Marek Vasut) - Fix rcar inbound range alignment calculation to improve packing of multiple entries (Marek Vasut) - Update rcar MACCTLR setting to match documentation (Yoshihiro Shimoda) - Clear bit 0 of MACCTLR before PCIETCTLR.CFINIT per manual (Yoshihiro Shimoda) - Add Marek Vasut and Yoshihiro Shimoda as R-Car maintainers (Simon Horman) Rockchip host bridge driver: - Make rockchip 0V9 and 1V8 power regulators non-optional (Robin Murphy) Socionext UniPhier host bridge driver: - Set uniphier to host (RC) mode always (Kunihiko Hayashi) Endpoint drivers: - Fix endpoint driver sign extension problem when shifting page number to phys_addr_t (Alan Mikhak) Misc: - Add NumaChip SPDX header (Krzysztof Wilczynski) - Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski) - Remove unused includes (Krzysztof Wilczynski) - Removed unused sysfs attribute groups (Ben Dooks) - Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas) - Add PCIe Link Control 2 register field definitions to replace magic numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas) - Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and Radeon CIK/SI PCIe Gen3 link training (Bjorn Helgaas) - Use pcie_capability_read_word() instead of pci_read_config_word() in AMDGPU and Radeon CIK/SI (Frederick Lawler) - Remove unused pci_irq_get_node() Greg Kroah-Hartman) - Make asm/msi.h mandatory and simplify PCI_MSI_IRQ_DOMAIN Kconfig (Palmer Dabbelt, Michal Simek) - Read all 64 bits of Switchtec part_event_bitmap (Logan Gunthorpe) - Fix erroneous intel-iommu dependency on CONFIG_AMD_IOMMU (Bjorn Helgaas) - Fix bridge emulation big-endian support (Grzegorz Jaszczyk) - Fix dwc find_next_bit() usage (Niklas Cassel) - Fix pcitest.c fd leak (Hewenliang) - Fix typos and comments (Bjorn Helgaas) - Fix Kconfig whitespace errors (Krzysztof Kozlowski)" * tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (160 commits) PCI: Remove PCI_MSI_IRQ_DOMAIN architecture whitelist asm-generic: Make msi.h a mandatory include/asm header Revert "nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T" PCI/MSI: Fix incorrect MSI-X masking on resume PCI/MSI: Move power state check out of pci_msi_supported() PCI/MSI: Remove unused pci_irq_get_node() PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer PCI: hv: Change pci_protocol_version to per-hbus PCI: hv: Add hibernation support PCI: hv: Reorganize the code in preparation of hibernation MAINTAINERS: Remove Keith from VMD maintainer PCI/ASPM: Remove PCIEASPM_DEBUG Kconfig option and related code PCI/ASPM: Add sysfs attributes for controlling ASPM link states PCI: Fix indentation drm/radeon: Prefer pcie_capability_read_word() drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions drm/radeon: Correct Transmit Margin masks drm/amdgpu: Prefer pcie_capability_read_word() PCI: uniphier: Set mode register to host mode drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions ...	2019-12-03 13:58:22 -08:00
Monk Liu	4905880b45	drm/amdgpu: fix GFX10 missing CSIB set(v3) still need to init csb even for SRIOV v2: drop init_pg() for gfx10 at all since PG and GFX off feature will be fully controled by RLC and SMU fw for gfx10 v3: drop the flush_gpu_tlb lines since we consider it is only usefull in emulation Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:58:08 -05:00
Monk Liu	cd05b51aaa	drm/amdgpu: should stop GFX ring in hw_fini To align with the scheme from gfx9 disabling GFX ring after VM shutdown could avoid garbage data be fetched to GFX RB which may lead to unnecessary screw up on GFX Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:57:54 -05:00
Monk Liu	dacf56e45d	drm/amdgpu: do autoload right after MEC loaded for SRIOV VF since we don't have RLCG ucode loading and no SRlist as well Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:57:39 -05:00
Monk Liu	6294017fe3	drm/amdgpu: skip rlc ucode loading for SRIOV gfx10 Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:57:17 -05:00
Monk Liu	747d4f715f	drm/amdgpu: fix calltrace during kmd unload(v3) issue: kernel would report a warning from a double unpin during the driver unloading on the CSB bo why: we unpin it during hw_fini, and there will be another unpin in sw_fini on CSB bo. fix: actually we don't need to pin/unpin it during hw_init/fini since it is created with kernel pinned, we only need to fullfill the CSB again during hw_init to prevent CSB/VRAM lost after S3 v2: get_csb in init_rlc so hw_init() will make CSIB content back even after reset or s3 v3: use bo_create_kernel instead of bo_create_reserved for CSB otherwise the bo_free_kernel() on CSB is not aligned and would lead to its internal reserve pending there forever take care of gfx7/8 as well Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:57:00 -05:00
Xiaojie Yuan	fa2b93e39b	drm/amdgpu/gfx10: unlock srbm_mutex after queue programming finish srbm_mutex is to guarantee atomicity for r/w of gfx indexed registers Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:52:55 -05:00
John Clements	f0312f45a0	drm/amdgpu: Added ASIC specific checks in gfxhub V1.1 get XGMI info Added max hive/node info checks per supported ASIC Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:52:11 -05:00
Monk Liu	e2195f7d0e	drm/amdgpu: use CPU to flush vmhub if sched stopped otherwse the flush_gpu_tlb will hang if we unload the KMD becuase the schedulers already stopped Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:39:21 -05:00
Yong Zhao	6dcab16b41	drm/amdkfd: Contain MMHUB number in mmhub_v9_4_setup_vm_pt_regs() Adjust the exposed function prototype so that the caller does not need to know the MMHUB number. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:08:24 -05:00
Hawking Zhang	7091b60cad	drm/amdgpu: load np fw prior before loading the TAs Platform TAs will independently toggle DF Cstate. for instance, get/set topology from xgmi ta. do error injection from ras ta. In such case, PMFW needs to be loaded before TAs so that all the subsequent Cstate calls recieved by PSP FW can be routed to PMFW. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:08:10 -05:00
Hawking Zhang	71e5f0cb93	drm/amdgpu: unload asd in psp hw de-init phase issue unload_ta_cmd to tOS to unload asd driver Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:08:03 -05:00
Hawking Zhang	c64ab8280e	drm/amdgpu: drop asd shared memory asd shared memory is not needed since drivers doesn't invoke any further cmd to asd directly after the asd loading. trust application is the one who needs to talk to asd after the initialization Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-03 11:07:56 -05:00
Emily Deng	0ea203a912	drm/amdgpu/sriov: No need the event 3 and 4 now As will call unload kms when initialize fail, and the unload kms will send event 3 and 4, so don't need event 3 and 4 in device init. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:39:28 -05:00
John Clements	031514956b	drm/amdgpu: Added ASIC specific checks in gfxhub V1.1 get XGMI info Added max hive/node info checks per supported ASIC Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:39:20 -05:00
Bhawanpreet Lakha	a7f4ba7a6c	drm/amd/display: Load TA firmware for navi10/12/14 load the ta firmware for navi10/12/14. This is already being done for raven/picasso and is needed for supporting hdcp on navi Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:39:13 -05:00
Yintian Tao	7c868b592d	drm/amdgpu: not remove sysfs if not create sysfs When load amdgpu failed before create pm_sysfs and ucode_sysfs, the pm_sysfs and ucode_sysfs should not be removed. Otherwise, there will be warning call trace just like below. [ 24.836386] [drm] VCE initialized successfully. [ 24.841352] amdgpu 0000:00:07.0: amdgpu_device_ip_init failed [ 25.370383] amdgpu 0000:00:07.0: Fatal error during GPU init [ 25.889575] [drm] amdgpu: finishing device. [ 26.069128] amdgpu 0000:00:07.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring kiq_2.1.0 test failed (-110) [ 26.070110] [drm:gfx_v9_0_hw_fini [amdgpu]] ERROR KCQ disable failed [ 26.200309] [TTM] Finalizing pool allocator [ 26.200314] [TTM] Finalizing DMA pool allocator [ 26.200349] [TTM] Zone kernel: Used memory at exit: 0 KiB [ 26.200351] [TTM] Zone dma32: Used memory at exit: 0 KiB [ 26.200353] [drm] amdgpu: ttm finalized [ 26.205329] ------------[ cut here ]------------ [ 26.205330] sysfs group 'fw_version' not found for kobject '0000:00:07.0' [ 26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256 sysfs_remove_group+0x80/0x90 [ 26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE) drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi floppy [ 26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G OE 5.2.0-rc1 #1 [ 26.205370] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 26.205372] RIP: 0010:sysfs_remove_group+0x80/0x90 [ 26.205374] Code: e8 35 b9 ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8 f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 [ 26.205375] RSP: 0018:ffffbee242b0b908 EFLAGS: 00010282 [ 26.205376] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 [ 26.205377] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff97ad6f817380 [ 26.205377] RBP: ffffbee242b0b920 R08: ffffffff98f520c4 R09: 00000000000002b3 [ 26.205378] R10: ffffbee242b0b8f8 R11: 00000000000002b3 R12: ffffffffc0e58240 [ 26.205379] R13: ffff97ad6d1fe0b0 R14: ffff97ad4db954c8 R15: ffff97ad4db7fff0 [ 26.205380] FS: 00007ff3d8a1c4c0(0000) GS:ffff97ad6f800000(0000) knlGS:0000000000000000 [ 26.205381] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.205381] CR2: 00007f9b2ef1df04 CR3: 000000042aab8001 CR4: 00000000003606f0 [ 26.205384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 26.205385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 26.205385] Call Trace: [ 26.205461] amdgpu_ucode_sysfs_fini+0x18/0x20 [amdgpu] [ 26.205518] amdgpu_device_fini+0x3b4/0x560 [amdgpu] [ 26.205573] amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu] [ 26.205623] amdgpu_driver_load_kms+0xcd/0x250 [amdgpu] [ 26.205637] drm_dev_register+0x12b/0x1c0 [drm] [ 26.205695] amdgpu_pci_probe+0x12a/0x1e0 [amdgpu] [ 26.205699] local_pci_probe+0x47/0xa0 [ 26.205701] pci_device_probe+0x106/0x1b0 [ 26.205704] really_probe+0x21a/0x3f0 [ 26.205706] driver_probe_device+0x11c/0x140 [ 26.205707] device_driver_attach+0x58/0x60 [ 26.205709] __driver_attach+0xc3/0x140 Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:39:05 -05:00
Monk Liu	d5939e4db5	drm/amdgpu: fix GFX10 missing CSIB set(v3) still need to init csb even for SRIOV v2: drop init_pg() for gfx10 at all since PG and GFX off feature will be fully controled by RLC and SMU fw for gfx10 v3: drop the flush_gpu_tlb lines since we consider it is only usefull in emulation Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:38:56 -05:00
Monk Liu	eb529b8e46	drm/amdgpu: should stop GFX ring in hw_fini To align with the scheme from gfx9 disabling GFX ring after VM shutdown could avoid garbage data be fetched to GFX RB which may lead to unnecessary screw up on GFX Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:38:49 -05:00
Monk Liu	6de40f02b3	drm/amdgpu: do autoload right after MEC loaded for SRIOV VF since we don't have RLCG ucode loading and no SRlist as well Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:38:43 -05:00
Monk Liu	1797ec7ffd	drm/amdgpu: skip rlc ucode loading for SRIOV gfx10 Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:38:37 -05:00
Monk Liu	82a829dc8c	drm/amdgpu: fix calltrace during kmd unload(v3) issue: kernel would report a warning from a double unpin during the driver unloading on the CSB bo why: we unpin it during hw_fini, and there will be another unpin in sw_fini on CSB bo. fix: actually we don't need to pin/unpin it during hw_init/fini since it is created with kernel pinned, we only need to fullfill the CSB again during hw_init to prevent CSB/VRAM lost after S3 v2: get_csb in init_rlc so hw_init() will make CSIB content back even after reset or s3 v3: use bo_create_kernel instead of bo_create_reserved for CSB otherwise the bo_free_kernel() on CSB is not aligned and would lead to its internal reserve pending there forever take care of gfx7/8 as well Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:38:27 -05:00
Monk Liu	869aebc7ba	drm/amdgpu: use CPU to flush vmhub if sched stopped otherwse the flush_gpu_tlb will hang if we unload the KMD becuase the schedulers already stopped Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:38:03 -05:00
James Zhu	45317d5ffb	drm/amdgpu/gfx: Increase dispatch packet number For Arcturus, increase dispatch packet number to stress scheduler. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:37:57 -05:00
James Zhu	2255d7f36e	drm/amdgpu/gfx: Clear more EDC cnt Clear SDMA and HDP EDC counter in GPR workarounds. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:37:51 -05:00
Xiaojie Yuan	858054f761	drm/amdgpu/gfx10: remove outdated comments Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:37:44 -05:00
Xiaojie Yuan	a5e82d0b95	drm/amdgpu/gfx10: unlock srbm_mutex after queue programming finish srbm_mutex is to guarantee atomicity for r/w of gfx indexed registers Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:37:29 -05:00
Linus Torvalds	aa32f11691	hmm related patches for 5.5 This is another round of bug fixing and cleanup. This time the focus is on the driver pattern to use mmu notifiers to monitor a VA range. This code is lifted out of many drivers and hmm_mirror directly into the mmu_notifier core and written using the best ideas from all the driver implementations. This removes many bugs from the drivers and has a very pleasing diffstat. More drivers can still be converted, but that is for another cycle. - A shared branch with RDMA reworking the RDMA ODP implementation - New mmu_interval_notifier API. This is focused on the use case of monitoring a VA and simplifies the process for drivers - A common seq-count locking scheme built into the mmu_interval_notifier API usable by drivers that call get_user_pages() or hmm_range_fault() with the VA range - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen GntDev drivers to the new API. This deletes a lot of wonky driver code. - Two improvements for hmm_range_fault(), from testing done by Ralph -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl3cCjQACgkQOG33FX4g mxpp8xAAiR9iOdT28m/tx1GF31XludrMhRZVIiz0vmCIxIiAkWekWEfAEVm9PDnh wdrxTJohSs+B65AK3sfToOM3AIuNCuFVWmbbHI5qmOO76vaSvcZa905Z++pNsawO Bn8mgRCprYoFHcxWLvTvnA5U0g1S2BSSOwBSZI43CbEnVvHjYAR6MnvRqfGMk+NF bf8fTk/x+fl0DCemhynlBLuJkogzoE2Hgl0yPY5bFna4PktOxdpa1yPaQsiqZ7e6 2s2NtM3pbMBJk0W42q5BU+aPhiqfxFFszasPSLBduXrD2xDsG76HJdHj5VydKmfL nelG4BvqJozXTEZWvTEePYhCqaZ41eJZ7Asw8BXtmacVqE5mDlTXo/Zdgbz7yEOR mI5MVyjD5rauZJldUOWXbwrPoWVFRvboauehiSgqvxvT9HvlFp9GKObSuu4gubBQ mzxs4t48tPhA7bswLmw0/pETSogFuVDfaB7hsyY0gi8EwxMFMpw2qFypm1PEEF+C BuUxCSShzvNKrraNe5PWaNNFd3AzIwAOWJHE+poH4bCoXQVr5nA+rq2gnHkdY5vq /xrBCyxkf0U05YoFGYembPVCInMehzp9Xjy8V+SueSvCg2/TYwGDCgGfsbe9dNOP Bc40JpS7BDn5w9nyLUJmOx7jfruNV6kx1QslA7NDDrB/rzOlsEc= =Hj8a -----END PGP SIGNATURE----- Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "This is another round of bug fixing and cleanup. This time the focus is on the driver pattern to use mmu notifiers to monitor a VA range. This code is lifted out of many drivers and hmm_mirror directly into the mmu_notifier core and written using the best ideas from all the driver implementations. This removes many bugs from the drivers and has a very pleasing diffstat. More drivers can still be converted, but that is for another cycle. - A shared branch with RDMA reworking the RDMA ODP implementation - New mmu_interval_notifier API. This is focused on the use case of monitoring a VA and simplifies the process for drivers - A common seq-count locking scheme built into the mmu_interval_notifier API usable by drivers that call get_user_pages() or hmm_range_fault() with the VA range - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen GntDev drivers to the new API. This deletes a lot of wonky driver code. - Two improvements for hmm_range_fault(), from testing done by Ralph" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap mm/hmm: make full use of walk_page_range() xen/gntdev: use mmu_interval_notifier_insert mm/hmm: remove hmm_mirror and related drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror drm/amdgpu: Call find_vma under mmap_sem nouveau: use mmu_interval_notifier instead of hmm_mirror nouveau: use mmu_notifier directly for invalidate_range_start drm/radeon: use mmu_interval_notifier_insert RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv RDMA/odp: Use mmu_interval_notifier_insert() mm/hmm: define the pre-processor related parts of hmm.h even if disabled mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror mm/mmu_notifier: add an interval tree notifier mm/mmu_notifier: define the header pre-processor parts even if disabled mm/hmm: allow snapshot of the special zero page	2019-11-30 10:33:14 -08:00
Bjorn Helgaas	e87eb585d3	Merge branch 'pci/misc' - Add NumaChip SPDX header (Krzysztof Wilczynski) - Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski) - Remove unused includes (Krzysztof Wilczynski) - Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on USB 2.0 or 1.1 connect events (Kai-Heng Feng) - Removed unused sysfs attribute groups (Ben Dooks) - Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas) - Add PCIe Link Control 2 register field definitions to replace magic numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas) - Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and Radeon CIK/SI PCIe Gen3 link training (Bjorn Helgaas) - Use pcie_capability_read_word() instead of pci_read_config_word() in AMDGPU and Radeon CIK/SI (Frederick Lawler) * pci/misc: drm/radeon: Prefer pcie_capability_read_word() drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions drm/radeon: Correct Transmit Margin masks drm/amdgpu: Prefer pcie_capability_read_word() drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions drm/amdgpu: Correct Transmit Margin masks PCI: Add #defines for Enter Compliance, Transmit Margin PCI: Allow building PCIe things without PCIEPORTBUS PCI: Remove PCIe Kconfig dependencies on PCI PCI/ASPM: Remove dependency on PCIEPORTBUS PCI/PTM: Remove dependency on PCIEPORTBUS PCI/PTM: Remove spurious "d" from granularity message PCI: sysfs: Remove unused attribute groups x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect PCI: Remove unused includes and superfluous struct declaration x86/PCI: Replace deprecated EXTRA_CFLAGS with ccflags-y x86/PCI: Correct SPDX comment style x86/PCI: Add NumaChip SPDX GPL-2.0 to replace COPYING boilerplate	2019-11-28 08:54:32 -06:00
Dan Carpenter	f4618fe9c2	drm/amdgpu: Fix a bug in jpeg_v1_0_start() Originally the last WREG32_SOC15() was a part of the if statement block but the curly braces are on the wrong line. Fixes: `bb0db70f3f` ("drm/amdgpu: separate JPEG1.0 code out from VCN1.0") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:04 -05:00
Alex Deucher	5149f08275	drm/amdgpu: flag vram lost on baco reset for VI/CIK VI/CIK BACO was inflight when this fix landed for SOC15/NV. Add the fix to VI/CIK as well. Acked-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:03 -05:00
Alex Deucher	de185019bc	drm/amdgpu: move pci handling out of pm ops The documentation says the that PCI core handles this for you unless you choose to implement it. Just rely on the PCI core to handle the pci specific bits. Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:03 -05:00
Hawking Zhang	be3e73ea7d	drm/amdgpu: apply gpr/gds workaround before enabling GFX EDC mode gfx memory should be initialized before enabling DED and FUE field in mmGB_EDC_MODE Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:03 -05:00
Jack Zhang	e416fdb6a3	drm/amd/amdgpu/sriov skip jpeg ip block for ARCTURUS VF Currently ARCTURUS VF doesn't support jpeg ip block. Skip jpeg ip block in case guest driver load fail. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Zhexi Zhang <zhexi.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:03 -05:00
Felix Kuehling	9f890f3044	drm/amdgpu: Optimize KFD page table reservation Be less pessimistic about estimated page table use for KFD. Most allocations use 2MB pages and therefore need less VRAM for page tables. This allows more VRAM to be used for applications especially on large systems with many GPUs and hundreds of GB of system memory. Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB Old page table reservation per GPU: 1GB New page table reservation per GPU: 32MB Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:03 -05:00
Felix Kuehling	b72ff1909c	drm/amdgpu: Raise KFD unpinned system memory limit Allow KFD applications to use more unpinned system memory through HMM. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:03 -05:00
Felix Kuehling	29a39c90ba	drm/amdgpu: Optimize KFD page table reservation Be less pessimistic about estimated page table use for KFD. Most allocations use 2MB pages and therefore need less VRAM for page tables. This allows more VRAM to be used for applications especially on large systems with many GPUs and hundreds of GB of system memory. Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB Old page table reservation per GPU: 1GB New page table reservation per GPU: 32MB Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 12:24:07 -05:00
Alex Deucher	dea8b90029	drm/amdgpu: flag vram lost on baco reset for VI/CIK VI/CIK BACO was inflight when this fix landed for SOC15/NV. Add the fix to VI/CIK as well. Acked-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 12:22:31 -05:00
John Clements	5985ebbe78	drm/amdgpu: Resolved offchip EEPROM I/O issue Updated target I2C address Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 12:20:22 -05:00
Nathan Chancellor	a63141e317	drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG Commit `b0f3cd3191` ("drm/amdgpu: remove unnecessary JPEG2.0 code from VCN2.0") introduced a new clang warning in the vcn_v2_0_stop function: ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: warning: variable 'r' is used uninitialized whenever 'while' loop exits because its condition is false [-Wsometimes-uninitialized] SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note: expanded from macro 'SOC15_WAIT_ON_RREG' while ((tmp_ & (mask)) != (expected_value)) { \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1083:6: note: uninitialized use occurs here if (r) ^ ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: note: remove the condition if it is always true SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r); ^ ../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note: expanded from macro 'SOC15_WAIT_ON_RREG' while ((tmp_ & (mask)) != (expected_value)) { \ ^ ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1072:7: note: initialize the variable 'r' to silence this warning int r; ^ = 0 1 warning generated. To prevent warnings like this from happening in the future, make the SOC15_WAIT_ON_RREG macro initialize its ret variable before the while loop that can time out. This macro's return value is always checked so it should set ret in both the success and fail path. Link: https://github.com/ClangBuiltLinux/linux/issues/776 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 09:48:57 -05:00
John Clements	8633f126bf	drm/amdgpu: Resolved offchip EEPROM I/O issue Updated target I2C address Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-25 11:20:10 -05:00
Oak Zeng	3d3f9ba8c4	drm/amdgpu: Apply noretry setting for mmhub9.4 Config the translation retry behavior according to noretry kernel parameter Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-25 11:19:55 -05:00
Jason Gunthorpe	81fa1af31b	drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror Convert the collision-retry lock around hmm_range_fault to use the one now provided by the mmu_interval notifier. Although this driver does not seem to use the collision retry lock that hmm provides correctly, it can still be converted over to use the mmu_interval_notifier api instead of hmm_mirror without too much trouble. This also deletes another place where a driver is associating additional data (struct amdgpu_mn) with a mmu_struct. Link: https://lore.kernel.org/r/20191112202231.3856-13-jgg@ziepe.ca Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:45 -04:00
Jason Gunthorpe	62914a99de	drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror Remove the interval tree in the driver and rely on the tree maintained by the mmu_notifier for delivering mmu_notifier invalidation callbacks. For some reason amdgpu has a very complicated arrangement where it tries to prevent duplicate entries in the interval_tree, this is not necessary, each amdgpu_bo can be its own stand alone entry. interval_tree already allows duplicates and overlaps in the tree. Also, there is no need to remove entries upon a release callback, the mmu_interval API safely allows objects to remain registered beyond the lifetime of the mm. The driver only has to stop touching the pages during release. Link: https://lore.kernel.org/r/20191112202231.3856-12-jgg@ziepe.ca Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:45 -04:00
Jason Gunthorpe	a9ae8731e6	drm/amdgpu: Call find_vma under mmap_sem find_vma() must be called under the mmap_sem, reorganize this code to do the vma check after entering the lock. Further, fix the unlocked use of struct task_struct's mm, instead use the mm from hmm_mirror which has an active mm_grab. Also the mm_grab must be converted to a mm_get before acquiring mmap_sem or calling find_vma(). Fixes: `66c45500bf` ("drm/amdgpu: use new HMM APIs and helpers") Fixes: `0919195f2b` ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads") Link: https://lore.kernel.org/r/20191112202231.3856-11-jgg@ziepe.ca Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
changzhu	f920d1bb9c	drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10 It may lose gpuvm invalidate acknowldege state across power-gating off cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire before invalidation and semaphore release after invalidation. After adding semaphore acquire before invalidation, the semaphore register become read-only if another process try to acquire semaphore. Then it will not be able to release this semaphore. Then it may cause deadlock problem. If this deadlock problem happens, it needs a semaphore firmware fix. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2019-11-22 14:55:19 -05:00
changzhu	6c2c897237	drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub SW must acquire/release one of the vm_invalidate_eng_sem around the invalidation req/ack. Through this way,it can avoid losing invalidate acknowledge state across power-gating off cycle. To use vm_invalidate_eng_sem, it needs to initialize vm_invalidate_eng*_sem firstly. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2019-11-22 14:55:19 -05:00
Jack Zhang	1b34de7c3f	drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF. After rlcg fw 2.1, kmd driver starts to load extra fw for LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw because all rlcg related fw have been loaded by host driver. Guest driver would load the three fw fail without this change. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:55:19 -05:00
Jack Zhang	ef1c0cbcd1	drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF Currently the three features haven't been enabled at SRIOV, it would trigger guest driver load fail with the bare-metal path of the three features. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:55:19 -05:00
Xiaojie Yuan	210b3b3c75	drm/amdgpu/gfx10: re-init clear state buffer after gpu reset This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x. clear state buffer (resides in vram) is corrupted after 1st baco reset, upon gfxoff exit, CPF gets garbage header in CSIB and hangs. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2019-11-22 14:55:12 -05:00
Stephen Rothwell	a3511321fd	merge fix for "ftrace: Rework event_create_dir()" Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:55:12 -05:00
Jay Cornwall	57fb0ab2f1	drm/amdgpu: Update Arcturus golden registers Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:55:12 -05:00
Xiaojie Yuan	908a28be09	drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access Fixes: `0900a9efdb` ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)") Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:55:12 -05:00
Xiaojie Yuan	1e902a6d32	drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt 50us is not enough to wait for cp ready after gpu reset on some navi asics. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Suggested-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2019-11-22 14:54:56 -05:00
Alex Deucher	5e18d2b14c	Revert "drm/amd/display: enable S/G for RAVEN chip" This reverts commit `1c42591591`. S/G display is not stable with the IOMMU enabled on some platforms. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:11 -05:00
Alex Deucher	8fc4134413	drm/amdgpu: disable gfxoff on original raven There are still combinations of sbios and firmware that are not stable. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:11 -05:00
Alex Deucher	5355d7e054	drm/amdgpu: remove experimental flag for Navi14 5.4 and newer works fine with navi14. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:10 -05:00
Alex Deucher	70f7eb639e	drm/amdgpu: disable gfxoff when using register read interface When gfxoff is enabled, accessing gfx registers via MMIO can lead to a hang. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497 Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:10 -05:00
Sam Bobroff	3d0e3ce52c	drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2 The INTERRUPT_CNTL2 register expects a valid DMA address, but is currently set with a GPU MC address. This can cause problems on systems that detect the resulting DMA read from an invalid address (found on a Power8 guest). Instead, use the DMA address of the dummy page because it will always be safe. Fixes: `27ae10641e` ("drm/amdgpu: add interupt handler implementation for si v3") Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:10 -05:00
Alex Deucher	f8a69a8022	drm/amdgpu/nv: add asic func for fetching vbios from rom directly Needed as a fallback if the vbios can't be fetched by other means. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:10 -05:00
Yintian Tao	c0e21ea1d0	drm/amdgpu: put flush_delayed_work at first There is one regression from 042f3d7b745cd76aa To put flush_delayed_work after adev->shutdown = true which will make amdgpu_ih_process not response the irq At last, all ib ring tests will be failed just like below [drm] amdgpu: finishing device. [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring comp_1.0.0 [drm] Fence fallback timer expired on ring comp_1.1.0 [drm] Fence fallback timer expired on ring comp_1.2.0 [drm] Fence fallback timer expired on ring comp_1.3.0 [drm] Fence fallback timer expired on ring comp_1.0.1 amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.1.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.2.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.3.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on sdma0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on sdma1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on uvd_enc_0.0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on vce0 (-110). [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] ERROR ib ring test failed (-110). v2: replace cancel_delayed_work_sync() with flush_delayed_work() Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:10 -05:00
Leo Liu	4e20f6550b	drm/amdgpu/vcn2.5: fix the enc loop with hw fini Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:10 -05:00
Xiaojie Yuan	0900a9efdb	drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2) 1. no need to allocate an extra member for 'mqd_backup' array 2. backup/restore mqd to/from the correct 'mqd_backup' array slot v2: warning fix (Alex) Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:35:10 -05:00
zhengbin	e9c5dbc1a2	drm/amdgpu: Use ARRAY_SIZE for sos_old_versions Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/psp_v3_1.c:182:40-41: WARNING: Use ARRAY_SIZE Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
changzhu	4ed8a03740	drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10 It may lose gpuvm invalidate acknowldege state across power-gating off cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire before invalidation and semaphore release after invalidation. After adding semaphore acquire before invalidation, the semaphore register become read-only if another process try to acquire semaphore. Then it will not be able to release this semaphore. Then it may cause deadlock problem. If this deadlock problem happens, it needs a semaphore firmware fix. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
changzhu	dab5ef2722	drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub SW must acquire/release one of the vm_invalidate_eng_sem around the invalidation req/ack. Through this way,it can avoid losing invalidate acknowledge state across power-gating off cycle. To use vm_invalidate_eng_sem, it needs to initialize vm_invalidate_eng*_sem firstly. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Jack Zhang	c348ad46b0	drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF. After rlcg fw 2.1, kmd driver starts to load extra fw for LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw because all rlcg related fw have been loaded by host driver. Guest driver would load the three fw fail without this change. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Jack Zhang	edc2176d51	drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF Currently the three features haven't been enabled at SRIOV, it would trigger guest driver load fail with the bare-metal path of the three features. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Xiaojie Yuan	c25edaaf75	drm/amdgpu/gfx10: re-init clear state buffer after gpu reset This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x. clear state buffer (resides in vram) is corrupted after 1st baco reset, upon gfxoff exit, CPF gets garbage header in CSIB and hangs. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Stephen Rothwell	6a93b58e5f	merge fix for "ftrace: Rework event_create_dir()" Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Colin Ian King	2e77541bd1	drm/amdgpu: remove redundant assignment to pointer write_frame The pointer write_frame is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Alex Deucher	562b49fcd0	drm/amdgpu: simplify runtime suspend In the standard _PR3 case, the pci core handles the pci state. The driver only needs to handle it in the legacy ATPX case. This may fix issues with runtime suspend/resume on certain hybrid graphics laptops. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Jay Cornwall	6e04b2248d	drm/amdgpu: Update Arcturus golden registers Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Dennis Li	f6c3623b7b	drm/amdgpu: implement querying ras error count for mmhub9.4 Get mmhub error counter by accessing EDC_CNT registers. v2: Add mmhub_v9_4_ prefix for local static variable and function Signed-off-by: Dennis Li <dennis.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:27:11 -05:00
Dennis Li	8781e5df11	drm/amdgpu: refine query function of mmhub EDC counter in vg20 Add codes to print the detail EDC info for the subblock of mmhub v2: Move the EDC_CNT registers' defintion from mmhub_9_4 header files to mmhub_1_0 ones. Add mmhub_v1_0_ prefix for the local static variable and function. v3: squash in DC fix Signed-off-by: Dennis Li <dennis.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:26:46 -05:00
Dennis Li	46f719696e	drm/amdgpu: define soc15_ras_field_entry for reuse The struct soc15_ras_field_entry will be reused by other IPs, such as mmhub and gc v2: rename ras_subblock_regs to gc_ras_fields_vg20, because the future asic maybe have a different table. Signed-off-by: Dennis Li <dennis.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:23:09 -05:00
Xiaojie Yuan	d98a07aea6	drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access Fixes: `0bb419c76b` ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)") Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:22:56 -05:00
Xiaojie Yuan	387d40fd6f	drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt 50us is not enough to wait for cp ready after gpu reset on some navi asics. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Suggested-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:20:30 -05:00
Alex Sierra	b4672c8a84	amd/amdgpu: force to trigger a no-retry-fault after a retry-fault Only for the debugger use case. [why] Avoid endless translation retries, after an invalid address access has been issued to the GPU. Instead, the trap handler is forced to enter by generating a no-retry-fault. A s_trap instruction is inserted in the debugger case to let the wave to enter trap handler to save context. [how] Intentionally using an invalid flag combination (F and P set at the same time) to trigger a no-retry-fault, after a retry-fault happens. This is only valid under compute context. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:20:23 -05:00
Alex Sierra	f43ef951f6	drm/amdgpu: add flag to indicate amdgpu vm context Flag added to indicate if the amdgpu vm context is used for compute or graphics. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-22 14:20:07 -05:00
Frederick Lawler	88027c89ea	drm/amdgpu: Prefer pcie_capability_read_word() Commit `8c0d3a02c1` ("PCI: Add accessors for PCI Express Capability") added accessors for the PCI Express Capability so that drivers didn't need to be aware of differences between v1 and v2 of the PCI Express Capability. Replace pci_read_config_word() and pci_write_config_word() calls with pcie_capability_read_word() and pcie_capability_write_word(). [bhelgaas: fix a couple remaining instances in cik.c] Link: https://lore.kernel.org/r/20191118003513.10852-1-fred@fredlawl.com Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-21 11:15:49 -06:00
Bjorn Helgaas	35e768e296	drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions Replace hard-coded magic numbers with the descriptive PCI_EXP_LNKCTL2 definitions. No functional change intended. Link: https://lore.kernel.org/r/20191112173503.176611-4-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-21 07:52:34 -06:00
Bjorn Helgaas	19d7a95a8b	drm/amdgpu: Correct Transmit Margin masks Previously we masked PCIe Link Control 2 register values with "7 << 9", which was apparently intended to be the Transmit Margin field, but instead was the high order bit of Transmit Margin, the Enter Modified Compliance bit, and the Compliance SOS bit. Correct the mask to "7 << 7", which is the Transmit Margin field. Link: https://lore.kernel.org/r/20191112173503.176611-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-21 07:52:34 -06:00
Dave Airlie	c22fe762ba	Merge tag 'drm-next-5.5-2019-11-15' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.5-2019-11-15: amdgpu: - Fix AVFS handling on SMU7 parts with custom power tables - Enable Overdrive sysfs interface for Navi parts - Fix power limit handling on smu11 parts - Fix pcie link sysfs output for Navi - Probably cancel MM worker threads on shutdown radeon: - Cleanup for ppc change Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191115163516.3714-1-alexander.deucher@amd.com	2019-11-21 08:52:27 +10:00
Alex Deucher	72f058b723	drm/amdgpu: enable runtime pm on BACO capable boards if runpm=1 BACO - Bus Active, Chip Off Everything is in place now. Not enabled by default yet. You still have to specify runpm=1. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:54 -05:00
Alex Deucher	3840c5bcc2	drm/amdgpu: disentangle runtime pm and vga_switcheroo Originally we only supported runtime pm on PX/HG laptops so vga_switcheroo and runtime pm are sort of entangled. Attempt to logically separate them. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:53 -05:00
Alex Deucher	6ae6c7d404	drm/amdgpu: start to disentangle boco from runtime pm BACO - Bus Active, Chip Off BOCO - Bus Off, Chip Off We originally only supported runtime pm on PX/HG laptops so most of the runtime pm code looks for this. Add a new flag to check for runtime pm enablement and use this rather than checking for PX/HG. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:53 -05:00
Alex Deucher	1913431728	drm/amdgpu: add baco support to runtime suspend/resume BACO - Bus Active, Chip Off This adds the necessary support to the runtime suspend and resume functions to handle boards that support baco. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:52 -05:00
Alex Deucher	361dbd01a1	drm/amdgpu: add helpers for baco entry and exit BACO - Bus Active, Chip Off Will be used for runtime pm. Entry will enter the BACO state (chip off). Exit will exit the BACO state (chip on). Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:52 -05:00
Alex Deucher	11520f2708	drm/amdgpu: split swSMU baco_reset into enter and exit BACO - Bus Active, Chip Off So we can use it for power savings rather than just reset. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:51 -05:00
Alex Deucher	b97e9d47e5	drm/amdgpu: add additional boco checks to runtime suspend/resume (v2) BACO - Bus Active, Chip Off BOCO - Bus Off, Chip Off We will take slightly different paths for boco and baco. v2: fold together two consecutive if clauses Reviewed-by: Evan Quan <evan.quan@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:50 -05:00
Alex Deucher	31af062acf	drm/amdgpu: rename amdgpu_device_is_px to amdgpu_device_supports_boco (v2) BACO - Bus Active, Chip Off BOCO - Bus Off, Chip Off To better match what we are checking for and to align with amdgpu_device_supports_baco. BOCO is used on PowerXpress/Hybrid Graphics systems and BACO is used on desktop dGPU boards. v2: fix typo in documentation Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:50 -05:00
Alex Deucher	a69cba42b1	drm/amdgpu: add a amdgpu_device_supports_baco helper BACO - Bus Active, Chip Off To check if a device supports BACO or not. This will be used in determining when to enable runtime pm. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:49 -05:00
Alex Deucher	ac7426169e	drm/amdgpu: add supports_baco callback for NV asics. BACO - Bus Active, Chip Off Check the BACO capabilities from the powerplay table. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:48 -05:00
Alex Deucher	e45ed9435f	drm/amdgpu: add supports_baco callback for VI asics. BACO - Bus Active, Chip Off Check the BACO capabilities from the powerplay table. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:48 -05:00
Alex Deucher	0d0c07ee07	drm/amdgpu: add supports_baco callback for CIK asics. BACO - Bus Active, Chip Off Check the BACO capabilities from the powerplay table. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:47 -05:00
Alex Deucher	3670c242e3	drm/amdgpu: add supports_baco callback for SI asics. BACO - Bus Active, Chip Off Not supported. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:47 -05:00
Alex Deucher	988eb9ff3e	drm/amdgpu: add supports_baco callback for soc15 asics. (v2) BACO - Bus Active, Chip Off Check the BACO capabilities from the powerplay table. v2: drop unrelated struct cleanup Reviewed-by: Evan Quan <evan.quan@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:46 -05:00
Alex Deucher	69d5436d4d	drm/amdgpu: add asic callback for BACO support BACO - Bus Active, Chip Off Used to check whether the device supports BACO. This will be used to enable runtime pm on devices which support BACO. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:46 -05:00
Hawking Zhang	858a2bbad6	drm/amdgpu: pull ras controller int status only when ras enabled ras_controller_irq and athub_err_event_irq are only registered when PCIE_BIF ras is marked as supported. as the result, the driver also just need pull the int status in such case. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 12:09:23 -05:00
Hawking Zhang	5bdd0b72d6	drm/amdgpu: switch to common helper func for psp cmd submission Drop all the IP specific cmd_submit callback function and use the common helper instead Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 12:09:13 -05:00
Hawking Zhang	cc65176e51	drm/amdgpu: add helper func for psp ring cmd submission Except for ring wptr update, the psp ring cmd submission function shouldn't be IP specific one. Create a common helper function to be shared for all the ASICs. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 12:09:06 -05:00
Hawking Zhang	13a390a6f9	drm/amdgpu: add psp funcs for ring write pointer read/write The ring write pointer regsiter update is the only part that is IP specific ones in psp_cmd_submit function. Add two callbacks for wptr read/write so that we unify the psp_cmd_submit function for all the ASICs. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 12:08:58 -05:00
Evan Quan	0a650c1d35	drm/amd/powerplay: add Arcturus baco reset support Enable baco reset support on Arcturus. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 12:08:38 -05:00
Alex Deucher	2aa87ba568	Revert "drm/amd/display: enable S/G for RAVEN chip" This reverts commit `1c42591591`. S/G display is not stable with the IOMMU enabled on some platforms. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:54 -05:00
Alex Deucher	3f2a06ac81	drm/amdgpu: disable gfxoff on original raven There are still combinations of sbios and firmware that are not stable. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:54 -05:00
Alex Deucher	b62d955426	drm/amdgpu: remove experimental flag for Navi14 5.4 and newer works fine with navi14. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:54 -05:00
Alex Deucher	ca9317b918	drm/amdgpu: disable gfxoff when using register read interface When gfxoff is enabled, accessing gfx registers via MMIO can lead to a hang. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497 Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:54 -05:00
zhengbin	1664194925	drm/amdgpu: remove not needed memset Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c:64:13-31: WARNING: dma_alloc_coherent use in ih -> ring already zeroes out memory, so memset is not needed Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:54 -05:00
Sam Bobroff	b992691d45	drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2 The INTERRUPT_CNTL2 register expects a valid DMA address, but is currently set with a GPU MC address. This can cause problems on systems that detect the resulting DMA read from an invalid address (found on a Power8 guest). Instead, use the DMA address of the dummy page because it will always be safe. Fixes: `27ae10641e` ("drm/amdgpu: add interupt handler implementation for si v3") Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:54 -05:00
Alex Deucher	29bc37b410	drm/amdgpu/nv: add asic func for fetching vbios from rom directly Needed as a fallback if the vbios can't be fetched by other means. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:51 -05:00
Alex Deucher	761e09230c	drm/amdgpu/soc15: move struct definition around to align with other soc15 asics Move reset_method next to reset callback to match the struct layout and the other definition in this file. Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:51 -05:00
Yintian Tao	d0d13fe874	drm/amdgpu: put flush_delayed_work at first There is one regression from 042f3d7b745cd76aa To put flush_delayed_work after adev->shutdown = true which will make amdgpu_ih_process not response the irq At last, all ib ring tests will be failed just like below [drm] amdgpu: finishing device. [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring comp_1.0.0 [drm] Fence fallback timer expired on ring comp_1.1.0 [drm] Fence fallback timer expired on ring comp_1.2.0 [drm] Fence fallback timer expired on ring comp_1.3.0 [drm] Fence fallback timer expired on ring comp_1.0.1 amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.1.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.2.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.3.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on sdma0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on sdma1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on uvd_enc_0.0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on vce0 (-110). [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] ERROR ib ring test failed (-110). v2: replace cancel_delayed_work_sync() with flush_delayed_work() Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:51 -05:00
Leo Liu	4effa8dbc1	drm/amdgpu/vcn2.5: fix the enc loop with hw fini Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:51 -05:00
Leo Liu	8c74e59049	drm/amdgpu: enable Arcturus JPEG2.5 block It also doen't care about FW loading type, so enabling it directly. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	e89e2237e8	drm/amdgpu: enable Arcturus CG for VCN and JPEG blocks Arcturus VCN and JPEG only got CG support, and no PG support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	14f43e8f88	drm/amdgpu: move JPEG2.5 out from VCN2.5 And clean up the duplicated stuff Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	5be45a26c9	drm/amdgpu: enable JPEG2.0 for Navi1x and Renoir By adding JPEG IP block to the family Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	52f2e779ad	drm/amdgpu: add driver support for JPEG2.0 and above By using JPEG IP block type Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	474b6d296f	drm/amdgpu: enable JPEG2.0 dpm By using its own enabling function Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	099d66e43f	drm/amdgpu: add PG and CG for JPEG2.0 And enable them for Navi1x and Renoir Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	b0f3cd3191	drm/amdgpu: remove unnecessary JPEG2.0 code from VCN2.0 They are no longer needed, using from JPEG2.0 instead. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	6ac2724110	drm/amdgpu: add JPEG v2.0 function supports It got separated from VCN2.0 with a new jpeg_v2_0_ip_block Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	2eb167293f	drm/amdgpu: add JPEG common functions to amdgpu_jpeg They will be used for JPEG2.0 and later. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	0388aee766	drm/amdgpu: use the JPEG structure for general driver support JPEG1.0 will be functional along with VCN1.0 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:49 -05:00
Leo Liu	bb0db70f3f	drm/amdgpu: separate JPEG1.0 code out from VCN1.0 For VCN1.0, the separation is just in code wise, JPEG1.0 HW is still included in the VCN1.0 HW. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:49 -05:00
Leo Liu	9d9cc9b8fe	drm/amdgpu: add amdgpu_jpeg and JPEG tests It will be used for all versions of JPEG eventually. Previous JPEG tests will be removed later since they are still used by JPEG2.x. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:49 -05:00
Leo Liu	88a1c40a04	drm/amdgpu: add JPEG HW IP and SW structures It will be used for JPEG IP 1.0, 2.0, 2.5 and later. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:49 -05:00
Xiaojie Yuan	0bb419c76b	drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2) 1. no need to allocate an extra member for 'mqd_backup' array 2. backup/restore mqd to/from the correct 'mqd_backup' array slot v2: warning fix (Alex) Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:49 -05:00
Hawking Zhang	9e612c11a7	drm/amdgpu: init umc functions for arcturus umc ras reuse vg20 umc functions for arcturus umc ras Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 09:47:36 -05:00
Hawking Zhang	baaeb610b1	drm/amdgpu: enable ras capablity check on arcturus check hw ras capablity via atomfirmware Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 09:47:30 -05:00
Jani Nikula	f139a62c7a	drm/amdgpu: use drm_debug_enabled() to check for debug categories Allow better abstraction of the drm_debug global variable in the future. No functional changes. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: David (ChunMing) Zhou <David1.Zhou@amd.com> Cc: amd-gfx@lists.freedesktop.org Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Sean Paul <sean@poorly.run> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e30ddbd627be1fe1c67dd007e0d36f81a094bc91.1572258936.git.jani.nikula@intel.com	2019-11-14 14:08:39 +02:00
Dave Airlie	0990ca235d	Merge tag 'drm-next-5.5-2019-11-08' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.5-2019-11-08: amdgpu: - Enable VCN dynamic powergating on RV/RV2 - Fixes for Navi14 - Misc Navi fixes - Fix MSI-X tear down - Misc Arturus fixes - Fix xgmi powerstate handling - Documenation fixes scheduler: - Fix static code checker warning - Fix possible thread reactivation while thread is stopped - Avoid cleanup if thread is parked radeon: - SI dpm fix ported from amdgpu Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191108212713.5078-1-alexander.deucher@amd.com	2019-11-14 08:43:07 +10:00
yu kuai	9e089a29c6	drm/amdgpu: remove set but not used variable 'invalid' Fixes gcc '-Wunused-but-set-variable' warning: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c: In function ‘amdgpu_amdkfd_evict_userptr’: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:1665:6: warning: variable ‘invalid’ set but not used [-Wunused-but-set-variable] 'invalid' is never used, so can be removed. Thus 'atomic_inc_return' can be replaced as 'atomic_inc' Fixes: `5ae0283e83` ("drm/amdgpu: Add userptr support for KFD") Signed-off-by: yu kuai <yukuai3@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-13 15:29:46 -05:00

... 7 8 9 10 11 ...

7400 Commits