linux

Author	SHA1	Message	Date
Chengming Gui	dec7880579	drm/amd/amdgpu: Correct gfx10's CG sequence Incorrect CG sequence will cause gfx timedout, if we keep switching power profile mode (enter profile mod such as PEAK will disable CG, exit profile mode EXIT will enable CG) when run Vulkan test case(case used for test: vkexample). Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-07 14:01:06 -04:00
Linus Torvalds	86f26a77cb	pci-v5.7-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl6GTQMUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vy3PhAAmqpYBRobOsG8QbmKDjoJEFtkqdvD z6+4zf/R+hF11RyXjMDwihIe8d+tkQ4eAaYu6Oh5PrTyanz0G0PgeCrivZeytULk thqQIWzDQMVA5vN/2/Vy8s5s+3HzP8z/MZOFScJ7+xA1MndXptPRTNmFUbjx+GAv x8/pTp0u9AF6m7itX65DxXvwkzjWamt+Ar4Yx2IcuKAU/M5RtfuZO3PpDnqn7/wk JFlkRoYeFB6qNnnkPdeyPHl9dALhuhzgdTyklQEnKVW3nf3xThYDhcEwdh6kBQgl 0dH8lL5LXy7PKGN8RES4wB0Vqndw/HlsCF5O4wkkfItbnbJxGJtS139e5973m0ud sgWvF4yJAT2jCKhIeNz34sePQJMyWALhv0XzZCsJ0YeGHsrV1jrHELkwUT1+eIsT 3UV0iZ6aL06zQJDyKUbbIcQzEQ/wwBC+x9VgsyL54K1quCQZ1N1Nl/dvrb4cRG9m m9EhJK/brDf4c0uFlOmMTSxV1t5J+z6ZSQnh1ShD/o5yBsxqN6q5brDT6LEs+jbM LsIkA18jJOd4OyiDs98YiFKvIfFQbQ0LEBQpJwhF0snvfBFMMbUYN/T/NYneWON/ F0TpkFoP7PXDuq55iNaLdnObfzrpC9kdzUyWvePUvjxIl55bkf+/qtUny+H48t4L dNggvW052d7BHes= =deWu -----END PGP SIGNATURE----- Merge tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Revert sysfs "rescan" renames that broke apps (Kelsey Skunberg) - Add more 32 GT/s link speed decoding and improve the implementation (Yicong Yang) Resource management: - Add support for sizing programmable host bridge apertures and fix a related alpha Nautilus regression (Ivan Kokshaysky) Interrupts: - Add boot interrupt quirk mechanism for Xeon chipsets and document boot interrupts (Sean V Kelley) PCIe native device hotplug: - When possible, disable in-band presence detect and use PDS (Alexandru Gagniuc) - Add DMI table for devices that don't use in-band presence detection but don't advertise that correctly (Stuart Hayes) - Fix hang when powering slots up/down via sysfs (Lukas Wunner) - Fix an MSI interrupt race (Stuart Hayes) Virtualization: - Add ACS quirks for Zhaoxin devices (Raymond Pang) Error handling: - Add Error Disconnect Recover (EDR) support so firmware can report devices disconnected via DPC and we can try to recover (Kuppuswamy Sathyanarayanan) Peer-to-peer DMA: - Add Intel Sky Lake-E Root Ports B, C, D to the whitelist (Andrew Maier) ASPM: - Reduce severity of common clock config message (Chris Packham) - Clear the correct bits when enabling L1 substates, so we don't go to the wrong state (Yicong Yang) Endpoint framework: - Replace EPF linkup ops with notifier call chain and improve locking (Kishon Vijay Abraham I) - Fix concurrent memory allocation in OB address region (Kishon Vijay Abraham I) - Move PF function number assignment to EPC core to support multiple function creation methods (Kishon Vijay Abraham I) - Fix issue with clearing configfs "start" entry (Kunihiko Hayashi) - Fix issue with endpoint MSI-X ignoring BAR Indicator and Table Offset (Kishon Vijay Abraham I) - Add support for testing DMA transfers (Kishon Vijay Abraham I) - Add support for testing > 10 endpoint devices (Kishon Vijay Abraham I) - Add support for tests to clear IRQ (Kishon Vijay Abraham I) - Add common DT schema for endpoint controllers (Kishon Vijay Abraham I) Amlogic Meson PCIe controller driver: - Add DT bindings for AXG PCIe PHY, shared MIPI/PCIe analog PHY (Remi Pommarel) - Add Amlogic AXG PCIe PHY, AXG MIPI/PCIe analog PHY drivers (Remi Pommarel) Cadence PCIe controller driver: - Add Root Complex/Endpoint DT schema for Cadence PCIe (Kishon Vijay Abraham I) Intel VMD host bridge driver: - Add two VMD Device IDs that require bus restriction mode (Sushma Kalakota) Mobiveil PCIe controller driver: - Refactor and modularize mobiveil driver (Hou Zhiqiang) - Add support for Mobiveil GPEX Gen4 host (Hou Zhiqiang) Microsoft Hyper-V host bridge driver: - Add support for Hyper-V PCI protocol version 1.3 and PCI_BUS_RELATIONS2 (Long Li) - Refactor to prepare for virtual PCI on non-x86 architectures (Boqun Feng) - Fix memory leak in hv_pci_probe()'s error path (Dexuan Cui) NVIDIA Tegra PCIe controller driver: - Use pci_parse_request_of_pci_ranges() (Rob Herring) - Add support for endpoint mode and related DT updates (Vidya Sagar) - Reduce -EPROBE_DEFER error message log level (Thierry Reding) Qualcomm PCIe controller driver: - Restrict class fixup to specific Qualcomm devices (Bjorn Andersson) Synopsys DesignWare PCIe controller driver: - Refactor core initialization code for endpoint mode (Vidya Sagar) - Fix endpoint MSI-X to use correct table address (Kishon Vijay Abraham I) TI DRA7xx PCIe controller driver: - Fix MSI IRQ handling (Vignesh Raghavendra) TI Keystone PCIe controller driver: - Allow AM654 endpoint to raise MSI-X interrupt (Kishon Vijay Abraham I) Miscellaneous: - Quirk ASMedia XHCI USB to avoid "PME# from D0" defect (Kai-Heng Feng) - Use ioremap(), not phys_to_virt(), for platform ROM to fix video ROM mapping with CONFIG_HIGHMEM (Mikel Rychliski)" * tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (96 commits) misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS PCI: tegra: Print -EPROBE_DEFER error message at debug level misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq() misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices tools: PCI: Add 'e' to clear IRQ misc: pci_endpoint_test: Add ioctl to clear IRQ misc: pci_endpoint_test: Avoid using module parameter to determine irqtype PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments misc: pci_endpoint_test: Add support to get DMA option from userspace tools: PCI: Add 'd' command line option to support DMA misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation PCI: endpoint: functions/pci-epf-test: Print throughput information PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data PCI: pciehp: Fix MSI interrupt race PCI: pciehp: Fix indefinite wait on sysfs requests PCI: endpoint: Fix clearing start entry in configfs PCI: tegra: Add support for PCIe endpoint mode in Tegra194 PCI: sysfs: Revert "rescan" file renames ...	2020-04-03 14:25:02 -07:00
Aaron Ma	5932d260a8	drm/amdgpu: Fix oops when pp_funcs is unset in ACPI event On ARCTURUS and RENOIR, powerplay is not supported yet. When plug in or unplug power jack, ACPI event will issue. Then kernel NULL pointer BUG will be triggered. Check for NULL pointers before calling. Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-04-03 17:24:10 -04:00
Likun Gao	b74fb888f4	drm/amdgpu: change SH MEM alignment mode for gfx10 Change SH_MEM_CONFIG Alignment mode to Automatic, as: 1)OGL fn_amd_compute_shader will failed with unaligned mode. 2)The default alignment mode was defined to automatic on gfx10 specification. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-03 17:23:21 -04:00
Alex Sierra	193cce34a1	amdgpu/drm: remove psp access on navi10 for sriov Navi ASICs don't require to access through PSP to osssys registers. This on SR-IOV configuration. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-03 17:01:25 -04:00
Bhawanpreet Lakha	8913f7ff05	drm/amd/display: Guard calls to hdcp_ta and dtm_ta [Why] The buffer used when calling psp is a shared buffer. If we have multiple calls at the same time we can overwrite the buffer. [How] Add mutex to guard the shared buffer. Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-03 17:01:18 -04:00
Linus Torvalds	50a5de895d	hmm related patches for 5.7 This series focuses on corner case bug fixes and general clarity improvements to hmm_range_fault(). - 9 bug fixes - Allow pgmap to track the 'owner' of a DEVICE_PRIVATE - in this case the owner tells the driver if it can understand the DEVICE_PRIVATE page or not. Use this to resolve a bug in nouveau where it could touch DEVICE_PRIVATE pages from other drivers. - Remove a bunch of dead, redundant or unused code and flags - Clarity improvements to hmm_range_fault() -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl6CUDEACgkQOG33FX4g mxqXgQ/+I2o4QIq86xTddRCesINab73sA4bdvIYgixojTBvo5a9Tu9JO3nmzJOEi mmUXsdN+MnyTdnrkcfbEvo0b5ktCGlwGYQmcRv9MrB1QvSChg6Zy3F3cWR+dsv5c Mo3z0sWRx+//dVSI0p1BMhd5rym0nZCh/rUO94C/XgZ3YoC+MjX4lKeCMvjJpZFc DDEN3Z+zjOvjeiT9TMNMmkPnZSY+N4RQwTRKek5l95QwcHyFy5SBI+uzOHyE0lVw KG5+yk+TFRMoiHjZh4BFqkibQbVKG0qMKiOymIncTr3kL4DK0Hwf/4Zk4DMKucEb Rs2B7C+UShiyIOzdjbnBsqPiHevAhR3nOpozJy3x/Z6fLapVoIlVx3UJJ/djuxZa 7+H2VzxKz1xGRDH4Js/WD0smZ8jisA04vW5THJVFr0Vd2sqKBo3G/5ay2Kw9NX7T MRII1KkA/luZbmM6WLEQkliVuNkMCsVUU3hiY96tLI7uN73M9fQUe6s3NubeoFxi UKg0zuMsAlsJxkHGeY+IMHtUR/law+k9/aZoB4idMGwV/i7KiiolbRcB0H5yn4L5 OV+4uxVBFS/n37sCpMwpx5WJtgbPzww3b6Cd0hfIUqnQzSOjP40Pl5ZX/c/2M7ps bUdZjve5j653YeC/7NBPDprvzbfmyyxJdZZZzzXLQl5kyL2o9LE= =UEpV -----END PGP SIGNATURE----- Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "This series focuses on corner case bug fixes and general clarity improvements to hmm_range_fault(). It arose from a review of hmm_range_fault() by Christoph, Ralph and myself. hmm_range_fault() is being used by these 'SVM' style drivers to non-destructively read the page tables. It is very similar to get_user_pages() except that the output is an array of PFNs and per-pfn flags, and it has various modes of reading. This is necessary before RDMA ODP can be converted, as we don't want to have weird corner case regressions, which is still a looking forward item. Ralph has a nice tester for this routine, but it is waiting for feedback from the selftests maintainers. Summary: - 9 bug fixes - Allow pgmap to track the 'owner' of a DEVICE_PRIVATE - in this case the owner tells the driver if it can understand the DEVICE_PRIVATE page or not. Use this to resolve a bug in nouveau where it could touch DEVICE_PRIVATE pages from other drivers. - Remove a bunch of dead, redundant or unused code and flags - Clarity improvements to hmm_range_fault()" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (25 commits) mm/hmm: return error for non-vma snapshots mm/hmm: do not set pfns when returning an error code mm/hmm: do not unconditionally set pfns when returning EBUSY mm/hmm: use device_private_entry_to_pfn() mm/hmm: remove HMM_FAULT_SNAPSHOT mm/hmm: remove unused code and tidy comments mm/hmm: return the fault type from hmm_pte_need_fault() mm/hmm: remove pgmap checking for devmap pages mm/hmm: check the device private page owner in hmm_range_fault() mm: simplify device private page handling in hmm_range_fault mm: handle multiple owners of device private pages in migrate_vma memremap: add an owner field to struct dev_pagemap mm: merge hmm_vma_do_fault into into hmm_vma_walk_hole_ mm/hmm: don't handle the non-fault case in hmm_vma_walk_hole_() mm/hmm: simplify hmm_vma_walk_hugetlb_entry() mm/hmm: remove the unused HMM_FAULT_ALLOW_RETRY flag mm/hmm: don't provide a stub for hmm_range_fault() mm/hmm: do not check pmd_protnone twice in hmm_vma_handle_pmd() mm/hmm: add missing call to hmm_pte_need_fault in HMM_PFN_SPECIAL handling mm/hmm: return -EFAULT when setting HMM_PFN_ERROR on requested valid pages ...	2020-04-01 17:57:52 -07:00
Linus Torvalds	f365ab31ef	drm for 5.7-rc1 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJehCE6AAoJEAx081l5xIa+bKAQAJj49Hj3WvSJV6cSl3LmgwPV IcWpR6LLVrCOQiS600NyXnbv6lTmCtnIfwEneQqm5ltQvJk38QcKQvSua6+ETi9f 0hl7IiytLwv2R/pS0g9jgSsKmbeP2bDwBAR44vK6pcK6WOgCrkpoL1F4YZEDUILT ewnRb52afF3Hw8PSG0lwgBYd7G0uto49t3nt86LjGftJgB3wPFlluVOzLHTeEh0w FZEyKuqS8hq8EZFfG1Gu1hS2ylO9y1VgYxiv18jDyRb8jUPq+gzqH6roTPRIronA whGZgC6SkyZ9NthCLBu4ITbO9wStAHoawzFfax25QwwuoOyrikuvGy3PfEUu+ixL bW2UtYK6BHLGnvZChH6E7i9J6qQbNYCn3Ty5bB1KsY06sHVoP1jYUBSicPN2ELWc 9KaBI+WROBNEkge/rizwUFfD/u0MZaaSRsVSlGDdHkD7IFj2tDPhfNWdXZQl4EwR JnndT6cu97htf7tkid7RASpJ/hnwJTb1hg0Cc9kPblOrGqEDF0K5845FLR9VptGl 5c2/KyKM/GaI793fP1TG76uDegBhV9337mUF1ZW3c6gCE7QXncZXM5jrIFRE9q2L IbvuyUYRof1gW0r1R0WVJYAi2CRBeMd7qgkQjMrbTw0o7FiYDJXB7dc5pYj2um2v mSVHikC2S1rbUdH0xbWM =I0vz -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "This is the main drm pull request for 5.7-rc1. Highlights: - i915 enables Tigerlake by default - i915 and amdgpu have initial OLED backlight support [ Jani Nikula pipes up and points out that we've had a bunch of "initial support" code for a long time already, but only now Lyude made it actually work on real world machines ] - vmwgfx add support to enable OpenGL 4 userspace - zero length arrays are mostly removed. Detailed summary: new driver: - tidss: TI Keystone platform display subsystem core: - new drm device warn macros - mode config valid for memory constrained devices - bridge bus format negotation - consolidated fake vblank event handling - dma_alloc related cleanups - drop get_crtc callback - dp: DP1.4 EDID corruption test - EDID CEA detailed timings improvements - relicense some code to dual GPL2/MIT - convert core vblank support to per-crtc support - rework drm_global_mutex - bridge rework to allow omap_dss custom driver removeal - remove drm_fb_helper connector interrfaces - zero-length array removal scheduler: - support for modifying the sched list - revert job distribution optimization - helper to pick least loaded scheduler - race condition fix mst: - various fixes - remove register_connector callback i915: - uapi to allows userspace specific CS ring buffer sizes - Tigerlake enablement patches + Tigerlake enabled by default - new sysfs entries for engine properties - display/logging refactors - eDP/DP fixes for DPCD - Gen7 back to aliasing-ppgtt - Gen8+ irq refactor - Avoid globals - GEM locking fixes and simplifications - Ice Lake and Elkhart Lake fixes and workarounds - Baytrail/Haswell instability fix - GVT - VFIO edid better support amdgpu: - Rework VM update handling in preparation for HMM support - drm load/unload removal fixups - USB-C PD firmware updates - HDCP srm support - Navi/renoir PM watermark fixes - OLED panel support - Optimize debugging vram access - Use BACO for runtime pm - DC clock programming optimizations and fixes - PSP fw loading sequence updates - Drop DRIVER_USE_AGP - Remove legacy drm load and unload callbacks - ACP Kconfig fix - Lots of fixes across the driver amdkfd: - runtime pm support - more gfx config details in amdgpu radeon: - drop DRIVER_USE_AGP vmwgfx: - Disable DMA when SEV encryption in use - Shader Model 5 support - needed for GL4 support msm: - DPU resource manager refactor - dpu using atomic global state mediatek: - MT8183 DPI support etnaviv: - out-of-bounds read fix - expose feature flags for GC400 STM32MP1 SoC - runtime suspend entry fix - dma32 zone fix hisilicon: - mode selection fixes meson: - YUV420 support lima: - add support for heap buffers tinydrm: - removal of owner field - explicit DT dependency removal - YAML schema conversion tegra: - misc cleanups tidss: - new driver virtio: - better batching of notifications to host - memory handling reworked - shmem + gpu context fixes hibmc: - add gamma_set support - improve DPMS support pl111: - Integrator IM-PD1 support sun4i: - LVDS support for A20 + A33 - DSI panel handling improvements" * tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm: (1537 commits) drm/i915/display: Fix mode private_flags comparison at atomic_check drm/i915/gt: Stage the transfer of the virtual breadcrumb drm/i915/gt: Select the deepest available parking mode for rc6 drm/i915: Avoid live-lock with i915_vma_parked() drm/i915/gt: Treat idling as a RPS downclock event drm/i915/gt: Cancel a hung context if already closed drm/i915: Use explicit flag to mark unreachable intel_context drm/amdgpu: don't try to reserve training bo for sriov (v2) drm/amdgpu/smu11: add support for SMU AC/DC interrupts drm/amdgpu/swSMU: handle manual AC/DC notifications drm/amdgpu/swSMU: handle DC controlled by GPIO for navi1x drm/amdgpu/swSMU: set AC/DC mode based on the current system state (v2) drm/amdgpu/swSMU: correct the bootup power source for Navi1X (v2) drm/amdgpu/swSMU: use the smu11 power source helper for navi1x drm/amdgpu/smu11: add a helper to set the power source drm/amd/swSMU: add callback to set AC/DC power source (v2) drm/scheduler: fix rare NULL ptr race drm/amdgpu: fix the coverage issue to clear ArcVPGRs drm/amd/display: Fix pageflip event race condition for DCN. drm/[radeon\|amdgpu]: Remove HAINAN board from max_sclk override check ...	2020-04-01 15:24:20 -07:00
Linus Torvalds	69c1fd9726	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "My attempt to revitalize trivial queue I've been neglecting for years (what a disaster that was for this world, right? :) ) with patches collected from backlog that were still relevant and not applied elsewhere in the meantime" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: err.h: remove deprecated PTR_RET for good blk-mq: Fix typo in comment x86/boot: Fix comment spelling sh: mach-highlander: Fix comment spelling s390/dasd: Fix comment spelling mfd: wm8994: Fix comment spelling docs: Add reference in binfmt-misc.rst genirq: fix kerneldoc comment for irq_desc drm/amdgpu: fix two documentation mismatch issues HID: fix Kconfig word ordering list/hashtable: minor documentation corrections.	2020-04-01 14:52:59 -07:00
Colin Ian King	a500194e73	drm/amdgpu/vcn: fix spelling mistake "fimware" -> "firmware" There is a spelling mistake in a dev_err error message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:46 -04:00
Christian König	82c416b13c	drm/amdgpu: fix and cleanup amdgpu_gem_object_close v4 The problem is that we can't add the clear fence to the BO when there is an exclusive fence on it since we can't guarantee the the clear fence will complete after the exclusive one. To fix this refactor the function and also add the exclusive fence as shared to the resv object. v2: fix warning v3: add excl fence as shared instead v4: squash in fix for fence handling in amdgpu_gem_object_close Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:46 -04:00
James Zhu	e520859cde	drm/amdgpu: enable VCN2.5 DPG mode for Arcturus Enable VCN2.5 DPG mode for arcturus after below items are applied. ASD: 0x21000023 SOS: 0x17003B VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16 VBIOS: 23 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
James Zhu	c97e3076eb	drm/amdgpu/vcn2.5: Add firmware w/r ptr reset sync Add firmware write/read point reset sync through shared memory Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
James Zhu	9352141027	drm/amdgpu/vcn2.0: Add firmware w/r ptr reset sync Add firmware write/read point reset sync through shared memory Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
James Zhu	2c68f0e377	drm/amdgpu/vcn: Add firmware share memory support Added firmware share memory support for VCN. Current multiple queue mode is enabled only. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
James Zhu	ad9469fb5b	drm/amdgpu/vcn2.5: stall DPG when WPTR/RPTR reset Add vcn dpg harware synchronization to fix race condition issue between vcn driver and hardware. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
James Zhu	ef563ff403	drm/amdgpu/vcn2.0: stall DPG when WPTR/RPTR reset Add vcn dpg harware synchronization to fix race condition issue between vcn driver and hardware. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
James Zhu	e3b41d82da	drm/amdgpu/vcn: fix race condition issue for dpg unpause mode switch Couldn't only rely on enc fence to decide switching to dpg unpaude mode. Since a enc thread may not schedule a fence in time during multiple threads running situation. v3: 1. Rename enc_submission_cnt to dpg_enc_submission_cnt 2. Add dpg_enc_submission_cnt check in idle_work_handler v4: Remove extra counter check, and reduce counter before idle work schedule Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
James Zhu	bd718638b8	drm/amdgpu/vcn: fix race condition issue for vcn start Fix race condition issue when multiple vcn starts are called. v2: Removed checking the return value of cancel_delayed_work_sync() to prevent possible races here. v3: Add total_submission_cnt to avoid gate power unexpectedly. v4: Remove extra counter check, and reduce counter before idle work schedule Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
Yintian Tao	17e137f27c	drm/amdgpu: skip access sdma_v5_0 registers under SRIOV (v2) Due to the new L1.0b0c011b policy, many SDMA registers are blocked which raise the violation warning. There are total 6 pair register needed to be skipped when driver init and de-init. mmSDMA0/1_CNTL mmSDMA0/1_F32_CNTL mmSDMA0/1_UTCL1_PAGE mmSDMA0/1_UTCL1_CNTL mmSDMA0/1_CHICKEN_BITS, mmSDMA0/1_SEM_WAIT_FAIL_TIMER_CNTL v2: squash in warning fix Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
Christian König	1675c3a24d	drm/amdgpu: stop disable the scheduler during HW fini When we stop the HW for example for GPU reset we should not stop the front-end scheduler. Otherwise we run into intermediate failures during command submission. The scheduler should only be stopped in very few cases: 1. We can't get the hardware working in ring or IB test after a GPU reset. 2. The KIQ scheduler is not used in the front-end and should be disabled during GPU reset. 3. In amdgpu_ring_fini() when the driver unloads. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Test-by: Dennis Li <dennis.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:45 -04:00
Alex Sierra	9e94ff3386	drm/amdgpu: reroute VMC and UMD to IH ring 1 for oss v5 [Why] Due Page faults can easily overwhelm the interrupt handler. So to make sure that we never lose valuable interrupts on the primary ring we re-route page faults to IH ring 1. It also facilitates the recovery page process, since it's already running from a process context. This is valid for Arcturus and future Navi generation GPUs. [How] Setting IH_CLIENT_CFG_DATA for VMC and UMD IH clients. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
Alex Sierra	0ab176e69c	drm/amdgpu: call psp to program ih cntl in SR-IOV for Navi call psp to program ih cntl in SR-IOV if supported on Navi and Arcturus. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
Alex Sierra	ab51801206	drm/amdgpu: enable IH ring 1 and ring 2 for navi Support added into IH to enable ring1 and ring2 for navi10_ih. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
Alex Sierra	b635ae8744	drm/amdgpu: ih doorbell size of range changed for nbio v7.4 [Why] nbio v7.4 size of ih doorbell range is 64 bit. This requires 2 DWords per register. [How] Change ih doorbell size from 2 to 4. This means two Dwords per ring. Current configuration uses two ih rings. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
Alex Sierra	04cdac5c17	drm/amdgpu: infinite retries fix from UTLC1 RB SDMA [Why] Previously these registers were set to 0. This was causing an infinite retry on the UTCL1 RB, preventing higher priority RB such as paging RB. [How] Set to one the SDMAx_UTLC1_TIMEOUT registers for all SDMAs on Vega10, Vega12, Vega20 and Arcturus. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
Evan Quan	a9d82d2f91	drm/amdgpu: fix non-pointer dereference for non-RAS supported Backtrace on gpu recover test on Navi10. [ 1324.516681] RIP: 0010:amdgpu_ras_set_error_query_ready+0x15/0x20 [amdgpu] [ 1324.523778] Code: 4c 89 f7 e8 cd a2 a0 d8 e9 99 fe ff ff 45 31 ff e9 91 fe ff ff 0f 1f 44 00 00 55 48 85 ff 48 89 e5 74 0e 48 8b 87 d8 2b 01 00 <40> 88 b0 38 01 00 00 5d c3 66 90 0f 1f 44 00 00 55 31 c0 48 85 ff [ 1324.543452] RSP: 0018:ffffaa1040e4bd28 EFLAGS: 00010286 [ 1324.549025] RAX: 0000000000000000 RBX: ffff911198b20000 RCX: 0000000000000000 [ 1324.556217] RDX: 00000000000c0a01 RSI: 0000000000000000 RDI: ffff911198b20000 [ 1324.563514] RBP: ffffaa1040e4bd28 R08: 0000000000001000 R09: ffff91119d0028c0 [ 1324.570804] R10: ffffffff9a606b40 R11: 0000000000000000 R12: 0000000000000000 [ 1324.578413] R13: ffffaa1040e4bd70 R14: ffff911198b20000 R15: 0000000000000000 [ 1324.586464] FS: 00007f4441cbf540(0000) GS:ffff91119ed80000(0000) knlGS:0000000000000000 [ 1324.595434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1324.601345] CR2: 0000000000000138 CR3: 00000003fcdf8004 CR4: 00000000003606e0 [ 1324.608694] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1324.616303] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1324.623678] Call Trace: [ 1324.626270] amdgpu_device_gpu_recover+0x6e7/0xc50 [amdgpu] [ 1324.632018] ? seq_printf+0x4e/0x70 [ 1324.636652] amdgpu_debugfs_gpu_recover+0x50/0x80 [amdgpu] [ 1324.643371] seq_read+0xda/0x420 [ 1324.647601] full_proxy_read+0x5c/0x90 [ 1324.652426] __vfs_read+0x1b/0x40 [ 1324.656734] vfs_read+0x8e/0x130 [ 1324.660981] ksys_read+0xa7/0xe0 [ 1324.665201] __x64_sys_read+0x1a/0x20 [ 1324.669907] do_syscall_64+0x57/0x1c0 [ 1324.674517] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1324.680654] RIP: 0033:0x7f44417cf081 Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
Tom St Denis	c76c1a4297	drm/amd/amdgpu: Include headers for PWR and SMUIO registers Clean up the smu10, smu12, and gfx9 drivers to use headers for registers instead of hardcoding in the C source files. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
xinhui pan	c8e42d5785	drm/amdgpu: implement more ib pools (v2) We have three ib pools, they are normal, VM, direct pools. Any jobs which schedule IBs without dependence on gpu scheduler should use DIRECT pool. Any jobs schedule direct VM update IBs should use VM pool. Any other jobs use NORMAL pool. v2: squash in coding style fix Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:44 -04:00
Jiawei	b7b2a316b9	drm/amdgpu: extend compute job timeout extend compute lockup timeout to 60000 for SR-IOV. Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Jiawei <Jiawei.Gu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Emily Deng	ad31da434e	drm/amdgpu: No need support vcn decode As no need to support vcn decode feature, so disable the ring for SR-IOV. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	2f2941324c	drm/amdgpu: postpone entering fullaccess mode if host support new handshake we only need to enter fullaccess_mode in ip_init() part, otherwise we need to do it before reading vbios (becuase host prepares vbios for VF only after received REQ_GPU_INIT event under legacy handshake) Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	dffa11b4f7	drm/amdgpu: adjust sequence of ip_discovery init and timeout_setting what: 1)move timtout setting before ip_early_init to reduce exclusive mode cost for SRIOV 2)move ip_discovery_init() to inside of amdgpu_discovery_reg_base_init() it is a prepare for the later upcoming patches. why: in later upcoming patches we would use a new mailbox event -- "req_gpu_init_data", which is a callback hooked in adev->virt.ops and this callback send a new event "REQ_GPU_INIT_DAT" to host to notify host to do some preparation like "IP discovery/vbios on the VF FB" and this callback must be: A) invoked after set_ip_block() because virt.ops is configured during set_ip_block() B) invoked before ip_discovery_init() becausen ip_discovery_init() need host side prepares everything in VF FB first. current place of ip_discovery_init() is before we can invoke callback of adev->virt.ops, thus we must move ip_discovery_init() to a place after the adev->virt.ops all settle done, and the perfect place is in amdgpu_discovery_reg_base_init() Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	122078de16	drm/amdgpu: equip new req_init_data handshake by this new handshake host side can prepare vbios/ip-discovery and pf&vf exchange data upon recieving this request without stopping world switch. this way the world switch is less impacted by VF's exclusive mode request Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	ff1f03a7b8	drm/amdgpu: use static mmio offset for NV mailbox what: with the new "req_init_data" handshake we need to use mailbox before do IP discovery, so in mxgpu_nv.c file the original SOC15_REG method won'twork because that depends on IP discovery complete first. how: so the solution is to always use static MMIO offset for NV+ mailbox registers. HW team confirm us all MAILBOX registers will be at the same offset for all ASICs, no IP discovery needed for those registers Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	aa53bc2edb	drm/amdgpu: introduce new request and its function 1) modify xgpu_nv_send_access_requests to support new idh request 2) introduce new function: req_gpu_init_data() which is used to notify host to prepare vbios/ip-discovery/pfvf exchange Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	c27cbdd2d0	drm/amdgpu: introduce new idh_request/event enum new idh_request and ihd_event to prepare for the new handshake protocol implementation later Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	4d130238a7	drm/amdgpu: cleanup idh event/req for NV headers 1) drop the headers from AI in mxgpu_nv.c, should refer to mxgpu_nv.h 2) the IDH_EVENT_MAX is not used and not aligned with host side so drop it 3) the IDH_TEXT_MESSAG was provided in host but not defined in guest Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Chen Zhou	955df04e3b	drm/amdgpu/uvd7: remove unnecessary conversion to bool The conversion to bool is not needed, remove it. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Emily Deng	d73cd70127	drm/amdgpu: Ignore the not supported error from psp As the VCN firmware will not use vf vmr now. And new psp policy won't support set tmr now. For driver compatible issue, ignore the not support error. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Emily Deng	6bc8cdde57	drm/amdgpu: Add 4k resolution for virtual display Add 4k resolution for virtual connector. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Emily Deng	02f6efb478	drm/amdgpu: Virtual display need to support multiple ctrcs The crtc num is determined by virtual_display parameter. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
John Clements	61380faa4b	drm/amdgpu: disable ras query and iject during gpu reset added flag to ras context to indicate if ras query functionality is ready Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
John Clements	66399248fe	drm/amdgpu: added xgmi ras error reset sequence added mechanism to clear xgmi ras status inbetween error queries Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Monk Liu	3aa0115d23	drm/amdgpu: cleanup all virtualization detection routine we need to move virt detection much earlier because: 1) HW team confirms us that RCC_IOV_FUNC_IDENTIFIER will always be at DE5 (dw) mmio offset from vega10, this way there is no need to implement detect_hw_virt() routine in each nbio/chip file. for VI SRIOV chip (tonga & fiji), the BIF_IOV_FUNC_IDENTIFIER is at 0x1503 2) we need to acknowledged we are SRIOV VF before we do IP discovery because the IP discovery content will be updated by host everytime after it recieved a new coming "REQ_GPU_INIT_DATA" request from guest (there will be patches for this new handshake soon). Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Monk Liu	b89659b783	drm/amdgpu: amends feature bits for MM bandwidth mgr Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Monk Liu	8884532a6e	drm/amdgpu: purge ip_discovery headers those two headers are not needed for ip discovery Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Kent Russell	714309f0f3	drm/amdgpu: Fix FRU data checking Ensure that when we memcpy, we don't end up copying more data than the struct supports. For now, this is 16 characters for product number and serial number, and 32 chars for product name Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Kent Russell	358e00e0ad	drm/amdgpu: Expose TA FW version in fw_version file Reporting the fw_version just returns 0, the actual version is kept as ta_*_ucode_version. This is the same as the feature reported in the amdgpu_firmware_info debugfs file. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
John Clements	fabe01d7bb	drm/amdgpu: disabled fru eeprom access added asic support checking function to be filled in by supported asic types Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Kent Russell	bd607166af	drm/amdgpu: Enable reading FRU chip via I2C v3 Allow for reading of information like manufacturer, product number and serial number from the FRU chip. Report the serial number as the new sysfs file serial_number. Note that this only works on server cards, as consumer cards do not feature the FRU chip, which contains this information. v2: Add documentation to amdgpu.rst, add helper functions, rename functions for consistency, fix bad starting offset v3: Remove testing definitions Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:41 -04:00
Christian König	8523f8875b	drm/amdgpu: improve amdgpu_gem_info debugfs file Note if a buffer was imported using peer2peer. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/359296	2020-04-01 09:02:45 +02:00
Christian König	f44ffd677f	drm/amdgpu: add support for exporting VRAM using DMA-buf v3 We should be able to do this now after checking all the prerequisites. v2: fix entrie count in the sgt v3: manually construct the sg Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/359295	2020-04-01 09:02:45 +02:00
Christian König	48262cd949	drm/amdgpu: add checks if DMA-buf P2P is supported Check if we can do peer2peer on the PCIe bus. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/359294	2020-04-01 09:02:45 +02:00
Christian König	57b7b62f5a	drm/amdgpu: note that we can handle peer2peer DMA-buf Importing should work out of the box. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/359293	2020-04-01 09:02:45 +02:00
Kevin Wang	987ed8e938	drm/amdgpu: fix hpd bo size calculation error the HPD bo size calculation error. the "mem.size" can't present actual BO size all time. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <Christian.Koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-31 12:26:14 -04:00
Dave Airlie	5fc0df93fc	Linux 5.6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl6BIG4eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGlHUH/RCFve2sfHRPjRW+ xR5SaLVAw6XKvtKBq7yvKmHEwqNJnL79IHyqqtSrtfFr2FfaH/KvYiCbbAezvSrM np0udGu7STKGd21CWuyEZJudyhXkOwMRNiFiCXWp7rs35oh8T0TpJxMzo2Nc1nLk JFQPqAP6OSvq4IkWEywKQI+Au3Z1IBf83xVjZ1s+MKPQHYD49x2hc4cQntL5/cnm a3DoR2iBkYiGZCZ9dDqAqJTnMQIiCbACdZXgGjNRUpdyA/dtAjsMl11NPYHm8TA2 3AHBupAK50WBZGad6xv2qKQyScsmoJG2mv92QjlOFz0Tpiu6rLnDlLYREDVB6YH6 qbLDsc8= =XEIU -----END PGP SIGNATURE----- Merge v5.6 into drm-next msm needed rc6, so I just went and merged release (msm has been in drm-next outside of this tree) Signed-off-by: Dave Airlie <airlied@redhat.com>	2020-03-31 15:15:47 +10:00
Mikel Rychliski	72e0ef0e5f	PCI: Use ioremap(), not phys_to_virt() for platform ROM On some EFI systems, the video BIOS is provided by the EFI firmware. The boot stub code stores the physical address of the ROM image in pdev->rom. Currently we attempt to access this pointer using phys_to_virt(), which doesn't work with CONFIG_HIGHMEM. On these systems, attempting to load the radeon module on a x86_32 kernel can result in the following: BUG: unable to handle page fault for address: 3e8ed03c #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP CPU: 0 PID: 317 Comm: systemd-udevd Not tainted 5.6.0-rc3-next-20200228 #2 Hardware name: Apple Computer, Inc. MacPro1,1/Mac-F4208DC8, BIOS MP11.88Z.005C.B08.0707021221 07/02/07 EIP: radeon_get_bios+0x5ed/0xe50 [radeon] Code: 00 00 84 c0 0f 85 12 fd ff ff c7 87 64 01 00 00 00 00 00 00 8b 47 08 8b 55 b0 e8 1e 83 e1 d6 85 c0 74 1a 8b 55 c0 85 d2 74 13 <80> 38 55 75 0e 80 78 01 aa 0f 84 a4 03 00 00 8d 74 26 00 68 dc 06 EAX: 3e8ed03c EBX: 00000000 ECX: 3e8ed03c EDX: 00010000 ESI: 00040000 EDI: eec04000 EBP: eef3fc60 ESP: eef3fbe0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206 CR0: 80050033 CR2: 3e8ed03c CR3: 2ec77000 CR4: 000006d0 Call Trace: r520_init+0x26/0x240 [radeon] radeon_device_init+0x533/0xa50 [radeon] radeon_driver_load_kms+0x80/0x220 [radeon] drm_dev_register+0xa7/0x180 [drm] radeon_pci_probe+0x10f/0x1a0 [radeon] pci_device_probe+0xd4/0x140 Fix the issue by updating all drivers which can access a platform provided ROM. Instead of calling the helper function pci_platform_rom() which uses phys_to_virt(), call ioremap() directly on the pdev->rom. radeon_read_platform_bios() previously directly accessed an __iomem pointer. Avoid this by calling memcpy_fromio() instead of kmemdup(). pci_platform_rom() now has no remaining callers, so remove it. Link: https://lore.kernel.org/r/20200319021623.5426-1-mikel@mikelr.com Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-30 09:52:23 -05:00
Wolfram Sang	0bf6595049	drm/amdgpu: convert to use i2c_new_client_device() Move away from the deprecated API. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200326211005.13301-2-wsa+renesas@sang-engineering.com	2020-03-28 22:47:22 +01:00
Jason Gunthorpe	6bfef2f919	mm/hmm: remove HMM_FAULT_SNAPSHOT Now that flags are handled on a fine-grained per-page basis this global flag is redundant and has a confusing overlap with the pfn_flags_mask and default_flags. Normalize the HMM_FAULT_SNAPSHOT behavior into one place. Callers needing the SNAPSHOT behavior should set a pfn_flags_mask and default_flags that always results in a cleared HMM_PFN_VALID. Then no pages will be faulted, and HMM_FAULT_SNAPSHOT is not a special flow that overrides the masking mechanism. As this is the last flag, also remove the flags argument. If future flags are needed they can be part of the struct hmm_range function arguments. Link: https://lore.kernel.org/r/20200327200021.29372-5-jgg@ziepe.ca Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 20:19:24 -03:00
Dave Airlie	5117c363eb	drm-misc-fixes for v5.6: - SG fixes for prime, radeon and amdgpu. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl58tmgACgkQ/lWMcqZw E8NIyBAAoKOKyKwTgzTF+AQvkFsAkRcE5wi7lrJTdbS26susSTstDFOmouHcgfdi gxRHwq2BHUfPDV2qUTp9v5dkQDvH9ZrDVaeuT0vK9f9PT+5EEpiO4mlmpq5o2CpH 5Pn1NAd58vP0BZmLl9MOrJQMKRC7Y7rSAaNXjQu9KfWV1CtqPu/OBm2cbRZC/7Rs nRcWRV2H+MSPzzSeGCA8MpcPvQiVEGtGjm3TFtH5Q4EFuE0ILIvf/cqWGLtiIkh2 QRyE/+bLokuzZc2XerQPf5zxQDCqXc1NPCWXwlAKUUkcIDF3lQ5ewxW6MZ8AExqx Sn84+5z/BMlIqjHptODeZaWXLXgUnt7G0iE5aKVlQ14yKgJOejtq2N05XmzhEcLS H5WiLW9qIdCKH7C8joZFtb6LAPEq48ubJgYO77G02JSYO/UnB7qBnxTgyEL4Sl2O OskTdFTNG4ayVCJkFEgZpU0Xb41H/wIwB1HPcD0QSkHPGmGamIBoy7IoEpfmyWZF vN/Ucw0INJMORzr+/sqNSHPnzNhT1MRorVdWMgk/5zcUWn/KD+pQfQrE72UXQtAy +Q84lkjhCTGOAVINZGbuC3CkfTdNqqrHTM+IqHBGU6oZ75HUb0N4VfeLQeESBoK6 kFEQYtB6EL6GMt7d6Pj+qTFXShh1pdWDIqKW2Kswz5nTGqwWgFo= =wtId -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2020-03-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.6: - SG fixes for prime, radeon and amdgpu. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ef10e822-76dd-125d-ec1f-9a78c5f76bc3@linux.intel.com	2020-03-27 12:33:23 +10:00
Christoph Hellwig	17ffdc4829	mm: simplify device private page handling in hmm_range_fault Remove the HMM_PFN_DEVICE_PRIVATE flag, no driver has ever set this flag on input, and the only place that uses it on output can be trivially changed to use is_device_private_page(). This removes the ability to request that device_private pages are faulted back into system memory. Link: https://lore.kernel.org/r/20200316193216.920734-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 14:33:38 -03:00
Monk Liu	e862b08b46	drm/amdgpu: don't try to reserve training bo for sriov (v2) 1) SRIOV guest KMD doesn't care training buffer 2) if we resered training buffer that will overlap with IP discovery reservation because training buffer is at vram_size - 0x8000 and IP discovery is at ()vram_size - 0x10000 => vram_size -1) v2: squash in warning fix from Nirmoy Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-25 17:04:35 -04:00
Alex Deucher	9644bf5f4a	drm/amdgpu/swSMU: handle manual AC/DC notifications For boards that do not support automatic AC/DC transitions in firmware, manually tell the firmware when the status changes. Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-25 17:00:11 -04:00
Dennis Li	10cda519ef	drm/amdgpu: fix the coverage issue to clear ArcVPGRs Set ComputePGMRSRC1.VGPRS as 0x3f to clear all ArcVGPRs. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-25 17:00:11 -04:00
Yassine Oudjana	c7e5587964	drm/[radeon\|amdgpu]: Remove HAINAN board from max_sclk override check Works stable without the overrides. Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-25 16:58:40 -04:00
Zhigang Luo	728b3d0533	Revert "drm/amdgpu: add CAP fw loading" This reverts commit `29e2501f8a`. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-25 16:58:40 -04:00
Shane Francis	0199172f93	drm/amdgpu: fix scatter-gather mapping with user pages Calls to dma_map_sg may return less segments / entries than requested if they fall on page bounderies. The old implementation did not support this use case. Fixes: `be62dbf554` ("iommu/amd: Convert AMD iommu driver to the dma-iommu api") Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206461 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206895 Bug: https://gitlab.freedesktop.org/drm/amd/issues/1056 Signed-off-by: Shane Francis <bigbeeshane@gmail.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200325090741.21957-3-bigbeeshane@gmail.com Cc: stable@vger.kernel.org	2020-03-25 12:10:40 -04:00
shaoyunl	02be064823	drm/amdgpu/sriov : Don't resume RLCG for SRIOV guest RLCG is enabled by host driver, no need to enable it in guest for none-PSP load path Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-20 10:45:00 -04:00
John Clements	43c4d57618	drm/amdgpu: protect RAS sysfs during GPU reset MMHub EDC becomes dirty after BACO reset EDC registers should be cleared early on in reset phase Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-20 10:45:00 -04:00
Colin Ian King	8cd296082c	drm: amd: fix spelling mistake "shoudn't" -> "shouldn't" There are spelling mistakes in pr_err messages and a comment. Fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-19 00:03:05 -04:00
Nathan Chancellor	931971280c	drm/amdgpu: Remove unnecessary variable shadow in gfx_v9_0_rlcg_wreg clang warns: drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:754:6: warning: variable 'shadow' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (offset == grbm_cntl \|\| offset == grbm_idx) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:757:6: note: uninitialized use occurs here if (shadow) { ^~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:754:2: note: remove the 'if' if its condition is always true if (offset == grbm_cntl \|\| offset == grbm_idx) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:738:13: note: initialize the variable 'shadow' to silence this warning bool shadow; ^ = 0 1 warning generated. shadow is only assigned in one condition and used as the condition for another if statement; combine the two if statements and remove shadow to make the code cleaner and resolve this warning. Fixes: `2e0cc4d48b` ("drm/amdgpu: revise RLCG access path") Link: https://github.com/ClangBuiltLinux/linux/issues/936 Suggested-by: Joe Perches <joe@perches.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-19 00:03:05 -04:00
James Zhu	6c1cb08e3a	drm/amdgpu: fix typo for vcn2.5/jpeg2.5 idle check fix typo for vcn2.5/jpeg2.5 idle check Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-19 00:03:05 -04:00
James Zhu	23edf7f1a8	drm/amdgpu: fix typo for vcn2/jpeg2 idle check fix typo for vcn2/jpeg2 idle check Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-19 00:03:05 -04:00
James Zhu	5e31fa6821	drm/amdgpu: fix typo for vcn1 idle check fix typo for vcn1 idle check Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-19 00:03:05 -04:00
Zhigang Luo	29e2501f8a	drm/amdgpu: add CAP fw loading The CAP fw is for enabling driver compatibility. Currently, it only enabled for vega10 VF. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-19 00:03:05 -04:00
Yintian Tao	31d0271d45	drm/amdgpu: miss PRT case when bo update Originally, only the PTE valid is taken in consider. The PRT case is missied when bo update which raise problem. We need add condition for PRT case. v2: add PRT condition for amdgpu_vm_bo_update_mapping, too v3: fix one typo error Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-19 00:03:04 -04:00
James Zhu	a3c33e7a4a	drm/amdgpu: fix typo for vcn2.5/jpeg2.5 idle check fix typo for vcn2.5/jpeg2.5 idle check Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-18 18:21:57 -04:00
James Zhu	b5689d22aa	drm/amdgpu: fix typo for vcn2/jpeg2 idle check fix typo for vcn2/jpeg2 idle check Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-18 18:21:45 -04:00
James Zhu	acfc62dc68	drm/amdgpu: fix typo for vcn1 idle check fix typo for vcn1 idle check Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-18 18:21:18 -04:00
Wang Xiayang	aad7012c31	drm/amdgpu: fix two documentation mismatch issues The function name mentioned in the documentational comments mismatches the actual one. The mismatch may make trouble for automatic documentation generation. One of the erronous name has seen to be misused and fixed in commit `bc5ab2d29b` ("drm/amdgpu: fix typo in function sdma_v4_0_page_resume"). There is apparently no functional change in the patch. Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2020-03-17 19:02:28 +01:00
Nirmoy Das	4ff7d8ba4c	drm/amdgpu: disable gpu_sched load balancer for vcn jobs VCN HW doesn't support dynamic load balance on multiple instances for a context. This patch initializes VNC entities with only one drm_gpu_scheduler picked by drm_sched_pick_best(). Picking a drm_gpu_scheduler using drm_sched_pick_best() ensures that we do load balance among multiple contexts but not among multiple jobs in a context. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-16 16:21:32 -04:00
Andrey Grodzovsky	9015d60c9e	drm/amdgpu: Move EEPROM I2C adapter to amdgpu_device Puts the i2c adapter in common place for sharing by RAS and upcoming data read from FRU EEPROM feature. v2: Move i2c adapter to amdgpu_pm and rename it. v3: Move i2c adapter init to ASIC specific code and get rid of the switch case in amdgpu_device Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-16 16:21:32 -04:00
xinhui pan	57210c19e4	drm_amdgpu: Add job fence to resv conditionally Job fence on page table should be a shared one, so add it to the root page talbe bo resv. last_delayed field is not needed anymore. so remove it. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-16 16:21:32 -04:00
Nirmoy Das	79cb2719be	drm/amdgpu: fix switch-case indentation Fix switch-case indentation in amdgpu_ctx_init_entity() Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-16 16:18:14 -04:00
Monk Liu	2e0cc4d48b	drm/amdgpu: revise RLCG access path what changed: 1)provide new implementation interface for the rlcg access path 2)put SQ_CMD/SQ_IND_INDEX to GFX9 RLCG path to let debugfs's reg_op function can access reg that need RLCG path help now even debugfs's reg_op can used to dump wave. tested-by: Monk Liu <monk.liu@amd.com> tested-by: Zhou pengju <pengju.zhou@amd.com> Signed-off-by: Zhou pengju <pengju.zhou@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-16 16:17:55 -04:00
Dennis Li	93cdb48eca	drm/amdgpu: add codes to clear AccVGPR for arcturus AccVGPRs are newly added in arcturus. Before reading these registers, they should be initialized. Otherwise edc error happens, when RAS is enabled. v2: reuse the existing logical to calculate register size Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:36 -04:00
Joe Perches	2541f95c17	AMD KFD: Use fallthrough; Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/ Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:35 -04:00
Stanley.Yang	c1509f3f6f	drm/amdgpu: fix warning in ras_debugfs_create_all() Fix the warning "warn: variable dereferenced before check 'obj' (see line 1131)" by removing unnecessary checks as amdgpu_ras_debugfs_create_all() is only called from amdgpu_debugfs_init() where obj member in con->head list is not NULL. Use list_for_each_entry() instead list_for_each_entry_safe() as obj do not to be freeing or removing from list during this process. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:34 -04:00
Evan Quan	565d194155	drm/amdgpu: add fbdev suspend/resume on gpu reset This can fix the baco reset failure seen on Navi10. And this should be a low risk fix as the same sequence is already used for system suspend/resume. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:34 -04:00
Guchun Chen	88474ccad5	drm/amdgpu: update ras capability's query based on mem ecc configuration RAS support capability needs to be updated on top of different memeory ECC enablement, and remove redundant memory ecc check in gmc module for vega20 and arcturus. v2: check HBM ECC enablement and set ras mask accordingly. v3: avoid to invoke atomfirmware interface to query twice. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:34 -04:00
Tom St Denis	6397ec580d	drm/amd/amdgpu: Fix GPR read from debugfs (v2) The offset into the array was specified in bytes but should be in terms of 32-bit words. Also prevent large reads that would also cause a buffer overread. v2: Read from correct offset from internal storage buffer. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:34 -04:00
Stanley.Yang	17cb04f2a6	drm/amdgpu: use amdgpu_ras.h in amdgpu_debugfs.c include amdgpu_ras.h head file instead of use extern ras_debugfs_create_all function Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:34 -04:00
Hawking Zhang	06dcd7eb83	drm/amdgpu: check GFX RAS capability before reset counters disallow the logical to be enabled on platforms that don't support gfx ras at this stage, like sriov skus, dgpu with legacy ras.etc Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:33 -04:00
John Clements	c2c6f816a8	drm/amdgpu: resolve failed error inject msg invoking an error injection successfully will cause an at_event intterrupt that will occur before the invoke sequence can complete causing an invalid error Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:33 -04:00
Jack Zhang	5f87611582	drm/amdgpu/sriov refine vcn_v2_5_early_init func refine the assignment for vcn.num_vcn_inst, vcn.harvest_config, vcn.num_enc_rings in VF Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 11:52:33 -04:00
Evan Quan	063e768ebd	drm/amdgpu: add fbdev suspend/resume on gpu reset This can fix the baco reset failure seen on Navi10. And this should be a low risk fix as the same sequence is already used for system suspend/resume. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-13 09:20:31 -04:00
Tom St Denis	5bbc6604a6	drm/amd/amdgpu: Fix GPR read from debugfs (v2) The offset into the array was specified in bytes but should be in terms of 32-bit words. Also prevent large reads that would also cause a buffer overread. v2: Read from correct offset from internal storage buffer. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-03-13 09:20:31 -04:00
Dave Airlie	69ddce0970	Merge tag 'amd-drm-next-5.7-2020-03-10' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.7-2020-03-10: amdgpu: - SR-IOV fixes - Fix up fallout from drm load/unload callback removal - Navi, renoir power management watermark fixes - Refactor smu parameter handling - Display FEC fixes - Display DCC fixes - HDCP fixes - Add support for USB-C PD firmware updates - Pollock detection fix - Rework compute ring priority handling - RAS fixes - Misc cleanups amdkfd: - Consolidate more gfx config details in amdgpu - Consolidate bo alloc flags - Improve code comments - SDMA MQD fixes - Misc cleanups gpu scheduler: - Add suport for modifying the sched list uapi: - Clarify comments about GEM_CREATE flags that are not used by userspace. The kernel driver has always prevented userspace from using these. They are only used internally in the kernel driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200310212748.4519-1-alexander.deucher@amd.com	2020-03-13 09:09:11 +10:00
Dave Airlie	9e12da086e	drm-misc-next for 5.7: UAPI Changes: Cross-subsystem Changes: Core Changes: Driver Changes: - fb-helper: Remove drm_fb_helper_{add,add_all,remove}_one_connector - fbdev: some cleanups and dead-code removal - Conversions to simple-encoder - zero-length array removal - Panel: panel-dpi support in panel-simple, Novatek NT35510, Elida KD35T133, -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXmZKhwAKCRDj7w1vZxhR xUgxAQDB1kkf1xQdU7rdw344vaaMf270qBeG+GNX/py3h9pbnwEA7XQvbB1wWBec hR629PO+csE0dWcFkGi8d5kpdWQCOQY= =PRn3 -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2020-03-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.7: UAPI Changes: Cross-subsystem Changes: Core Changes: Driver Changes: - fb-helper: Remove drm_fb_helper_{add,add_all,remove}_one_connector - fbdev: some cleanups and dead-code removal - Conversions to simple-encoder - zero-length array removal - Panel: panel-dpi support in panel-simple, Novatek NT35510, Elida KD35T133, Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200309135439.dicfnbo4ikj4tkz7@gilmour	2020-03-12 12:42:56 +10:00
Dave Airlie	d3bd37f587	Linux 5.6-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5lkYceHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGpHQH/RJrzcaZHo4lw88m Jf7vBZ9DYUlRgqE0pxTHWmodNObKRqpwOUGflUcWbb/7GD2LQUfeqhSECVQyTID9 N9y7FcPvx321Qhc3EkZ24DBYk0+DQ0K2FVUrSa/PxO0n7czxxXWaLRDmlSULEd3R D4pVs3zEWOBXJHUAvUQ5R+lKfkeWKNeeepeh+rezuhpdWFBRNz4Jjr5QUJ8od5xI sIwobYmESJqTRVBHqW8g2T2/yIsFJ78GCXs8DZLe1wxh40UbxdYDTA0NDDTHKzK6 lxzBgcmKzuge+1OVmzxLouNWMnPcjFlVgXWVerpSy3/SIFFkzzUWeMbqm6hKuhOn wAlcIgI= =VQUc -----END PGP SIGNATURE----- Merge v5.6-rc5 into drm-next Requested my mripard for some misc patches that need this as a base. Signed-off-by: Dave Airlie <airlied@redhat.com>	2020-03-11 07:27:21 +10:00
Feifei Xu	5d11e37c02	drm/amdgpu/runpm: disable runpm on Vega10 Some framework test will fail if enable runpm on Vega10. Disable it untill issue fixed. Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Tested-by: Kyle Chen <Kyle.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-10 15:55:18 -04:00
Tao Zhou	204eaac625	drm/amdgpu: call ras_debugfs_create_all in debugfs_init and remove each ras IP's own debugfs creation this is required to fix ras when the driver does not use the drm load and unload callbacks due to ordering issues with the drm device node. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-10 15:55:11 -04:00
Tao Zhou	f9317014ea	drm/amdgpu: add function to creat all ras debugfs node centralize all debugfs creation in one place for ras this is required to fix ras when the driver does not use the drm load and unload callbacks due to ordering issues with the drm device node. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-10 15:55:02 -04:00
xinhui pan	9fe58d0bbd	drm/amdgpu: Correct the condition of warning while bo release Only kernel bo has kfd eviction fence. This warning is to give a notice that kfd only remove eviction fence on individual bos. Tested-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-10 15:54:42 -04:00
Yong Zhao	1d251d9008	drm/amdkfd: Consolidate duplicated bo alloc flags ALLOC_MEM_FLAGS_* used are the same as the KFD_IOC_ALLOC_MEM_FLAGS_, but they are interweavedly used in kernel driver, resulting in bad readability. For example, KFD_IOC_ALLOC_MEM_FLAGS_COHERENT is not referenced in kernel, and it functions implicitly in kernel through ALLOC_MEM_FLAGS_COHERENT, causing unnecessary confusion. Replace all occurrences of ALLOC_MEM_FLAGS_ with KFD_IOC_ALLOC_MEM_FLAGS_* to solve the problem. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-10 15:54:34 -04:00
Nirmoy Das	ea29221d1d	drm/amdgpu: do not set nil entry in compute_prio_sched If there are no high priority compute queues available then set normal priority sched array to compute_prio_sched[AMDGPU_GFX_PIPE_PRIO_HIGH] Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-10 15:54:07 -04:00
Hawking Zhang	f1c2cd3f8f	drm/amdgpu: correct ROM_INDEX/DATA offset for VEGA20 The ROMC_INDEX/DATA offset was changed to e4/e5 since from smuio_v11 (vega20/arcturus). Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Tested-by: Candice Li <Candice.Li@amd.com> Reviewed-by: Candice Li <Candice.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-09 16:42:28 -04:00
Nirmoy Das	552b80d740	drm/amdgpu: remove unused functions AMDGPU statically sets priority for compute queues at initialization so remove all the functions responsible for changing compute queue priority dynamically. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-09 13:51:48 -04:00
Nirmoy Das	2316a86bde	drm/amdgpu: change hw sched list on ctx priority override Switch to appropriate sched list for an entity on priority override. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-09 13:51:42 -04:00
Nirmoy Das	33abcb1f5a	drm/amdgpu: set compute queue priority at mqd_init We were changing compute ring priority while rings were being used before every job submission which is not recommended. This patch sets compute queue priority at mqd initialization for gfx8, gfx9 and gfx10. Policy: make queue 0 of each pipe as high priority compute queue High/normal priority compute sched lists are generated from set of high/normal priority compute queues. At context creation, entity of compute queue get a sched list from high or normal priority depending on ctx->priority Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-09 13:51:24 -04:00
Andrey Grodzovsky	97f6a21bfa	drm/amdgpu: Enter low power state if CRTC active. CRTC in DPMS state off calls for low power state entry. Support both atomic mode setting and pre-atomic mode setting. v2: move comment Acked-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-09 13:50:52 -04:00
Monk Liu	cc9f2fba37	drm/amdgpu: disable clock/power gating for SRIOV and disable MC resum in VCN2.0 as well those are not concerned by VF driver Singed-off-by: darlington Opara <darlington.opara@amd.com> Signed-off-by: Jinage Zhao <jiange.zhao@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:40:30 -05:00
Monk Liu	68430c6be5	drm/amdgpu: cleanup ring/ib test for SRIOV vcn2.0 (v2) support IB test on dec/enc ring disable ring test on dec/enc ring (MMSCH limitation) v2: squash in unused variable warning fix Singed-off-by: darlington Opara <darlington.opara@amd.com> Signed-off-by: Jinage Zhao <jiange.zhao@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:40:30 -05:00
Monk Liu	dd26858a9c	drm/amdgpu: implement initialization part on VCN2.0 for SRIOV something need to do for VCN2.0 enablement on SRIOV: 1)use one dec ring and one enc ring 2)allocate MM table for MMSCH usage 3)implement SRIOV version vcn_start which orgnize vcn programing with patcket format and implement start mmsch for to run those packet 4)doorbell is changed for SRIOV Singed-off-by: darlington Opara <darlington.opara@amd.com> Signed-off-by: Jinage Zhao <jiange.zhao@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:34:56 -05:00
Monk Liu	fe44249186	drm/amdgpu: disable jpeg block for SRIOV MMSCH doesn't support jpeg ring on SRIOV Signed-off-by: Jinage Zhao <jiange.zhao@amd.com> Singed-off-by: darlington Opara <darlington.opara@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:34:49 -05:00
Monk Liu	3569b6d19e	drm/amdgpu: introduce mmsch v2.0 header Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:34:42 -05:00
Yong Zhao	fa5bde8056	drm/amdgpu: Use better names to reflect it is CP MQD buffer Add "CP" to AMDGPU_GEM_CREATE_MQD_GFX9 to indicate it is only for CP MQD buffer. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:34:18 -05:00
Andrey Grodzovsky	90f88cdd7c	drm/amdgpu: Fix GPU reset error. Problem: During GU reset PSP's sysfs was being wrongly reinitilized during call to amdgpu_device_ip_late_init which was failing with duplicate error. Fix: Move psp_sysfs_init to psp_sw_init to avoid this. Add guards in sysfs file's read and write hook agains premature call if PSP is not finished initialization. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:32:24 -05:00
Jacob He	5e208eb62b	drm/amdgpu: Update SPM_VMID with the job's vmid when application reserves the vmid SPM access the video memory according to SPM_VMID. It should be updated with the job's vmid right before the job is scheduled. SPM_VMID is a global resource Signed-off-by: Jacob He <jacob.he@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:32:16 -05:00
John Clements	1a2172b5ee	drm/amdgpu: update page retirement sequence check UMC status and exit prior to making and erroneus register access this resolved unexpected behaviour with UMC indexing mode broadcasting writes Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:32:06 -05:00
Guchun Chen	d38c3ac716	drm/amdgpu: toggle DF-Cstate when accessing UMC ras error related registers On arcturus, DF-Cstate needs to be toggled off/on before and after accessing UMC error counter and error address registers, otherwise, clearing such registers may fail. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:31:59 -05:00
John Clements	1b3460a8b1	drm/amdgpu: increase atombios cmd timeout mitigates race condition on BACO reset between GPU bootcode and driver reload Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:31:51 -05:00
Hawking Zhang	a61f41b177	drm/amdgpu: enable PCS error report on arcturus add arcturus xgmi/wafl pcs err status group to support PCS error detection and report on arcturus Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:31:43 -05:00
Hawking Zhang	ec01fe2dbf	drm/amdgpu: enable PCS error report on VG20 Now driver will report XGMI/WAFL PCS error through sysfs xgmi_wafl_err_count node on Vega20 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:31:35 -05:00
Hawking Zhang	18f36157f2	drm/amdgpu: add helper funcs to detect PCS error Since from vega20, hardware supports run-time detect and report XGMI/WAFL PCS ras error. Add helper functions to walkthrough every type of ras error and report it if any. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-06 14:31:28 -05:00
Pankaj Bharadiya	ff1f62d35b	drm: Remove drm_fb_helper add, add all and remove connector calls drm_fb_helper_{add,remove}_one_connector() and drm_fb_helper_single_add_all_connectors() are dummy functions now and serve no purpose. Hence remove their calls. This is the preparatory step for removing the drm_fb_helper_{add,remove}_one_connector() functions from drm_fb_helper.h This removal is done using below sementic patch and unused variable compilation warnings are fixed manually. @@ @@ - drm_fb_helper_single_add_all_connectors(...); @@ expression e1; statement S; @@ - e1 = drm_fb_helper_single_add_all_connectors(...); - S @@ @@ - drm_fb_helper_add_one_connector(...); @@ @@ - drm_fb_helper_remove_one_connector(...); Changes since v1: * Squashed warning fixes into the patch that introduced the warnings (into 5/7) (Laurent, Emil, Lyude) Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200305120434.111091-6-pankaj.laxminarayan.bharadiya@intel.com	2020-03-06 14:19:58 +01:00
Pankaj Bharadiya	2dea2d1182	drm: Remove unused arg from drm_fb_helper_init The max connector argument for drm_fb_helper_init() isn't used anymore hence remove it. All the drm_fb_helper_init() calls are modified with below sementic patch. @@ expression E1, E2, E3; @@ - drm_fb_helper_init(E1,E2, E3) + drm_fb_helper_init(E1,E2) Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200305120434.111091-2-pankaj.laxminarayan.bharadiya@intel.com	2020-03-06 14:19:57 +01:00
Tianci.Yin	194bcf35bc	drm/amdgpu: disable 3D pipe 1 on Navi1x [why] CP firmware decide to skip setting the state for 3D pipe 1 for Navi1x as there is no use case. [how] Disable 3D pipe 1 on Navi1x. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-03-05 09:41:55 -05:00
Yintian Tao	2ab7e274b8	drm/amdgpu: clean wptr on wb when gpu recovery The TDR will be randomly failed due to compute ring test failure. If the compute ring wptr & 0x7ff(ring_buf_mask) is 0x100 then after map mqd the compute ring rptr will be synced with 0x100. And the ring test packet size is also 0x100. Then after invocation of amdgpu_ring_commit, the cp will not really handle the packet on the ring buffer because rptr is equal to wptr. Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:50:07 -05:00
Andrey Grodzovsky	6863d60732	drm/amdgpu: Wrap clflush_cache_range with x86 ifdef To avoid compile errors on other platforms. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:33:30 -05:00
Andrey Grodzovsky	57430471e2	drm/amdgpu: Add support for USBC PD FW download Starts USBC PD FW download and reads back the latest FW version. v2: Move sysfs file creation to late init Add locking around PSP calls to avoid concurrent access to PSP's C2P registers Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:33:24 -05:00
Andrey Grodzovsky	0dc93fd117	drm/amdgpu: Add USBC PD FW load to PSP 11 Add the programming sequence. v2: Change donwload wait loop to more efficient. Move C2PMSG_CMD_GFX_USB_PD_FW_VER defintion v3: Fix lack of loop counter increment typo v4: Remove superflous status reg read Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:33:16 -05:00
Andrey Grodzovsky	95860efc44	drm/amdgpu: Add USBC PD FW load interface to PSP. Used to load power Delivery FW to PSP. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:33:09 -05:00
Hawking Zhang	1a0dd3d928	drm/amdgpu: correct ROM_INDEX/DATA offset for VEGA20 The ROMC_INDEX/DATA offset was changed to e4/e5 since from smuio_v11 (vega20/arcturus). Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Tested-by: Candice Li <Candice.Li@amd.com> Reviewed-by: Candice Li <Candice.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:33:01 -05:00
Hawking Zhang	4a89ad9b39	drm/amdgpu: add reset_ras_error_count function for HDP HDP ras error counters are dirty ones after cold reboot Read operation is needed to reset them to 0 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:32:54 -05:00
Hawking Zhang	279375c331	drm/amdgpu: add reset_ras_error_count function for GFX GFX ras error counters are dirty ones after cold reboot Read operation is needed to reset them to 0 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:32:47 -05:00
Hawking Zhang	fe5211f19a	drm/amdgpu: add reset_ras_error_count function for MMHUB MMHUB ras error counters are dirty ones after cold reboot Read operation is needed to reset them to 0 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:32:40 -05:00
Hawking Zhang	86153f1be2	drm/amdgpu: add reset_ras_error_count function for SDMA SDMA ras error counters are dirty ones after cold reboot Read operation is needed to reset them to 0 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:32:32 -05:00
jianzh	e7429606bb	drm/amdgpu/sriov: Use VF-accessible register for gpu_clock_count Navi12 VK CTS subtest timestamp.calibrated.dev_domain_test failed because mmRLC_CAPTURE_GPU_CLOCK_COUNT register cannot be written in VF due to security policy. Solution: use a VF-accessible timestamp register pair mmGOLDEN_TSC_COUNT_LOWER/UPPER for SRIOV case. v2: according to Deucher Alexander's advice, switch to mmGOLDEN_TSC_COUNT_LOWER/UPPER for both bare metal and SRIOV. Signed-off-by: jianzh <Jiange.Zhao@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:32:23 -05:00
Tiecheng Zhou	8a43cf88b7	drm/amdgpu/sriov: skip programing some regs with new L1 policy With new L1 policy, some regs are blocked at guest and they are programed at host side. So skip programing the regs under sriov. the regs are: GCMC_VM_FB_LOCATION_TOP GCMC_VM_FB_LOCATION_BASE MMMC_VM_FB_LOCATION_TOP MMMC_VM_FB_LOCATION_BASE GCMC_VM_SYSTEM_APERTURE_HIGH_ADDR GCMC_VM_SYSTEM_APERTURE_LOW_ADDR MMMC_VM_SYSTEM_APERTURE_HIGH_ADDR MMMC_VM_SYSTEM_APERTURE_LOW_ADDR HDP_NONSURFACE_BASE HDP_NONSURFACE_BASE_HI GCMC_VM_AGP_TOP GCMC_VM_AGP_BOT GCMC_VM_AGP_BASE Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:31:54 -05:00
Samir Dhume	022b651816	drm/amdgpu: Rearm IRQ in Navi10 SR-IOV if IRQ lost Ported from Vega10. SDMA stress tests sometimes see IRQ lost. Signed-off-by: Samir Dhume <samir.dhume@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:28:34 -05:00
Monk Liu	341dfe9073	drm/amdgpu: stop using sratch_reg in IB test scratch_reg0 is used by RLCG for register access usage in SRIOV case. both CP firmware and driver can invoke RLCG to do certain register access (through scratch_reg0/1/2/3) but rlcg now dosen't have race concern so if two clients are in parallel doing the RLCG reg access then we are colliding, GFX IB test is a runtime work, so it is forbidden to use scrach_reg0/1/2/3 during IB test period note: Although we can only have this change for SRIOV, but looks it doesn't worth the effort to differentiate bare-metal with SRIOV on the GFX ib test Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:28:25 -05:00
Monk Liu	752c683dbb	drm/amdgpu: fix IB test MCBP bug 1)for gfx IB test we shouldn't insert DE meta data 2)we should make sure IB test finished before we send event 3 to hypervisor otherwise the IDLE from event 3 will preempt IB test, which is not designed as a compatible structure for MCBP Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:28:11 -05:00
Tianci.Yin	f091c1c70e	drm/amdgpu: disable 3D pipe 1 on Navi1x [why] CP firmware decide to skip setting the state for 3D pipe 1 for Navi1x as there is no use case. [how] Disable 3D pipe 1 on Navi1x. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:28:00 -05:00
Chengming Gui	0cf64555fe	drm/amdgpu: Add debugfs interface to set arbitrary sclk for navi14 (v2) add debugfs interface amdgpu_force_sclk to set arbitrary sclk for navi14 v2: Add lock Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:27:50 -05:00
Rohit Khaire	1da7d4a8ab	drm/amdgpu: Write blocked CP registers using RLC on VF This change programs CP_ME_CNTL and RLC_CSIB_* through RLC Signed-off-by: Rohit Khaire <Rohit.Khaire@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:26:34 -05:00
Yintian Tao	1d21a84661	drm/amdgpu: clean wptr on wb when gpu recovery The TDR will be randomly failed due to compute ring test failure. If the compute ring wptr & 0x7ff(ring_buf_mask) is 0x100 then after map mqd the compute ring rptr will be synced with 0x100. And the ring test packet size is also 0x100. Then after invocation of amdgpu_ring_commit, the cp will not really handle the packet on the ring buffer because rptr is equal to wptr. Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:25:57 -05:00
Maxime Ripard	83794ee6c1	Merge drm/drm-next into drm-misc-next Daniel needs a few commits from drm-next. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2020-03-04 08:56:28 +01:00
Yintian Tao	6c26d558bf	drm/amdgpu: release drm_device after amdgpu_driver_unload_kms If we release drm_device before amdgpu_driver_unload_kms, then it will raise the error below. Therefore, we need to place it before amdgpu_driver_unload_kms. [ 43.055736] Memory manager not clean during takedown. [ 43.055777] WARNING: CPU: 1 PID: 2807 at /build/linux-hwe-9KJ07q/linux-hwe-4.18.0/drivers/gpu/drm/drm_mm.c:913 drm_mm_takedown+0x24/0x30 [drm] [ 43.055778] Modules linked in: amdgpu(OE-) amd_sched(OE) amdttm(OE) amdkcl(OE) amd_iommu_v2 drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic nfit kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm ghash_clmulni_intel snd_seq_midi snd_seq_midi_event pcbc snd_rawmidi snd_seq snd_seq_device aesni_intel snd_timer joydev aes_x86_64 crypto_simd cryptd glue_helper snd soundcore input_leds mac_hid serio_raw qemu_fw_cfg binfmt_misc sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic floppy usbhid psmouse hid i2c_piix4 e1000 pata_acpi [ 43.055819] CPU: 1 PID: 2807 Comm: modprobe Tainted: G OE 4.18.0-15-generic #16~18.04.1-Ubuntu [ 43.055820] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 43.055830] RIP: 0010:drm_mm_takedown+0x24/0x30 [drm] [ 43.055831] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48 83 c7 38 48 39 c7 75 02 f3 c3 55 48 c7 c7 38 33 80 c0 48 89 e5 e8 1c 41 ec d0 <0f> 0b 5d c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 [ 43.055857] RSP: 0018:ffffae33c1393d28 EFLAGS: 00010286 [ 43.055859] RAX: 0000000000000000 RBX: ffff9651b4a29800 RCX: 0000000000000006 [ 43.055860] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff9651bfc964b0 [ 43.055861] RBP: ffffae33c1393d28 R08: 00000000000002a6 R09: 0000000000000004 [ 43.055861] R10: ffffae33c1393d20 R11: 0000000000000001 R12: ffff9651ba6cb000 [ 43.055863] R13: ffff9651b7f40000 R14: ffffffffc0de3a10 R15: ffff9651ba5c6460 [ 43.055864] FS: 00007f1d3c08d540(0000) GS:ffff9651bfc80000(0000) knlGS:0000000000000000 [ 43.055865] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.055866] CR2: 00005630a5831640 CR3: 000000012e274004 CR4: 00000000003606e0 [ 43.055870] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 43.055871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 43.055871] Call Trace: [ 43.055885] drm_vma_offset_manager_destroy+0x1b/0x30 [drm] [ 43.055894] drm_gem_destroy+0x19/0x40 [drm] [ 43.055903] drm_dev_fini+0x7f/0x90 [drm] [ 43.055911] drm_dev_release+0x2b/0x40 [drm] [ 43.055919] drm_dev_unplug+0x64/0x80 [drm] [ 43.055994] amdgpu_pci_remove+0x39/0x70 [amdgpu] [ 43.055998] pci_device_remove+0x3e/0xc0 [ 43.056001] device_release_driver_internal+0x18a/0x260 [ 43.056003] driver_detach+0x3f/0x80 [ 43.056004] bus_remove_driver+0x59/0xd0 [ 43.056006] driver_unregister+0x2c/0x40 [ 43.056008] pci_unregister_driver+0x22/0xa0 [ 43.056087] amdgpu_exit+0x15/0x57c [amdgpu] [ 43.056090] __x64_sys_delete_module+0x146/0x280 [ 43.056094] do_syscall_64+0x5a/0x120 v2: put drm_dev_put after pci_set_drvdata Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:22 -05:00
Yintian Tao	d2790e10d3	drm/amdgpu: no need to clean debugfs at amdgpu drm_minor_unregister will invoke drm_debugfs_cleanup to clean all the child node under primary minor node. We don't need to invoke amdgpu_debugfs_fini and amdgpu_debugfs_regs_cleanup to clean agian. Otherwise, it will raise the NULL pointer like below. [ 45.046029] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 45.047256] PGD 0 P4D 0 [ 45.047713] Oops: 0002 [#1] SMP PTI [ 45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: G W OE 4.18.0-15-generic #16~18.04.1-Ubuntu [ 45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 45.050651] RIP: 0010:down_write+0x1f/0x40 [ 45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01 [ 45.053702] RSP: 0018:ffffad8f4133fd40 EFLAGS: 00010246 [ 45.054384] RAX: 00000000000000a8 RBX: 00000000000000a8 RCX: ffffa011327dd814 [ 45.055349] RDX: ffffffff00000001 RSI: 0000000000000001 RDI: 00000000000000a8 [ 45.056346] RBP: ffffad8f4133fd48 R08: 0000000000000000 R09: ffffffffc0690a00 [ 45.057326] R10: ffffad8f4133fd58 R11: 0000000000000001 R12: ffffa0113cff0300 [ 45.058266] R13: ffffa0113c0a0000 R14: ffffffffc0c02a10 R15: ffffa0113e5c7860 [ 45.059221] FS: 00007f60d46f9540(0000) GS:ffffa0113fc00000(0000) knlGS:0000000000000000 [ 45.060809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.061826] CR2: 00000000000000a8 CR3: 0000000136250004 CR4: 00000000003606f0 [ 45.062913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.064404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.065897] Call Trace: [ 45.066426] debugfs_remove+0x36/0xa0 [ 45.067131] amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu] [ 45.068019] amdgpu_debugfs_fini+0x2c/0x50 [amdgpu] [ 45.068756] amdgpu_pci_remove+0x49/0x70 [amdgpu] [ 45.069439] pci_device_remove+0x3e/0xc0 [ 45.070037] device_release_driver_internal+0x18a/0x260 [ 45.070842] driver_detach+0x3f/0x80 [ 45.071325] bus_remove_driver+0x59/0xd0 [ 45.071850] driver_unregister+0x2c/0x40 [ 45.072377] pci_unregister_driver+0x22/0xa0 [ 45.073043] amdgpu_exit+0x15/0x57c [amdgpu] [ 45.073683] __x64_sys_delete_module+0x146/0x280 [ 45.074369] do_syscall_64+0x5a/0x120 [ 45.074916] entry_SYSCALL_64_after_hwframe+0x44/0xa9 v2: remove all debugfs cleanup/fini code at amdgpu v3: squash in unused variable removal Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:22 -05:00
Jacob He	460c484f24	drm/amdgpu: Initialize SPM_VMID with 0xf (v2) SPM_VMID is a global resource, SPM access the video memory according to SPM_VMID. The initial valude of SPM_VMID is 0 which is used by kernel. That means UMD can overwrite the memory of VMID0 by enabling SPM, that is really dangerous. Initialize SPM_VMID with 0xf, it messes up other user mode process at most. v2: squash in indentation fix Signed-off-by: Jacob He <jacob.he@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:21 -05:00
Emily Deng	89510a2737	drm/amdgpu/sriov: Use kiq to copy the gpu clock For vega10 sriov, the register is blocked, use copy data command to fix the issue. v2: Rename amdgpu_kiq_read_clock to gfx_v9_0_kiq_read_clock. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:21 -05:00
Yong Zhao	fd7d08bad7	drm/amdkfd: Make get_tile_config() generic Given we can query all the asic specific information from amdgpu_gfx_config, we can make get_tile_config() generic. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:20 -05:00
Yong Zhao	94b5c215ce	drm/amdgpu: Add num_banks and num_ranks to gfx config structure The two members will be used by KFD later. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-28 16:59:20 -05:00
Dave Airlie	a2ae604da7	Merge tag 'amd-drm-next-5.7-2020-02-26' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.7-2020-02-26: amdgpu: - Rework VM update handling in preparation for HMM support - HDCP srm support - PSR fixes - DC watermark fixes - OLED panel support - SR-IOV fixes - BACO fixes - Optimize debugging vram access - RAS fixes - Use BACO for runtime pm - HDCP fixes - XGMI fixes - DDC fixes - DC clock programming optimizations and fixes - PSP fw loading sequence updates - Drop DRIVER_USE_AGP - Remove legacy drm load and unload callbacks amdkfd: - Add runtime pm support radeon: - Drop DRIVER_USE_AGP Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227043142.4075-1-alexander.deucher@amd.com	2020-02-28 15:40:26 +10:00
Christian König	bd2275eeed	dma-buf: drop dynamic_mapping flag Instead use the pin() callback to detect dynamic DMA-buf handling. Since amdgpu is now migrated it doesn't make much sense to keep the extra flag. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353997/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	a448cb003e	drm/amdgpu: implement amdgpu_gem_prime_move_notify v2 Implement the importer side of unpinned DMA-buf handling. v2: update page tables immediately Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353998/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	2d4dad2734	drm/amdgpu: add amdgpu_dma_buf_pin/unpin v2 This implements the exporter side of unpinned DMA-buf handling. v2: fix minor coding style issues Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353999/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	4993ba0263	drm/amdgpu: use allowed_domains for exported DMA-bufs Avoid that we ping/pong the buffers when we stop to pin DMA-buf exports by using the allowed domains for exported buffers. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353996/?series=73646&rev=1	2020-02-27 14:58:01 +01:00
Christian König	bb42df4662	dma-buf: add dynamic DMA-buf handling v15 On the exporter side we add optional explicit pinning callbacks. Which are called when the importer doesn't implement dynamic handling, move notification or need the DMA-buf locked in place for its use case. On the importer side we add an optional move_notify callback. This callback is used by the exporter to inform the importers that their mappings should be destroyed as soon as possible. This allows the exporter to provide the mappings without the need to pin the backing store. v2: don't try to invalidate mappings when the callback is NULL, lock the reservation obj while using the attachments, add helper to set the callback v3: move flag for invalidation support into the DMA-buf, use new attach_info structure to set the callback v4: use importer_priv field instead of mangling exporter priv. v5: drop invalidation_supported flag v6: squash together with pin/unpin changes v7: pin/unpin takes an attachment now v8: nuke dma_buf_attachment_(map\|unmap)_locked, everything is now handled backward compatible v9: always cache when export/importer don't agree on dynamic handling v10: minimal style cleanup v11: drop automatically re-entry avoidance v12: rename callback to move_notify v13: add might_lock in appropriate places v14: rebase on separated locking change v15: add EXPERIMENTAL flag, some more code comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/353993/?series=73646&rev=1	2020-02-27 14:58:00 +01:00
Alex Deucher	c6385e503a	drm/amdgpu: drop legacy drm load and unload callbacks We've moved the debugfs handling into a centralized place so we can remove the legacy load an unload callbacks. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	405a1f9090	drm/amdgpu/display: split dp connector registration (v4) Split into init and register functions to avoid a segfault in some configs when the load/unload callbacks are removed. v2: - add back accidently dropped has_aux setting - set dev in late_register v3: - fix dp cec ordering v4: - squash in kdev reference fix Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	d090e7db5a	drm/amdgpu/display: move debugfs init into core amdgpu debugfs (v2) In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for display. v2: add config guard for DC Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Harry Wentland <harry.wentland@amd.com> (v1) Acked-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	4074892967	drm/amdgpu: don't call drm_connector_register for non-MST ports The core does this for us now. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:13 -05:00
Alex Deucher	fd23cfcc2e	drm/amdgpu/ring: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for rings. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	cd9e29e717	drm/amdgpu/firmware: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for firmware. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	f9d64e6c4a	drm/amdgpu/regs: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for register access files. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	3f5cea671c	drm/amdgpu/gem: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for gem. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	24038d581c	drm/amdgpu/fence: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for fence handling. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	15997544a3	drm/amdgpu/sa: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for SA (sub allocator). Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	a4c5b1bb7b	drm/amdgpu/pm: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for pm. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	c5820361da	drm/amdgpu/ttm: move debugfs init into core amdgpu debugfs In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for ttm. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Alex Deucher	923ffa6b02	drm/amdgpu: rename amdgpu_debugfs_preempt_cleanup to amdgpu_debugfs_fini. It will be used for other things in the future. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:12 -05:00
Yong Zhao	8bdab6bb1c	drm/amdgpu: Increase timout on emulator to tenfold instead of twice Since emulators are slower, sometime some operations like flushing tlb through FM need more than twice the regular timout of 100ms, so increase the timeout to 1s on emulators. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:03 -05:00
Yong Zhao	e694530418	drm/amdkfd: Avoid ambiguity by indicating it's cp queue The queues represented in queue_bitmap are only CP queues. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:20:05 -05:00
Divya Shikre	0c663695a6	drm/amd: Extend ROCt to surface UUID for devices that have them Devices from Arcturus onwards will have their UUID exposed to Thunk. Adding neccessary functions to the kernel to propagate the uuid. Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:18:17 -05:00
Kent Russell	944effd337	drm/amdgpu: Fix check for DPM when returning max clock pp_funcs may not exist, while dpm may be enabled. This change ensures that KFD topology will report the same as pp_dpm_sclk, as the conditions for reporting them will be the same. Otherwise, we may see the issue where KFD reports "100MHz" in topology as the max speed, while DPM is working correctly. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:42 -05:00
Rohit Khaire	75ddb640e1	drm/amdgpu: Don't write GCVM_L2_CNTL* regs on navi12 VF This change disables programming of GCVM_L2_CNTL* regs on VF. Signed-off-by: Rohit Khaire <Rohit.Khaire@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Daniel Vetter	f3ed67395d	drm/amdgpu: Drop DRIVER_USE_AGP This doesn't do anything except auto-init drm_agp support when you call drm_get_pci_dev(). Which amdgpu stopped doing with commit `b58c11314a` Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Jun 2 17:16:31 2017 -0400 drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev No idea whether this was intentional or accidental breakage, but I guess anyone who manages to boot a this modern gpu behind an agp bridge deserves a price. A price I never expect anyone to ever collect :-) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Xiaojie Yuan <xiaojie.yuan@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: "Tianci.Yin" <tianci.yin@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Tom St Denis	669e2f91e4	drm/amd/amdgpu: Add gfxoff debugfs entry Write a 32-bit value of zero to disable GFXOFF and write a 32-bit value of non-zero to enable GFXOFF. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Nirmoy Das	c6fc97f9bc	drm/amdgpu: use amdgpu_ring_test_helper when possible amdgpu_ring_test_helper already handles ring->sched.ready correctly Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Christian König	42e5fee65e	drm/amdgpu: add VM update fences back to the root PD v2 Add update fences to the root PD while mapping BOs. Otherwise PDs freed during the mapping won't wait for updates to finish and can cause corruptions. v2: rebased on drm-misc-next Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `90b69cdc5f` drm/amdgpu: stop adding VM updates fences to the resv obj Reviewed-by: xinhui pan <xinhui.pan@amd.com> Tested-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
Nirmoy Das	6f9f960472	drm/amdgpu: cleanup amdgpu_ring_fini cleanup amdgpu_ring_fini to check the prerequisites before changing ring->sched.ready Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:33 -05:00
John Clements	ef1caf48bd	drm/amdgpu: Add Arcturus D342 page retire support Check Arcturus SKU type to select I2C address of page retirement EEPROM Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Hawking Zhang	938065d4cb	drm/amdgpu: toggle DF-Cstate to protect DF reg access driver needs to take DF out Cstate before any DF register access. otherwise, the DF register may not be accessible. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Hawking Zhang	19744f5f2d	drm/amdgpu: move get_xgmi_relative_phy_addr to amdgpu_xgmi.c centralize all the xgmi related function to amdgpu_xgmi.c Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Hawking Zhang	53e0f1e6be	drm/amdgpu: add dpm helper function for DF Cstate control The helper function hides software smu and legacy powerplay implementation for DF Cstate control. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Evan Quan	995da6cc4c	drm/amdgpu: update psp firmwares loading sequence V2 For those ASICs with DF Cstate management centralized to PMFW, TMR setup should be performed between pmfw loading and other non-psp firmwares loading. V2: skip possible SMU firmware reloading Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
xinhui pan	f4a3c42b5c	drm/amdgpu: Remove kfd eviction fence before release bo (v2) No need to trigger eviction as the memory mapping will not be used anymore. All pt/pd bos share same resv, hence the same shared eviction fence. Everytime page table is freed, the fence will be signled and that cuases kfd unexcepted evictions. v2: squash in 32 bit fix CC: Christian König <christian.koenig@amd.com> CC: Felix Kuehling <felix.kuehling@amd.com> CC: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:17:32 -05:00
Daniel Vetter	8a3bddf67c	drm/amdgpu: Drop DRIVER_USE_AGP This doesn't do anything except auto-init drm_agp support when you call drm_get_pci_dev(). Which amdgpu stopped doing with commit `b58c11314a` Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Jun 2 17:16:31 2017 -0400 drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev No idea whether this was intentional or accidental breakage, but I guess anyone who manages to boot a this modern gpu behind an agp bridge deserves a price. A price I never expect anyone to ever collect :-) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Xiaojie Yuan <xiaojie.yuan@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: "Tianci.Yin" <tianci.yin@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-26 14:02:41 -05:00
Shirish S	a3ed353cf8	amdgpu/gmc_v9: save/restore sdpif regs during S3 fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-25 11:30:42 -05:00
Felix Kuehling	b80cd524ac	drm/amdgpu: Improve Vega20 XGMI TLB flush workaround Using a heavy-weight TLB flush once is not sufficient. Concurrent memory accesses in the same TLB cache line can re-populate TLB entries from stale texture cache (TC) entries while the heavy-weight TLB flush is in progress. To fix this race condition, perform another TLB flush after the heavy-weight one, when TC is known to be clean. Move the workaround into the low-level TLB flushing functions. This way they apply to amdgpu as well, and KIQ-based TLB flush only needs to synchronize once. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:57 -05:00
Monk Liu	82c4ebfa35	drm/amdgpu: fix psp ucode not loaded in bare-metal for bare-metal we alawys need to load sys/sos/kdb Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:49 -05:00
Shirish S	c2ecd79bec	amdgpu/gmc_v9: save/restore sdpif regs during S3 fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:26 -05:00
Alex Deucher	91aeda1811	drm/amdgpu/discovery: make the discovery code less chatty Make the IP block base output debug only. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:25 -05:00
Monk Liu	6325b38d89	drm/amdgpu: fix colliding of preemption what: some os preemption path is messed up with world switch preemption fix: cleanup those logics so os preemption not mixed with world switch this patch is a general fix for issues comes from SRIOV MCBP, but there is still UMD side issues not resovlved yet, so this patch cannot fix all world switch bug. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:25 -05:00
Monk Liu	f77a9c920a	drm/amdgpu: cleanup some incorrect reg access for SRIOV 1) we shouldn't load PSP kdb and sys/sos for VF, they are supposed to be handled by hypervisor 2) ih reroute doesn't work on VF thus we should avoid calling it, besides VF should not use those PSP register sets for PF 3) shouldn't load SMU ucode under SRIOV, otherwise PSP would report error Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-25 11:01:25 -05:00
Evan Quan	14008574a3	drm/amdgpu: drop the non-sense firmware version check on arcturus As the firmware versions of arcturus are different from other gfx9 ASICs. And the warning("CP firmware version too old, please update!") caused by this check can be eliminated. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
changzhu	f61f01b14d	drm/amdgpu: add is_raven_kicker judgement for raven1 The rlc version of raven_kicer_rlc is different from the legacy rlc version of raven_rlc. So it needs to add a judgement function for raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
Guchun Chen	3cd4f61859	drm/amdgpu: record non-zero error counter info in NBIO before resetting GPU When NBIO's RAS error happens, before trigging GPU reset, it's needed to record error counter information, which can correct the error counter value missed issue when reading from debugfs. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
Guchun Chen	313c8fd33e	drm/amdgpu: log on non-zero error conter per IP before GPU reset Once sync flood interrupt is triggered by RAS error, before actual GPU recovery job, it's necessary to log on and print non-zero error counter, this will help user knows where the RAS error source is from quickly. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
changzhu	debcf83770	drm/amdgpu: add is_raven_kicker judgement for raven1 The rlc version of raven_kicer_rlc is different from the legacy rlc version of raven_rlc. So it needs to add a judgement function for raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:33:42 -05:00
Maxime Ripard	28f2aff1ca	Linux 5.6-rc2 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5JsVQeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGHZEH+wddtJO4dZk5TZdF KZB2w2ldsCvrBZAmas1TVvm8ncvUf+ATUZcSIzvZ3YbLmxsuLF2Cz3kD+n+36Mvy ejwq8Scl7jwnouYps/Gfd6rRj/uCafqST4qp15GMGeiy2ST4A8dJrv5IAgZhD8/N SN1bSr1AXpZ2JlEzzLDQ/NdVoNMS6IzCOsaINZcc60/XQoQZFRBWamMJFqu+CmXD SBJOybQNFJhziy45cGZSAl+67sSCcoPftwTs0Stu4CJsvFWRb3MsbNTDS51Hjcc4 3tdgGOhoNXzyzZr96MEAHmiaW4VLQv0PGgUfOajE35viMz48OrwCTru8Kbuae3XM YHV4qJk= =yiem -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXkpecgAKCRDj7w1vZxhR xU4CAQD/8ZzfYroiiEI7FHEgvnuldGYZqA9kbNAO2Vueeac0OgD+Jojewdxy+pJ9 dxfA/POdtzx3hOdN+U1YgaDC0hbXwww= =XgKq -----END PGP SIGNATURE----- Merge v5.6-rc2 into drm-misc-next Lyude needs some patches in 5.6-rc2 and we didn't bring drm-misc-next forward yet, so it looks like a good occasion. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2020-02-17 10:34:34 +01:00
Alex Deucher	b08c3ed609	drm/amdgpu/gfx10: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Alex Deucher	120cf95930	drm/amdgpu/gfx9: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Alex Deucher	c657b936ea	drm/amdgpu/soc15: fix xclk for raven It's 25 Mhz (refclk / 4). This fixes the interpretation of the rlc clock counter. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Bhawanpreet Lakha	c6f8c44044	drm/amd/display: fix dtm unloading there was a type in the terminate command. We should be calling psp_dtm_unload() instead of psp_hdcp_unload() Fixes: `143f230533` ("drm/amdgpu: psp DTM init") Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-14 12:58:58 -05:00
Thomas Zimmermann	e3eff4b5d9	drm/amdgpu: Convert to CRTC VBLANK callbacks VBLANK callbacks in struct drm_driver are deprecated in favor of their equivalents in struct drm_crtc_funcs. Convert amdgpu over. v2: * don't wrap existing functions; change signature instead Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-6-tzimmermann@suse.de	2020-02-13 13:08:13 +01:00
Thomas Zimmermann	ea702333e5	drm/amdgpu: Convert to struct drm_crtc_helper_funcs.get_scanout_position() The callback struct drm_driver.get_scanout_position() is deprecated in favor of struct drm_crtc_helper_funcs.get_scanout_position(). Convert amdgpu over. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-5-tzimmermann@suse.de	2020-02-13 13:08:13 +01:00
Dan Carpenter	434cbcb1bd	drm/amdgpu: return -EFAULT if copy_to_user() fails The copy_to_user() function returns the number of bytes remaining to be copied, but we want to return a negative error code to the user. Fixes: `030d5b97a5` ("drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:41 -05:00
Alex Deucher	72b4c01d66	drm/amdgpu/gfx10: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:41 -05:00
Alex Deucher	e5f134958d	drm/amdgpu/gfx9: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:40 -05:00
Alex Deucher	b90c4d667c	drm/amdgpu/soc15: fix xclk for raven It's 25 Mhz (refclk / 4). This fixes the interpretation of the rlc clock counter. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:40 -05:00
Bhawanpreet Lakha	c786530b21	drm/amd/display: fix dtm unloading there was a type in the terminate command. We should be calling psp_dtm_unload() instead of psp_hdcp_unload() Fixes: `143f230533` ("drm/amdgpu: psp DTM init") Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:03:22 -05:00
Alex Deucher	4fdda2e66d	drm/amdgpu/runpm: enable runpm on baco capable VI+ asics Seems to work reliably on VI+ except for a few so enable runpm barring those where baco for runtime power management is not supported. [rajneesh] Picked https://patchwork.freedesktop.org/patch/335402/ to enable runtime pm with baco for kfd. Also fixed a checkpatch warning and added extra checks for VEGA20 and ARCTURUS. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:01:03 -05:00
Rajneesh Bhardwaj	9593f4d6a6	drm/amdkfd: refactor runtime pm for baco So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user processes to create queues and interact with kfd driver and amd gpu. This mechanism created problem when amdgpu device is runtime suspended with BACO enabled. Any application that relies on kfd driver fails to load because the driver reports a locked kfd device since gpu is runtime suspended. However, in an ideal case, when gpu is runtime suspended the kfd driver should be able to: - auto resume amdgpu driver whenever a client requests compute service - prevent runtime suspend for amdgpu while kfd is in use This change refactors the amdgpu and amdkfd drivers to support BACO and runtime power management. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:00:54 -05:00
Rajneesh Bhardwaj	70bedd68e7	drm/amdgpu: Fix missing error check in suspend amdgpu_device_suspend might return an error code since it can be called from both runtime and system suspend flows. Add the missing return code in case of a failure. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:00:25 -05:00
Guchun Chen	a934f9d866	drm/amdgpu: correct comment to clear up the confusion Former comment looks to be one intended behavior in code, actually it's not. So correct it. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:40:12 -05:00
James Zhu	b5336bfd6f	drm/amdgpu/vcn2.5: fix warning Fix warning during switching to dpg pause mode for VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:37:22 -05:00
Guchun Chen	2cabe0d4cd	drm/amdgpu: limit GDS clearing workaround in cold boot sequence GDS clear workaround will cause gfx failure in suspend/resume case. [ 98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] ERROR late_init of IP block <gfx_v9_0> failed -110 [ 98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110 [ 98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110 As this workaround is specific to the HW bug of GDS's ECC error existing in cold boot up, so bypass this workaround in suspend/ resume case after booting up. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:37:02 -05:00
Jonathan Kim	46d1da733f	drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf hwc->conf was designated specifically for AMD APU IOMMU purposes. This could cause problems in performance and/or function since APU IOMMU implementation is elsewhere. Also hwc->conf and hwc->config are different members of an anonymous union so hwc->conf aliases as hw->last_tag. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:35:54 -05:00
James Zhu	f4d0242b7b	drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1 Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode power off issue on instance 1. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:10:36 -05:00
Alex Deucher	f0f7ddfc34	drm/amdgpu: add flag for runtime suspend So we know whether we in are in runtime suspend or system suspend. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:52 -05:00
xinhui pan	a6605c43f9	drm/amdgpu: Do not move root PT bo to relocated list As root PD has no parent, we just need move its status to idle. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> CC: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:38 -05:00
Guchun Chen	278628fa46	drm/amdgpu: correct comment to clear up the confusion Former comment looks to be one intended behavior in code, actually it's not. So correct it. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:31 -05:00
James Zhu	3b4a18a355	drm/amdgpu/vcn2.5: fix warning Fix warning during switching to dpg pause mode for VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:49:23 -05:00
Guchun Chen	ea6f0931c1	drm/amdgpu: limit GDS clearing workaround in cold boot sequence GDS clear workaround will cause gfx failure in suspend/resume case. [ 98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] ERROR late_init of IP block <gfx_v9_0> failed -110 [ 98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110 [ 98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110 As this workaround is specific to the HW bug of GDS's ECC error existing in cold boot up, so bypass this workaround in suspend/ resume case after booting up. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:49:16 -05:00
Linus Torvalds	c16b99d6c5	drm fixes for 5.6-rc1 tegra: - merge window regression fixes nouveau: - couple of volta/turing modesetting fixes amdgpu: - EDC fixes for Arcturus - GDDR6 memory training fixe - Fix for reading gfx clockgating registers while in GFXOFF state - i2c freq fixes - Misc display fixes - TLB invalidation fix when using semaphores - VCN 2.5 instancing fixes - Switch raven1 gfxoff to a blacklist - Coreboot workaround for KV/KB - Root cause dongle fixes for display and revert workaround - Enable GPU reset for renoir and navi - Navi overclocking fixes - Fix up confusing warnings in display clock validation on raven amdkfd: - SDMA fix radeon: - Misc LUT fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJePNasAAoJEAx081l5xIa+axgP+wZbUkJZYt75p5thhptp2CY3 235yIZQcDy0T+J/34Agv0e5DfK2hCafPIVv7dIV1A9AjWxhVzIoZmD2mHSLKnxVb MsfemKsrm1WO9KezDkDIEOgkN5rY811xTIsg89VSjyiiPXQmwAcXQBbmIBHg9YtT 7ttek1afJv7h0HD4W2JdTxLEeKn9lpZHUXESXz+xoE8OUMDizMgmbt8nWXlbFpcw Tx1EmxZ3jbCYGsrcQBOLURZiwzOb2zbXJ9o8BgNdomvtI69l2/3mMonavw+IgVq+ aiHMgWViVsDDP8ijn8uBa7EEqtGWuqTRaL/Ei71w2MGkG7S3Nz4PxbLbl5P9NHAc D3OsiaXGdNhoSGfGD6sr7TlOIEo7gjLGKJ3fMW15XmBF4isb0lwlnIB7dTiU9Zaq UBMpNA9/Vtty2MtT1SkVKQrC+Mujl4lAYssWhcrh1/RX0Ij6do4aFswy17dbVg3Q LIigj1quIfNPXONb0fwkg5JGQyOMnrYs0Q6SkrhQfSR2UzsNWnGoBRyROcZipNnQ STaM8F6eEi7RbFUaT0uWL71GBxM00PyjQgpf5fnrOiKp6rfgK3B3ThpsVoAQ2OW4 CnodlCa1uxIkIgQdcoG/3vT5B6nzpdhgRVjHt+N5sraLjOZFZHxex0YmJAYXcN+/ QR8ARNgILJvA14xCsOT7 =C0nE -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Just some fixes for this merge window: the tegra changes fix some regressions in the merge, nouveau has a few modesetting fixes. The amdgpu fixes are bit bigger, but they contain a couple of weeks of fixes, and don't seem to contain anything that isn't really a fix. Summary: tegra: - merge window regression fixes nouveau: - couple of volta/turing modesetting fixes amdgpu: - EDC fixes for Arcturus - GDDR6 memory training fixe - Fix for reading gfx clockgating registers while in GFXOFF state - i2c freq fixes - Misc display fixes - TLB invalidation fix when using semaphores - VCN 2.5 instancing fixes - Switch raven1 gfxoff to a blacklist - Coreboot workaround for KV/KB - Root cause dongle fixes for display and revert workaround - Enable GPU reset for renoir and navi - Navi overclocking fixes - Fix up confusing warnings in display clock validation on raven amdkfd: - SDMA fix radeon: - Misc LUT fixes" * tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm: (90 commits) gpu: host1x: Set DMA direction only for DMA-mapped buffer objects drm/tegra: Reuse IOVA mapping where possible drm/tegra: Relax IOMMU usage criteria on old Tegra drm/amd/dm/mst: Ignore payload update failures drm/amdgpu: update default voltage for boot od table for navi1x drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2) drm/amdgpu: fetch default VDDC curve voltages (v2) drm/amdgpu/smu_v11_0: Correct behavior of restoring default tables (v2) drm/amdgpu/navi10: add OD_RANGE for navi overclocking drm/amdgpu/navi: fix index for OD MCLK drm/amd/display: Fix HW/SW state mismatch drm/amd/display: Fix a typo when computing dsc configuration drm/amd/powerplay: fix navi10 system intermittent reboot issue V2 drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode drm/amd/display: Only enable cursor on pipes that need it drm/nouveau/kms/gv100-: avoid sending a core update until the first modeset drm/nouveau/kms/gv100-: move window ownership setup into modesetting path drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR storms ...	2020-02-07 12:46:08 -08:00
Christian König	dd1ab79910	drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_access_memory v2 Make use of the better performance here as well. This patch is only compile tested! v2: fix calculation bug pointed out by Felix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:39 -05:00
Christian König	030d5b97a5	drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read This speeds up the access quite a bit from 2.2 MB/s to 2.9 MB/s on 32bit and 12,8 MB/s on 64bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:33 -05:00
Christian König	c12b84d6e0	drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2 This should speed up debugging VRAM access a lot. v2: add HDP flush/invalidate Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:25 -05:00
Christian König	ce05ac56e6	drm/amdgpu: optimize amdgpu_device_vram_access a bit. Only write the _HI register when necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:17 -05:00
Jonathan Kim	42d708db8e	drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf hwc->conf was designated specifically for AMD APU IOMMU purposes. This could cause problems in performance and/or function since APU IOMMU implementation is elsewhere. Also hwc->conf and hwc->config are different members of an anonymous union so hwc->conf aliases as hw->last_tag. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:05 -05:00
James Zhu	80ff3e10c8	drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1 Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode power off issue on instance 1. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:44:51 -05:00
Jack Zhang	86b93fd62d	drm/amdgpu/sriov Don't send msg when smu suspend For sriov and pp_onevf_mode, do not send message to set smu status, because smu doesn't support these messages under VF. Besides, it should skip smu_suspend when pp_onevf_mode is disabled. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:44:24 -05:00
Hawking Zhang	0b9d37609a	drm/amdgpu: move xgmi init/fini to xgmi_add/remove_device call (v2) For sriov, psp ip block has to be initialized before ih block for the dynamic register programming interface that needed for vf ih ring buffer. On the other hand, current psp ip block hw_init function will initialize xgmi session which actaully depends on interrupt to return session context. This results an empty xgmi ta session id and later failures on all the xgmi ta cmd invoked from vf. xgmi ta session initialization has to be done after ih ip block hw_init call. to unify xgmi session init/fini for both bare-metal sriov virtualization use scenario, move xgmi ta init to xgmi_add_device call, and accordingly terminate xgmi ta session in xgmi_remove_device call. The existing suspend/resume sequence will not be changed. v2: squash in return fix from Nirmoy Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-06 15:04:36 -05:00
Christian König	9f3cc18d19	drm/amdgpu: rework synchronization of VM updates v4 If provided we only sync to the BOs reservation object and no longer to the root PD. v2: update comment, cleanup amdgpu_bo_sync_wait_resv v3: use correct reservation object while clearing v4: fix typo in amdgpu_bo_sync_wait_resv Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	4939d973b6	drm/amdgpu: simplify and fix amdgpu_sync_resv No matter what we always need to sync to moves. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	fe6796ac12	drm/amdgpu: allow higher level PD invalidations Allow partial invalidation on unallocated PDs. This is useful when we need to silence faults to stop interrupt floods on Vega. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	7d28efe0c3	drm/amdgpu: return EINVAL instead of ENOENT in the VM code That we can't find a PD above the root is expected can only happen if we try to update a larger range than actually managed by the VM. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	bfcd6c69e4	drm/amdgpu: fix parentheses in amdgpu_vm_update_ptes For the root PD mask can be 0xffffffff as well which would overrun to 0 if we don't cast it before we add one. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	46cf5f7626	drm/amdgpu: make sure to never allocate PDs/PTs for invalidations Make sure that we never allocate a page table for an invalidation operation. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	55cdd4e9b9	drm/amdgpu: drop unnecessary restriction for huge root PDEs The root PD can also contain huge PDEs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	ef48d4b39e	drm/amdgpu: stop using amdgpu_bo_gpu_offset in the VM backend We need to update page tables without any lock held. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	5d3196605d	drm/amdgpu: rework job synchronization v2 For unlocked page table updates we need to be able to sync to fences of a specific VM. v2: use SYNC_ALWAYS in the UVD code Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	114fbc3195	drm/amdgpu: use the VM as job owner For HMM we need to rework how VM synchronization works, so instead of the filp use VM as job owner. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	81417bea87	drm/amdgpu: explicitly sync VM update to PDs/PTs Explicitly sync VM updates to the moving fence in PDs and PTs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Nathan Chancellor	968162204a	drm/amdgpu: Fix implicit enum conversion in gfx_v9_4_ras_error_inject Clang warns: ../drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c:967:35: warning: implicit conversion from enumeration type 'enum amdgpu_ras_block' to different enumeration type 'enum ta_ras_block' [-Wenum-conversion] block_info.block_id = info->head.block; ~ ~~~~~~~~~~~^~~~~ 1 warning generated. Use the function added in commit `828cfa2909` ("drm/amdgpu: Fix amdgpu ras to ta enums conversion") that handles this conversion explicitly. Fixes: `4c461d89db` ("drm/amdgpu: add RAS support for the gfx block of Arcturus") Link: https://github.com/ClangBuiltLinux/linux/issues/849 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:43 -05:00
Stephen Rothwell	7044cb6c20	amdgpu: using vmalloc requires includeing vmalloc.h Fixes: `240c811ccd` ("drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2)") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:42 -05:00
Nirmoy Das	977f7e1068	drm/amdgpu: allocate entities on demand Currently we pre-allocate entities and fences for all the HW IPs on context creation and some of which are might never be used. This patch tries to resolve entity/fences wastage by creating entity only when needed. v2: allocate memory for entity and fences together Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:42 -05:00
Joseph Greathouse	18c6b74e7c	drm/amdgpu: Enable DISABLE_BARRIER_WAITCNT for Arcturus In previous gfx9 parts, S_BARRIER shader instructions are implicitly S_WAITCNT 0 instructions as well. This setting turns off that mechanism in Arcturus and beyond. With this, shaders must follow the ISA guide insofar as putting in explicit S_WAITCNT operations even after an S_BARRIER. v2: Fix patch title to list component Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:27 -05:00
Linus Torvalds	9f68e3655a	drm pull for 5.6-rc1 uapi: - dma-buf heaps added (and fixed) - command line add support for panel oreientation - command line allow overriding penguin count drm: - mipi dsi definition updates - lockdep annotations for dma_resv - remove dma-buf kmap/kunmap support - constify fb_ops in all fbdev drivers - MST fix for daisy chained hotplug- - CTA-861-G modes with VIC >= 193 added - fix drm_panel_of_backlight export - LVDS decoder support - more device based logging support - scanline alighment for dumb buffers - MST DSC helpers scheduler: - documentation fixes - job distribution improvements panel: - Logic PD type 28 panel support - Jimax8729d MIPI-DSI - igenic JZ4770 - generic DSI devicetree bindings - sony acx424AKP panel - Leadtek LTK500HD1829 - xinpeng XPP055C272 - AUO B116XAK01 - GiantPlus GPM940B0 - BOE NV140FHM-N49 - Satoz SAT050AT40H12R2 - Sharp LS020B1DD01D panels. ttm: - use blocking WW lock i915: - hw/uapi state separation - Lock annotation improvements - selftest improvements - ICL/TGL DSI VDSC support - VBT parsing improvments - Display refactoring - DSI updates + fixes - HDCP 2.2 for CFL - CML PCI ID fixes - GLK+ fbc fix - PSR fixes - GEN/GT refactor improvments - DP MST fixes - switch context id alloc to xarray - workaround updates - LMEM debugfs support - tiled monitor fixes - ICL+ clock gating programming removed - DP MST disable sequence fixed - LMEM discontiguous object maps - prefaulting for discontiguous objects - use LMEM for dumb buffers if possible - add LMEM mmap support amdgpu: - enable sync object timelines for vulkan - MST atomic routines - enable MST DSC support - add DMCUB display microengine support - DC OEM i2c support - Renoir DC fixes - Initial HDCP 2.x support - BACO support for Arcturus - Use BACO for runtime PM power save - gfxoff on navi10 - gfx10 golden updates and fixes - DCN support on POWER - GFXOFF for raven1 refresh - MM engine idle handlers cleanup - 10bpc EDP panel fixes - renoir watermark fixes - SR-IOV fixes - Arcturus VCN fixes - GDDR6 training fixes - freesync fixes - Pollock support amdkfd: - unify more codepath with amdgpu - use KIQ to setup HIQ rather than MMIO radeon: - fix vma fault handler race - PPC DMA fix - register check fixes for r100/r200 nouveau: - mmap_sem vs dma_resv fix - rewrite the ACR secure boot code for Turing - TU10x graphics engine support (TU11x pending) - Page kind mapping for turing - 10-bit LUT support - GP10B Tegra fixes - HD audio regression fix hisilicon/hibmc: - use generic fbdev code and helpers rockchip: - dsi/px30 support virtio: - fb damage support - static some functions vc4: - use dma_resv lock wrappers msm: - use dma_resv lock wrappers - sc7180 display + DSI support - a618 support - UBWC support improvements vmwgfx: - updates + new logging uapi exynos: - enable/disable callback cleanups etnaviv: - use dma_resv lock wrappers atmel-hlcdc: - clock fixes mediatek: - cmdq support - non-smooth cursor fixes - ctm property support sun4i: - suspend support - A64 mipi dsi support rcar-du: - Color management module support - LVDS encoder dual-link support - R8A77980 support analogic: - add support for an6345 ast: - atomic modeset support - primary plane garbage fix arcgpu: - fixes for fourcc handling tegra: - minor fixes and improvments mcde: - vblank support meson: - OSD1 plane AFBC commit gma500: - add pageflip support - reomve global drm_dev komeda: - tweak debugfs output - d32 support - runtime PM suppotr udl: - use generic shmem helpers - cleanup and fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJeMm6RAAoJEAx081l5xIa+vN8P/0j4jEOv+KIinAhoH+LG3EpD m2TUuu5OQIoBrcCoWOgFBk3wqYpw6PdMBdkXh+5sE5lfeBynp8oC3Bin+QsHJE05 eGBpZtHe+70MQb0Eha+Aic0hchvBKzRnq6i0MYSIHn6afs76dLmF8knTjycxrvV5 Xu1Z3WDmjzqgWF9ja5JCD6fby11seP5RrwObYKVikO35QQyJJwGSGKgu5rq/pByK /n0PCnCOINuL0Lz6J9qexdh/0/XYFQilRC31GJNlKbDSFuECF0GOEzEE/xUBW/pI dLh2YwIIygm18Gar9PgvMwXJn3BfzQ0qEJsf+HlQeNw9iLgbHpp2AsTxHTE87OGe R/y85taW3jGjPsNOKZOeLpvg/Ro8l8ZipLApvDCG2O22DThg/cd6NDjZxl1FJfRH acDG/JdgPo5MbdRAH/cM1WuFS9gEM+0BeSQ5gCjtPakF+X4Vz+ABFDLMRJoaejkJ q8DG32TQXELQx0RMghsqK7YCWGfl+2alA1u9w6TgJh9Rq4iVckvpDeqAZnK1Adkc 87g957Tl0n6FA4wJj/t5jrceiLRMJAm/rBK+R3GZNfWrgx4bHbCmb4fZDZsrFzph nbAjNJ5kOchrFCaRR47ULby6+Q14MAFbkWq4Crfu4YDdzUkTPpep6pi2GIe8w0rV P0hdYOYJf6LUda0utuQX =oFrI -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Davbe Airlie: "This is the main pull request for graphics for 5.6. Usual selection of changes all over. I've got one outstanding vmwgfx pull that touches mm so kept it separate until after all of this lands. I'll try and get it to you soon after this, but it might be early next week (nothing wrong with code, just my schedule is messy) This also hits a lot of fbdev drivers with some cleanups. Other notables: - vulkan timeline semaphore support added to syncobjs - nouveau turing secureboot/graphics support - Displayport MST display stream compression support Detailed summary: uapi: - dma-buf heaps added (and fixed) - command line add support for panel oreientation - command line allow overriding penguin count drm: - mipi dsi definition updates - lockdep annotations for dma_resv - remove dma-buf kmap/kunmap support - constify fb_ops in all fbdev drivers - MST fix for daisy chained hotplug- - CTA-861-G modes with VIC >= 193 added - fix drm_panel_of_backlight export - LVDS decoder support - more device based logging support - scanline alighment for dumb buffers - MST DSC helpers scheduler: - documentation fixes - job distribution improvements panel: - Logic PD type 28 panel support - Jimax8729d MIPI-DSI - igenic JZ4770 - generic DSI devicetree bindings - sony acx424AKP panel - Leadtek LTK500HD1829 - xinpeng XPP055C272 - AUO B116XAK01 - GiantPlus GPM940B0 - BOE NV140FHM-N49 - Satoz SAT050AT40H12R2 - Sharp LS020B1DD01D panels. ttm: - use blocking WW lock i915: - hw/uapi state separation - Lock annotation improvements - selftest improvements - ICL/TGL DSI VDSC support - VBT parsing improvments - Display refactoring - DSI updates + fixes - HDCP 2.2 for CFL - CML PCI ID fixes - GLK+ fbc fix - PSR fixes - GEN/GT refactor improvments - DP MST fixes - switch context id alloc to xarray - workaround updates - LMEM debugfs support - tiled monitor fixes - ICL+ clock gating programming removed - DP MST disable sequence fixed - LMEM discontiguous object maps - prefaulting for discontiguous objects - use LMEM for dumb buffers if possible - add LMEM mmap support amdgpu: - enable sync object timelines for vulkan - MST atomic routines - enable MST DSC support - add DMCUB display microengine support - DC OEM i2c support - Renoir DC fixes - Initial HDCP 2.x support - BACO support for Arcturus - Use BACO for runtime PM power save - gfxoff on navi10 - gfx10 golden updates and fixes - DCN support on POWER - GFXOFF for raven1 refresh - MM engine idle handlers cleanup - 10bpc EDP panel fixes - renoir watermark fixes - SR-IOV fixes - Arcturus VCN fixes - GDDR6 training fixes - freesync fixes - Pollock support amdkfd: - unify more codepath with amdgpu - use KIQ to setup HIQ rather than MMIO radeon: - fix vma fault handler race - PPC DMA fix - register check fixes for r100/r200 nouveau: - mmap_sem vs dma_resv fix - rewrite the ACR secure boot code for Turing - TU10x graphics engine support (TU11x pending) - Page kind mapping for turing - 10-bit LUT support - GP10B Tegra fixes - HD audio regression fix hisilicon/hibmc: - use generic fbdev code and helpers rockchip: - dsi/px30 support virtio: - fb damage support - static some functions vc4: - use dma_resv lock wrappers msm: - use dma_resv lock wrappers - sc7180 display + DSI support - a618 support - UBWC support improvements vmwgfx: - updates + new logging uapi exynos: - enable/disable callback cleanups etnaviv: - use dma_resv lock wrappers atmel-hlcdc: - clock fixes mediatek: - cmdq support - non-smooth cursor fixes - ctm property support sun4i: - suspend support - A64 mipi dsi support rcar-du: - Color management module support - LVDS encoder dual-link support - R8A77980 support analogic: - add support for an6345 ast: - atomic modeset support - primary plane garbage fix arcgpu: - fixes for fourcc handling tegra: - minor fixes and improvments mcde: - vblank support meson: - OSD1 plane AFBC commit gma500: - add pageflip support - reomve global drm_dev komeda: - tweak debugfs output - d32 support - runtime PM suppotr udl: - use generic shmem helpers - cleanup and fixes" * tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits) drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing drm/nouveau/acr: return error when registering LSF if ACR not supported drm/nouveau/disp/gv100-: not all channel types support reporting error codes drm/nouveau/disp/nv50-: prevent oops when no channel method map provided drm/nouveau: support synchronous pushbuf submission drm/nouveau: signal pending fences when channel has been killed drm/nouveau: reject attempts to submit to dead channels drm/nouveau: zero vma pointer even if we only unreference it rather than free drm/nouveau: Add HD-audio component notifier support drm/nouveau: fix build error without CONFIG_IOMMU_API drm/nouveau/kms/nv04: remove set but not used variable 'width' drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector' drm/nouveau/mmu: fix comptag memory leak drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping drm/exynos: Rename Exynos to lowercase drm/exynos: change callback names drm/mst: Don't do atomic checks over disabled managers drm/amdgpu: add the lost mutex_init back drm/amd/display: skip opp blank or unblank if test pattern enabled ...	2020-01-30 08:04:01 -08:00
Alex Deucher	2cb44fb093	drm/amdgpu: enable GPU reset by default on renoir Everything is in place. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Alex Deucher	658c663947	drm/amdgpu: enable GPU reset by default on Navi Has been working fine for a while. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Christian König	77171eade8	drm/amdgpu: add coreboot workaround for KV/KB Coreboot seems to have a problem correctly setting up access to the stolen VRAM on KV/KB. Use the direct access only when necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-and-tested-by: Fredrik Bruhn <fredrik.bruhn@unibap.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Alex Deucher	276cc92945	drm/amdgpu: original raven doesn't support full asic reset So don't use it. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Alex Deucher	7af2a5771e	drm/amdgpu: attempt to enable gfxoff on more raven1 boards (v2) Switch to a blacklist so we can disable specific boards that are problematic. v2: make the blacklist non-raven specific. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Colin Ian King	b20dcd72c1	drm/amd/amdgpu: fix spelling mistake "to" -> "too" There is a spelling mistake in a DRM_ERROR message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
xinhui pan	f583cc57ba	drm/amdgpu: initialize bo_va_list when add gws to process bo_va_list is list_head, so initialize it. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	55bbb747ec	drm/amdgpu/vcn: use inst_idx relacing inst Use inst_idx relacing inst in SOC15_DPG_MODE macro to avoid confusion. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	a455573214	drm/amdgpu/vcn: fix typo error Fix typo error, should be inst_idx instead of inst. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	326b523eeb	drm/amdgpu/vcn: fix vcn2.5 instance issue Fix vcn2.5 instance issue, vcn0 and vcn1 have same register offset Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	62884a7bf3	drm/amdgpu/vcn2.5: fix a bug for the 2nd vcn instance (v2) Fix a bug for the 2nd vcn instance at start and stop. v2: squash in unused label removal. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
James Zhu	b650121726	drm/amdgpu/vcn: Share vcn_v2_0_dec_ring_test_ring to vcn2.5 Share vcn_v2_0_dec_ring_test_ring to vcn2.5 to support vcn software ring. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
Felix Kuehling	fa34edbed4	drm/amdgpu: Use the correct flush_type in flush_gpu_tlb_pasid The flush_type was incorrectly hard-coded to 0 when calling falling back to MMIO-based invalidation in flush_gpu_tlb_pasid. Fixes: `ea930000a6` ("drm/amdgpu: export function to flush TLB via pasid") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:44 -05:00
Felix Kuehling	37c58ddf57	drm/amdgpu: Fix TLB invalidation request when using semaphore Use a more meaningful variable name for the invalidation request that is distinct from the tmp variable that gets overwritten when acquiring the invalidation semaphore. Fixes: `4ed8a03740` ("drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:36 -05:00
Chris Wilson	7e13ad8964	drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count Since drm_global_mutex is a true global mutex across devices, we don't want to acquire it unless absolutely necessary. For maintaining the device local open_count, we can use atomic operations on the counter itself, except when making the transition to/from 0. Here, we tackle the easy portion of delaying acquiring the drm_global_mutex for the final release by using atomic_dec_and_mutex_lock(), leaving the global serialisation across the device opens. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Thomas Hellström <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124130107.125404-1-chris@chris-wilson.co.uk	2020-01-24 17:41:34 +00:00
Alex Deucher	23fe1390c7	drm/amdgpu: remove the experimental flag for renoir Should work properly with the latest sbios on 5.5 and newer kernels. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-23 12:14:53 -05:00
Nirmoy Das	63e3ab9a82	drm/amdgpu: individualize fence allocation per entity Allocate fences for each entity and remove ctx->fences reference as fences should be bound to amdgpu_ctx_entity instead amdgpu_ctx. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Tianci.Yin	7db1d560a4	Revert "drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5)" This reverts commit `9e44147862`. The patch will be replaced with a better solution, revert it. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Tianci.Yin	240c811ccd	drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2) [why] In GDDR6 BIST training, a certain mount of bottom VRAM will be encroached by UMC, that causes problems(like GTT corrupted and page fault observed). [how] Saving the content of this bottom VRAM to system memory before training, and restoring it after training to avoid VRAM corruption. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Nirmoy Das	a9d4fe2fd6	drm/amdgpu: remove unnecessary conversion to bool Better clean that up before some automation starts to complain about it Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
Dennis Li	4c461d89db	drm/amdgpu: add RAS support for the gfx block of Arcturus Implement functions to do the RAS error injection and query EDC counter. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:36:30 -05:00
Dennis Li	504c5e72d7	drm/amdgpu: abstract EDC counter clear to a separated function 1. Add IP prefix for the IP related codes. 2. Refactor the code to clear EDC counter. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:36:14 -05:00
Dennis Li	5e66403e4d	drm/amdgpu: refine the security check for RAS functions To avoid calling RAS related functions when RAS feature isn't supported in hardware. Change to check supported features, instead of checking asic type. v2: reuse amdgpu_ras_is_supported function, instead of introducing a new flag for hardware ras feature. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:36:04 -05:00
Dennis Li	39aa0ef163	drm/amdgpu: enable RAS feature for more mmhub sub-blocks of Acrturus Compared with Vg20, the size of mmhub range is changed from 2 to 8. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:35:56 -05:00
chen gong	e3cd03603d	drm/amdgpu: read gfx register using RREG32_KIQ macro Reading CP_MEM_SLP_CNTL register with RREG32_SOC15 macro will lead to hang when GPU is in "gfxoff" state. I do a uniform substitution here. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:29 -05:00
chen gong	c68dbcd8f9	drm/amdgpu: add kiq version interface for RREG32/WREG32 Reading some registers by mmio will result in hang when GPU is in "gfxoff" state.This problem can be solved by GPU in "ring command packages" way. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:22 -05:00
chen gong	d33a99c4b6	drm/amdgpu: provide a generic function interface for reading/writing register by KIQ Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c, and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and flexible. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:14 -05:00
John Clements	a6c44d2538	drm/amdgpu: added support to get mGPU DRAM base resolves issue with RAS error injection in mGPU configuration Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:07 -05:00
Alex Sierra	36a1707afd	drm/amdgpu: modify packet size for pm4 flush tlbs [Why] PM4 packet size for flush message was oversized. [How] Packet size adjusted to allocate flush + fence packets. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:33:52 -05:00
Pan, Xinhui	bd05221123	drm/amdgpu: add the lost mutex_init back Initialize notifier_lock. Bug: https://gitlab.freedesktop.org/drm/amd/issues/1016 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-17 16:03:09 -05:00
Hawking Zhang	e9d4cf918f	drm/amdgpu: add arcturus to gpu recovery check code path support check if dirver should try gpu recovery for arcturus Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:40:54 -05:00
Hawking Zhang	93af20f74e	drm/amdgpu: check if driver should try recovery in ras recovery path To allow the flexibilty for user to disable gpu recovery in RAS recovery path by module parameter amdgpu_gpu_recovery Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:40:47 -05:00
Evan Quan	2ac0d68697	drm/amd/powerplay: a quick fix for the deadlock issue below NFO: task ocltst:2028 blocked for more than 120 seconds. Tainted: G OE 5.0.0-37-generic #40~18.04.1-Ubuntu echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. cltst D 0 2028 2026 0x00000000 all Trace: __schedule+0x2c0/0x870 schedule+0x2c/0x70 schedule_preempt_disabled+0xe/0x10 __mutex_lock.isra.9+0x26d/0x4e0 __mutex_lock_slowpath+0x13/0x20 ? __mutex_lock_slowpath+0x13/0x20 mutex_lock+0x2f/0x40 amdgpu_dpm_set_powergating_by_smu+0x64/0xe0 [amdgpu] gfx_v8_0_enable_gfx_static_mg_power_gating+0x3c/0x70 [amdgpu] gfx_v8_0_set_powergating_state+0x66/0x260 [amdgpu] amdgpu_device_ip_set_powergating_state+0x62/0xb0 [amdgpu] pp_dpm_force_performance_level+0xe7/0x100 [amdgpu] amdgpu_set_dpm_forced_performance_level+0x129/0x330 [amdgpu] Fixes: `a64c9e15e6` ("drm/amd/powerplay: cleanup the interfaces for powergate setting through SMU") Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-by: Rui Teng <Rui.Teng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:38:23 -05:00
Huang Rui	0e5b7a9528	drm/amdgpu: only set cp active field for kiq queue The mec ucode will set the CP_HQD_ACTIVE bit while the queue is mapped by MAP_QUEUES packet. So we only need set cp active field for kiq queue. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:38:16 -05:00
Alex Deucher	27414cd42a	drm/amdgpu/pm: clean up return types count is size_t so don't use negative values. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:38:02 -05:00
James Zhu	0c0dab86d9	drm/amdgpu/vcn2.5: implement indirect DPG SRAM mode Implement indirect DPG SRAM mode for vcn2.5 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:47 -05:00
James Zhu	8484df9601	drm/amdgpu/vcn2.5: add dpg pause mode Add dpg pause mode support for vcn2.5 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:41 -05:00
James Zhu	d2a2c64f53	drm/amdgpu/vcn2.5: add DPG mode start and stop Add DPG mode start and stop functions for vcn2.5 v2: Correct firmware ucode index in vcn_v2_5_mc_resume_dpg_mode Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:34 -05:00
James Zhu	45cec87cd6	drm/amdgpu/vcn: move macro from vcn2.0 to share amdgpu_vcn (v2) Move macro from vcn2.0 to amdgpu_vcn to share with vcn2.5 v2: squash in macro fix Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:37:34 -05:00
James Zhu	5db86843e8	drm/amdgpu/vcn: support multiple instance direct SRAM read and write (v2) Add multiple instance direct SRAM read and write support for vcn2.5 v2: squash in indexing fix Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:36:47 -05:00
James Zhu	597e6ac3a7	drm/amdgpu/vcn: support multiple-instance dpg pause mode Add multiple-instance dpg pause mode support for VCN2.5 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:51 -05:00
Tianci.Yin	9e44147862	drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5) [why] In dual GPUs scenario, stolen_size is assigned to zero on the secondary GPU, since there is no pre-OS console using that memory. Then the bottom region of VRAM was allocated as GTT, unfortunately a small region of bottom VRAM was encroached by UMC firmware during GDDR6 BIST training, this cause page fault. [how] Forcing stolen_size to 3MB, then the bottom region of VRAM was allocated as stolen memory, GTT corruption avoid. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:37 -05:00
Tianci.Yin	6a1094ab68	drm/amdgpu/gfx10: update gfx golden settings for navi14 remove registers: mmSPI_CONFIG_CNTL add registers: mmSPI_CONFIG_CNTL_1 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:30 -05:00
Tianci.Yin	7b7041f892	drm/amdgpu/gfx10: update gfx golden settings remove registers: mmSPI_CONFIG_CNTL add registers: mmSPI_CONFIG_CNTL_1 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:23 -05:00
shaoyunl	b4df2823ec	drm/amdgpu: check rlc_g firmware pointer is valid before using it In SRIOV, rlc_g firmware is loaded by host, guest driver won't load it which will cause the rlc_fw pointer is null Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:17 -05:00
Christian König	971fe55545	drm/amdgpu: drop amdgpu_job.owner Entirely unused. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:10 -05:00
Nirmoy Das	55414ad5c9	drm/amdgpu: error out on entity with no run queue Disabled HW IP's entity initialized with NULL rq. We should not process any submit request from userspace for a disabled HW IP. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:35:03 -05:00

... 4 5 6 7 8 ...

7400 Commits