Commit Graph

7117 Commits

Author SHA1 Message Date
Jonathan Kim
42d708db8e drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf
hwc->conf was designated specifically for AMD APU IOMMU purposes.  This
could cause problems in performance and/or function since APU IOMMU
implementation is elsewhere.  Also hwc->conf and hwc->config are
different members of an anonymous union so hwc->conf aliases as
hw->last_tag.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:45:05 -05:00
James Zhu
80ff3e10c8 drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1
Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode
power off issue on instance 1.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:44:51 -05:00
Jack Zhang
86b93fd62d drm/amdgpu/sriov Don't send msg when smu suspend
For sriov and pp_onevf_mode, do not send message to set smu
status, because smu doesn't support these messages under VF.

Besides, it should skip smu_suspend when pp_onevf_mode is disabled.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:44:24 -05:00
Hawking Zhang
0b9d37609a drm/amdgpu: move xgmi init/fini to xgmi_add/remove_device call (v2)
For sriov, psp ip block has to be initialized before
ih block for the dynamic register programming interface
that needed for vf ih ring buffer. On the other hand,
current psp ip block hw_init function will initialize
xgmi session which actaully depends on interrupt to
return session context. This results an empty xgmi ta
session id and later failures on all the xgmi ta cmd
invoked from vf. xgmi ta session initialization has to
be done after ih ip block hw_init call.

to unify xgmi session init/fini for both bare-metal
sriov virtualization use scenario, move xgmi ta init
to xgmi_add_device call, and accordingly terminate xgmi
ta session in xgmi_remove_device call.

The existing suspend/resume sequence will not be changed.

v2: squash in return fix from Nirmoy

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-06 15:04:36 -05:00
Christian König
9f3cc18d19 drm/amdgpu: rework synchronization of VM updates v4
If provided we only sync to the BOs reservation
object and no longer to the root PD.

v2: update comment, cleanup amdgpu_bo_sync_wait_resv
v3: use correct reservation object while clearing
v4: fix typo in amdgpu_bo_sync_wait_resv

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
4939d973b6 drm/amdgpu: simplify and fix amdgpu_sync_resv
No matter what we always need to sync to moves.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
fe6796ac12 drm/amdgpu: allow higher level PD invalidations
Allow partial invalidation on unallocated PDs. This is useful when we
need to silence faults to stop interrupt floods on Vega.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
7d28efe0c3 drm/amdgpu: return EINVAL instead of ENOENT in the VM code
That we can't find a PD above the root is expected can only happen if
we try to update a larger range than actually managed by the VM.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
bfcd6c69e4 drm/amdgpu: fix parentheses in amdgpu_vm_update_ptes
For the root PD mask can be 0xffffffff as well which would
overrun to 0 if we don't cast it before we add one.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
46cf5f7626 drm/amdgpu: make sure to never allocate PDs/PTs for invalidations
Make sure that we never allocate a page table for an invalidation operation.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
55cdd4e9b9 drm/amdgpu: drop unnecessary restriction for huge root PDEs
The root PD can also contain huge PDEs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
ef48d4b39e drm/amdgpu: stop using amdgpu_bo_gpu_offset in the VM backend
We need to update page tables without any lock held.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
5d3196605d drm/amdgpu: rework job synchronization v2
For unlocked page table updates we need to be able
to sync to fences of a specific VM.

v2: use SYNC_ALWAYS in the UVD code

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
114fbc3195 drm/amdgpu: use the VM as job owner
For HMM we need to rework how VM synchronization works, so instead of the filp use VM as job owner.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
81417bea87 drm/amdgpu: explicitly sync VM update to PDs/PTs
Explicitly sync VM updates to the moving fence in PDs and PTs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Nathan Chancellor
968162204a drm/amdgpu: Fix implicit enum conversion in gfx_v9_4_ras_error_inject
Clang warns:

../drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c:967:35: warning: implicit
conversion from enumeration type 'enum amdgpu_ras_block' to different
enumeration type 'enum ta_ras_block' [-Wenum-conversion]
        block_info.block_id = info->head.block;
                            ~ ~~~~~~~~~~~^~~~~
1 warning generated.

Use the function added in commit 828cfa2909 ("drm/amdgpu: Fix amdgpu
ras to ta enums conversion") that handles this conversion explicitly.

Fixes: 4c461d89db ("drm/amdgpu: add RAS support for the gfx block of Arcturus")
Link: https://github.com/ClangBuiltLinux/linux/issues/849
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:43 -05:00
Stephen Rothwell
7044cb6c20 amdgpu: using vmalloc requires includeing vmalloc.h
Fixes: 240c811ccd ("drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2)")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:42 -05:00
Nirmoy Das
977f7e1068 drm/amdgpu: allocate entities on demand
Currently we pre-allocate entities and fences for all the HW IPs on
context creation and some of which are might never be used.

This patch tries to resolve entity/fences wastage by creating entity
only when needed.

v2: allocate memory for entity and fences together

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:42 -05:00
Joseph Greathouse
18c6b74e7c drm/amdgpu: Enable DISABLE_BARRIER_WAITCNT for Arcturus
In previous gfx9 parts, S_BARRIER shader instructions are implicitly
S_WAITCNT 0 instructions as well. This setting turns off that
mechanism in Arcturus and beyond. With this, shaders must follow the
ISA guide insofar as putting in explicit S_WAITCNT operations even
after an S_BARRIER.

v2: Fix patch title to list component

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:27 -05:00
Linus Torvalds
9f68e3655a drm pull for 5.6-rc1
uapi:
 - dma-buf heaps added (and fixed)
 - command line add support for panel oreientation
 - command line allow overriding penguin count
 
 drm:
 - mipi dsi definition updates
 - lockdep annotations for dma_resv
 - remove dma-buf kmap/kunmap support
 - constify fb_ops in all fbdev drivers
 - MST fix for daisy chained hotplug-
 - CTA-861-G modes with VIC >= 193 added
 - fix drm_panel_of_backlight export
 - LVDS decoder support
 - more device based logging support
 - scanline alighment for dumb buffers
 - MST DSC helpers
 
 scheduler:
 - documentation fixes
 - job distribution improvements
 
 panel:
 - Logic PD type 28 panel support
 - Jimax8729d MIPI-DSI
 - igenic JZ4770
 - generic DSI devicetree bindings
 - sony acx424AKP panel
 - Leadtek LTK500HD1829
 - xinpeng XPP055C272
 - AUO B116XAK01
 - GiantPlus GPM940B0
 - BOE NV140FHM-N49
 - Satoz SAT050AT40H12R2
 - Sharp LS020B1DD01D panels.
 
 ttm:
 - use blocking WW lock
 
 i915:
 - hw/uapi state separation
 - Lock annotation improvements
 - selftest improvements
 - ICL/TGL DSI VDSC support
 - VBT parsing improvments
 - Display refactoring
 - DSI updates + fixes
 - HDCP 2.2 for CFL
 - CML PCI ID fixes
 - GLK+ fbc fix
 - PSR fixes
 - GEN/GT refactor improvments
 - DP MST fixes
 - switch context id alloc to xarray
 - workaround updates
 - LMEM debugfs support
 - tiled monitor fixes
 - ICL+ clock gating programming removed
 - DP MST disable sequence fixed
 - LMEM discontiguous object maps
 - prefaulting for discontiguous objects
 - use LMEM for dumb buffers if possible
 - add LMEM mmap support
 
 amdgpu:
 - enable sync object timelines for vulkan
 - MST atomic routines
 - enable MST DSC support
 - add DMCUB display microengine support
 - DC OEM i2c support
 - Renoir DC fixes
 - Initial HDCP 2.x support
 - BACO support for Arcturus
 - Use BACO for runtime PM power save
 - gfxoff on navi10
 - gfx10 golden updates and fixes
 - DCN support on POWER
 - GFXOFF for raven1 refresh
 - MM engine idle handlers cleanup
 - 10bpc EDP panel fixes
 - renoir watermark fixes
 - SR-IOV fixes
 - Arcturus VCN fixes
 - GDDR6 training fixes
 - freesync fixes
 - Pollock support
 
 amdkfd:
 - unify more codepath with amdgpu
 - use KIQ to setup HIQ rather than MMIO
 
 radeon:
 - fix vma fault handler race
 - PPC DMA fix
 - register check fixes for r100/r200
 
 nouveau:
 - mmap_sem vs dma_resv fix
 - rewrite the ACR secure boot code for Turing
 - TU10x graphics engine support (TU11x pending)
 - Page kind mapping for turing
 - 10-bit LUT support
 - GP10B Tegra fixes
 - HD audio regression fix
 
 hisilicon/hibmc:
 - use generic fbdev code and helpers
 
 rockchip:
 - dsi/px30 support
 
 virtio:
 - fb damage support
 - static some functions
 
 vc4:
 - use dma_resv lock wrappers
 
 msm:
 - use dma_resv lock wrappers
 - sc7180 display + DSI support
 - a618 support
 - UBWC support improvements
 
 vmwgfx:
 - updates + new logging uapi
 
 exynos:
 - enable/disable callback cleanups
 
 etnaviv:
 - use dma_resv lock wrappers
 
 atmel-hlcdc:
 - clock fixes
 
 mediatek:
 - cmdq support
 - non-smooth cursor fixes
 - ctm property support
 
 sun4i:
 - suspend support
 - A64 mipi dsi support
 
 rcar-du:
 - Color management module support
 - LVDS encoder dual-link support
 - R8A77980 support
 
 analogic:
 - add support for an6345
 
 ast:
 - atomic modeset support
 - primary plane garbage fix
 
 arcgpu:
 - fixes for fourcc handling
 
 tegra:
 - minor fixes and improvments
 
 mcde:
 - vblank support
 
 meson:
 - OSD1 plane AFBC commit
 
 gma500:
 - add pageflip support
 - reomve global drm_dev
 
 komeda:
 - tweak debugfs output
 - d32 support
 - runtime PM suppotr
 
 udl:
 - use generic shmem helpers
 - cleanup and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJeMm6RAAoJEAx081l5xIa+vN8P/0j4jEOv+KIinAhoH+LG3EpD
 m2TUuu5OQIoBrcCoWOgFBk3wqYpw6PdMBdkXh+5sE5lfeBynp8oC3Bin+QsHJE05
 eGBpZtHe+70MQb0Eha+Aic0hchvBKzRnq6i0MYSIHn6afs76dLmF8knTjycxrvV5
 Xu1Z3WDmjzqgWF9ja5JCD6fby11seP5RrwObYKVikO35QQyJJwGSGKgu5rq/pByK
 /n0PCnCOINuL0Lz6J9qexdh/0/XYFQilRC31GJNlKbDSFuECF0GOEzEE/xUBW/pI
 dLh2YwIIygm18Gar9PgvMwXJn3BfzQ0qEJsf+HlQeNw9iLgbHpp2AsTxHTE87OGe
 R/y85taW3jGjPsNOKZOeLpvg/Ro8l8ZipLApvDCG2O22DThg/cd6NDjZxl1FJfRH
 acDG/JdgPo5MbdRAH/cM1WuFS9gEM+0BeSQ5gCjtPakF+X4Vz+ABFDLMRJoaejkJ
 q8DG32TQXELQx0RMghsqK7YCWGfl+2alA1u9w6TgJh9Rq4iVckvpDeqAZnK1Adkc
 87g957Tl0n6FA4wJj/t5jrceiLRMJAm/rBK+R3GZNfWrgx4bHbCmb4fZDZsrFzph
 nbAjNJ5kOchrFCaRR47ULby6+Q14MAFbkWq4Crfu4YDdzUkTPpep6pi2GIe8w0rV
 P0hdYOYJf6LUda0utuQX
 =oFrI
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Davbe Airlie:
 "This is the main pull request for graphics for 5.6. Usual selection of
  changes all over.

  I've got one outstanding vmwgfx pull that touches mm so kept it
  separate until after all of this lands. I'll try and get it to you
  soon after this, but it might be early next week (nothing wrong with
  code, just my schedule is messy)

  This also hits a lot of fbdev drivers with some cleanups.

  Other notables:
   - vulkan timeline semaphore support added to syncobjs
   - nouveau turing secureboot/graphics support
   - Displayport MST display stream compression support

  Detailed summary:

  uapi:
   - dma-buf heaps added (and fixed)
   - command line add support for panel oreientation
   - command line allow overriding penguin count

  drm:
   - mipi dsi definition updates
   - lockdep annotations for dma_resv
   - remove dma-buf kmap/kunmap support
   - constify fb_ops in all fbdev drivers
   - MST fix for daisy chained hotplug-
   - CTA-861-G modes with VIC >= 193 added
   - fix drm_panel_of_backlight export
   - LVDS decoder support
   - more device based logging support
   - scanline alighment for dumb buffers
   - MST DSC helpers

  scheduler:
   - documentation fixes
   - job distribution improvements

  panel:
   - Logic PD type 28 panel support
   - Jimax8729d MIPI-DSI
   - igenic JZ4770
   - generic DSI devicetree bindings
   - sony acx424AKP panel
   - Leadtek LTK500HD1829
   - xinpeng XPP055C272
   - AUO B116XAK01
   - GiantPlus GPM940B0
   - BOE NV140FHM-N49
   - Satoz SAT050AT40H12R2
   - Sharp LS020B1DD01D panels.

  ttm:
   - use blocking WW lock

  i915:
   - hw/uapi state separation
   - Lock annotation improvements
   - selftest improvements
   - ICL/TGL DSI VDSC support
   - VBT parsing improvments
   - Display refactoring
   - DSI updates + fixes
   - HDCP 2.2 for CFL
   - CML PCI ID fixes
   - GLK+ fbc fix
   - PSR fixes
   - GEN/GT refactor improvments
   - DP MST fixes
   - switch context id alloc to xarray
   - workaround updates
   - LMEM debugfs support
   - tiled monitor fixes
   - ICL+ clock gating programming removed
   - DP MST disable sequence fixed
   - LMEM discontiguous object maps
   - prefaulting for discontiguous objects
   - use LMEM for dumb buffers if possible
   - add LMEM mmap support

  amdgpu:
   - enable sync object timelines for vulkan
   - MST atomic routines
   - enable MST DSC support
   - add DMCUB display microengine support
   - DC OEM i2c support
   - Renoir DC fixes
   - Initial HDCP 2.x support
   - BACO support for Arcturus
   - Use BACO for runtime PM power save
   - gfxoff on navi10
   - gfx10 golden updates and fixes
   - DCN support on POWER
   - GFXOFF for raven1 refresh
   - MM engine idle handlers cleanup
   - 10bpc EDP panel fixes
   - renoir watermark fixes
   - SR-IOV fixes
   - Arcturus VCN fixes
   - GDDR6 training fixes
   - freesync fixes
   - Pollock support

  amdkfd:
   - unify more codepath with amdgpu
   - use KIQ to setup HIQ rather than MMIO

  radeon:
   - fix vma fault handler race
   - PPC DMA fix
   - register check fixes for r100/r200

  nouveau:
   - mmap_sem vs dma_resv fix
   - rewrite the ACR secure boot code for Turing
   - TU10x graphics engine support (TU11x pending)
   - Page kind mapping for turing
   - 10-bit LUT support
   - GP10B Tegra fixes
   - HD audio regression fix

  hisilicon/hibmc:
   - use generic fbdev code and helpers

  rockchip:
   - dsi/px30 support

  virtio:
   - fb damage support
   - static some functions

  vc4:
   - use dma_resv lock wrappers

  msm:
   - use dma_resv lock wrappers
   - sc7180 display + DSI support
   - a618 support
   - UBWC support improvements

  vmwgfx:
   - updates + new logging uapi

  exynos:
   - enable/disable callback cleanups

  etnaviv:
   - use dma_resv lock wrappers

  atmel-hlcdc:
   - clock fixes

  mediatek:
   - cmdq support
   - non-smooth cursor fixes
   - ctm property support

  sun4i:
   - suspend support
   - A64 mipi dsi support

  rcar-du:
   - Color management module support
   - LVDS encoder dual-link support
   - R8A77980 support

  analogic:
   - add support for an6345

  ast:
   - atomic modeset support
   - primary plane garbage fix

  arcgpu:
   - fixes for fourcc handling

  tegra:
   - minor fixes and improvments

  mcde:
   - vblank support

  meson:
   - OSD1 plane AFBC commit

  gma500:
   - add pageflip support
   - reomve global drm_dev

  komeda:
   - tweak debugfs output
   - d32 support
   - runtime PM suppotr

  udl:
   - use generic shmem helpers
   - cleanup and fixes"

* tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits)
  drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing
  drm/nouveau/acr: return error when registering LSF if ACR not supported
  drm/nouveau/disp/gv100-: not all channel types support reporting error codes
  drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
  drm/nouveau: support synchronous pushbuf submission
  drm/nouveau: signal pending fences when channel has been killed
  drm/nouveau: reject attempts to submit to dead channels
  drm/nouveau: zero vma pointer even if we only unreference it rather than free
  drm/nouveau: Add HD-audio component notifier support
  drm/nouveau: fix build error without CONFIG_IOMMU_API
  drm/nouveau/kms/nv04: remove set but not used variable 'width'
  drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector'
  drm/nouveau/mmu: fix comptag memory leak
  drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc
  drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping
  drm/exynos: Rename Exynos to lowercase
  drm/exynos: change callback names
  drm/mst: Don't do atomic checks over disabled managers
  drm/amdgpu: add the lost mutex_init back
  drm/amd/display: skip opp blank or unblank if test pattern enabled
  ...
2020-01-30 08:04:01 -08:00
Alex Deucher
2cb44fb093 drm/amdgpu: enable GPU reset by default on renoir
Everything is in place.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Alex Deucher
658c663947 drm/amdgpu: enable GPU reset by default on Navi
Has been working fine for a while.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Christian König
77171eade8 drm/amdgpu: add coreboot workaround for KV/KB
Coreboot seems to have a problem correctly setting up access to the stolen VRAM
on KV/KB. Use the direct access only when necessary.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-and-tested-by: Fredrik Bruhn <fredrik.bruhn@unibap.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Alex Deucher
276cc92945 drm/amdgpu: original raven doesn't support full asic reset
So don't use it.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Alex Deucher
7af2a5771e drm/amdgpu: attempt to enable gfxoff on more raven1 boards (v2)
Switch to a blacklist so we can disable specific boards
that are problematic.

v2: make the blacklist non-raven specific.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Colin Ian King
b20dcd72c1 drm/amd/amdgpu: fix spelling mistake "to" -> "too"
There is a spelling mistake in a DRM_ERROR message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
xinhui pan
f583cc57ba drm/amdgpu: initialize bo_va_list when add gws to process
bo_va_list is list_head, so initialize it.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
55bbb747ec drm/amdgpu/vcn: use inst_idx relacing inst
Use inst_idx relacing inst in SOC15_DPG_MODE macro to avoid confusion.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
a455573214 drm/amdgpu/vcn: fix typo error
Fix typo error, should be inst_idx instead of inst.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
326b523eeb drm/amdgpu/vcn: fix vcn2.5 instance issue
Fix vcn2.5 instance issue, vcn0 and vcn1 have same register offset

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
62884a7bf3 drm/amdgpu/vcn2.5: fix a bug for the 2nd vcn instance (v2)
Fix a bug for the 2nd vcn instance at start and stop.

v2: squash in unused label removal.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
b650121726 drm/amdgpu/vcn: Share vcn_v2_0_dec_ring_test_ring to vcn2.5
Share vcn_v2_0_dec_ring_test_ring to vcn2.5 to support
vcn software ring.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
Felix Kuehling
fa34edbed4 drm/amdgpu: Use the correct flush_type in flush_gpu_tlb_pasid
The flush_type was incorrectly hard-coded to 0 when calling falling back
to MMIO-based invalidation in flush_gpu_tlb_pasid.

Fixes: ea930000a6 ("drm/amdgpu: export function to flush TLB via pasid")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
Felix Kuehling
37c58ddf57 drm/amdgpu: Fix TLB invalidation request when using semaphore
Use a more meaningful variable name for the invalidation request
that is distinct from the tmp variable that gets overwritten when
acquiring the invalidation semaphore.

Fixes: 4ed8a03740 ("drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:36 -05:00
Chris Wilson
7e13ad8964 drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
Since drm_global_mutex is a true global mutex across devices, we don't
want to acquire it unless absolutely necessary. For maintaining the
device local open_count, we can use atomic operations on the counter
itself, except when making the transition to/from 0. Here, we tackle the
easy portion of delaying acquiring the drm_global_mutex for the final
release by using atomic_dec_and_mutex_lock(), leaving the global
serialisation across the device opens.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200124130107.125404-1-chris@chris-wilson.co.uk
2020-01-24 17:41:34 +00:00
Alex Deucher
23fe1390c7 drm/amdgpu: remove the experimental flag for renoir
Should work properly with the latest sbios on 5.5 and newer
kernels.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-23 12:14:53 -05:00
Nirmoy Das
63e3ab9a82 drm/amdgpu: individualize fence allocation per entity
Allocate fences for each entity and remove ctx->fences reference as
fences should be bound to amdgpu_ctx_entity instead amdgpu_ctx.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Tianci.Yin
7db1d560a4 Revert "drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5)"
This reverts commit 9e44147862.

The patch will be replaced with a better solution, revert it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Tianci.Yin
240c811ccd drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2)
[why]
In GDDR6 BIST training, a certain mount of bottom VRAM will be encroached by
UMC, that causes problems(like GTT corrupted and page fault observed).

[how]
Saving the content of this bottom VRAM to system memory before training, and
restoring it after training to avoid VRAM corruption.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Nirmoy Das
a9d4fe2fd6 drm/amdgpu: remove unnecessary conversion to bool
Better clean that up before some automation starts to complain about it

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Dennis Li
4c461d89db drm/amdgpu: add RAS support for the gfx block of Arcturus
Implement functions to do the RAS error injection and
query EDC counter.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:36:30 -05:00
Dennis Li
504c5e72d7 drm/amdgpu: abstract EDC counter clear to a separated function
1. Add IP prefix for the IP related codes.
2. Refactor the code to clear EDC counter.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:36:14 -05:00
Dennis Li
5e66403e4d drm/amdgpu: refine the security check for RAS functions
To avoid calling RAS related functions when RAS feature isn't
supported in hardware. Change to check supported features, instead
of checking asic type.

v2: reuse amdgpu_ras_is_supported function, instead of introducing
a new flag for hardware ras feature.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:36:04 -05:00
Dennis Li
39aa0ef163 drm/amdgpu: enable RAS feature for more mmhub sub-blocks of Acrturus
Compared with Vg20, the size of mmhub range is changed from 2 to 8.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:35:56 -05:00
chen gong
e3cd03603d drm/amdgpu: read gfx register using RREG32_KIQ macro
Reading CP_MEM_SLP_CNTL register with RREG32_SOC15 macro will lead to
hang when GPU is in "gfxoff" state.
I do a uniform substitution here.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:29 -05:00
chen gong
c68dbcd8f9 drm/amdgpu: add kiq version interface for RREG32/WREG32
Reading some registers by mmio will result in hang when GPU is in
"gfxoff" state.This problem can be solved by GPU in "ring command
packages" way.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:22 -05:00
chen gong
d33a99c4b6 drm/amdgpu: provide a generic function interface for reading/writing register by KIQ
Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c,
and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and
flexible.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:14 -05:00
John Clements
a6c44d2538 drm/amdgpu: added support to get mGPU DRAM base
resolves issue with RAS error injection in mGPU configuration

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:07 -05:00
Alex Sierra
36a1707afd drm/amdgpu: modify packet size for pm4 flush tlbs
[Why]
PM4 packet size for flush message was oversized.

[How]
Packet size adjusted to allocate flush + fence packets.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:33:52 -05:00
Pan, Xinhui
bd05221123 drm/amdgpu: add the lost mutex_init back
Initialize notifier_lock.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/1016
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-17 16:03:09 -05:00
Hawking Zhang
e9d4cf918f drm/amdgpu: add arcturus to gpu recovery check code path
support check if dirver should try gpu recovery for
arcturus

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:40:54 -05:00
Hawking Zhang
93af20f74e drm/amdgpu: check if driver should try recovery in ras recovery path
To allow the flexibilty for user to disable gpu recovery
in RAS recovery path by module parameter amdgpu_gpu_recovery

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:40:47 -05:00
Evan Quan
2ac0d68697 drm/amd/powerplay: a quick fix for the deadlock issue below
NFO: task ocltst:2028 blocked for more than 120 seconds.
     Tainted: G           OE     5.0.0-37-generic #40~18.04.1-Ubuntu
echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cltst          D    0  2028   2026 0x00000000
all Trace:
__schedule+0x2c0/0x870
schedule+0x2c/0x70
schedule_preempt_disabled+0xe/0x10
__mutex_lock.isra.9+0x26d/0x4e0
__mutex_lock_slowpath+0x13/0x20
? __mutex_lock_slowpath+0x13/0x20
mutex_lock+0x2f/0x40
amdgpu_dpm_set_powergating_by_smu+0x64/0xe0 [amdgpu]
gfx_v8_0_enable_gfx_static_mg_power_gating+0x3c/0x70 [amdgpu]
gfx_v8_0_set_powergating_state+0x66/0x260 [amdgpu]
amdgpu_device_ip_set_powergating_state+0x62/0xb0 [amdgpu]
pp_dpm_force_performance_level+0xe7/0x100 [amdgpu]
amdgpu_set_dpm_forced_performance_level+0x129/0x330 [amdgpu]

Fixes: a64c9e15e6 ("drm/amd/powerplay: cleanup the interfaces for powergate setting through SMU")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: Rui Teng <Rui.Teng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:38:23 -05:00
Huang Rui
0e5b7a9528 drm/amdgpu: only set cp active field for kiq queue
The mec ucode will set the CP_HQD_ACTIVE bit while the queue is mapped by
MAP_QUEUES packet. So we only need set cp active field for kiq queue.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:38:16 -05:00
Alex Deucher
27414cd42a drm/amdgpu/pm: clean up return types
count is size_t so don't use negative values.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:38:02 -05:00
James Zhu
0c0dab86d9 drm/amdgpu/vcn2.5: implement indirect DPG SRAM mode
Implement indirect DPG SRAM mode for vcn2.5

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:47 -05:00
James Zhu
8484df9601 drm/amdgpu/vcn2.5: add dpg pause mode
Add dpg pause mode support for vcn2.5

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:41 -05:00
James Zhu
d2a2c64f53 drm/amdgpu/vcn2.5: add DPG mode start and stop
Add DPG mode start and stop functions for vcn2.5

v2: Correct firmware ucode index in vcn_v2_5_mc_resume_dpg_mode

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:34 -05:00
James Zhu
45cec87cd6 drm/amdgpu/vcn: move macro from vcn2.0 to share amdgpu_vcn (v2)
Move macro from vcn2.0 to amdgpu_vcn to share with vcn2.5

v2: squash in macro fix

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:34 -05:00
James Zhu
5db86843e8 drm/amdgpu/vcn: support multiple instance direct SRAM read and write (v2)
Add multiple instance direct SRAM read and write support for vcn2.5

v2: squash in indexing fix

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:36:47 -05:00
James Zhu
597e6ac3a7 drm/amdgpu/vcn: support multiple-instance dpg pause mode
Add multiple-instance dpg pause mode support for VCN2.5

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:51 -05:00
Tianci.Yin
9e44147862 drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5)
[why]
In dual GPUs scenario, stolen_size is assigned to zero on the secondary GPU,
since there is no pre-OS console using that memory. Then the bottom region of
VRAM was allocated as GTT, unfortunately a small region of bottom VRAM was
encroached by UMC firmware during GDDR6 BIST training, this cause page fault.

[how]
Forcing stolen_size to 3MB, then the bottom region of VRAM was
allocated as stolen memory, GTT corruption avoid.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:37 -05:00
Tianci.Yin
6a1094ab68 drm/amdgpu/gfx10: update gfx golden settings for navi14
remove registers: mmSPI_CONFIG_CNTL
add registers: mmSPI_CONFIG_CNTL_1

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:30 -05:00
Tianci.Yin
7b7041f892 drm/amdgpu/gfx10: update gfx golden settings
remove registers: mmSPI_CONFIG_CNTL
add registers: mmSPI_CONFIG_CNTL_1

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:23 -05:00
shaoyunl
b4df2823ec drm/amdgpu: check rlc_g firmware pointer is valid before using it
In SRIOV, rlc_g firmware is loaded by host, guest driver won't load it which will
cause the rlc_fw pointer is null

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:17 -05:00
Christian König
971fe55545 drm/amdgpu: drop amdgpu_job.owner
Entirely unused.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:10 -05:00
Nirmoy Das
55414ad5c9 drm/amdgpu: error out on entity with no run queue
Disabled HW IP's entity initialized with NULL rq. We should not
process any submit request from userspace for a disabled HW IP.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:03 -05:00
Huang Rui
8eee00f615 drm/amdkfd: use map_queues for hiq on gfx v10 as well
To align with gfx v9, we use the map_queues packet to load hiq MQD.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:57 -05:00
Aaron Liu
35cd89d5a6 drm/amdkfd: use kiq to load the mqd of hiq queue for gfx v9 (v6)
There is an issue that CP will check the HIQ queue to be configured and mapped
with KIQ ring, otherwise, it will be unable to read back the secure buffer while
the gfxoff is enabled even with trusted IP blocks.

v1 -> v2:
- Fix to remove surplus set_resources packets.
- Fill the whole configuration in MQD.
- Change the author as Aaron because he addressed the key point of this issue.
- Add kiq ring lock.

v2 -> v3:
- Free the lock while in error return case.
- Remove the programming only needed by the queue is unmapped.

v3 -> v4:
- Remove doorbell programming because it's used for restarting queue.
- Remove CP scheduler programming because map_queue packet will handle this.

v4 -> v5:
- Remove cp_hqd_active because mec ucode will enable it while use map_queues.
- Revise goto out_unlock.
- Correct the right doorbell offset for HIQ that kfd driver assigned in the
  packet.

v5 -> v6:
- Merge Arcturus fix into this patch because it will get oops in Arcturus
  platform.

Reported-by: Lisa Saturday <Lisa.Saturday@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:50 -05:00
Alex Sierra
d175e9acf6 drm/amdgpu: flush TLB functions removal from kfd2kgd interface
[Why]
kfd2kgd interface will be deprecated. This removal only covers TLB
invalidation for now. They have been replaced in amdgpu_amdkfd API.

[How]
TLB invalidate functions removed from the different amdkfd_gfx_v*
versions.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:42 -05:00
Alex Sierra
ffa022696f drm/amdgpu: GPU TLB flush API moved to amdgpu_amdkfd
[Why]
TLB flush method has been deprecated using kfd2kgd interface.
This implementation is now on the amdgpu_amdkfd API.

[How]
TLB flush functions now implemented in amdgpu_amdkfd.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:33 -05:00
Alex Sierra
ea930000a6 drm/amdgpu: export function to flush TLB via pasid
This can be used directly from amdgpu and amdkfd to invalidate
TLB through pasid.
It supports gmc v7, v8, v9 and v10.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:27 -05:00
Alex Sierra
4f01f1e58e drm/amdgpu: replace kcq enable/disable functions on gfx_v9
[Why]
There are HW-indpendent functions that enables and disables kcq. These functions use
the kiq_pm4_funcs implementation.

[How]
Local kcq enable and disable functions removed and replace it by the generic kcq
enable under amdgpu_gfx

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:19 -05:00
Alex Sierra
58e508b6be drm/amdgpu: implement tlbs invalidate on gfx9 gfx10
tlbs invalidate pointer function added to kiq_pm4_funcs struct.
This way, tlb flush can be done through kiq member.
TLBs invalidatation implemented for gfx9 and gfx10.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:11 -05:00
Alex Sierra
f167ea6a14 drm/amdgpu: kiq pm4 function implementation for gfx_v9
Functions implemented from kiq_pm4_funcs struct members
for gfx_v9 version.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:04 -05:00
Alex Sierra
a269e44989 drm/amdgpu: Avoid reclaim fs while eviction lock
[Why]
Avoid reclaim filesystem while eviction lock is held called from
MMU notifier.

[How]
Setting PF_MEMALLOC_NOFS flags while eviction mutex is locked.
Using memalloc_nofs_save / memalloc_nofs_restore API.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:33:16 -05:00
Christian König
5e791166d3 drm/ttm: nuke invalidate_caches callback
Another completely unused feature.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/348265/
2020-01-16 16:35:07 +01:00
Aaron Liu
f2360e333b drm/amdgpu: update goldensetting for renoir
Update mmSDMA0_UTCL1_WATERMK golden setting for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-15 17:52:21 -05:00
Alex Deucher
a9ffe2a983 drm/amdgpu/debugfs: properly handle runtime pm
If driver debugfs files are accessed, power up the GPU
when necessary.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:20:34 -05:00
Alex Deucher
b9a9294b91 drm/amdgpu/pm: properly handle runtime pm
If power management sysfs or debugfs files are accessed,
power up the GPU when necessary.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:20:34 -05:00
Flora Cui
f81110b852 drm/amdgpu: add header file for macro SZ_1M
Fixes: 4dee6e4ca5 ("drm/amdgpu: use linux size macro to simplify ONE_Kib & One_Mib")
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:58 -05:00
Alex Deucher
a2e4b418c6 drm/amdgpu/psp: declare navi1x ta firmware
So that it gets included in the initrd.  At the moment
this is optional firmware that contains support for HDCP.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:58 -05:00
Joseph Greathouse
22d39fe729 drm/amdgpu: Match TC hash settings to DF settings (v2)
On Arcturus, data fabric hashing is set by the VBIOS, and
affects which addresses map to which memory channels. The
gfx core's caches also need to know this mapping, but the
hash settings for these these caches is set by the driver.

This change queries the DF to understand how the VBIOS
configured DF, then matches the TC hash configuration bits
to do the same thing.

v2: squash in warning fix

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:58 -05:00
Joseph Greathouse
bdf84a80e0 drm/amdgpu: Create generic DF struct in adev
The only data fabric information the adev struct currently
contains is a function pointer table. In the near future,
we will be adding some cached DF information into adev. As
such, this patch creates a new amdgpu_df struct for adev.
Right now, it only containst the old function pointer table,
but new stuff will be added soon.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:41 -05:00
John Clements
eee2eabafe drm/amdgpu: preserve RSMU UMC index mode state
between UMC RAS err register access restore previous RSMU UMC index mode state

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
John Clements
9c8c81fe7d drm/amdgpu: disable XGMI TA unload for arcturus
in event of GPU reset, XGMI TA unload causes unrecoverable GPU hang

Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Aaron Liu
d8459d1b7f drm/amdgpu: update goldensetting for renoir
Update mmSDMA0_UTCL1_WATERMK golden setting for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Alex Deucher
1499bcc7a2 drm/amdgpu/gmc10: free stolen memory in late_init
We don't need to store the pre-OS console memory after
the driver has loaded so free it.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Alex Deucher
bbde7162f7 drm/amdgpu/gmc10: remove dead code
Leftover from bring up.  We look up the actual pre-OS memory usage
value later in the same function.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Alex Deucher
403c1ef0d2 drm/amdgpu: enable S/G display on PCO and RV2 (v2)
It should work on all Raven variants, but some users have
reported issues with original Raven with IOMMU enabled.
So far there have been no issues observed with PCO or RV2.

v2: split out the dm init and domain changes into separate
    patches.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Alex Deucher
d44394a9e1 drm/amdgpu/gfx9: remove unused sdma headers
All of the sdma stuff these were used for moves to
the sdma code, so remove them.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Hawking Zhang
49da2ccd2d drm/amdgpu: check sdma ras funcs pointer before accessing
sdma ras funcs are not supported by ASIC prior
to vega20

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Guchun Chen
5d4667ec33 drm/amdgpu: calculate MCUMC_ADDRT0 per asic's UMC offset
Hardcoded offset is not friendly. And another benifit of this
patch is to keep read and write access to this register be
consistent with other similar UMC regsiters  in this file.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Tiecheng Zhou
df5e984c8b drm/amdgpu/sriov: workaround on rev_id for Navi12 under sriov
guest vm gets 0xffffffff when reading RCC_DEV0_EPF0_STRAP0,
as a consequence, the rev_id and external_rev_id are wrong.

workaround it by hardcoding the rev_id to 0, which is the default value.

v2. add comment in the code

Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Hawking Zhang
5e62db9df6 drm/amdgpu: read sdma edc counter to clear the counters
SDMA edc counter registers were added in gfx edc counters
array. When querying gfx error counter in that array, there
is no way to differentiate sdma instance number for different
asic and then results to NULL pointer access when trying to
read sdma register base address for instances greater
than 2 on Vega20.
In addition, this also results to wrong gfx error counters
since it actually added sdma edc counters.
Therefore, sdma edc counter registers should be separated
from gfx edc counter regsiter array and only get initialized
when driver tries to enable sdma ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Hawking Zhang
1dd5ead294 drm/amdgpu: add ras_late_init and ras_fini for sdma v4
move ras_late_init and ras_fini to sdma_ras_funcs table

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Hawking Zhang
3e81ee9a78 drm/amdgpu: support error reporting for sdma ip block
invoke sdma query_ras_error_count to get sdma single
bit error count

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Hawking Zhang
93070deb58 drm/amdgpu: add query_ras_error_count function for sdma v4
query_ras_error_count function will be invoked to query
single bit error count detected in sdma ip block

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Leo Liu
e7ddb87848 drm/amdgpu: enable VCN2.5 IP block for Arcturus
With default PSP FW loading

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Leo Liu
2d6605911d drm/amdgpu/vcn2.5: fix PSP FW loading for the second instance
ucodes for instances are from different location

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Nirmoy Das
5021e9a831 drm/amdgpu: catch amdgpu_irq_add_id failure
Do not ignore amdgpu_irq_add_id return value while registering
VMC page fault interrupt.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Evan Quan
9530273ec9 drm/amd/powerplay: cover the powerplay implementation details V3
This can save users much troubles. As they do not
actually need to care whether swSMU or traditional
powerplay routine should be used.

V2: apply the fixes to vi.c and cik.c also
V3: squash in oops fix

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Yong Zhao
a434b94c5a drm/amdkfd: Improve function get_sdma_rlc_reg_offset() (v2)
The SOC15_REG_OFFSET() macro needs to dereference adev->reg_offset[IP]
pointer, which is sometimes NULL when there are fewer than 8 sdma engines.
Avoid that by not initializing the array regardless.

v2: squash in warning fixes

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:17:05 -05:00
Alex Deucher
2cacd20e91 Revert "drm/amdgpu: Set no-retry as default."
This reverts commit 51bfac71ca.

This causes stability issues on some raven boards.  Revert
for now until a proper fix is completed.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/934
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206017
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:04:02 -05:00
Alex Deucher
48ccd5ffe5 drm/amdgpu/gfx: simplify old firmware warning
Put it on one line to avoid whitespace issues when
printing in the log.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:03:54 -05:00
Alex Deucher
5677c52090 drm/amdgpu/gmc10: use common invalidation engine helper
Rather than open coding it.  This also changes the free masks
to better reflect the usage by other components.

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:03:47 -05:00
Alex Deucher
bdbe90f04d drm/amdgpu/gmc: move invaliation bitmap setup to common code
So it can be shared with newer GMC versions.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:03:42 -05:00
John Clements
c8aa6ae30c drm/amdgpu: updated UMC error address record with correct channel index
defined macros for repetitive for loops

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:02:08 -05:00
John Clements
0ee51f1d94 drm/amdgpu: resolved bug in UMC RAS CE query
switch CE counter register access' to use SMN

disable UMC indexing mode

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:02:01 -05:00
Evan Quan
a64c9e15e6 drm/amd/powerplay: cleanup the interfaces for powergate setting through SMU
Provided an unified entry point. And fixed the confusing that the API
usage is conflict with what the naming implies.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:01:54 -05:00
Zhigang Luo
25344d7e98 drm/amd/amdgpu: L1 Policy(5/5) - removed IH_CHICKEN from VF
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Jane Jian <jane.jian@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:00:46 -05:00
Zhigang Luo
2ee9403e81 drm/amd/amdgpu: L1 Policy(3/5) - removed ECC interrupt from VF
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Jane Jian <jane.jian@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:00:40 -05:00
Zhigang Luo
08546895bc drm/amd/amdgpu: L1 Policy(2/5) - removed GC GRBM violations from gfxhub
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Jane Jian <jane.jian@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:00:33 -05:00
Zhigang Luo
20bf2f6fef drm/amd/amdgpu: L1 Policy(1/5) - removed VM settings for mmhub and gfxhub from VF
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Jane Jian <jane.jian@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:00:17 -05:00
John Clements
61130c7432 drm/amdgpu: removed GFX RAS support check in UMC ECC callback
enable GPU recovery in event of uncorrectable UMC error

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:59:37 -05:00
John Clements
097dc53ee9 drm/amdgpu: added function to wait for PSP BL availability
reduced duplicate code

increased wait time for PSP BL readiness

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:50 -05:00
Kevin Wang
4dee6e4ca5 drm/amdgpu: use linux size macro to simplify ONE_Kib & One_Mib
replace internal size macro with linux size macro

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:44 -05:00
John Clements
bd68fb94b3 drm/amdgpu: resolve bug in UMC 6 error counter query
iterate over all error counter registers in SMN space

removed support error counter access via MMIO

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:37 -05:00
Jack Zhang
895bd048fb amd/amdgpu/sriov tdr enablement with pp_onevf_mode
Under sriov and pp_onevf mode,
1.take resume instead of hw_init for smc recover to avoid
potential memory leak.

2.add return condition inside smc resume function for
sriov_pp_onevf_mode and pm_enabled param.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:19 -05:00
Jack Zhang
c2a801af31 amd/amdgpu/sriov enable onevf mode for ARCTURUS VF
Before, initialization of smu ip block would be skipped
for sriov ASICs. But if there's only one VF being used,
guest driver should be able to dump some HW info such as
clks, temperature,etc.

To solve this, now after onevf mode is enabled, host
driver will notify guest. If it's onevf mode, guest will
do smu hw_init and skip some steps in normal smu hw_init
flow because host driver has already done it for smu.

With this fix, guest app can talk with smu and dump hw
information from smu.

v2: refine the logic for pm_enabled.Skip hw_init by not
changing pm_enabled.
v3: refine is_support_sw_smu and fix some indentation
issue.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:10 -05:00
John Clements
955c712007 drm/amdgpu: update UMC 6.1 RAS error counter register access path
use proper method for SMN register access

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:48 -05:00
John Clements
34e48caee4 drm/amdgpu: amalgamated PSP TA invoke functions
reduce duplicate code

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:15 -05:00
John Clements
1f455f2580 drm/amdgpu: amalgamate PSP TA load/unload functions
reduce duplicate code

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:09 -05:00
John Clements
e6e193c00d drm/amdgpu: by default output PSP ret status in event of cmd failure
update log level from DRM_DEBUG_DRIVER to DRM_WARN

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:01 -05:00
Evan Quan
0753e56e9a drm/amdgpu: correct RLC firmwares loading sequence
Per confirmation with RLC firmware team, the RLC should
be unhalted after all RLC related firmwares uploaded.
However, in fact the RLC is unhalted immediately after
RLCG firmware uploaded. And that may causes unexpected
PSP hang on loading the succeeding RLC save restore
list related firmwares.
So, we correct the firmware loading sequence to load
RLC save restore list related firmwares before RLCG
ucode. That will help to get around this issue.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:56:52 -05:00
Evan Quan
b8ab58f350 drm/amd/powerplay: add check for baco support on Arcturus
This is used to determine whether runtime pm can be
supported or not.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:56:43 -05:00
Alex Deucher
c00ca07f64 Revert "drm/amdgpu: simplify ATPX detection"
This reverts commit f5fda6d89a.

You can't use BASE_CLASS in pci_get_class.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/995
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:55:56 -05:00
Guchun Chen
8831fa6e89 drm/amdgpu: simplify function return logic
Former return logic is redundant.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:55:29 -05:00
Chunming Zhou
b6025eeaa1 drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu
Can expose it now that the khronos has exposed the
vlk extension.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:54:30 -05:00
Chunming Zhou
db4ff423cd drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu
Can expose it now that the khronos has exposed the
vlk extension.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-01-07 11:33:32 -05:00
Alex Deucher
7aec9ec1cf Revert "drm/amdgpu: Set no-retry as default."
This reverts commit 51bfac71ca.

This causes stability issues on some raven boards.  Revert
for now until a proper fix is completed.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/934
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206017
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-01-07 11:27:03 -05:00
Evan Quan
969e115292 drm/amdgpu: correct RLC firmwares loading sequence
Per confirmation with RLC firmware team, the RLC should
be unhalted after all RLC related firmwares uploaded.
However, in fact the RLC is unhalted immediately after
RLCG firmware uploaded. And that may causes unexpected
PSP hang on loading the succeeding RLC save restore
list related firmwares.
So, we correct the firmware loading sequence to load
RLC save restore list related firmwares before RLCG
ucode. That will help to get around this issue.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-01-01 09:26:09 -05:00
changzhu
e0c6381235 drm/amdgpu: enable gfxoff for raven1 refresh
When smu version is larger than 0x41e2b, it will load
raven_kicker_rlc.bin.To enable gfxoff for raven_kicker_rlc.bin,it
needs to avoid adev->pm.pp_feature &= ~PP_GFXOFF_MASK when it loads
raven_kicker_rlc.bin.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-29 02:00:27 -05:00
Alex Deucher
5d30ed3c2c Revert "drm/amdgpu: simplify ATPX detection"
This reverts commit f5fda6d89a.

You can't use BASE_CLASS in pci_get_class.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/995
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-29 01:53:27 -05:00
zhengbin
e95cd6b2ac drm/amdgpu: use true, false for bool variable in amdgpu_psp.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:674:2-26: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:794:1-25: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:897:2-36: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1016:1-35: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1087:2-34: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1177:1-33: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 15:00:02 -05:00
zhengbin
c5b2bd5d39 drm/amdgpu: use true, false for bool variable in amdgpu_debugfs.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:132:2-10: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:140:2-10: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:142:13-21: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 15:00:01 -05:00
zhengbin
2a9b90ae47 drm/amdgpu: use true, false for bool variable in amdgpu_device.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3961:1-19: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3981:1-19: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 15:00:01 -05:00
zhengbin
6df3dab619 drm/amdgpu: use true, false for bool variable in mxgpu_nv.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:255:2-20: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:267:2-20: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 15:00:01 -05:00
zhengbin
eb28038cc6 drm/amdgpu: use true, false for bool variable in mxgpu_ai.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:253:2-20: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:265:2-20: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 15:00:00 -05:00
Guchun Chen
46cf2fecf5 drm/amdgpu: add missed return value set for error case
Return value should be set when going to error handle tag
for error case, this can avoid potential invalid array
access by upper caller.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:51 -05:00
Frank.Min
55d62fe10f drm/amdgpu: remove FB location config for sriov
FB location is already programmed by HV driver
for arcutus so remove this part

Signed-off-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:43 -05:00
Frank.Min
fdf57ba690 drm/amdgpu: enable xgmi init for sriov use case
1. enable xgmi ta initialization for sriov
2. enable xgmi initialization for sriov

Signed-off-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:37 -05:00
Tianci.Yin
33a9a5ab1e drm/amdgpu: remove memory training p2c buffer reservation(V2)
IP discovery TMR(occupied the top VRAM with size DISCOVERY_TMR_SIZE)
has been reserved, and the p2c buffer is in the range of this TMR, so
the p2c buffer reservation is unnecessary.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:28 -05:00
Tianci.Yin
8d40002fee drm/amdgpu: update the method to get fb_loc of memory training(V4)
The method of getting fb_loc changed from parsing VBIOS to
taking certain offset from top of VRAM

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:20 -05:00
Ma Feng
7eca40066f drm/amdgpu: Remove unneeded variable 'ret' in navi10_ih.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/navi10_ih.c:113:5-8: Unneeded variable: "ret". Return "0" on line 182

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ma Feng <mafeng.ma@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:09 -05:00
Ma Feng
e3c00faa7a drm/amdgpu: Remove unneeded variable 'ret' in amdgpu_device.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1036:5-8: Unneeded variable: "ret". Return "0" on line 1079

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ma Feng <mafeng.ma@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:58:56 -05:00
James Zhu
57cb635bb4 drm/amdgpu/gfx: Add mmSDMA2-7_EDC_COUNTER to support Arcturus
Add mmSDMA2-7_EDC_COUNTER to support Arcturus

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:57 -05:00
James Zhu
107ab06136 drm/amdgpu/gfx: Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus
Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:51 -05:00
James Zhu
d8c61373e0 drm/amdgpu/gfx: Replace ARRAY_SIZE with size variable
Replace ARRAY_SIZE with size variables to support
different ASICs.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:45 -05:00
John Clements
1e2c6d5582 drm/amdgpu: Added ASIC specific check in gmc v9.0 ECC interrupt programming sequence
Devices newer then VEGA10/12 shall have these programming sequences performed by PSP BL

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:27 -05:00
Frank.Min
56ca8628ac drm/amdgpu: enlarge agp_start address into 48bit
max range of the agp aperture is 48 bits, so
enlarge agp_start address into 48bit with all bits set

Signed-off-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:21 -05:00
Jane Jian
8adf5d2184 drm/amdgpu: disable VCN2.5 ib test for Arcturus sriov
currently using TMR loading VCN fw MMSCH would fail
to init after FLR, just disable ib test for temporarily
daily testing, continuing debug with mm team.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:15 -05:00
Le Ma
0a96afc7c5 drm/amdgpu: fix ctx init failure for asics without gfx ring
This workaround does not affect other asics because amdgpu only need expose
one gfx sched to user for now.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:09 -05:00
Jonathan Kim
a7843c0379 drm/amdgpu: attempt xgmi perfmon re-arm on failed arm
The DF routines to arm xGMI performance will attempt to re-arm both on
performance monitoring start and read on initial failure to arm.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:56:02 -05:00
Luben Tuikov
ce73516d42 drm/amdgpu: simplify padding calculations (v2)
Simplify padding calculations.

v2: Comment update and spacing.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:55:44 -05:00
Jane Jian
ab5999dea0 drm/amdgpu: enable VCN0 and VCN1 sriov instances support for Arcturus
v1: compared to bare-metal: sriov support psp loading VCN firmware; only one
encoding ring would be used in each instance.
v2: keep unchange for bare-metal VCN2.5 hw_init, just add a flag with sriov
and also remove multiple lines.
v3: squash in warning fix

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-19 10:10:05 -05:00
Jane Jian
b40953c2ba drm/amdgpu: skip VCN2.5 power gating and clock gating for sriov Arcturus
v1: skip gating in serveral called functions by power gating and clock gating
v2: from suggestion, skip setting gate in both set function, which is where
it being called.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:33:26 -05:00
Jane Jian
d83c7a07a7 drm/amdgpu: update VCN1(dual instances) fw types ID and VCN ip block type
Previously there is no VCN1 type ID in psp gfx interface. Also add VCN ip
block type unless the reinit after FLR for sriov would fail.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:33:26 -05:00
Jane Jian
7daaebfea5 drm/amdgpu: add VCN2.5 sriov start for Arctrus
Use MMSCH V1 to finish Memory Controller
programming as well as start MMSCH to do
VCN2.5 initialization.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:33:26 -05:00
Jane Jian
95f1b55b67 drm/amdgpu: add VCN2.5 MMSCH start for Arcturus
Use MMSCH to do the initialization since MMSCH
manages VCN2.5 instances and its world switch.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:33:26 -05:00
Guchun Chen
fb71a336cd drm/amdgpu: move umc offset to one new header file for Arcturus
Code refactor and no functional change.

Fixes: 4cf781c24c ("drm/amdgpu: Added RAS UMC error query support for Arcturus")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:33:01 -05:00
Alex Deucher
719423f670 drm/amdgpu: wait for all rings to drain before runtime suspending
Add a safety check to runtime suspend to make sure all outstanding
fences have signaled before we suspend.  Doesn't fix any known issue.

We already do this via the fence driver suspend function, but we
just force completion rather than bailing.  This bails on runtime
suspend so we can try again later once the fences are signaled to
avoid missing any outstanding work.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:13 -05:00
Andrey Grodzovsky
c96cf2823d drm/amdgpu: Switch from system_highpri_wq to system_unbound_wq
This is to avoid queueing jobs to same CPU during XGMI hive reset
because there is a strict timeline for when the reset commands
must reach all the GPUs in the hive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:13 -05:00
Andrey Grodzovsky
c6a6e2db99 drm/amdgpu: Redo XGMI reset synchronization.
Use task barrier in XGMI hive to synchronize ASIC resets
across devices in XGMI hive.

v2: Return right away with a warning if no xgmi hive, update doc.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:13 -05:00
Andrey Grodzovsky
f33a8770cd drm/amdgpu: Add task barrier to XGMI hive.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:13 -05:00
Andrey Grodzovsky
041a62bc06 drm/amdgpu: reverts commit ce316fa55e.
In preparation for doing XGMI reset synchronization using task barrier.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Leo Liu
f06a58db92 drm/amdgpu/vcn: remove unnecessary included headers
Esp. VCN1.0 headers should not be here

v2: add back the <linux/module.h> to keep consistent.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Monk Liu
5a7489a7e1 drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV
issues:
MEC is ruined by the amdkfd_pre_reset after VF FLR done

fix:
amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR,
the correct sequence is do amdkfd_pre_reset before VF FLR but there is
a limitation to block this sequence:
if we do pre_reset() before VF FLR, it would go KIQ way to do register
access and stuck there, because KIQ probably won't work by that time
(e.g. you already made GFX hang)

so the best way right now is to simply remove it.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Monk Liu
1512d064f5 drm/amdgpu: fix double gpu_recovery for NV of SRIOV
issues:
gpu_recover() is re-entered by the mailbox interrupt
handler mxgpu_nv.c

fix:
we need to bypass the gpu_recover() invoke in mailbox
interrupt as long as the timeout is not infinite (thus the TDR
will be triggered automatically after time out, no need to invoke
gpu_recover() through mailbox interrupt.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Nirmoy Das
f880799d7f amd/amdgpu: add sched array to IPs with multiple run-queues
This sched array can be passed on to entity creation routine
instead of manually creating such sched array on every context creation.

v2: squash in missing break fix

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Nirmoy Das
0c88b43032 drm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds list
drm_sched_entity_init() takes drm gpu scheduler list instead of
drm_sched_rq list. This makes conversion of drm_sched_rq list
to drm gpu scheduler list unnecessary

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Nirmoy Das
b3ac17667f drm/scheduler: rework entity creation
Entity currently keeps a copy of run_queue list and modify it in
drm_sched_entity_set_priority(). Entities shouldn't modify run_queue
list. Use drm_gpu_scheduler list instead of drm_sched_rq list
in drm_sched_entity struct. In this way we can select a runqueue based
on entity/ctx's priority for a  drm scheduler.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Alex Deucher
45a80abebc drm/amdgpu/pm_runtime: update usage count in fence handling
Increment the usage count in emit fence, and decrement in
process fence to make sure the GPU is always considered in
use while there are fences outstanding.  We always wait for
the engines to drain in runtime suspend, but in practice
that only covers short lived jobs for gfx.  This should
cover us for longer lived fences.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:11 -05:00
zhengbin
374bf7bd6a drm/amdgpu: Remove unneeded semicolon in amdgpu_ras.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:318:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:11 -05:00
zhengbin
640f079325 drm/amdgpu: Remove unneeded semicolon in gfx_v10_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:1967:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:11 -05:00
zhengbin
2111a5f715 drm/amdgpu: Remove unneeded semicolon in amdgpu_pmu.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:110:3-4: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:133:2-3: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:163:2-3: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:191:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:11 -05:00
Alex Deucher
42a9938e1e drm/amdgpu/sdma5: make ring tests less chatty
We already did this for older generations.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:11 -05:00
Alex Deucher
e47c9bce46 drm/amdgpu/gfx10: make ring tests less chatty
We already did this for older generations.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:11 -05:00
Leo Liu
5e1e89eead drm/amdgpu/vcn: remove JPEG related code from idle handler and begin use
For VCN2.0 and above, VCN has been separated from JPEG

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:07 -05:00
Leo Liu
d58ed70778 drm/amdgpu/vcn1.0: use its own idle handler and begin use funcs
Because VCN1.0 power management and DPG mode are managed together with
JPEG1.0 under both HW and FW, so separated them from general VCN code.
Also the multiple instances case got removed, since VCN1.0 HW just have
a single instance.

v2: override work func with vcn1.0's own

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:07 -05:00
changzhu
aaff8b448d drm/amdgpu: enable gfxoff for raven1 refresh
When smu version is larger than 0x41e2b, it will load
raven_kicker_rlc.bin.To enable gfxoff for raven_kicker_rlc.bin,it
needs to avoid adev->pm.pp_feature &= ~PP_GFXOFF_MASK when it loads
raven_kicker_rlc.bin.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:07 -05:00
Emily Deng
8973d9ec8f drm/amdgpu/sriov: Tonga sriov also need load firmware with smu
Fix Tonga sriov load driver fail issue.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewd-by Yintian Tao <Yintian.tao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:06 -05:00
Guchun Chen
6193462409 drm/amdgpu: drop useless BACO arg in amdgpu_ras_reset_gpu
BACO reset mode strategy is determined by latter func when
calling amdgpu_ras_reset_gpu. So not to confuse audience, drop
it.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:06 -05:00
Yong Zhao
d7f72fe482 drm/amdgpu: Add CU info print log
The log will be useful for easily getting the CU info on various
emulation models or ASICs.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:06 -05:00
Yong Zhao
ad5901df88 drm/amdkfd: Use Arcturus specific set_vm_context_page_table_base()
Since Arcturus has it own function pointer, we can move Arcturus
specific logic to there rather than leaving it entangled with
other GFX9 chips.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:06 -05:00
Daniel Vetter
be452c4e8d Merge tag 'drm-next-5.6-2019-12-11' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.6-2019-12-11:

amdgpu:
- Add MST atomic routines
- Add support for DMCUB (new helper microengine for displays)
- Add OEM i2c support in DC
- Use vstartup for vblank events on DCN
- Simplify Kconfig for DC
- Renoir fixes for DC
- Clean up function pointers in DC
- Initial support for HDCP 2.x
- Misc code cleanups
- GFX10 fixes
- Rework JPEG engine handling for VCN
- Add clock and power gating support for JPEG
- BACO support for Arcturus
- Cleanup PSP ring handling
- Add framework for using BACO with runtime pm to save power
- Move core pci state handling out of the driver for pm ops
- Allow guest power control in 1 VF case with SR-IOV
- SR-IOV fixes
- RAS fixes
- Support for power metrics on renoir
- Golden settings updates for gfx10
- Enable gfxoff on supported navi10 skus
- Update MAINTAINERS

amdkfd:
- Clean up generational gfx code
- Fixes for gfx10
- DIQ fixes
- Share more code with amdgpu

radeon:
- PPC DMA fix
- Register checker fixes for r1xx/r2xx
- Misc cleanups

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211223020.7510-1-alexander.deucher@amd.com
2019-12-17 18:47:46 +01:00
Daniel Vetter
6c56e8adc0 drm-misc-next for v5.6:
UAPI Changes:
 - Add support for DMA-BUF HEAPS.
 
 Cross-subsystem Changes:
 - mipi dsi definition updates, pulled into drm-intel as well.
 - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
 - Remove support for dma-buf kmap/kunmap.
 - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
 
 Core Changes:
 - Small cleanups to ttm.
 - Fix SCDC definition.
 - Assorted cleanups to core.
 - Add todo to remove load/unload hooks, and use generic fbdev emulation.
 - Assorted documentation updates.
 - Use blocking ww lock in ttm fault handler.
 - Remove drm_fb_helper_fbdev_setup/teardown.
 - Warning fixes with W=1 for atomic.
 - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
 - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
 - Various kconfig indentation fixes in core and drivers.
 - Fix freeing transactions in dp-mst correctly.
 - Sean Paul is steping down as core maintainer. :-(
 - Add lockdep annotations for atomic locks vs dma-resv.
 - Prevent use-after-free for a bad job in drm_scheduler.
 - Fill out all block sizes in the P01x and P210 definitions.
 - Avoid division by zero in drm/rect, and fix bounds.
 - Add drm/rect selftests.
 - Add aspect ratio and alternate clocks for HDMI 4k modes.
 - Add todo for drm_framebuffer_funcs and fb_create cleanup.
 - Drop DRM_AUTH for prime import/export ioctls.
 - Clear DP-MST payload id tables downstream when initializating.
 - Fix for DSC throughput definition.
 - Add extra FEC definitions.
 - Fix fake offset in drm_gem_object_funs.mmap.
 - Stop using encoder->bridge in core directly
 - Handle bridge chaining slightly better.
 - Add backlight support to drm/panel, and use it in many panel drivers.
 - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
 
 Driver Changes:
 - Small fixes all over.
 - Fix documentation in vkms.
 - Fix mmap_sem vs dma_resv in nouveau.
 - Small cleanup in komeda.
 - Add page flip support in gma500 for psb/cdv.
 - Add ddc symlink in the connector sysfs directory for many drivers.
 - Add support for analogic an6345, and fix small bugs in it.
 - Add atomic modesetting support to ast.
 - Fix radeon fault handler VMA race.
 - Switch udl to use generic shmem helpers.
 - Unconditional vblank handling for mcde.
 - Miscellaneous fixes to mcde.
 - Tweak debug output from komeda using debugfs.
 - Add gamma and color transform support to komeda for DOU-IPS.
 - Add support for sony acx424AKP panel.
 - Various small cleanups to gma500.
 - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
 - Add support for Logic PD Type 28 panel.
 - Use drm_panel_* wrapper functions in exynos/tegra/msm.
 - Add devicetree bindings for generic DSI panels.
 - Don't include drm_pci.h directly in many drivers.
 - Add support for begin/end_cpu_access in udmabuf.
 - Stop using drm_get_pci_dev in gma500 and mga200.
 - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
 - Add devfreq thermal support to panfrost.
 - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
 - meson: Add support for OSD1 plane AFBC commit.
 - Stop displaying garbage when toggling ast primary plane on/off.
 - More cleanups and fixes to UDL.
 - Add D32 suport to komeda.
 - Remove globle copy of drm_dev in gma500.
 - Add support for Boe Himax8279d MIPI-DSI LCD panel.
 - Add support for ingenic JZ4770 panel.
 - Small null pointer deference fix in ingenic.
 - Remove support for the special tfp420 driver, as there is a generic way to do it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl34lkkACgkQ/lWMcqZw
 E8M76g//WRYl9fWnV063s44FBVJYjGxaus0vQJSGidaPCIE6Ep6TNjXp8DVzV82M
 HR79P9glL02DC9B8pflioNNXdIRGSVk/FJcKVB2seFAqEFCAknvWDM/X/y+mOUpp
 fUeFl+Znlwx3YlM8f4Qujdbm+CbTewfbya4VAWeWd8XG2V8jfq5cmODPPlUMNenZ
 J6Ja+W3ph741uSIfAKaP69LVJgOcuUjXINE4SWhRk/i5QF3GIRej/A7ZjWGLQ/t2
 2zUUF7EiCzhPomM40H3ddKtXb4ZjNJuc5pOD4GpxR8ciNbe2gUOHEZ5aenwYBdsU
 5MwbxNKyMbKXATtn3yv3fSc4jH3DtmEKpmovONeO8ZDBrQBnxeYa3tQvfkNghA2f
 acoZMzYUImV+ft6DMIgpXppASvo7mQYDAbLPOGEJ9E44AL4UP00jesEjnK5FOHSR
 3BEzGUnK/6QL5zFNPni8YZQ8dan4jDIno1mqIV+cQ4WCGlaKckzIWO6243Bf13b/
 kROSJpgWkiK6Ngq0ofhD0MHyT/m1QnqUzWRKTJhRtPflSWRBsDZqWCQ5Vx1QlNIE
 /HfTNbTpXWwa+5wXbbB8TkDw5t9cQGnR+QcrEd9HgoIec7B5Re8rx9i0TJAT4N05
 03RCQCecSfD8gwKd2wgaFIpFGRl9lTdLYSpffSmyL2X5a20lZhM=
 =b15X
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-12-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.6:

UAPI Changes:
- Add support for DMA-BUF HEAPS.

Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.

Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.

Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
2019-12-17 13:57:54 +01:00
Dave Airlie
d16f0f6140 Merge tag 'drm-fixes-5.5-2019-12-12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
drm-fixes-5.5-2019-12-12:

amdgpu:
- DC fixes for renoir
- Gfx8 fence flush align with mesa
- Power profile fix for arcturus
- Freesync fix
- DC I2c over aux fix
- DC aux defer fix
- GPU reset fix
- GPUVM invalidation semaphore fixes for PCO and SR-IOV
- Golden settings updates for gfx10

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191212223211.8034-1-alexander.deucher@amd.com
2019-12-13 14:50:01 +10:00
changzhu
f271fe1856 drm/amdgpu: add invalidate semaphore limit for SRIOV in gmc10
It may fail to load guest driver in round 2 when using invalidate
semaphore for SRIOV. So it needs to avoid using invalidate semaphore
for SRIOV.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-12 16:13:48 -05:00
changzhu
90f6452ca5 drm/amdgpu: add invalidate semaphore limit for SRIOV and picasso in gmc9
It may fail to load guest driver in round 2 or cause Xstart problem
when using invalidate semaphore for SRIOV or picasso. So it needs avoid
using invalidate semaphore for SRIOV and picasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-12 16:13:24 -05:00
changzhu
413fc385a5 drm/amdgpu: avoid using invalidate semaphore for picasso
It may cause timeout waiting for sem acquire in VM flush when using
invalidate semaphore for picasso. So it needs to avoid using invalidate
semaphore for piasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-12 09:39:39 -05:00
Alex Deucher
a680aea00d Revert "drm/amdgpu: dont schedule jobs while in reset"
This reverts commit f2efc6e600.

This was fixed properly for 5.5, but came back via 5.4 merge
into drm-next, so revert it again.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-12 09:39:17 -05:00
Alex Deucher
ad808910be drm/amdgpu: fix license on Kconfig and Makefiles
amdgpu is MIT licensed.

Fixes: ec8f24b7fa ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:08 -05:00
Simon Ser
93b09a9a89 drm/amdgpu: log when amdgpu.dc=1 but ASIC is unsupported
This makes it easier to figure out whether the kernel parameter has been
taken into account.

Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:08 -05:00
Leo Liu
3504bd45a9 drm/amdgpu: fix JPEG instance checking when ctx init
Use proper structure.

Fixes: 0388aee766 ("drm/amdgpu: use the JPEG structure for general driver support")
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:08 -05:00
Leo Liu
21a174f5ad drm/amdgpu: fix VCN2.x number of irq types
The JPEG irq type has been moved to its own structure

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:08 -05:00
Tianci.Yin
89ed5a5211 drm/amdgpu/gfx10: update gfx golden settings for navi14
add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
Tianci.Yin
eaec03f206 drm/amdgpu/gfx10: update gfx golden settings
add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
Kevin Wang
d549991ce5 drm/amdgpu: enable gfxoff feature for navi10 asic
enable gfxoff feature for some navi10 asics

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
Tianci.Yin
5f5202bf69 drm/amdgpu/gfx10: update gfx golden settings for navi14
add registers: mmSPI_CONFIG_CNTL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
Tianci.Yin
d4117354c8 drm/amdgpu/gfx10: update gfx golden settings
add registers: mmSPI_CONFIG_CNTL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
Yintian Tao
c9ffa427db drm/amd/powerplay: enable pp one vf mode for vega10
Originally, due to the restriction from PSP and SMU, VF has
to send message to hypervisor driver to handle powerplay
change which is complicated and redundant. Currently, SMU
and PSP can support VF to directly handle powerplay
change by itself. Therefore, the old code about the handshake
between VF and PF to handle powerplay will be removed and VF
will use new the registers below to handshake with SMU.
mmMP1_SMN_C2PMSG_101: register to handle SMU message
mmMP1_SMN_C2PMSG_102: register to handle SMU parameter
mmMP1_SMN_C2PMSG_103: register to handle SMU response

v2: remove module parameter pp_one_vf
v3: fix the parens
v4: forbid vf to change smu feature
v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute
v6: change skip condition at vega10_copy_table_to_smc

Signed-off-by: Yintian Tao <yttao@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
John Clements
4cf781c24c drm/amdgpu: Added RAS UMC error query support for Arcturus
Updated UMC 6.1 function set to support UMC 6.1.1 and 6.1.2 devices

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
changzhu
418899d615 drm/amdgpu: avoid using invalidate semaphore for picasso
It may cause timeout waiting for sem acquire in VM flush when using
invalidate semaphore for picasso. So it needs to avoid using invalidate
semaphore for piasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
Le Ma
feffbaac36 drm/amdgpu: add condition to enable baco for ras recovery
Switch to baco reset method for ras recovery if the PMFW supported.
If not, keep the original reset method.

v2: revise the condition

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 15:22:07 -05:00
Alex Deucher
bd95c14452 drm/amdgpu: fix license on Kconfig and Makefiles
amdgpu is MIT licensed.

Fixes: ec8f24b7fa ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 14:29:38 -05:00
Tianci.Yin
69897d3425 drm/amdgpu/gfx10: update gfx golden settings for navi14
add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 14:29:38 -05:00
Tianci.Yin
847b0d8795 drm/amdgpu/gfx10: update gfx golden settings
add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 14:29:38 -05:00
Tianci.Yin
5714a2026f drm/amdgpu/gfx10: update gfx golden settings for navi14
add registers: mmSPI_CONFIG_CNTL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 14:29:38 -05:00
Tianci.Yin
02cca5769f drm/amdgpu/gfx10: update gfx golden settings
add registers: mmSPI_CONFIG_CNTL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11 14:29:38 -05:00
Hawking Zhang
0d6f39bb77 drm/amdgpu: fix resume failures due to psp fw loading sequence change (v3)
this fix the regression caused by asd/ta loading sequence
adjustment recently. asd/ta loading was move out from
hw_start and should also be applied to psp_resume.
otherwise those fw loading will be ignored in resume phase.

v2: add the mutex unlock for asd loading failure case
v3: merge the error handling to failed tag

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-09 17:03:38 -05:00
Thong Thai
d515959125 Revert "drm/amdgpu: enable VCN DPG on Raven and Raven2"
This reverts commit a4840d91c9.

Reverting due to power efficiency issues seen on Raven 1 and 2
when DPG mode is enabled.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-09 17:02:33 -05:00
Christian König
b4ff0f8a85 drm/amdgpu: add VM eviction lock v3
This allows to invalidate VM entries without taking the reservation lock.

v3: use -EBUSY

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-09 17:02:26 -05:00
Christian König
90b69cdc5f drm/amdgpu: stop adding VM updates fences to the resv obj
Don't add the VM update fences to the resv object and remove
the handling to stop implicitely syncing to them.

Ongoing updates prevent page tables from being evicted and we manually
block for all updates to complete before releasing PDs and PTS.

This way we can do updates even without the resv obj locked.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-09 17:02:15 -05:00
Christian König
e095fc17bb drm/amdgpu: explicitely sync to VM updates v2
Allows us to reduce the overhead while syncing to fences a bit.

v2: also drop adev parameter from the functions

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-09 17:02:07 -05:00
Christian König
6ceeb144b1 drm/amdgpu: move VM eviction decision into amdgpu_vm.c
When a page tables needs to be evicted the VM code should
decide if that is possible or not.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-09 17:01:51 -05:00
Linus Torvalds
7ada90eb9c drm msm + fixes for 5.5-rc1
msm-next:
 - OCMEM support for a3xx and a4xx GPUs.
 - a510 support + display support
 
 core:
 - mst payload deletion fix
 
 i915:
 - uapi alignment fix
 - fix for power usage regression due to security fixes
 - change default preemption timeout to 640ms from 100ms
 - EHL voltage level display fixes
 - TGL DGL PHY fix
 - gvt - MI_ATOMIC cmd parser fix, CFL non-priv warning
 - CI spotted deadlock fix
 - EHL port D programming fix
 
 amdgpu:
 - VRAM lost fixes on BACO for CI/VI
 - navi14 DC fixes
 - misc SR-IOV, gfx10 fixes
 - XGMI fixes for arcturus
 - SRIOV fixes
 
 amdkfd:
 - KFD on ppc64le enabled
 - page table optimisations
 
 radeon:
 - fix for r1xx/2xx register checker.
 
 tegra:
 - displayport regression fixes
 - DMA API regression fixes
 
 mgag200:
 - fix devices that can't scanout except at 0 addr
 
 omap:
 - fix dma_addr refcounting
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJd6cqnAAoJEAx081l5xIa+YR0P/A0LkilEbSnF/k7zKDjm0HN8
 JGsf9ZfQRGA2y8URoLRtNdFjZfyuTSpiDSxsbDI0ShBhRimGHyCSxAJXO42vp8q3
 jE57jBoaTSiGtagSO3nxrc1vQP7CfUpaggC2ilKSmcVvTrlqip6iPx7s2PoNyQYc
 GRVUhkcylnZK5UrMiE8Yz/iNcy3Mh0X8bJQKXMEYxpW2KA3SL4qxuRlYIxXEoMyB
 4MlWEV09wHTduf1uYuKdusHjILgR5EiVOdmbvpM92obqZOTokt5/S20TEdhFqiy0
 0IHxuEkgVx+trXzGFbmqgh2I7BZvZIbKVCSnBT4AXAvUEJ99kGTdEP0I6uOp2lsC
 1DCm+7/hcI8BlwmwC9N6ogUwoAzKn7DNc1urcet/0QVbnZLZlueUK/6fSgUNnUYe
 miOeMNBmfHr83b75MpnNxYVoyz5S+/DFbtUplYKqxgjDYfiWWceSSE47NB+IHAiI
 RVpz3AxGpKaw4/w5l2q8VuToWZxdO85TNjgVCTmKfwlYjIbEuveWpZNFqO/GHMm9
 x50f4ZYVOjU2TEPnLQNTIJOgv71JrTpoAdFzPVwCeWUf4h4Y4lVLgTLvdG1JLcw+
 k9BrA5z2R0kjzPtabRhS6WfSjpgSbY3DgY9hfi+HIUmKvZq4fdtAbBlp1oGSXJ9N
 zkVrs9eE6Ahkcndi6ZV9
 =3cs2
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-12-06' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Dave Airlie:
 "Rob pointed out I missed his pull request for msm-next, it's been in
  next for a while outside of my tree so shouldn't cause any unexpected
  issues, it has some OCMEM support in drivers/soc that is acked by
  other maintainers as it's outside my tree.

  Otherwise it's a usual fixes pull, i915, amdgpu, the main ones, with
  some tegra, omap, mgag200 and one core fix.

  Summary:

  msm-next:
   - OCMEM support for a3xx and a4xx GPUs.
   - a510 support + display support

  core:
   - mst payload deletion fix

  i915:
   - uapi alignment fix
   - fix for power usage regression due to security fixes
   - change default preemption timeout to 640ms from 100ms
   - EHL voltage level display fixes
   - TGL DGL PHY fix
   - gvt - MI_ATOMIC cmd parser fix, CFL non-priv warning
   - CI spotted deadlock fix
   - EHL port D programming fix

  amdgpu:
   - VRAM lost fixes on BACO for CI/VI
   - navi14 DC fixes
   - misc SR-IOV, gfx10 fixes
   - XGMI fixes for arcturus
   - SRIOV fixes

  amdkfd:
   - KFD on ppc64le enabled
   - page table optimisations

  radeon:
   - fix for r1xx/2xx register checker.

  tegra:
   - displayport regression fixes
   - DMA API regression fixes

  mgag200:
   - fix devices that can't scanout except at 0 addr

  omap:
   - fix dma_addr refcounting"

* tag 'drm-next-2019-12-06' of git://anongit.freedesktop.org/drm/drm: (100 commits)
  drm/dp_mst: Correct the bug in drm_dp_update_payload_part1()
  drm/omap: fix dma_addr refcounting
  drm/tegra: Run hub cleanup on ->remove()
  drm/tegra: sor: Make the +5V HDMI supply optional
  drm/tegra: Silence expected errors on IOMMU attach
  drm/tegra: vic: Export module device table
  drm/tegra: sor: Implement system suspend/resume
  drm/tegra: Use proper IOVA address for cursor image
  drm/tegra: gem: Remove premature import restrictions
  drm/tegra: gem: Properly pin imported buffers
  drm/tegra: hub: Remove bogus connection mutex check
  ia64: agp: Replace empty define with do while
  agp: Add bridge parameter documentation
  agp: remove unused variable num_segments
  agp: move AGPGART_MINOR to include/linux/miscdevice.h
  agp: remove unused variable size in agp_generic_create_gatt_table
  drm/dp_mst: Fix build on systems with STACKTRACE_SUPPORT=n
  drm/radeon: fix r1xx/r2xx register checker for POT textures
  drm/amdgpu: fix GFX10 missing CSIB set(v3)
  drm/amdgpu: should stop GFX ring in hw_fini
  ...
2019-12-06 10:28:09 -08:00
Gerd Hoffmann
b3fac52c51 drm: share address space for dma bufs
Use the shared address space of the drm device (see drm_open() in
drm_file.c) for dma-bufs too.  That removes a difference betweem drm
device mmap vmas and dma-buf mmap vmas and fixes corner cases like
dropping ptes (using madvise(DONTNEED) for example) not working
properly.

Also remove amdgpu driver's private dmabuf update.  It is not needed
any more now that we are doing this for everybody.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20191127092523.5620-3-kraxel@redhat.com
2019-12-06 11:18:11 +01:00
Pierre-Eric Pelloux-Prayer
bf26da927a drm/amdgpu: add cache flush workaround to gfx8 emit_fence
The same workaround is used for gfx7.
Both PAL and Mesa use it for gfx8 too, so port this commit to
gfx_v8_0_ring_emit_fence_gfx.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 17:54:55 -05:00
Guchun Chen
6e807535da drm/amdgpu: add check before enabling/disabling broadcast mode
When security violation from new vbios happens, data fabric is
risky to stop working. So prevent the direct access to DF
mmFabricConfigAccessControl from the new vbios and onwards.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 17:54:14 -05:00
Le Ma
76434f75d4 drm/amdgpu: reduce redundant uvd context lost warning message
Move the print out of uvd instance loop in amdgpu_uvd_suspend

v2: drop unnecessary brackets
v3: grab ras_intr state once for multiple times use

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:26:20 -05:00
Le Ma
00eaa57172 drm/amdgpu: clear err_event_athub flag after reset exit
Otherwise next err_event_athub error cannot call gpu reset. And following
resume sequence will not be affected by this flag.

v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:26:11 -05:00
Le Ma
b823821f22 drm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs
This athub fatal error can be recovered by baco without system-level reboot,
so add a mode to use baco for the recovery. Not affect the default psp reset
situations for now.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:26:03 -05:00
Le Ma
ce316fa55e drm/amdgpu: add concurrent baco reset support for XGMI
Currently each XGMI node reset wq does not run in parrallel if bound to same
cpu. Make change to bound the xgmi_reset_work item to different cpus.

XGMI requires all nodes enter into baco within very close proximity before
any node exit baco. So schedule the xgmi_reset_work wq twice for enter/exit
baco respectively.

To use baco for XGMI, PMFW supported for baco on XGMI needs to be involved.

The case that PSP reset and baco reset coexist within an XGMI hive never exist
and is not in the consideration.

v2: define use_baco flag to simplify the code for xgmi baco sequence

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:25:53 -05:00
Le Ma
7a22677b95 drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper
This operation is needed when baco entry/exit for ras recovery

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:25:45 -05:00
Le Ma
5c39d600e3 drm/amdgpu: clear uncorrectable parity error status bit
This should be cleared during every nbif uncorrectable error cleanup work.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:25:35 -05:00
Le Ma
28f87950d9 drm/amdgpu: clear ras controller status registers when interrupt occurs
To fix issue that ras controller interrupt cannot be triggered anymore after
one time nbif uncorrectable error. And error count is stored in nbif ras object
for query.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:25:27 -05:00
Le Ma
f2a79be1c0 drm/amdgpu: export amdgpu_ras_find_obj to use externally
Change it to external interface.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:25:20 -05:00
Le Ma
4a2d93565a drm/amdgpu: remove ras global recovery handling from ras_controller_int handler
v2: add notification when ras controller interrupt generates

Signed-off-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:25:11 -05:00
Pierre-Eric Pelloux-Prayer
b456c93253 drm/amdgpu: add cache flush workaround to gfx8 emit_fence
The same workaround is used for gfx7.
Both PAL and Mesa use it for gfx8 too, so port this commit to
gfx_v8_0_ring_emit_fence_gfx.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:24:58 -05:00
James Zhu
f83f5a1e11 drm/amdgpu/gfx: Improvement on EDC GPR workarounds
SPI limits total CS waves in flight per SE to no more than 32 * num_cu and
we need to stuff 40 waves on a CU to completely clean the SGPR. This is
accomplished in the WR by cleaning the SE in two steps, half of the CU per
step.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:24:28 -05:00
Guchun Chen
79c4c8ea91 drm/amdgpu: add check before enabling/disabling broadcast mode
When security violation from new vbios happens, data fabric is
risky to stop working. So prevent the direct access to DF
mmFabricConfigAccessControl from the new vbios and onwards.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:24:09 -05:00
Jani Nikula
b6ff753a0c drm: constify fb ops across all drivers
Now that the fbops member of struct fb_info is const, we can start
making the ops const as well.

Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/59b43629ac60031c5bbf961d8c49695019bc9c6f.1575390740.git.jani.nikula@intel.com
2019-12-05 10:57:42 +02:00
Linus Torvalds
c3bed3b20e pci-v5.5-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl3leXUUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyY3g/9FAVVdPEaadNtAhQ/zIxcjozDovKq
 0q7yOA3aTBTUoNEinm88an6p0dcC4gNKtGukXmzVH2Hhxm9kLRdtpZGYY00tpLUB
 9rI7XsgwwHa+hLwsHbIs507sKGFGy5FLr0ChTTGLDEMppnEvjA2hZooYmcB/OgrC
 LlFcwbNKGOk/Si9u2bF2nLO0JDoVHnwzpF99saew/nqc7Lfj9e9IPZFom+VjPBUh
 AOvRp2H7uBN+WQlpLeFeMDDoeXh34lX0kYqIV/cVkXVnknDGYKV2CBTg2aeX7jd0
 QiPHZh6zlW8zNQgaCZRiBAbatVEOnRMRJ++yiqB8hBYp1LMXm6kJ01YSQpXkugoY
 Vp9dtzzTARWV/XkKwD4brw9ZEmIDnO+Ed2x2VbUkPJVcXAvzSQWAx82IU0Iuqmcb
 9qr6U2Zf/Xk5aFlGPYVH8QOG+QqzIbZNRQ7NlhDlITyW4P6QPu0mw374yYP2wDGL
 sP5YSS3YGa0sQcEgDtVnd4z+WTZI4AwXLPaeaLkDhdfHp2FsERUY4TrPs33J99xw
 og4EyokVFzjYzlnBPU6WWn7LL+jj5ccXkL3MA4DR4FJOnNGHh7NXfQUH56rrgsq7
 F9/8shL5DuTbQkde1uSyUG9Iq/RigVLlV5DQavFm3dSXvZi0E16t5alC5URNTzk7
 at8Bogn53QhlmYc=
 =uUXw
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Warn if a host bridge has no NUMA info (Yunsheng Lin)

   - Add PCI_STD_NUM_BARS for the number of standard BARs (Denis
     Efremov)

  Resource management:

   - Fix boot-time Embedded Controller GPE storm caused by incorrect
     resource assignment after ACPI Bus Check Notification (Mika
     Westerberg)

   - Protect pci_reassign_bridge_resources() against concurrent
     addition/removal (Benjamin Herrenschmidt)

   - Fix bridge dma_ranges resource list cleanup (Rob Herring)

   - Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters to control
     the MMIO and prefetchable MMIO window sizes of hotplug bridges
     independently (Nicholas Johnson)

   - Fix MMIO/MMIO_PREF window assignment that assigned more space than
     desired (Nicholas Johnson)

   - Only enforce bus numbers from bridge EA if the bridge has EA
     devices downstream (Subbaraya Sundeep)

   - Consolidate DT "dma-ranges" parsing and convert all host drivers to
     use shared parsing (Rob Herring)

  Error reporting:

   - Restore AER capability after resume (Mayurkumar Patel)

   - Add PoisonTLPBlocked AER counter (Rajat Jain)

   - Use for_each_set_bit() to simplify AER code (Andy Shevchenko)

   - Fix AER kernel-doc (Andy Shevchenko)

   - Add "pcie_ports=dpc-native" parameter to allow native use of DPC
     even if platform didn't grant control over AER (Olof Johansson)

  Hotplug:

   - Avoid returning prematurely from sysfs requests to enable or
     disable a PCIe hotplug slot (Lukas Wunner)

   - Don't disable interrupts twice when suspending hotplug ports (Mika
     Westerberg)

   - Fix deadlocks when PCIe ports are hot-removed while suspended (Mika
     Westerberg)

  Power management:

   - Remove unnecessary ASPM locking (Bjorn Helgaas)

   - Add support for disabling L1 PM Substates (Heiner Kallweit)

   - Allow re-enabling Clock PM after it has been disabled (Heiner
     Kallweit)

   - Add sysfs attributes for controlling ASPM link states (Heiner
     Kallweit)

   - Remove CONFIG_PCIEASPM_DEBUG, including "link_state" and "clk_ctl"
     sysfs files (Heiner Kallweit)

   - Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on
     USB 2.0 or 1.1 connect events (Kai-Heng Feng)

   - Move power state check out of pci_msi_supported() (Bjorn Helgaas)

   - Fix incorrect MSI-X masking on resume and revert related nvme quirk
     for Kingston NVME SSD running FW E8FK11.T (Jian-Hong Pan)

   - Always return devices to D0 when thawing to fix hibernation with
     drivers like mlx4 that used legacy power management (previously we
     only did it for drivers with new power management ops) (Dexuan Cui)

   - Clear PCIe PME Status even for legacy power management (Bjorn
     Helgaas)

   - Fix PCI PM documentation errors (Bjorn Helgaas)

   - Use dev_printk() for more power management messages (Bjorn Helgaas)

   - Apply D2 delay as milliseconds, not microseconds (Bjorn Helgaas)

   - Convert xen-platform from legacy to generic power management (Bjorn
     Helgaas)

   - Removed unused .resume_early() and .suspend_late() legacy power
     management hooks (Bjorn Helgaas)

   - Rearrange power management code for clarity (Rafael J. Wysocki)

   - Decode power states more clearly ("4" or "D4" really refers to
     "D3cold") (Bjorn Helgaas)

   - Notice when reading PM Control register returns an error (~0)
     instead of interpreting it as being in D3hot (Bjorn Helgaas)

   - Add missing link delays required by the PCIe spec (Mika Westerberg)

  Virtualization:

   - Move pci_prg_resp_pasid_required() to CONFIG_PCI_PRI (Bjorn
     Helgaas)

   - Allow VFs to use PRI (the PF PRI is shared by the VFs, but the code
     previously didn't recognize that) (Kuppuswamy Sathyanarayanan)

   - Allow VFs to use PASID (the PF PASID capability is shared by the
     VFs, but the code previously didn't recognize that) (Kuppuswamy
     Sathyanarayanan)

   - Disconnect PF and VF ATS enablement, since ATS in PFs and
     associated VFs can be enabled independently (Kuppuswamy
     Sathyanarayanan)

   - Cache PRI and PASID capability offsets (Kuppuswamy Sathyanarayanan)

   - Cache the PRI PRG Response PASID Required bit (Bjorn Helgaas)

   - Consolidate ATS declarations in linux/pci-ats.h (Krzysztof
     Wilczynski)

   - Remove unused PRI and PASID stubs (Bjorn Helgaas)

   - Removed unnecessary EXPORT_SYMBOL_GPL() from ATS, PRI, and PASID
     interfaces that are only used by built-in IOMMU drivers (Bjorn
     Helgaas)

   - Hide PRI and PASID state restoration functions used only inside the
     PCI core (Bjorn Helgaas)

   - Add a DMA alias quirk for the Intel VCA NTB (Slawomir Pawlowski)

   - Serialize sysfs sriov_numvfs reads vs writes (Pierre Crégut)

   - Update Cavium ACS quirk for ThunderX2 and ThunderX3 (George
     Cherian)

   - Fix the UPDCR register address in the Intel ACS quirk (Steffen
     Liebergeld)

   - Unify ACS quirk implementations (Bjorn Helgaas)

  Amlogic Meson host bridge driver:

   - Fix meson PERST# GPIO polarity problem (Remi Pommarel)

   - Add DT bindings for Amlogic Meson G12A (Neil Armstrong)

   - Fix meson clock names to match DT bindings (Neil Armstrong)

   - Add meson support for Amlogic G12A SoC with separate shared PHY
     (Neil Armstrong)

   - Add meson extended PCIe PHY functions for Amlogic G12A USB3+PCIe
     combo PHY (Neil Armstrong)

   - Add arm64 DT for Amlogic G12A PCIe controller node (Neil Armstrong)

   - Add commented-out description of VIM3 USB3/PCIe mux in arm64 DT
     (Neil Armstrong)

  Broadcom iProc host bridge driver:

   - Invalidate iProc PAXB address mapping before programming it
     (Abhishek Shah)

   - Fix iproc-msi and mvebu __iomem annotations (Ben Dooks)

  Cadence host bridge driver:

   - Refactor Cadence PCIe host controller to use as a library for both
     host and endpoint (Tom Joseph)

  Freescale Layerscape host bridge driver:

   - Add layerscape LS1028a support (Xiaowei Bao)

  Intel VMD host bridge driver:

   - Add VMD bus 224-255 restriction decode (Jon Derrick)

   - Add VMD 8086:9A0B device ID (Jon Derrick)

   - Remove Keith from VMD maintainer list (Keith Busch)

  Marvell ARMADA 3700 / Aardvark host bridge driver:

   - Use LTSSM state to build link training flag since Aardvark doesn't
     implement the Link Training bit (Remi Pommarel)

   - Delay before training Aardvark link in case PERST# was asserted
     before the driver probe (Remi Pommarel)

   - Fix Aardvark issues with Root Control reads and writes (Remi
     Pommarel)

   - Don't rely on jiffies in Aardvark config access path since
     interrupts may be disabled (Remi Pommarel)

   - Fix Aardvark big-endian support (Grzegorz Jaszczyk)

  Marvell ARMADA 370 / XP host bridge driver:

   - Make mvebu_pci_bridge_emul_ops static (Ben Dooks)

  Microsoft Hyper-V host bridge driver:

   - Add hibernation support for Hyper-V virtual PCI devices (Dexuan
     Cui)

   - Track Hyper-V pci_protocol_version per-hbus, not globally (Dexuan
     Cui)

   - Avoid kmemleak false positive on hv hbus buffer (Dexuan Cui)

  Mobiveil host bridge driver:

   - Change mobiveil csr_read()/write() function names that conflict
     with riscv arch functions (Kefeng Wang)

  NVIDIA Tegra host bridge driver:

   - Fix Tegra CLKREQ dependency programming (Vidya Sagar)

  Renesas R-Car host bridge driver:

   - Remove unnecessary header include from rcar (Andrew Murray)

   - Tighten register index checking for rcar inbound range programming
     (Marek Vasut)

   - Fix rcar inbound range alignment calculation to improve packing of
     multiple entries (Marek Vasut)

   - Update rcar MACCTLR setting to match documentation (Yoshihiro
     Shimoda)

   - Clear bit 0 of MACCTLR before PCIETCTLR.CFINIT per manual
     (Yoshihiro Shimoda)

   - Add Marek Vasut and Yoshihiro Shimoda as R-Car maintainers (Simon
     Horman)

  Rockchip host bridge driver:

   - Make rockchip 0V9 and 1V8 power regulators non-optional (Robin
     Murphy)

  Socionext UniPhier host bridge driver:

   - Set uniphier to host (RC) mode always (Kunihiko Hayashi)

  Endpoint drivers:

   - Fix endpoint driver sign extension problem when shifting page
     number to phys_addr_t (Alan Mikhak)

  Misc:

   - Add NumaChip SPDX header (Krzysztof Wilczynski)

   - Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski)

   - Remove unused includes (Krzysztof Wilczynski)

   - Removed unused sysfs attribute groups (Ben Dooks)

   - Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas)

   - Add PCIe Link Control 2 register field definitions to replace magic
     numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas)

   - Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and
     Radeon CIK/SI PCIe Gen3 link training (Bjorn Helgaas)

   - Use pcie_capability_read_word() instead of pci_read_config_word()
     in AMDGPU and Radeon CIK/SI (Frederick Lawler)

   - Remove unused pci_irq_get_node() Greg Kroah-Hartman)

   - Make asm/msi.h mandatory and simplify PCI_MSI_IRQ_DOMAIN Kconfig
     (Palmer Dabbelt, Michal Simek)

   - Read all 64 bits of Switchtec part_event_bitmap (Logan Gunthorpe)

   - Fix erroneous intel-iommu dependency on CONFIG_AMD_IOMMU (Bjorn
     Helgaas)

   - Fix bridge emulation big-endian support (Grzegorz Jaszczyk)

   - Fix dwc find_next_bit() usage (Niklas Cassel)

   - Fix pcitest.c fd leak (Hewenliang)

   - Fix typos and comments (Bjorn Helgaas)

   - Fix Kconfig whitespace errors (Krzysztof Kozlowski)"

* tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (160 commits)
  PCI: Remove PCI_MSI_IRQ_DOMAIN architecture whitelist
  asm-generic: Make msi.h a mandatory include/asm header
  Revert "nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T"
  PCI/MSI: Fix incorrect MSI-X masking on resume
  PCI/MSI: Move power state check out of pci_msi_supported()
  PCI/MSI: Remove unused pci_irq_get_node()
  PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer
  PCI: hv: Change pci_protocol_version to per-hbus
  PCI: hv: Add hibernation support
  PCI: hv: Reorganize the code in preparation of hibernation
  MAINTAINERS: Remove Keith from VMD maintainer
  PCI/ASPM: Remove PCIEASPM_DEBUG Kconfig option and related code
  PCI/ASPM: Add sysfs attributes for controlling ASPM link states
  PCI: Fix indentation
  drm/radeon: Prefer pcie_capability_read_word()
  drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions
  drm/radeon: Correct Transmit Margin masks
  drm/amdgpu: Prefer pcie_capability_read_word()
  PCI: uniphier: Set mode register to host mode
  drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
  ...
2019-12-03 13:58:22 -08:00
Monk Liu
4905880b45 drm/amdgpu: fix GFX10 missing CSIB set(v3)
still need to init csb even for SRIOV

v2:
drop init_pg() for gfx10 at all since
PG and GFX off feature will be fully controled
by RLC and SMU fw for gfx10

v3:
drop the flush_gpu_tlb lines since we consider
it is only usefull in emulation

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:58:08 -05:00
Monk Liu
cd05b51aaa drm/amdgpu: should stop GFX ring in hw_fini
To align with the scheme from gfx9

disabling GFX ring after VM shutdown could avoid
garbage data be fetched to GFX RB which may lead
to unnecessary screw up on GFX

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:57:54 -05:00
Monk Liu
dacf56e45d drm/amdgpu: do autoload right after MEC loaded for SRIOV VF
since we don't have RLCG ucode loading and no SRlist as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:57:39 -05:00
Monk Liu
6294017fe3 drm/amdgpu: skip rlc ucode loading for SRIOV gfx10
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:57:17 -05:00
Monk Liu
747d4f715f drm/amdgpu: fix calltrace during kmd unload(v3)
issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo

why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.

fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3

v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3

v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever

take care of gfx7/8 as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:57:00 -05:00
Xiaojie Yuan
fa2b93e39b drm/amdgpu/gfx10: unlock srbm_mutex after queue programming finish
srbm_mutex is to guarantee atomicity for r/w of gfx indexed registers

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:52:55 -05:00
John Clements
f0312f45a0 drm/amdgpu: Added ASIC specific checks in gfxhub V1.1 get XGMI info
Added max hive/node info checks per supported ASIC

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:52:11 -05:00
Monk Liu
e2195f7d0e drm/amdgpu: use CPU to flush vmhub if sched stopped
otherwse the flush_gpu_tlb will hang if we unload the
KMD becuase the schedulers already stopped

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:39:21 -05:00
Yong Zhao
6dcab16b41 drm/amdkfd: Contain MMHUB number in mmhub_v9_4_setup_vm_pt_regs()
Adjust the exposed function prototype so that the caller does not need
to know the MMHUB number.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:08:24 -05:00
Hawking Zhang
7091b60cad drm/amdgpu: load np fw prior before loading the TAs
Platform TAs will independently toggle DF Cstate.
for instance, get/set topology from xgmi ta. do error
injection from ras ta. In such case, PMFW needs to be
loaded before TAs so that all the subsequent Cstate
calls recieved by PSP FW can be routed to PMFW.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:08:10 -05:00
Hawking Zhang
71e5f0cb93 drm/amdgpu: unload asd in psp hw de-init phase
issue unload_ta_cmd to tOS to unload asd driver

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:08:03 -05:00
Hawking Zhang
c64ab8280e drm/amdgpu: drop asd shared memory
asd shared memory is not needed since drivers doesn't
invoke any further cmd to asd directly after the asd
loading. trust application is the one who needs
to talk to asd after the initialization

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-03 11:07:56 -05:00
Emily Deng
0ea203a912 drm/amdgpu/sriov: No need the event 3 and 4 now
As will call unload kms when initialize fail, and the unload kms will
send event 3 and 4, so don't need event 3 and 4 in device init.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:39:28 -05:00
John Clements
031514956b drm/amdgpu: Added ASIC specific checks in gfxhub V1.1 get XGMI info
Added max hive/node info checks per supported ASIC

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:39:20 -05:00
Bhawanpreet Lakha
a7f4ba7a6c drm/amd/display: Load TA firmware for navi10/12/14
load the ta firmware for navi10/12/14.
This is already being done for raven/picasso and
is needed for supporting hdcp on navi

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:39:13 -05:00
Yintian Tao
7c868b592d drm/amdgpu: not remove sysfs if not create sysfs
When load amdgpu failed before create pm_sysfs and ucode_sysfs,
the pm_sysfs and ucode_sysfs should not be removed.
Otherwise, there will be warning call trace just like below.
[   24.836386] [drm] VCE initialized successfully.
[   24.841352] amdgpu 0000:00:07.0: amdgpu_device_ip_init failed
[   25.370383] amdgpu 0000:00:07.0: Fatal error during GPU init
[   25.889575] [drm] amdgpu: finishing device.
[   26.069128] amdgpu 0000:00:07.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[   26.070110] [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
[   26.200309] [TTM] Finalizing pool allocator
[   26.200314] [TTM] Finalizing DMA pool allocator
[   26.200349] [TTM] Zone  kernel: Used memory at exit: 0 KiB
[   26.200351] [TTM] Zone   dma32: Used memory at exit: 0 KiB
[   26.200353] [drm] amdgpu: ttm finalized
[   26.205329] ------------[ cut here ]------------
[   26.205330] sysfs group 'fw_version' not found for kobject '0000:00:07.0'
[   26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256 sysfs_remove_group+0x80/0x90
[   26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE) drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi floppy
[   26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G           OE     5.2.0-rc1 #1
[   26.205370] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   26.205372] RIP: 0010:sysfs_remove_group+0x80/0x90
[   26.205374] Code: e8 35 b9 ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8 f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55
[   26.205375] RSP: 0018:ffffbee242b0b908 EFLAGS: 00010282
[   26.205376] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
[   26.205377] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff97ad6f817380
[   26.205377] RBP: ffffbee242b0b920 R08: ffffffff98f520c4 R09: 00000000000002b3
[   26.205378] R10: ffffbee242b0b8f8 R11: 00000000000002b3 R12: ffffffffc0e58240
[   26.205379] R13: ffff97ad6d1fe0b0 R14: ffff97ad4db954c8 R15: ffff97ad4db7fff0
[   26.205380] FS:  00007ff3d8a1c4c0(0000) GS:ffff97ad6f800000(0000) knlGS:0000000000000000
[   26.205381] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   26.205381] CR2: 00007f9b2ef1df04 CR3: 000000042aab8001 CR4: 00000000003606f0
[   26.205384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   26.205385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   26.205385] Call Trace:
[   26.205461]  amdgpu_ucode_sysfs_fini+0x18/0x20 [amdgpu]
[   26.205518]  amdgpu_device_fini+0x3b4/0x560 [amdgpu]
[   26.205573]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
[   26.205623]  amdgpu_driver_load_kms+0xcd/0x250 [amdgpu]
[   26.205637]  drm_dev_register+0x12b/0x1c0 [drm]
[   26.205695]  amdgpu_pci_probe+0x12a/0x1e0 [amdgpu]
[   26.205699]  local_pci_probe+0x47/0xa0
[   26.205701]  pci_device_probe+0x106/0x1b0
[   26.205704]  really_probe+0x21a/0x3f0
[   26.205706]  driver_probe_device+0x11c/0x140
[   26.205707]  device_driver_attach+0x58/0x60
[   26.205709]  __driver_attach+0xc3/0x140

Signed-off-by: Yintian Tao <yttao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:39:05 -05:00