Commit Graph

7117 Commits

Author SHA1 Message Date
Monk Liu
d5939e4db5 drm/amdgpu: fix GFX10 missing CSIB set(v3)
still need to init csb even for SRIOV

v2:
drop init_pg() for gfx10 at all since
PG and GFX off feature will be fully controled
by RLC and SMU fw for gfx10

v3:
drop the flush_gpu_tlb lines since we consider
it is only usefull in emulation

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:38:56 -05:00
Monk Liu
eb529b8e46 drm/amdgpu: should stop GFX ring in hw_fini
To align with the scheme from gfx9

disabling GFX ring after VM shutdown could avoid
garbage data be fetched to GFX RB which may lead
to unnecessary screw up on GFX

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:38:49 -05:00
Monk Liu
6de40f02b3 drm/amdgpu: do autoload right after MEC loaded for SRIOV VF
since we don't have RLCG ucode loading and no SRlist as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:38:43 -05:00
Monk Liu
1797ec7ffd drm/amdgpu: skip rlc ucode loading for SRIOV gfx10
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:38:37 -05:00
Monk Liu
82a829dc8c drm/amdgpu: fix calltrace during kmd unload(v3)
issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo

why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.

fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3

v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3

v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever

take care of gfx7/8 as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:38:27 -05:00
Monk Liu
869aebc7ba drm/amdgpu: use CPU to flush vmhub if sched stopped
otherwse the flush_gpu_tlb will hang if we unload the
KMD becuase the schedulers already stopped

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:38:03 -05:00
James Zhu
45317d5ffb drm/amdgpu/gfx: Increase dispatch packet number
For Arcturus, increase dispatch packet number to stress scheduler.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:37:57 -05:00
James Zhu
2255d7f36e drm/amdgpu/gfx: Clear more EDC cnt
Clear SDMA and HDP EDC counter in GPR workarounds.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:37:51 -05:00
Xiaojie Yuan
858054f761 drm/amdgpu/gfx10: remove outdated comments
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:37:44 -05:00
Xiaojie Yuan
a5e82d0b95 drm/amdgpu/gfx10: unlock srbm_mutex after queue programming finish
srbm_mutex is to guarantee atomicity for r/w of gfx indexed registers

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:37:29 -05:00
Linus Torvalds
aa32f11691 hmm related patches for 5.5
This is another round of bug fixing and cleanup. This time the focus is on
 the driver pattern to use mmu notifiers to monitor a VA range. This code
 is lifted out of many drivers and hmm_mirror directly into the
 mmu_notifier core and written using the best ideas from all the driver
 implementations.
 
 This removes many bugs from the drivers and has a very pleasing
 diffstat. More drivers can still be converted, but that is for another
 cycle.
 
 - A shared branch with RDMA reworking the RDMA ODP implementation
 
 - New mmu_interval_notifier API. This is focused on the use case of
   monitoring a VA and simplifies the process for drivers
 
 - A common seq-count locking scheme built into the mmu_interval_notifier
   API usable by drivers that call get_user_pages() or hmm_range_fault()
   with the VA range
 
 - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen GntDev
   drivers to the new API. This deletes a lot of wonky driver code.
 
 - Two improvements for hmm_range_fault(), from testing done by Ralph
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl3cCjQACgkQOG33FX4g
 mxpp8xAAiR9iOdT28m/tx1GF31XludrMhRZVIiz0vmCIxIiAkWekWEfAEVm9PDnh
 wdrxTJohSs+B65AK3sfToOM3AIuNCuFVWmbbHI5qmOO76vaSvcZa905Z++pNsawO
 Bn8mgRCprYoFHcxWLvTvnA5U0g1S2BSSOwBSZI43CbEnVvHjYAR6MnvRqfGMk+NF
 bf8fTk/x+fl0DCemhynlBLuJkogzoE2Hgl0yPY5bFna4PktOxdpa1yPaQsiqZ7e6
 2s2NtM3pbMBJk0W42q5BU+aPhiqfxFFszasPSLBduXrD2xDsG76HJdHj5VydKmfL
 nelG4BvqJozXTEZWvTEePYhCqaZ41eJZ7Asw8BXtmacVqE5mDlTXo/Zdgbz7yEOR
 mI5MVyjD5rauZJldUOWXbwrPoWVFRvboauehiSgqvxvT9HvlFp9GKObSuu4gubBQ
 mzxs4t48tPhA7bswLmw0/pETSogFuVDfaB7hsyY0gi8EwxMFMpw2qFypm1PEEF+C
 BuUxCSShzvNKrraNe5PWaNNFd3AzIwAOWJHE+poH4bCoXQVr5nA+rq2gnHkdY5vq
 /xrBCyxkf0U05YoFGYembPVCInMehzp9Xjy8V+SueSvCg2/TYwGDCgGfsbe9dNOP
 Bc40JpS7BDn5w9nyLUJmOx7jfruNV6kx1QslA7NDDrB/rzOlsEc=
 =Hj8a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull hmm updates from Jason Gunthorpe:
 "This is another round of bug fixing and cleanup. This time the focus
  is on the driver pattern to use mmu notifiers to monitor a VA range.
  This code is lifted out of many drivers and hmm_mirror directly into
  the mmu_notifier core and written using the best ideas from all the
  driver implementations.

  This removes many bugs from the drivers and has a very pleasing
  diffstat. More drivers can still be converted, but that is for another
  cycle.

   - A shared branch with RDMA reworking the RDMA ODP implementation

   - New mmu_interval_notifier API. This is focused on the use case of
     monitoring a VA and simplifies the process for drivers

   - A common seq-count locking scheme built into the
     mmu_interval_notifier API usable by drivers that call
     get_user_pages() or hmm_range_fault() with the VA range

   - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen
     GntDev drivers to the new API. This deletes a lot of wonky driver
     code.

   - Two improvements for hmm_range_fault(), from testing done by Ralph"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap
  mm/hmm: make full use of walk_page_range()
  xen/gntdev: use mmu_interval_notifier_insert
  mm/hmm: remove hmm_mirror and related
  drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
  drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
  drm/amdgpu: Call find_vma under mmap_sem
  nouveau: use mmu_interval_notifier instead of hmm_mirror
  nouveau: use mmu_notifier directly for invalidate_range_start
  drm/radeon: use mmu_interval_notifier_insert
  RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
  RDMA/odp: Use mmu_interval_notifier_insert()
  mm/hmm: define the pre-processor related parts of hmm.h even if disabled
  mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror
  mm/mmu_notifier: add an interval tree notifier
  mm/mmu_notifier: define the header pre-processor parts even if disabled
  mm/hmm: allow snapshot of the special zero page
2019-11-30 10:33:14 -08:00
Bjorn Helgaas
e87eb585d3 Merge branch 'pci/misc'
- Add NumaChip SPDX header (Krzysztof Wilczynski)

  - Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski)

  - Remove unused includes (Krzysztof Wilczynski)

  - Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on USB
    2.0 or 1.1 connect events (Kai-Heng Feng)

  - Removed unused sysfs attribute groups (Ben Dooks)

  - Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas)

  - Add PCIe Link Control 2 register field definitions to replace magic
    numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas)

  - Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and Radeon
    CIK/SI PCIe Gen3 link training (Bjorn Helgaas)

  - Use pcie_capability_read_word() instead of pci_read_config_word() in
    AMDGPU and Radeon CIK/SI (Frederick Lawler)

* pci/misc:
  drm/radeon: Prefer pcie_capability_read_word()
  drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions
  drm/radeon: Correct Transmit Margin masks
  drm/amdgpu: Prefer pcie_capability_read_word()
  drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
  drm/amdgpu: Correct Transmit Margin masks
  PCI: Add #defines for Enter Compliance, Transmit Margin
  PCI: Allow building PCIe things without PCIEPORTBUS
  PCI: Remove PCIe Kconfig dependencies on PCI
  PCI/ASPM: Remove dependency on PCIEPORTBUS
  PCI/PTM: Remove dependency on PCIEPORTBUS
  PCI/PTM: Remove spurious "d" from granularity message
  PCI: sysfs: Remove unused attribute groups
  x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect
  PCI: Remove unused includes and superfluous struct declaration
  x86/PCI: Replace deprecated EXTRA_CFLAGS with ccflags-y
  x86/PCI: Correct SPDX comment style
  x86/PCI: Add NumaChip SPDX GPL-2.0 to replace COPYING boilerplate
2019-11-28 08:54:32 -06:00
Dan Carpenter
f4618fe9c2 drm/amdgpu: Fix a bug in jpeg_v1_0_start()
Originally the last WREG32_SOC15() was a part of the if statement block
but the curly braces are on the wrong line.

Fixes: bb0db70f3f ("drm/amdgpu: separate JPEG1.0 code out from VCN1.0")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:04 -05:00
Alex Deucher
5149f08275 drm/amdgpu: flag vram lost on baco reset for VI/CIK
VI/CIK BACO was inflight when this fix landed for SOC15/NV.
Add the fix to VI/CIK as well.

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Alex Deucher
de185019bc drm/amdgpu: move pci handling out of pm ops
The documentation says the that PCI core handles this
for you unless you choose to implement it.  Just rely
on the PCI core to handle the pci specific bits.

Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Hawking Zhang
be3e73ea7d drm/amdgpu: apply gpr/gds workaround before enabling GFX EDC mode
gfx memory should be initialized before enabling
DED and FUE field in mmGB_EDC_MODE

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Jack Zhang
e416fdb6a3 drm/amd/amdgpu/sriov skip jpeg ip block for ARCTURUS VF
Currently ARCTURUS VF doesn't support jpeg ip block.
Skip jpeg ip block in case guest driver load fail.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Zhexi Zhang <zhexi.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Felix Kuehling
9f890f3044 drm/amdgpu: Optimize KFD page table reservation
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.

Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU:  1GB
New page table reservation per GPU: 32MB

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Felix Kuehling
b72ff1909c drm/amdgpu: Raise KFD unpinned system memory limit
Allow KFD applications to use more unpinned system memory through
HMM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Felix Kuehling
29a39c90ba drm/amdgpu: Optimize KFD page table reservation
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.

Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU:  1GB
New page table reservation per GPU: 32MB

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 12:24:07 -05:00
Alex Deucher
dea8b90029 drm/amdgpu: flag vram lost on baco reset for VI/CIK
VI/CIK BACO was inflight when this fix landed for SOC15/NV.
Add the fix to VI/CIK as well.

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 12:22:31 -05:00
John Clements
5985ebbe78 drm/amdgpu: Resolved offchip EEPROM I/O issue
Updated target I2C address

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 12:20:22 -05:00
Nathan Chancellor
a63141e317 drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG
Commit b0f3cd3191 ("drm/amdgpu: remove unnecessary JPEG2.0 code from
VCN2.0") introduced a new clang warning in the vcn_v2_0_stop function:

../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: warning: variable 'r'
is used uninitialized whenever 'while' loop exits because its condition
is false [-Wsometimes-uninitialized]
        SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
                while ((tmp_ & (mask)) != (expected_value)) {   \
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1083:6: note: uninitialized use
occurs here
        if (r)
            ^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: note: remove the
condition if it is always true
        SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
        ^
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
                while ((tmp_ & (mask)) != (expected_value)) {   \
                       ^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1072:7: note: initialize the
variable 'r' to silence this warning
        int r;
             ^
              = 0
1 warning generated.

To prevent warnings like this from happening in the future, make the
SOC15_WAIT_ON_RREG macro initialize its ret variable before the while
loop that can time out. This macro's return value is always checked so
it should set ret in both the success and fail path.

Link: https://github.com/ClangBuiltLinux/linux/issues/776
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 09:48:57 -05:00
John Clements
8633f126bf drm/amdgpu: Resolved offchip EEPROM I/O issue
Updated target I2C address

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-25 11:20:10 -05:00
Oak Zeng
3d3f9ba8c4 drm/amdgpu: Apply noretry setting for mmhub9.4
Config the translation retry behavior according to noretry
kernel parameter

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-25 11:19:55 -05:00
Jason Gunthorpe
81fa1af31b drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
Convert the collision-retry lock around hmm_range_fault to use the one now
provided by the mmu_interval notifier.

Although this driver does not seem to use the collision retry lock that
hmm provides correctly, it can still be converted over to use the
mmu_interval_notifier api instead of hmm_mirror without too much trouble.

This also deletes another place where a driver is associating additional
data (struct amdgpu_mn) with a mmu_struct.

Link: https://lore.kernel.org/r/20191112202231.3856-13-jgg@ziepe.ca
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23 19:56:45 -04:00
Jason Gunthorpe
62914a99de drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.

For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.

Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.

Link: https://lore.kernel.org/r/20191112202231.3856-12-jgg@ziepe.ca
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23 19:56:45 -04:00
Jason Gunthorpe
a9ae8731e6 drm/amdgpu: Call find_vma under mmap_sem
find_vma() must be called under the mmap_sem, reorganize this code to
do the vma check after entering the lock.

Further, fix the unlocked use of struct task_struct's mm, instead use
the mm from hmm_mirror which has an active mm_grab. Also the mm_grab
must be converted to a mm_get before acquiring mmap_sem or calling
find_vma().

Fixes: 66c45500bf ("drm/amdgpu: use new HMM APIs and helpers")
Fixes: 0919195f2b ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads")
Link: https://lore.kernel.org/r/20191112202231.3856-11-jgg@ziepe.ca
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23 19:56:44 -04:00
changzhu
f920d1bb9c drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.

After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:55:19 -05:00
changzhu
6c2c897237 drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:55:19 -05:00
Jack Zhang
1b34de7c3f drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
After rlcg fw 2.1, kmd driver starts to load extra fw for
LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw
because all rlcg related fw have been loaded by host driver.
Guest driver would load the three fw fail without this change.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:19 -05:00
Jack Zhang
ef1c0cbcd1 drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF
Currently the three features haven't been enabled at SRIOV, it would
trigger guest driver load fail with the bare-metal path of the three
features.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:19 -05:00
Xiaojie Yuan
210b3b3c75 drm/amdgpu/gfx10: re-init clear state buffer after gpu reset
This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

clear state buffer (resides in vram) is corrupted after 1st baco reset,
upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:55:12 -05:00
Stephen Rothwell
a3511321fd merge fix for "ftrace: Rework event_create_dir()"
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:12 -05:00
Jay Cornwall
57fb0ab2f1 drm/amdgpu: Update Arcturus golden registers
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:12 -05:00
Xiaojie Yuan
908a28be09 drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
Fixes: 0900a9efdb ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)")
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:12 -05:00
Xiaojie Yuan
1e902a6d32 drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
50us is not enough to wait for cp ready after gpu reset on some navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:54:56 -05:00
Alex Deucher
5e18d2b14c Revert "drm/amd/display: enable S/G for RAVEN chip"
This reverts commit 1c42591591.

S/G display is not stable with the IOMMU enabled on some
platforms.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:11 -05:00
Alex Deucher
8fc4134413 drm/amdgpu: disable gfxoff on original raven
There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:11 -05:00
Alex Deucher
5355d7e054 drm/amdgpu: remove experimental flag for Navi14
5.4 and newer works fine with navi14.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Alex Deucher
70f7eb639e drm/amdgpu: disable gfxoff when using register read interface
When gfxoff is enabled, accessing gfx registers via MMIO
can lead to a hang.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497
Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Sam Bobroff
3d0e3ce52c drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
The INTERRUPT_CNTL2 register expects a valid DMA address, but is
currently set with a GPU MC address.  This can cause problems on
systems that detect the resulting DMA read from an invalid address
(found on a Power8 guest).

Instead, use the DMA address of the dummy page because it will always
be safe.

Fixes: 27ae10641e ("drm/amdgpu: add interupt handler implementation for si v3")
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Alex Deucher
f8a69a8022 drm/amdgpu/nv: add asic func for fetching vbios from rom directly
Needed as a fallback if the vbios can't be fetched by other means.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Yintian Tao
c0e21ea1d0 drm/amdgpu: put flush_delayed_work at first
There is one regression from 042f3d7b745cd76aa
To put flush_delayed_work after adev->shutdown = true
which will make amdgpu_ih_process not response the irq
At last, all ib ring tests will be failed just like below

[drm] amdgpu: finishing device.
[drm] Fence fallback timer expired on ring gfx
[drm] Fence fallback timer expired on ring comp_1.0.0
[drm] Fence fallback timer expired on ring comp_1.1.0
[drm] Fence fallback timer expired on ring comp_1.2.0
[drm] Fence fallback timer expired on ring comp_1.3.0
[drm] Fence fallback timer expired on ring comp_1.0.1
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).

v2: replace cancel_delayed_work_sync() with flush_delayed_work()

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Leo Liu
4e20f6550b drm/amdgpu/vcn2.5: fix the enc loop with hw fini
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Xiaojie Yuan
0900a9efdb drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)
1. no need to allocate an extra member for 'mqd_backup' array
2. backup/restore mqd to/from the correct 'mqd_backup' array slot

v2: warning fix (Alex)

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
zhengbin
e9c5dbc1a2 drm/amdgpu: Use ARRAY_SIZE for sos_old_versions
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/psp_v3_1.c:182:40-41: WARNING: Use ARRAY_SIZE

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
changzhu
4ed8a03740 drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.

After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
changzhu
dab5ef2722 drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Jack Zhang
c348ad46b0 drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
After rlcg fw 2.1, kmd driver starts to load extra fw for
LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw
because all rlcg related fw have been loaded by host driver.
Guest driver would load the three fw fail without this change.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Jack Zhang
edc2176d51 drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF
Currently the three features haven't been enabled at SRIOV, it would
trigger guest driver load fail with the bare-metal path of the three
features.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Xiaojie Yuan
c25edaaf75 drm/amdgpu/gfx10: re-init clear state buffer after gpu reset
This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

clear state buffer (resides in vram) is corrupted after 1st baco reset,
upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Stephen Rothwell
6a93b58e5f merge fix for "ftrace: Rework event_create_dir()"
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Colin Ian King
2e77541bd1 drm/amdgpu: remove redundant assignment to pointer write_frame
The pointer write_frame is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Alex Deucher
562b49fcd0 drm/amdgpu: simplify runtime suspend
In the standard _PR3 case, the pci core handles the pci state.
The driver only needs to handle it in the legacy ATPX case.

This may fix issues with runtime suspend/resume on certain
hybrid graphics laptops.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Jay Cornwall
6e04b2248d drm/amdgpu: Update Arcturus golden registers
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Dennis Li
f6c3623b7b drm/amdgpu: implement querying ras error count for mmhub9.4
Get mmhub error counter by accessing EDC_CNT registers.

v2: Add mmhub_v9_4_ prefix for local static variable and function

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Dennis Li
8781e5df11 drm/amdgpu: refine query function of mmhub EDC counter in vg20
Add codes to print the detail EDC info for the subblock of mmhub

v2: Move the EDC_CNT registers' defintion from mmhub_9_4 header
files to mmhub_1_0 ones. Add mmhub_v1_0_ prefix for the local
static variable and function.

v3: squash in DC fix

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:26:46 -05:00
Dennis Li
46f719696e drm/amdgpu: define soc15_ras_field_entry for reuse
The struct soc15_ras_field_entry will be reused by
other IPs, such as mmhub and gc

v2: rename ras_subblock_regs to gc_ras_fields_vg20,
because the future asic maybe have a different table.

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:23:09 -05:00
Xiaojie Yuan
d98a07aea6 drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
Fixes: 0bb419c76b ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)")
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:22:56 -05:00
Xiaojie Yuan
387d40fd6f drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
50us is not enough to wait for cp ready after gpu reset on some navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:20:30 -05:00
Alex Sierra
b4672c8a84 amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:20:23 -05:00
Alex Sierra
f43ef951f6 drm/amdgpu: add flag to indicate amdgpu vm context
Flag added to indicate if the amdgpu vm context is used for compute or
graphics.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:20:07 -05:00
Frederick Lawler
88027c89ea drm/amdgpu: Prefer pcie_capability_read_word()
Commit 8c0d3a02c1 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.

Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().

[bhelgaas: fix a couple remaining instances in cik.c]
Link: https://lore.kernel.org/r/20191118003513.10852-1-fred@fredlawl.com
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-21 11:15:49 -06:00
Bjorn Helgaas
35e768e296 drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
Replace hard-coded magic numbers with the descriptive PCI_EXP_LNKCTL2
definitions.  No functional change intended.

Link: https://lore.kernel.org/r/20191112173503.176611-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-21 07:52:34 -06:00
Bjorn Helgaas
19d7a95a8b drm/amdgpu: Correct Transmit Margin masks
Previously we masked PCIe Link Control 2 register values with "7 << 9",
which was apparently intended to be the Transmit Margin field, but instead
was the high order bit of Transmit Margin, the Enter Modified Compliance
bit, and the Compliance SOS bit.

Correct the mask to "7 << 7", which is the Transmit Margin field.

Link: https://lore.kernel.org/r/20191112173503.176611-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-21 07:52:34 -06:00
Dave Airlie
c22fe762ba Merge tag 'drm-next-5.5-2019-11-15' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-15:

amdgpu:
- Fix AVFS handling on SMU7 parts with custom power tables
- Enable Overdrive sysfs interface for Navi parts
- Fix power limit handling on smu11 parts
- Fix pcie link sysfs output for Navi
- Probably cancel MM worker threads on shutdown

radeon:
- Cleanup for ppc change

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115163516.3714-1-alexander.deucher@amd.com
2019-11-21 08:52:27 +10:00
Alex Deucher
72f058b723 drm/amdgpu: enable runtime pm on BACO capable boards if runpm=1
BACO - Bus Active, Chip Off

Everything is in place now.  Not enabled by default yet.  You
still have to specify runpm=1.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:54 -05:00
Alex Deucher
3840c5bcc2 drm/amdgpu: disentangle runtime pm and vga_switcheroo
Originally we only supported runtime pm on PX/HG laptops
so vga_switcheroo and runtime pm are sort of entangled.

Attempt to logically separate them.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:53 -05:00
Alex Deucher
6ae6c7d404 drm/amdgpu: start to disentangle boco from runtime pm
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

We originally only supported runtime pm on PX/HG
laptops so most of the runtime pm code looks for this.
Add a new flag to check for runtime pm enablement and
use this rather than checking for PX/HG.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:53 -05:00
Alex Deucher
1913431728 drm/amdgpu: add baco support to runtime suspend/resume
BACO - Bus Active, Chip Off

This adds the necessary support to the runtime suspend
and resume functions to handle boards that support
baco.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:52 -05:00
Alex Deucher
361dbd01a1 drm/amdgpu: add helpers for baco entry and exit
BACO - Bus Active, Chip Off

Will be used for runtime pm.  Entry will enter the BACO
state (chip off).  Exit will exit the BACO state (chip on).

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:52 -05:00
Alex Deucher
11520f2708 drm/amdgpu: split swSMU baco_reset into enter and exit
BACO - Bus Active, Chip Off

So we can use it for power savings rather than just reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:51 -05:00
Alex Deucher
b97e9d47e5 drm/amdgpu: add additional boco checks to runtime suspend/resume (v2)
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

We will take slightly different paths for boco and baco.

v2: fold together two consecutive if clauses

Reviewed-by: Evan Quan <evan.quan@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:50 -05:00
Alex Deucher
31af062acf drm/amdgpu: rename amdgpu_device_is_px to amdgpu_device_supports_boco (v2)
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

To better match what we are checking for and to align with
amdgpu_device_supports_baco.

BOCO is used on PowerXpress/Hybrid Graphics systems and BACO
is used on desktop dGPU boards.

v2: fix typo in documentation

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:50 -05:00
Alex Deucher
a69cba42b1 drm/amdgpu: add a amdgpu_device_supports_baco helper
BACO - Bus Active, Chip Off

To check if a device supports BACO or not.  This will be
used in determining when to enable runtime pm.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:49 -05:00
Alex Deucher
ac7426169e drm/amdgpu: add supports_baco callback for NV asics.
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:48 -05:00
Alex Deucher
e45ed9435f drm/amdgpu: add supports_baco callback for VI asics.
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:48 -05:00
Alex Deucher
0d0c07ee07 drm/amdgpu: add supports_baco callback for CIK asics.
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:47 -05:00
Alex Deucher
3670c242e3 drm/amdgpu: add supports_baco callback for SI asics.
BACO - Bus Active, Chip Off

Not supported.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:47 -05:00
Alex Deucher
988eb9ff3e drm/amdgpu: add supports_baco callback for soc15 asics. (v2)
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

v2: drop unrelated struct cleanup

Reviewed-by: Evan Quan <evan.quan@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:46 -05:00
Alex Deucher
69d5436d4d drm/amdgpu: add asic callback for BACO support
BACO - Bus Active, Chip Off

Used to check whether the device supports BACO.  This will
be used to enable runtime pm on devices which support BACO.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:46 -05:00
Hawking Zhang
858a2bbad6 drm/amdgpu: pull ras controller int status only when ras enabled
ras_controller_irq and athub_err_event_irq are only registered
when PCIE_BIF ras is marked as supported. as the result, the driver
also just need pull the int status in such case.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:09:23 -05:00
Hawking Zhang
5bdd0b72d6 drm/amdgpu: switch to common helper func for psp cmd submission
Drop all the IP specific cmd_submit callback function
and use the common helper instead

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:09:13 -05:00
Hawking Zhang
cc65176e51 drm/amdgpu: add helper func for psp ring cmd submission
Except for ring wptr update, the psp ring cmd submission
function shouldn't be IP specific one. Create a common
helper function to be shared for all the ASICs.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:09:06 -05:00
Hawking Zhang
13a390a6f9 drm/amdgpu: add psp funcs for ring write pointer read/write
The ring write pointer regsiter update is the only part that
is IP specific ones in psp_cmd_submit function.

Add two callbacks for wptr read/write so that we unify the
psp_cmd_submit function for all the ASICs.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:08:58 -05:00
Evan Quan
0a650c1d35 drm/amd/powerplay: add Arcturus baco reset support
Enable baco reset support on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:08:38 -05:00
Alex Deucher
2aa87ba568 Revert "drm/amd/display: enable S/G for RAVEN chip"
This reverts commit 1c42591591.

S/G display is not stable with the IOMMU enabled on some
platforms.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
3f2a06ac81 drm/amdgpu: disable gfxoff on original raven
There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
b62d955426 drm/amdgpu: remove experimental flag for Navi14
5.4 and newer works fine with navi14.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
ca9317b918 drm/amdgpu: disable gfxoff when using register read interface
When gfxoff is enabled, accessing gfx registers via MMIO
can lead to a hang.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497
Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
zhengbin
1664194925 drm/amdgpu: remove not needed memset
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c:64:13-31: WARNING: dma_alloc_coherent use in ih -> ring already zeroes out memory,  so memset is not needed

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Sam Bobroff
b992691d45 drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
The INTERRUPT_CNTL2 register expects a valid DMA address, but is
currently set with a GPU MC address.  This can cause problems on
systems that detect the resulting DMA read from an invalid address
(found on a Power8 guest).

Instead, use the DMA address of the dummy page because it will always
be safe.

Fixes: 27ae10641e ("drm/amdgpu: add interupt handler implementation for si v3")
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
29bc37b410 drm/amdgpu/nv: add asic func for fetching vbios from rom directly
Needed as a fallback if the vbios can't be fetched by other means.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Alex Deucher
761e09230c drm/amdgpu/soc15: move struct definition around to align with other soc15 asics
Move reset_method next to reset callback to match the struct layout and
the other definition in this file.

Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Yintian Tao
d0d13fe874 drm/amdgpu: put flush_delayed_work at first
There is one regression from 042f3d7b745cd76aa
To put flush_delayed_work after adev->shutdown = true
which will make amdgpu_ih_process not response the irq
At last, all ib ring tests will be failed just like below

[drm] amdgpu: finishing device.
[drm] Fence fallback timer expired on ring gfx
[drm] Fence fallback timer expired on ring comp_1.0.0
[drm] Fence fallback timer expired on ring comp_1.1.0
[drm] Fence fallback timer expired on ring comp_1.2.0
[drm] Fence fallback timer expired on ring comp_1.3.0
[drm] Fence fallback timer expired on ring comp_1.0.1
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).

v2: replace cancel_delayed_work_sync() with flush_delayed_work()

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Leo Liu
4effa8dbc1 drm/amdgpu/vcn2.5: fix the enc loop with hw fini
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Leo Liu
8c74e59049 drm/amdgpu: enable Arcturus JPEG2.5 block
It also doen't care about FW loading type, so enabling it directly.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
e89e2237e8 drm/amdgpu: enable Arcturus CG for VCN and JPEG blocks
Arcturus VCN and JPEG only got CG support, and no PG support

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
14f43e8f88 drm/amdgpu: move JPEG2.5 out from VCN2.5
And clean up the duplicated stuff

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
5be45a26c9 drm/amdgpu: enable JPEG2.0 for Navi1x and Renoir
By adding JPEG IP block to the family

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
52f2e779ad drm/amdgpu: add driver support for JPEG2.0 and above
By using JPEG IP block type

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
474b6d296f drm/amdgpu: enable JPEG2.0 dpm
By using its own enabling function

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
099d66e43f drm/amdgpu: add PG and CG for JPEG2.0
And enable them for Navi1x and Renoir

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
b0f3cd3191 drm/amdgpu: remove unnecessary JPEG2.0 code from VCN2.0
They are no longer needed, using from JPEG2.0 instead.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
6ac2724110 drm/amdgpu: add JPEG v2.0 function supports
It got separated from VCN2.0 with a new jpeg_v2_0_ip_block

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
2eb167293f drm/amdgpu: add JPEG common functions to amdgpu_jpeg
They will be used for JPEG2.0 and later.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
0388aee766 drm/amdgpu: use the JPEG structure for general driver support
JPEG1.0 will be functional along with VCN1.0

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Leo Liu
bb0db70f3f drm/amdgpu: separate JPEG1.0 code out from VCN1.0
For VCN1.0, the separation is just in code wise, JPEG1.0 HW is still
included in the VCN1.0 HW.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Leo Liu
9d9cc9b8fe drm/amdgpu: add amdgpu_jpeg and JPEG tests
It will be used for all versions of JPEG eventually. Previous
JPEG tests will be removed later since they are still used by
JPEG2.x.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Leo Liu
88a1c40a04 drm/amdgpu: add JPEG HW IP and SW structures
It will be used for JPEG IP 1.0, 2.0, 2.5 and later.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Xiaojie Yuan
0bb419c76b drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)
1. no need to allocate an extra member for 'mqd_backup' array
2. backup/restore mqd to/from the correct 'mqd_backup' array slot

v2: warning fix (Alex)

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Hawking Zhang
9e612c11a7 drm/amdgpu: init umc functions for arcturus umc ras
reuse vg20 umc functions for arcturus umc ras

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 09:47:36 -05:00
Hawking Zhang
baaeb610b1 drm/amdgpu: enable ras capablity check on arcturus
check hw ras capablity via atomfirmware

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 09:47:30 -05:00
Jani Nikula
f139a62c7a drm/amdgpu: use drm_debug_enabled() to check for debug categories
Allow better abstraction of the drm_debug global variable in the
future. No functional changes.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Sean Paul <sean@poorly.run>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e30ddbd627be1fe1c67dd007e0d36f81a094bc91.1572258936.git.jani.nikula@intel.com
2019-11-14 14:08:39 +02:00
Dave Airlie
0990ca235d Merge tag 'drm-next-5.5-2019-11-08' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-08:

amdgpu:
- Enable VCN dynamic powergating on RV/RV2
- Fixes for Navi14
- Misc Navi fixes
- Fix MSI-X tear down
- Misc Arturus fixes
- Fix xgmi powerstate handling
- Documenation fixes

scheduler:
- Fix static code checker warning
- Fix possible thread reactivation while thread is stopped
- Avoid cleanup if thread is parked

radeon:
- SI dpm fix ported from amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108212713.5078-1-alexander.deucher@amd.com
2019-11-14 08:43:07 +10:00
yu kuai
9e089a29c6 drm/amdgpu: remove set but not used variable 'invalid'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c: In function
‘amdgpu_amdkfd_evict_userptr’:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:1665:6: warning:
variable ‘invalid’ set but not used [-Wunused-but-set-variable]

'invalid' is never used, so can be removed. Thus 'atomic_inc_return'
can be replaced as 'atomic_inc'

Fixes: 5ae0283e83 ("drm/amdgpu: Add userptr support for KFD")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:46 -05:00
yu kuai
4f2922d12d drm/amdgpu: remove set but not used variable 'amdgpu_connector'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_display.c: In function
‘amdgpu_display_crtc_scaling_mode_fixup’:
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:693:27: warning: variable
‘amdgpu_connector’ set but not used [-Wunused-but-set-variable]

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:46 -05:00
yu kuai
747a397d39 drm/amdgpu: remove set but not used variable 'mc_shared_chmap' from 'gfx_v6_0.c' and 'gfx_v7_0.c'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c: In function
‘gfx_v6_0_constants_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c:1579:6: warning: variable
‘mc_shared_chmap’ set but not used [-Wunused-but-set-variable]

drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c: In function
‘gfx_v7_0_gpu_early_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c:4262:6: warning: variable
‘mc_shared_chmap’ set but not used [-Wunused-but-set-variable]

Fixes: 2cd46ad223 ("drm/amdgpu: add graphic pipeline implementation for si v8")
Fixes: d93f3ca706 ("drm/amdgpu/gfx7: rework gpu_init()")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:46 -05:00
Colin Ian King
d785476c60 drm/amd/display: remove duplicated assignment to grph_obj_type
Variable grph_obj_type is being assigned twice, one of these is
redundant so remove it.

Addresses-Coverity: ("Evaluation order violation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
e98042db2c drm/amdgpu: remove set but not used variable 'mc_shared_chmap'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function
‘gfx_v8_0_gpu_early_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:1713:6: warning: variable
‘mc_shared_chmap’ set but not used [-Wunused-but-set-variable]

Fixes: 0bde3a95ea ("drm/amdgpu: split gfx8 gpu init into sw and hw parts")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
220ac8d144 drm/amdgpu: remove always false comparison in 'amdgpu_atombios_i2c_process_i2c_ch'
Fixes gcc '-Wtype-limits' warning:

drivers/gpu/drm/amd/amdgpu/atombios_i2c.c: In function
‘amdgpu_atombios_i2c_process_i2c_ch’:
drivers/gpu/drm/amd/amdgpu/atombios_i2c.c:79:11: warning: comparison is
always false due to limited range of data type [-Wtype-limits]

'num' is 'u8', so it will never be greater than 'TOM_MAX_HW_I2C_READ',
which is defined as 255. Therefore, the comparison can be removed.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
d1d09dc417 drm/amdgpu: remove set but not used variable 'dig'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/atombios_dp.c: In function
‘amdgpu_atombios_dp_link_train’:
drivers/gpu/drm/amd/amdgpu/atombios_dp.c:716:34: warning: variable ‘dig’
set but not used [-Wunused-but-set-variable]

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
5bea7fedb7 drm/amdgpu: remove set but not used variable 'dig_connector'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/atombios_dp.c: In function
‘amdgpu_atombios_dp_get_panel_mode’:
drivers/gpu/drm/amd/amdgpu/atombios_dp.c:364:36: warning: variable
‘dig_connector’ set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
e8b74035d8 drm/amdgpu: add function parameter description in 'amdgpu_gart_bind'
Fixes gcc warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:313: warning: Function
parameter or member 'flags' not described in 'amdgpu_gart_bind'

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
b8b7213057 drm/amdgpu: add function parameter description in 'amdgpu_device_set_cg_state'
Fixes gcc warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1954: warning: Function
parameter or member 'state' not described in 'amdgpu_device_set_cg_state'

Fixes: e3ecdffac9 ("drm/amdgpu: add documentation for amdgpu_device.c")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
bae028e3e5 drm/amdgpu: remove 4 set but not used variable in amdgpu_atombios_get_connector_info_from_object_table
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c: In function
'amdgpu_atombios_get_connector_info_from_object_table':
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:376:26: warning: variable
'grph_obj_num' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:376:13: warning: variable
'grph_obj_id' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:341:37: warning: variable
'con_obj_type' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:341:24: warning: variable
'con_obj_num' set but not used [-Wunused-but-set-variable]

They are never used, so can be removed.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha
b86a1aa36a drm/amd/display: rename DCN1_0 kconfig to DCN
Since dcn20 and dcn21 are under dcn1 it doesnt make sense to
have it named dcn1.

Change it to "dcn" to make it generic

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha
aca935c7cc drm/amd/display: Drop CONFIG_DRM_AMD_DC_DCN2_1 flag
[Why]

DCN21 is stable enough to be build by default. So drop the flags.

[How]

Remove them using the unifdef tool. The following commands were executed
in sequence:

$ find -name '*.c' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DCN2_1 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_1 '{}' ';'
$ find -name '*.h' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DCN2_1 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_1 '{}' ';'

In addition:

* Remove from kconfig, and replace any dependencies with DCN1_0.
* Remove from any makefiles.
* Fix and cleanup Renoir definitions in dal_asic_id.h
* Expand DCN1 ifdef to include DCN21 code in the following files:
    * clk_mgr/clk_mgr.c: dc_clk_mgr_create()
    * core/dc_resources.c: dc_create_resource_pool()
    * gpio/hw_factory.c: dal_hw_factory_init()
    * gpio/hw_translate.c: dal_hw_translate_init()

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha
1da37801a8 drm/amd/display: Drop CONFIG_DRM_AMD_DC_DCN2_0 and DSC_SUPPORTED
[Why]

DCN2 and DSC are stable enough to be build by default. So drop the flags.

[How]

Remove them using the unifdef tool. The following commands were executed
in sequence:

$ find -name '*.c' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';'
$ find -name '*.h' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';'

In addition:

* Remove from kconfig, and replace any dependencies with DCN1_0.
* Remove from any makefiles.
* Fix and cleanup NV defninitions in dal_asic_id.h
* Expand DCN1 ifdef to include DCN2 code in the following files:
    * clk_mgr/clk_mgr.c: dc_clk_mgr_create()
    * core/dc_resources.c: dc_create_resource_pool()
    * dce/dce_dmcu.c: dcn20_*lock_phy()
    * dce/dce_dmcu.c: dcn20_funcs
    * dce/dce_dmcu.c: dcn20_dmcu_create()
    * gpio/hw_factory.c: dal_hw_factory_init()
    * gpio/hw_translate.c: dal_hw_translate_init()

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Alex Deucher
622b2a0ab6 drm/amdgpu/vcn: finish delay work before release resources
flush/cancel delayed works before doing finalization
to avoid concurrently requests.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Nicholas Kazlauskas
976e51a7c0 drm/amdgpu: Add DMCUB to firmware query interface
The DMCUB firmware version can be read using the AMDGPU_INFO ioctl
or the amdgpu_firmware_info debugfs entry.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Nicholas Kazlauskas
2bd2a27ffc drm/amdgpu: Add PSP loading support for DMCUB ucode
DMCUB ucode requires secure loading through PSP. This is already
supported in PSP as GFX_FW_TYPE_DMUB, it just needs to be mapped from
AMDGPU_UCODE_ID_DMCUB to GFX_FW_TYPE_DMUB.

DMUB is a shorthand name for DMCUB and can be used interchangeably.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Nicholas Kazlauskas
02350f0bdf drm/amdgpu: Add ucode support for DMCUB
The DMCUB is a secondary DMCU (Display MicroController Unit) that has
its own separate firmware. It's required for DMCU support on Renoir.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Dave Airlie
77e0723bd2 Linux 5.4-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd
 CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm
 6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL
 XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h
 Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY
 6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR
 dUvuYZc=
 =1N6F
 -----END PGP SIGNATURE-----

Merge v5.4-rc7 into drm-next

We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-11-14 05:53:10 +10:00
Jesse Zhang
9f87516764 drm/amd/amdgpu: finish delay works before release resources
flush/cancel delayed works before doing finalization
to avoid concurrently requests.

Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-11 17:38:13 -05:00
Hawking Zhang
51bd363857 drm/amdgpu: avoid upload corrupted ta ucode to psp
xgmi, ras, hdcp and dtm ta are actually separated ucode and
need to handled case by case to upload to psp.

We support the case that ta binary have one or multiple of
them built-in. As a result, the driver should check each ta
binariy's availablity before decide to upload them to psp.

In the terminate (unload) case, the driver will check the
context readiness before perform unload activity. It's fine
to keep it as is.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-11 17:38:13 -05:00
changzhu
eebc7f4d7f drm/amdgpu: allow direct upload save restore list for raven2
It will cause modprobe atombios stuck problem in raven2 if it doesn't
allow direct upload save restore list from gfx driver.
So it needs to allow direct upload save restore list for raven2
temporarily.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-08 12:29:52 -05:00
Andrey Grodzovsky
a28fda312a drm/amdgpu: Avoid accidental thread reactivation.
Problem:
During GPU reset we call the GPU scheduler to suspend it's
thread, those two functions in amdgpu also suspend and resume
the sceduler for their needs but this can collide with GPU
reset in progress and accidently restart a suspended thread
before time.

Fix:
Serialize with GPU reset.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Andrey Grodzovsky
7c55adb0a9 Revert "drm/amdgpu: dont schedule jobs while in reset"
This reverts commit 89b3d86403.

We will do a proper fix in next patch.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Jonathan Kim
cb5932f866 drm/amdgpu: fix vega20 pstate status change
vega20 only requires all devices be set to same pstate level for low
pstate and not high.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Evan Quan <Evan.Quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Kevin Wang
2af8153126 drm/amdgpu: fix sysfs interface pcie_replay_count error on navi asic
the asic callback function of get_pcie_replay_count is not implement on navi asic,
it will cause null pinter error when read this interface.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Emily Deng
e31dcdcfab drm/amdgpu: Need to disable msix when unloading driver
For driver reload test, it will report "can't enable
MSI (MSI-X already enabled)".

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Oak Zeng
f6baa07497 drm/amdgpu: Add comments to gmc structure
Explain fields like aper_base, agp_start etc. The definition
of those fields are confusing as they are from different view
(CPU or GPU). Add comments for easier understand.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <Alex.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Alex Deucher
77a3160221 drm/amdgpu/renoir: move gfxoff handling into gfx9 module
To properly handle the option parsing ordering.

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
changzhu
440a7a54e7 drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9
It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
changzhu
589b64a7e3 drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
Evan Quan
6a299d7aaa drm/amdgpu: register gpu instance before fan boost feature enablment
Otherwise, the feature enablement will be skipped due to wrong count.

Fixes: beff74bc6e ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
Alex Deucher
ef177d11d6 drm/amdgpu: Improve RAS documentation (v2)
Clarify some areas, clean up formatting, add section for
unrecoverable error handling.

v2: fix grammatical errors

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Alex Deucher
ad4d81dc57 drm/amdgpu/renoir: move gfxoff handling into gfx9 module
To properly handle the option parsing ordering.

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Pan Bian
365f7f8db8 drm/amdgpu: fix double reference dropping
The reference to object fence is dropped at the end of the loop.
However, it is dropped again outside the loop. The reference can be
dropped immediately after calling dma_fence_wait() in the loop and
thus the dropping operation outside the loop can be removed.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Pan Bian
946ab8db69 drm/amdgpu: fix potential double drop fence reference
The object fence is not set to NULL after its reference is dropped. As a
result, its reference may be dropped again if error occurs after that,
which may lead to a use after free bug. To avoid the issue, fence is
explicitly set to NULL after dropping its reference.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Eric Huang
f88e2d1f8e drm/amdgpu: change read of GPU clock counter on Vega10 VF
Using unified VBIOS has performance drop in sriov environment.
The fix is switching to another register instead.

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
changzhu
11c6108934 drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9
It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
changzhu
a6522a5c63 drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Evan Quan
60599a0363 drm/amdgpu: perform p-state switch after the whole hive initialized
P-state switch should be performed after all devices from the hive
get initialized.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by:  Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Evan Quan
5c5b2ba006 drm/amdgpu: fix possible pstate switch race condition
Added lock protection so that the p-state switch will
be guarded to be sequential. Also update the hive
pstate only all device from the hive are in the same
state.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Evan Quan
b0adca4d50 drm/amdgpu: register gpu instance before fan boost feature enablment
Otherwise, the feature enablement will be skipped due to wrong count.

Fixes: beff74bc6e ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Hawking Zhang
58f46d4b65 drm/amdgpu: disallow direct upload save restore list from gfx driver
Direct uploading save/restore list via mmio register writes breaks the security
policy. Instead, the driver should pass s&r list to psp.

For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list
twice, in non-psp ucode front door loading phase and gfx pg initialization phase.
The latter is not allowed.

VG12 is the only exception where the driver still keeps legacy approach for S&R
list uploading. In theory, this can be elimnated if we have valid srcntl ucode
for VG12.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Candice Li <Candice.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Emily Deng
224f82e5b7 drm/amdgpu/discovery: Need to free discovery memory
When unloading driver, need to free discovery memory.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Alex Deucher
8863baefaf drm/amdgpu/gpuvm: add some additional comments in amdgpu_vm_update_ptes
To better clarify what is happening in this function.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Alex Deucher
a4840d91c9 drm/amdgpu: enable VCN DPG on Raven and Raven2
It's safe to enable dynamic VCN powergating on raven and
raven2 for increased power savings.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Tianci.Yin
84e4e8205e drm/amdgpu: add navi14 PCI ID
Add the navi14 PCI device id.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Evan Quan
3e454860f2 drm/amd/powerplay: support xgmi pstate setting on powerplay routine V2
Add xgmi pstate setting on powerplay routine.

V2: split the change of is_support_sw_smu_xgmi into a separate patch

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Evan Quan
39ea6e5f9e drm/amdgpu: change pstate only after all XGMI device initialized
Pstate settings should be performed after all device of the
XGMI setup get initialized.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Tianci.Yin
5e200fb97a drm/amdgpu: add navi14 PCI ID
Add the navi14 PCI device id.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 15:35:08 -05:00
Shirish S
f2efc6e600 drm/amdgpu: dont schedule jobs while in reset
[Why]

doing kthread_park()/unpark() from drm_sched_entity_fini
while GPU reset is in progress defeats all the purpose of
drm_sched_stop->kthread_park.
If drm_sched_entity_fini->kthread_unpark() happens AFTER
drm_sched_stop->kthread_park nothing prevents from another
(third) thread to keep submitting job to HW which will be
picked up by the unparked scheduler thread and try to submit
to HW but fail because the HW ring is deactivated.

[How]
grab the reset lock before calling drm_sched_entity_fini()

Signed-off-by: Shirish S <shirish.s@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 15:26:53 -05:00
Alex Deucher
576daab3cd drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 15:26:20 -05:00
Shirish S
89b3d86403 drm/amdgpu: dont schedule jobs while in reset
[Why]

doing kthread_park()/unpark() from drm_sched_entity_fini
while GPU reset is in progress defeats all the purpose of
drm_sched_stop->kthread_park.
If drm_sched_entity_fini->kthread_unpark() happens AFTER
drm_sched_stop->kthread_park nothing prevents from another
(third) thread to keep submitting job to HW which will be
picked up by the unparked scheduler thread and try to submit
to HW but fail because the HW ring is deactivated.

[How]
grab the reset lock before calling drm_sched_entity_fini()

Signed-off-by: Shirish S <shirish.s@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 14:20:24 -05:00
Alex Deucher
e2f619aa14 drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 14:20:08 -05:00
Dave Airlie
8a86b00a43 Merge tag 'drm-next-5.5-2019-11-01' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-01:

amdgpu:
- Add EEPROM support for Arcturus
- Enable VCN encode support for Arcturus
- Misc PSP fixes
- Misc DC fixes
- swSMU cleanup

amdkfd:
- Misc cleanups
- Fix typo in cu bitmap parsing

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101190607.3763-1-alexander.deucher@amd.com
2019-11-04 10:22:53 +10:00
Dave Airlie
633aa7e53a drm-misc-next for 5.5:
UAPI Changes:
 -dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean)
 
 Cross-subsystem Changes:
 - None
 
 Core Changes:
 -dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock
 	  state on mmap/munmap (Christian)
 -vram: add prepare/cleanup fb helpers to vram helpers (Thomas)
 -ttm: always keep bo's on the lru + ttm cleanups (Christian)
 -sched: allow a free_job routine to sleep (Steven)
 -fb_helper: remove unused drm_fb_helper_defio_init() (Thomas)
 
 Driver Changes:
 -bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas)
 -amdgpu: Implement dma-buf import/export without drm helpers (Christian)
 -panfrost: Simplify devfreq integration in driver (Steven)
 
 Cc: Christian König <christian.koenig@amd.com>
 Cc: Thomas Zimmermann <tzimmermann@suse.de>
 Cc: Steven Price <steven.price@arm.com>
 Cc: Andrew F. Davis <afd@ti.com>
 Cc: John Stultz <john.stultz@linaro.org>
 Cc: Sean Paul <seanpaul@chromium.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl27NYYACgkQcywAJXLc
 r3k7SQgAy7MbT7MHg1NkObnWFKwbNxA0FuRV7HVuQ5AiVA5fbehGECNVI1hFZE3D
 kCyL5zRgzfrbBjI9zqclonXPmjPbD//f+ufUZJ2EaWHfE5k7LUYIEunpgWlCEgUR
 qqeuqjvx2+NY3gLZW6D8BIrUW79cKbggBvDa9QeE4aoMiI7/F3lpNog5LNo7TWCQ
 4SGgpDqtcJ2eSBneX80zLLVnI1CKfCSiWheZVoqCTRN5bfUftGwhbXb3N/JipG37
 cCblVR7D8FR7z0MQUtq2ql76tCn5SJJqloujkmUU06quYBHR5K1deB5BP+8HI1Fw
 /XMqWHFo0uX7h4hFIt1MVIJ92U+b+w==
 =ViNo
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-10-31' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.5:

UAPI Changes:
-dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean)

Cross-subsystem Changes:
- None

Core Changes:
-dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock
	  state on mmap/munmap (Christian)
-vram: add prepare/cleanup fb helpers to vram helpers (Thomas)
-ttm: always keep bo's on the lru + ttm cleanups (Christian)
-sched: allow a free_job routine to sleep (Steven)
-fb_helper: remove unused drm_fb_helper_defio_init() (Thomas)

Driver Changes:
-bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas)
-amdgpu: Implement dma-buf import/export without drm helpers (Christian)
-panfrost: Simplify devfreq integration in driver (Steven)

Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031193015.GA243509@art_vandelay
2019-11-04 09:28:51 +10:00
Alex Deucher
30ef5c7eab drm/amdgpu/gmc10: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-30 11:56:20 -04:00
Andrey Grodzovsky
57c0f58e9f drm/amdgpu: If amdgpu_ib_schedule fails return back the error.
Use ERR_PTR to return back the error happened during amdgpu_ib_schedule.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:56:15 -04:00
Tianci.Yin
47661f6dad drm/amdgpu/gfx10: update gfx golden settings for navi12
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:56:15 -04:00
Tianci.Yin
3dde767f14 drm/amdgpu/gfx10: update gfx golden settings for navi14
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:56:15 -04:00
Tianci.Yin
f52ebe1f88 drm/amdgpu/gfx10: update gfx golden settings
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-30 11:55:54 -04:00
Pierre-Eric Pelloux-Prayer
9bdf63d357 drm/amdgpu/sdma5: do not execute 0-sized IBs (v2)
This seems to help with https://bugs.freedesktop.org/show_bug.cgi?id=111481.

v2: insert a NOP instead of skipping all 0-sized IBs to avoid breaking older hw

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:55:20 -04:00
chen gong
e5574f61e9 drm/amdgpu: Fix SDMA hang when performing VKexample test
VKexample test hang during Occlusion/SDMA/Varia runs.
Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-30 11:54:33 -04:00
Alex Deucher
46203a508f drm/amdgpu/gmc10: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:07:13 -04:00
Le Ma
361d66edc5 drm/amdgpu: fix no ACK from LDS read during stress test for Arcturus
Set mmSQ_CONFIG.DISABLE_SMEM_SOFT_CLAUSE as W/R.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:07:13 -04:00
HaiJun Chang
897110eed5 drm/amdgpu: fix gfx VF FLR test fail on navi
Cp wptr in wb buffer is outdated after VF FLR.
The outdated wptr may cause cp to execute unexpected packets.
Reset cp wptr in wb buffer.

Signed-off-by: HaiJun Chang <HaiJun.Chang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:07:13 -04:00
Le Ma
bff77e86a3 drm/amdgpu: bypass some cleanup work after err_event_athub (v2)
PSP lost connection when err_event_athub occurs. These cleanup work can be
skipped in BACO reset.

v2: squash in missing include (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:52 -04:00
Le Ma
8baaadba73 drm/amdgpu: clear UVD VCPU buffer when err_event_athub generated
The err_event_athub error will mess up the buffer and cause UVD resume hang.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Jiange Zhao
b4def3744b drm/amdgpu/SRIOV: SRIOV VF doesn't support BACO
SRIOV VF doesn't support BACO.

Only PF with BACO capability can do it.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Geert Uytterhoeven
44b582b32a drm/amdgpu: Remove superfluous void * cast in debugfs_create_file() call
There is no need to cast a typed pointer to a void pointer when calling
a function that accepts the latter.  Remove it, as the cast prevents
further compiler checks.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
YueHaibing
e4b116a2c0 drm/amdgpu: remove set but not used variable 'adev'
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:1221:24: warning: variable adev set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:488:24: warning: variable adev set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:547:24: warning: variable adev set but not used [-Wunused-but-set-variable]

It is never used, so can removed it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Yong Zhao
533bfcaea1 drm/amdkfd: Delete duplicated queue bit map reservation
The KIQ is on the second MEC and its reservation is covered in the
latter logic, so no need to reserve its bit twice.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Yong Zhao
55695b36c1 drm/amdkfd: Delete unnecessary pr_fmt switch
Given amdkfd.ko has been merged into amdgpu.ko, this switch is no
longer useful.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Dave Airlie
a24e4b09dc drm-misc-next for 5.5:
UAPI Changes:
 -syncobj: allow querying the last submitted timeline value (David)
 -fourcc: explicitly defineDRM_FORMAT_BIG_ENDIAN as unsigned (Adam)
 -omap: revert the OMAP_BO_* flags that were added -- no userspace (Sean)
 
 Cross-subsystem Changes:
 -MAINTAINERS: add Mihail as komeda co-maintainer (Mihail)
 
 Core Changes:
 -edid: a few cleanups, add AVI infoframe bar info (Ville)
 -todo: remove i915 device_link item and add difficulty levels (Daniel)
 -dp_helpers: add a few new helpers to parse dpcd (Thierry)
 
 Driver Changes:
 -gma500: fix a few memory disclosure leaks (Kangjie)
 -qxl: convert to use the new drm_gem_object_funcs.mmap (Gerd)
 -various: open code dp_link helpers in preparation for helper removal (Thierry)
 
 Cc: Chunming Zhou <david1.zhou@amd.com>
 Cc: Adam Jackson <ajax@redhat.com>
 Cc: Sean Paul <seanpaul@chromium.org>
 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
 Cc: Kangjie Lu <kjlu@umn.edu>
 Cc: Mihail Atanassov <mihail.atanassov@arm.com>
 Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
 Cc: Thierry Reding <treding@nvidia.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl2xxiYACgkQcywAJXLc
 r3mJowf/U9ANPuvF+lahI6IVKpE4KD8Pz+73uNhjmxq5TnmcA7eNJ2pb8HO38tU2
 lkchhaQEGZR96nVL962DkegEDu8p08RpYYwKv8r+sDV+zw/aviN/ANLSmTVtZ//m
 wzgUI7zIqF1WxKdFEzNmQVuY0hYd4fWBn6kGvw1jS/6xL2/3KR6hKVigBZwkICSt
 /ZiCZyxA7HhAlqzasn+PqGkuLVYv6NvFu4Ug6YG4nBOh57IrKmGt1a6cEUjGsHFf
 6Ets5wTPu2ydMRvY+v6rUDDRj6JJQph7Lv4hVKtg13FerJ1+OQ7xjhu4gIk1oNl/
 7zIgRWMVZj79ksMXyk6zgFD2rZAF3w==
 =Fm3U
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-10-24-2' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.5:

UAPI Changes:
-syncobj: allow querying the last submitted timeline value (David)
-fourcc: explicitly defineDRM_FORMAT_BIG_ENDIAN as unsigned (Adam)
-omap: revert the OMAP_BO_* flags that were added -- no userspace (Sean)

Cross-subsystem Changes:
-MAINTAINERS: add Mihail as komeda co-maintainer (Mihail)

Core Changes:
-edid: a few cleanups, add AVI infoframe bar info (Ville)
-todo: remove i915 device_link item and add difficulty levels (Daniel)
-dp_helpers: add a few new helpers to parse dpcd (Thierry)

Driver Changes:
-gma500: fix a few memory disclosure leaks (Kangjie)
-qxl: convert to use the new drm_gem_object_funcs.mmap (Gerd)
-various: open code dp_link helpers in preparation for helper removal (Thierry)

Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kangjie Lu <kjlu@umn.edu>
Cc: Mihail Atanassov <mihail.atanassov@arm.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024155535.GA10294@art_vandelay
2019-10-30 06:11:47 +10:00
Dave Airlie
60845e34f0 Merge tag 'drm-next-5.5-2019-10-25' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-10-25:

amdgpu:
- BACO support for CI and VI asics
- Quick memory training support for navi
- MSI-X support
- RAS fixes
- Display AVI infoframe fixes
- Display ref clock fixes for renoir
- Fix number of audio endpoints in renoir
- Fix for discovery tables
- Powerplay fixes
- Documentation fixes
- Misc cleanups

radeon:
- revert a PPC fix which broke x86

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025221020.203546-1-alexander.deucher@amd.com
2019-10-30 05:46:09 +10:00
Christian König
a39414716c drm/amdgpu: add independent DMA-buf import v9
Instead of relying on the DRM functions just implement our own import
functions. This prepares support for taking care of unpinned DMA-buf.

v2: enable for all exporters, not just amdgpu, fix invalidation
    handling, lock reservation object while setting callback
v3: change to new dma_buf attach interface
v4: split out from unpinned DMA-buf work
v5: rebased and cleanup on new DMA-buf interface
v6: squash with invalidation callback change,
    stop using _(map|unmap)_locked
v7: drop invalidations when the BO is already in system domain
v8: rebase on new DMA-buf patch and drop move notification
v9: cleanup comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/337948/
2019-10-28 16:59:43 +01:00
Christian König
6e6db2722c drm/amdgpu: add independent DMA-buf export v8
Add an DMA-buf export implementation independent of the DRM helpers.

This not only avoids the caching of DMA-buf mappings, but also
allows us to use the new dynamic locking approach.

This is also a prerequisite of unpinned DMA-buf handling.

v2: fix unintended recursion, remove debugging leftovers
v3: split out from unpinned DMA-buf work
v4: rebase on top of new no_sgt_cache flag
v5: fix some warnings by including amdgpu_dma_buf.h
v6: fix locking for non amdgpu exports
v7: rebased on new DMA-buf locking patch
v8: drop extra include

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/337949/
2019-10-28 16:59:35 +01:00
Wambui Karuga
f440ff44b1 drm/amd: correct "_LENTH" mispelling in constant
Correct the "_LENTH" mispelling in the AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
constant.

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-28 11:19:17 -04:00
Wambui Karuga
7e0ff20c7a drm/amd: declare amdgpu_exp_hw_support in amdgpu.h
Declare `amdgpu_exp_hw_support` as extern in amdgpu.h to address the
following sparse warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:118:5: warning: symbol 'amdgpu_exp_hw_support' was not declared. Should it be static?

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Suggested-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-28 11:19:17 -04:00
Andrey Grodzovsky
db5e65fcb3 drm/amdgpu: If amdgpu_ib_schedule fails return back the error.
Use ERR_PTR to return back the error happened during amdgpu_ib_schedule.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Tianci.Yin
dcc0fcff14 drm/amdgpu/gfx10: update gfx golden settings for navi12
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Tianci.Yin
21c943f35a drm/amdgpu/gfx10: update gfx golden settings for navi14
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Tianci.Yin
d753dc6ab2 drm/amdgpu/gfx10: update gfx golden settings
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Evan Quan
0525f29713 drm/amd/powerplay: skip unsupported clock limit settings on Arcturus V2
For Arcturus, clock limit settings on uclk/socclk/fclk domains
are not supported.

V2: simplify the code to support both SGPU and MGPU cases

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Marek Olšák
664fe85a2d drm/amdgpu: Allow reading more status registers on si/cik
Allow userspace to read the same status registers for every family.
Based on commit c7890fea, added any of these registers if defined in
the include files of each architecture.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Andrey Grodzovsky
121a2bc6ae drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready.
For Arcturus the I2C traffic is done through SMU tables and so
we must postpone RAS recovery init to after they are ready
which is in amdgpu_device_ip_hw_init_phase2.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Andrey Grodzovsky
cf52ecc8b6 drm/amdgpu: Use ARCTURUS in RAS EEPROM.
Add Arcturus EEPROM/I2C support in generic EEPROM code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Nirmoy Das
9f0256da6b drm/amdgpu: remove unused parameter in amdgpu_gfx_kiq_free_ring
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
James Zhu
8047266443 drm/amdgpu/vcn: Enable VCN2.5 encoding
After VCN2.5 firmware (Version ENC: 1.1  Revision: 11),
VCN2.5 encoding can work properly.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Pelloux-prayer, Pierre-eric
3f378758b8 drm/amdgpu/sdma5: do not execute 0-sized IBs (v2)
This seems to help with https://bugs.freedesktop.org/show_bug.cgi?id=111481.

v2: insert a NOP instead of skipping all 0-sized IBs to avoid breaking older hw

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
chen gong
5aed95bbdd drm/amdgpu: Fix SDMA hang when performing VKexample test
VKexample test hang during Occlusion/SDMA/Varia runs.
Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Guchun Chen
52dd95f2b6 drm/amdgpu: define macros for retire page reservation
Easy for maintainance.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Guchun Chen
c688a06bc6 drm/amdgpu: refine reboot debugfs operation in ras case (v3)
Ras reboot debugfs node allows user one easy control to avoid
gpu recovery hang problem and directly reboot system per card
basis, after ras uncorrectable error happens. However, it is
one common entry, which should get rid of ras_ctrl node and
remove ip dependence when inputting by user. So add one new
auto_reboot node in ras debugfs dir to achieve this.

v2: in commit mssage, add justification why ras reboot debugfs
node is needed.
v3: use debugfs_create_bool to create debugfs file for boolean value

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Evan Quan
3697b339c6 drm/amd/powerplay: add lock protection for swSMU APIs V2
This is a quick and low risk fix. Those APIs which
are exposed to other IPs or to support sysfs/hwmon
interfaces or DAL will have lock protection. Meanwhile
no lock protection is enforced for swSMU internal used
APIs. Future optimization is needed.

V2: strip the lock protection for all swSMU internal APIs

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:09 -04:00
Colin Ian King
d5e5c1bce1 drm/amdgpu/psp: fix spelling mistake "initliaze" -> "initialize"
There is a spelling mistake in a DRM_ERROR error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:07 -04:00
Xiaojie Yuan
73469970a9 drm/amdgpu/psp11: fix typo in comment
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:07 -04:00
Xiaojie Yuan
d7e7f1ea25 drm/amdgpu/psp11: wait for sOS ready for ring creation
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:07 -04:00
Pelloux-prayer, Pierre-eric
ee8bcc2333 drm/amdgpu: call amdgpu_vm_prt_fini before deleting the root PD
amdgpu_vm_prt_fini uses "vm->root.base.bo" so it must still be valid when
we call it.

Fixes: b65709a921 ("drm/amdgpu: reserve the root PD while freeing PASIDs")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:47:07 -04:00
Alex Deucher
17523bd00c drm/amdgpu/vce: make some functions static
They are not used outside of the file they are defined in.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:15:00 -04:00
Alex Deucher
569557e524 drm/amdgpu/vce: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
feedback buffer, otherwise the IB test can overwrite
other memory.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:15:00 -04:00
chen gong
3a8b7d2761 drm/amdgpu/psp: declare PSP TA firmware
Add PSP TA firmware declaration for raven raven2 picasso

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:15:00 -04:00
Dave Airlie
3275a71e76 Merge tag 'drm-next-5.5-2019-10-09' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-10-09:

amdgpu:
- Additional RAS enablement for vega20
- RAS page retirement and bad page storage in EEPROM
- No GPU reset with unrecoverable RAS errors
- Reserve vram for page tables rather than trying to evict
- Fix issues with GPU reset and xgmi hives
- DC i2c over aux fixes
- Direct submission for clears, PTE/PDE updates
- Improvements to help support recoverable GPU page faults
- Silence harmless SAD block messages
- Clean up code for creating a bo at a fixed location
- Initial DC HDCP support
- Lots of documentation fixes
- GPU reset for renoir
- Add IH clockgating support for soc15 asics
- Powerplay improvements
- DC MST cleanups
- Add support for MSI-X
- Misc cleanups and bug fixes

amdkfd:
- Query KFD device info by asic type rather than pci ids
- Add navi14 support
- Add renoir support
- Add navi12 support
- gfx10 trap handler improvements
- pasid cleanups
- Check against device cgroup

ttm:
- Return -EBUSY with pipelining with no_gpu_wait

radeon:
- Silence harmless SAD block messages

device_cgroup:
- Export devcgroup_check_permission

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com
2019-10-26 05:56:57 +10:00
Christian König
97588b5b9a drm/ttm: remove pointers to globals
As the name says global memory and bo accounting is global. So it doesn't
make to much sense having pointers to global structures all around the code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/332879/
2019-10-25 11:40:51 +02:00
Christian König
9165fb879f drm/ttm: always keep BOs on the LRU
This allows blocking for BOs to become available
in the memory management.

Amdgpu is doing this for quite a while now during CS. Now
apply the new behavior to all drivers using TTM.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/332878/
2019-10-25 11:40:50 +02:00
Sean Paul
44bf67f32a Merge drm/drm-next into drm-misc-next
Parroting Daniel's backmerge justification from
2e79e22e09:

Thierry needs fd70c7755b ("drm/bridge: tc358767: fix max_tu_symbol
value") to be able to merge his dp_link patch series.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
2019-10-23 11:14:11 -04:00
Daniel Vetter
2e79e22e09 Linux 5.4-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl2su/AeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvm4H/1jkheCrvB/GJS69
 wd18vizAg+eFmNCzxlGVhpQTKGymNRy+g6clnoli3cNJ3pSVKcYgVyB3oXaONIhp
 g/ANudnBjTdjqYgJzfLij5AGecrGwDpF3YL0kuKrCB63s2I/HwQGYy/aPrYY8emy
 gAYdaf1DGRu5/DIIB6soTo/TnpKoAyTE+XY5MaPSug++t/Flov19tlU40IZxXW94
 bjTXbm0yklrsIx+LL5mYYGGnygSTCF66JjFg1qhDCBQaS2MZ21h1ZgaOtGZTwZcc
 WgEiqLC5S1Iyj96zir1t78RcVQ4RzgvDbhUOgIqUFsYAO2wOicvxyFE3Hj8rPOKd
 uGgVPRM=
 =xgZa
 -----END PGP SIGNATURE-----

Merge v5.4-rc4 into drm-next

Thierry needs fd70c7755b ("drm/bridge: tc358767: fix max_tu_symbol
value") to be able to merge his dp_link patch series.

Some adjacent changes conflicts, plus some clashes in i915 due to
cherry-picking and git trying to be helpful and leaving both versions
in.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-10-23 12:10:05 +02:00
Alex Deucher
ee027828c4 drm/amdgpu/vce: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
feedback buffer, otherwise the IB test can overwrite
other memory.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Christian König
de51a5019f drm/amdgpu: fix error handling in amdgpu_bo_list_create
We need to drop normal and userptr BOs separately.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 17:12:34 -04:00
Christian König
3122051edc drm/amdgpu: fix potential VM faults
When we allocate new page tables under memory
pressure we should not evict old ones.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 17:12:34 -04:00
Philip Yang
209620b422 drm/amdgpu: user pages array memory leak fix
user_pages array should always be freed after validation regardless if
user pages are changed after bo is created because with HMM change parse
bo always allocate user pages array to get user pages for userptr bo.

v2: remove unused local variable and amend commit

v3: add back get user pages in gem_userptr_ioctl, to detect application
bug where an userptr VMA is not ananymous memory and reject it.

Bugzilla: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1844962

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Joe Barnett <thejoe@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.3
2019-10-17 17:12:34 -04:00
Alex Deucher
c81fffc2c9 drm/amdgpu/vcn: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

- Session info is 128K according to mesa
- Use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Alex Deucher
5d230bc91f drm/amdgpu/uvd7: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Alex Deucher
ce584a8e28 drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Kevin Wang
2c2fdb8bca drm/amdgpu: fix amdgpu trace event print string format error
the trace event print string format error.
(use integer type to handle string)

before:
amdgpu_test_kev-1556  [002]   138.508781: amdgpu_cs_ioctl:
sched_job=8, timeline=gfx_0.0.0, context=177, seqno=1,
ring_name=ffff94d01c207bf0, num_ibs=2

after:
amdgpu_test_kev-1506  [004]   370.703783: amdgpu_cs_ioctl:
sched_job=12, timeline=gfx_0.0.0, context=234, seqno=2,
ring_name=gfx_0.0.0, num_ibs=1

change trace event list:
1.amdgpu_cs_ioctl
2.amdgpu_sched_run_job
3.amdgpu_ib_pipe_sync

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:32:00 -04:00
Tianci.Yin
367039bfb6 drm/amdgpu/psp: add psp memory training implementation(v3)
add memory training implementation code to save resume time.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:54 -04:00
Tianci.Yin
778e8c428f drm/amdgpu: reserve vram for memory training(v4)
memory training using specific fixed vram segment, reserve these
segments before anyone may allocate it.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:44 -04:00
Tianci.Yin
0586a0596a drm/amdgpu: add psp memory training callbacks and macro
add interface for memory training.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:38 -04:00
Tianci.Yin
efe4f00077 drm/amdgpu/atomfirmware: add memory training related helper functions(v3)
parse firmware to get memory training capability and fb location.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:31 -04:00
Tianci.Yin
a7d4c920f8 drm/amdgpu: introduce psp_v11_0_is_sos_alive interface(v2)
introduce psp_v11_0_is_sos_alive func for common use.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:19 -04:00
Tianci.Yin
e35e2b117f drm/amdgpu: add a generic fb accessing helper function(v3)
add a generic helper function for accessing framebuffer via MMIO

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:13 -04:00
Tianci.Yin
45cf454e4c drm/amdgpu: update amdgpu_discovery to handle revision
update amdgpu_discovery to get IP revision.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:06 -04:00
Alex Deucher
8c32d0438f drm/amdgpu/vcn: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

- Session info is 128K according to mesa
- Use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:30:57 -04:00
Alex Deucher
b24c459f9f drm/amdgpu/uvd7: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:30:55 -04:00
Alex Deucher
481bf82c97 drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:30:52 -04:00
Prike Liang
f839110157 drm/amdgpu: fix S3 failed as RLC safe mode entry stucked in polloing gfx acq
Fix gfx cgpg setting sequence for RLC deadlock at safe mode entry in polling gfx response.
The patch can fix VCN IB test failed and DAL get dispaly count failed issue.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:24:09 -04:00
Prike Liang
c8486eef2c drm/amdgpu: add GFX_PIPELINE capacity check for updating gfx cgpg
Before disable gfx pipeline power gating need check the flag AMD_PG_SUPPORT_GFX_PIPELINE.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:23:54 -04:00
Gerd Hoffmann
12067e0e89 drm/ttm: rename ttm_fbdev_mmap
Rename ttm_fbdev_mmap to ttm_bo_mmap_obj.  Move the vm_pgoff sanity
check to amdgpu_bo_fbdev_mmap (only ttm_fbdev_mmap user in tree).

The ttm_bo_mmap_obj function can now be used to map any buffer object.
This allows to implement &drm_gem_object_funcs.mmap in gem ttm helpers.

v3: patch added to series

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191016115203.20095-8-kraxel@redhat.com
2019-10-17 13:59:16 +02:00
Alex Deucher
97c002be41 drm/amdgpu: enable BACO reset for SMU7 based dGPUs (v2)
Use BACO to reset the GPU if supported on SMU7 based
dGPUs.

v2: don't use baco on CI parts

Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:55:32 -04:00
Alex Deucher
5337aae9b5 drm/amdgpu/soc15: add support for baco reset with swSMU
Add support for vega20 when the swSMU path is used.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:55:31 -04:00
Alex Deucher
31fa2991f4 drm/amdgpu: remove in_baco_reset hack
It was a vega20 specific hack.  Check if we are in reset
and what reset method we are using.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Alex Deucher
f5fda6d89a drm/amdgpu: simplify ATPX detection
Use the base class rather than the specific class and drop
the second loop.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Alex Deucher
897483d8a0 drm/amdgpu: move gpu reset out of amdgpu_device_suspend
Move it into the caller.  There are cases were we don't
want it.  We need it for hibernation, but we don't need
it for runtime pm, so drop it for runtime pm.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Alex Deucher
803cc26d5c drm/amdgpu: move pci_save_state into suspend path
for amdgpu_device_suspend.  This follows the logic
in the resume path.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Andrey Grodzovsky
ed606f8a34 dmr/amdgpu: Fix crash on SRIOV for ERREVENT_ATHUB_INTERRUPT interrupt.
Ignre the ERREVENT_ATHUB_INTERRUPT for systems without RAS.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-and-tested-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:11 -04:00
Philip Yang
06f7f57e87 drm/amdgpu: user pages array memory leak fix
user_pages array should always be freed after validation regardless if
user pages are changed after bo is created because with HMM change parse
bo always allocate user pages array to get user pages for userptr bo.

v2: remove unused local variable and amend commit

v3: add back get user pages in gem_userptr_ioctl, to detect application
bug where an userptr VMA is not ananymous memory and reject it.

Bugzilla: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1844962

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Joe Barnett <thejoe@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:01 -04:00
Evan Quan
5bcc92407c drm/amd/powerplay: enable Arcturus runtime VCN dpm on/off
Enable runtime VCN DPM on/off on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:43 -04:00
Emily Deng
bcccee89f4 drm/amdgpu: Fix tdr3 could hang with slow compute issue
When index is 1, need to set compute ring timeout for sriov and passthrough.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:23 -04:00
Christian König
b2c18f0a9c drm/amdgpu: fix potential VM faults
When we allocate new page tables under memory
pressure we should not evict old ones.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:18 -04:00
Christian König
b146570010 drm/amdgpu: fix error handling in amdgpu_bo_list_create
We need to drop normal and userptr BOs separately.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:07 -04:00
Dennis Li
820924745b drm/amdgpu: add RAS support for VML2 and ATCL2
v1: Add codes to query the EDC count of VML2 & ATCL2
v2: Rename VML2/ATCL2 registers and drop their mask define
v3: Add back the ECC mask for VML2 registers

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:57 -04:00
Dennis Li
13ba03442a drm/amdgpu: change to query the actual EDC counter
For the potential request in the future, change to
query the actual EDC counter.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:44 -04:00
Le Ma
956f670509 drm/amdgpu/soc15: disable doorbell interrupt as part of BACO entry sequence
Workaround to make RAS recovery work in BACO reset.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:14 -04:00
Hans de Goede
402c60d7b0 drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1
Bail from the pci_driver probe function instead of from the drm_driver
load function.

This avoid /dev/dri/card0 temporarily getting registered and then
unregistered again, sending unwanted add / remove udev events to
userspace.

Specifically this avoids triggering the (userspace) bug fixed by this
plymouth merge-request:
https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59

Note that despite that being a userspace bug, not sending unnecessary
udev events is a good idea in general.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:07 -04:00
Xiaojie Yuan
5f6a556f98 drm/amdgpu/discovery: reserve discovery data at the top of VRAM
IP Discovery data is TMR fenced by the latest PSP BL,
so we need to reserve this region.

Tested on navi10/12/14 with VBIOS integrated with latest PSP BL.

v2: use DISCOVERY_TMR_SIZE macro as bo size
    use amdgpu_bo_create_kernel_at() to allocate bo

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:48:46 -04:00
Xiaojie Yuan
d12c50857c drm/amdgpu/sdma5: fix mask value of POLL_REGMEM packet for pipe sync
sdma will hang once sequence number to be polled reaches 0x1000_0000

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-11 21:32:06 -05:00
Hans de Goede
984d7a929a drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1
Bail from the pci_driver probe function instead of from the drm_driver
load function.

This avoid /dev/dri/card0 temporarily getting registered and then
unregistered again, sending unwanted add / remove udev events to
userspace.

Specifically this avoids triggering the (userspace) bug fixed by this
plymouth merge-request:
https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59

Note that despite that being a userspace bug, not sending unnecessary
udev events is a good idea in general.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-11 21:31:14 -05:00
chen gong
6696b8adb8 drm/amdgpu: Do not implement power-on for SDMA after do mode2 reset on Renoir
Find that ring sdma0 test failed if turn on SDMA powergating after do
mode2 reset.

Perhaps the mode2 reset does not reset the SDMA PG state, SDMA is
already powered up so there is no need to ask the SMU to power it up
again. So I skip this function for a moment.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:39:06 -05:00
Xiaojie Yuan
e8939b4a0d drm/amdgpu/sdma5: fix mask value of POLL_REGMEM packet for pipe sync
sdma will hang once sequence number to be polled reaches 0x1000_0000

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:39:06 -05:00
Nirmoy Das
b9ed69e6fd drm/amdgpu: fix memory leak
cleanup error handling code and make sure temporary info array
with the handles are freed by amdgpu_bo_list_put() on
idr_replace()'s failure.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:35 -05:00
Tao Zhou
6e4be98767 drm/amdgpu: avoid ras error injection for retired page
check whether a page is bad page before umc error injection, bad page
should not be accessed again

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:27 -05:00
Luben Tuikov
4e930d96c9 drm/amdgpu: Use the ALIGN() macro
Use the ALIGN() macro to set "num_dw" to a
multiple of 8, i.e. lower 3 bits cleared.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:20 -05:00
Alex Deucher
54e9ab2edb drm/amdgpu/ras: document the reboot ras option
We recently added it, but never documented it.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:18 -05:00
Alex Deucher
a20bfd0fd4 drm/amdgpu/ras: fix typos in documentation
Fix a couple of spelling typos.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:11 -05:00
Oak Zeng
f81b86a043 drm/amdgpu: Enable gfx cache probing on HDP write for arcturus
This allows gfx cache to be probed and invalidated (for none-dirty cache lines)
on a HDP write (from either another GPU or CPU). This should work only for the
memory mapped as RW memory type newly added for arcturus, to achieve some cache
coherence b/t multiple memory clients.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:24:19 -05:00
Oak Zeng
cb1545f710 drm/amdgpu: Clean up gmc_v9_0_gart_enable
Many logic in this function are HDP set up,
not gart set up. Moved those logic to gmc_v9_0_hw_init.
No functional change.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:24:12 -05:00
Marek Olšák
6f3bf46a7e drm/amdgpu: simplify gds_compute_max_wave_id computation
Use asic constants.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:23:35 -05:00
Dave Airlie
7ed093602e drm-misc-next for 5.5:
UAPI Changes:
 -Colorspace: Expose different prop values for DP vs. HDMI (Gwan-gyeong Mun)
 -fourcc: Add DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED (Raymond)
 -not_actually: s/ENOTSUPP/EOPNOTSUPP/ in drm_edid and drm_mipi_dbi. This should
     not reach userspace, but adding here to specifically call that out (Daniel)
 -i810: Prevent underflow in dispatch ioctls (Dan)
 -komeda: Add ACLK sysfs attribute (Mihail)
 -v3d: Allow userspace to clean up after render jobs (Iago)
 
 Cross-subsystem Changes:
 -MAINTAINERS:
  -Add Alyssa & Steven as panfrost reviewers (Rob)
  -Add Jernej as DE2 reviewer (Maxime)
  -Add Chen-Yu as Allwinner maintainer (Maxime)
 -staging: Make some stack arrays static const (Colin)
 
 Core Changes:
 -ttm: Allow drivers to specify their vma manager (to use gem mgr) (Gerd)
 -docs: Various fixes in connector/encoder/bridge docs (Daniel, Lyude, Laurent)
 -connector: Allow more than 3 possible encoders for a connector (José)
 -dp_cec: Allow a connector to be associated with a cec device (Dariusz)
 -various: Fix some compile/sparse warnings (Ville)
 -mm: Ensure mm node removals are properly serialised (Chris)
 -panel: Specify the type of panel for drm_panels for later use (Laurent)
 -panel: Use drm_panel_init to init device and funcs (Laurent)
 -mst: Refactors and cleanups in anticipation of suspend/resume support (Lyude)
 -vram:
  -Add lazy unmapping for gem bo's (Thomas)
  -Unify and rationalize vram mm and gem vram (Thomas)
  -Expose vmap and vunmap for gem vram objects (Thomas)
  -Allow objects to be pinned at the top of vram to avoid fragmentation (Thomas)
 
 Driver Changes:
 -various: Include drm_bridge.h instead of relying on drm_crtc.h (Boris)
 -ast/mgag200: Refactor show_cursor(), move cursor to top of video mem (Thomas)
 -komeda:
  -Add error event printing (behind CONFIG) and reg dump support (Lowry)
  -Add suspend/resume support (Lowry)
  -Workaround D71 shadow registers not flushing on disable (Lowry)
 -meson: Add suspend/resume support (Neil)
 -omap: Miscellaneous refactors and improvements (Tomi/Jyri)
 -panfrost/shmem: Silence lockdep by using mutex_trylock (Rob)
 -panfrost: Miscellaneous small fixes (Rob/Steven)
 -sti: Fix warnings (Benjamin/Linus)
 -sun4i:
  -Add vcc-dsi regulator to sun6i_mipi_dsi (Jagan)
  -A few patches to figure out the DRQ/start delay calc on dsi (Jagan/Icenowy)
 -virtio:
  -Add module param to switch resource reuse workaround on/off (Gerd)
  -Avoid calling vmexit while holding spinlock (Gerd)
  -Use gem shmem helpers instead of ttm (Gerd)
  -Accommodate command buffer allocations too big for cma (David)
 
 Cc: Rob Herring <robh@kernel.org>
 Cc: Maxime Ripard <mripard@kernel.org>
 Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
 Cc: Gerd Hoffmann <kraxel@redhat.com>
 Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
 Cc: Lyude Paul <lyude@redhat.com>
 Cc: José Roberto de Souza <jose.souza@intel.com>
 Cc: Dariusz Marcinkiewicz <darekm@google.com>
 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
 Cc: Raymond Smith <raymond.smith@arm.com>
 Cc: Chris Wilson <chris@chris-wilson.co.uk>
 Cc: Colin Ian King <colin.king@canonical.com>
 Cc: Thomas Zimmermann <tzimmermann@suse.de>
 Cc: Dan Carpenter <dan.carpenter@oracle.com>
 Cc: Mihail Atanassov <Mihail.Atanassov@arm.com>
 Cc: Lowry Li <Lowry.Li@arm.com>
 Cc: Neil Armstrong <narmstrong@baylibre.com>
 Cc: Jyri Sarha <jsarha@ti.com>
 Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
 Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
 Cc: Steven Price <steven.price@arm.com>
 Cc: Benjamin Gaignard <benjamin.gaignard@st.com>
 Cc: Linus Walleij <linus.walleij@linaro.org>
 Cc: Jagan Teki <jagan@amarulasolutions.com>
 Cc: Icenowy Zheng <icenowy@aosc.io>
 Cc: Iago Toral Quiroga <itoral@igalia.com>
 Cc: David Riley <davidriley@chromium.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl2d9h8ACgkQcywAJXLc
 r3ms5gf9HIFpqwJ16CqaRukSnpcBcDoYUM8DGrOic+vw2bw14BQwFqvEOqrCkKL4
 V6h/OCJlNFPtOcc1LvU/jeXxYf4AQWh/2qZeg+oee7HAGX5x8Y3f08GsEjO8+55t
 QvSVxCKVti04M1ErPRfKrM7KPVE+IC+KdY26nO8Bf5zDGeCAkiPIDrdh2aZGMRdC
 Eer0DJ96cgWW9LrhseCdj5nKwcR78DlbWa79zuPAss4LaBBbXqThNXYYzg/mZMKB
 +VYgzs48tGYKK1NXXJ6biVI3brHrM52bqv5JpIncD5HepF1oIartWOMnbAO7MAqh
 h/tgJWxL+4bnl9aqY87by1BtyVgl3w==
 =kaOE
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-10-09-2' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.5:

UAPI Changes:
-Colorspace: Expose different prop values for DP vs. HDMI (Gwan-gyeong Mun)
-fourcc: Add DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED (Raymond)
-not_actually: s/ENOTSUPP/EOPNOTSUPP/ in drm_edid and drm_mipi_dbi. This should
    not reach userspace, but adding here to specifically call that out (Daniel)
-i810: Prevent underflow in dispatch ioctls (Dan)
-komeda: Add ACLK sysfs attribute (Mihail)
-v3d: Allow userspace to clean up after render jobs (Iago)

Cross-subsystem Changes:
-MAINTAINERS:
 -Add Alyssa & Steven as panfrost reviewers (Rob)
 -Add Jernej as DE2 reviewer (Maxime)
 -Add Chen-Yu as Allwinner maintainer (Maxime)
-staging: Make some stack arrays static const (Colin)

Core Changes:
-ttm: Allow drivers to specify their vma manager (to use gem mgr) (Gerd)
-docs: Various fixes in connector/encoder/bridge docs (Daniel, Lyude, Laurent)
-connector: Allow more than 3 possible encoders for a connector (José)
-dp_cec: Allow a connector to be associated with a cec device (Dariusz)
-various: Fix some compile/sparse warnings (Ville)
-mm: Ensure mm node removals are properly serialised (Chris)
-panel: Specify the type of panel for drm_panels for later use (Laurent)
-panel: Use drm_panel_init to init device and funcs (Laurent)
-mst: Refactors and cleanups in anticipation of suspend/resume support (Lyude)
-vram:
 -Add lazy unmapping for gem bo's (Thomas)
 -Unify and rationalize vram mm and gem vram (Thomas)
 -Expose vmap and vunmap for gem vram objects (Thomas)
 -Allow objects to be pinned at the top of vram to avoid fragmentation (Thomas)

Driver Changes:
-various: Include drm_bridge.h instead of relying on drm_crtc.h (Boris)
-ast/mgag200: Refactor show_cursor(), move cursor to top of video mem (Thomas)
-komeda:
 -Add error event printing (behind CONFIG) and reg dump support (Lowry)
 -Add suspend/resume support (Lowry)
 -Workaround D71 shadow registers not flushing on disable (Lowry)
-meson: Add suspend/resume support (Neil)
-omap: Miscellaneous refactors and improvements (Tomi/Jyri)
-panfrost/shmem: Silence lockdep by using mutex_trylock (Rob)
-panfrost: Miscellaneous small fixes (Rob/Steven)
-sti: Fix warnings (Benjamin/Linus)
-sun4i:
 -Add vcc-dsi regulator to sun6i_mipi_dsi (Jagan)
 -A few patches to figure out the DRQ/start delay calc on dsi (Jagan/Icenowy)
-virtio:
 -Add module param to switch resource reuse workaround on/off (Gerd)
 -Avoid calling vmexit while holding spinlock (Gerd)
 -Use gem shmem helpers instead of ttm (Gerd)
 -Accommodate command buffer allocations too big for cma (David)

Cc: Rob Herring <robh@kernel.org>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Dariusz Marcinkiewicz <darekm@google.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Raymond Smith <raymond.smith@arm.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mihail Atanassov <Mihail.Atanassov@arm.com>
Cc: Lowry Li <Lowry.Li@arm.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Benjamin Gaignard <benjamin.gaignard@st.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Jagan Teki <jagan@amarulasolutions.com>
Cc: Icenowy Zheng <icenowy@aosc.io>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: David Riley <davidriley@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Thu 10 Oct 2019 01:00:47 AM AEST
# gpg:                using RSA key 732C002572DCAF79
# gpg: Can't check signature: public key not found

# Conflicts:
#	drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
#	drivers/gpu/drm/i915/i915_drv.c
#	drivers/gpu/drm/i915/i915_gem.c
#	drivers/gpu/drm/i915/i915_gem_gtt.c
#	drivers/gpu/drm/i915/i915_vma.c
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009150825.GA227673@art_vandelay
2019-10-11 09:30:53 +10:00
Nirmoy Das
083164dbdb drm/amdgpu: fix memory leak
cleanup error handling code and make sure temporary info array
with the handles are freed by amdgpu_bo_list_put() on
idr_replace()'s failure.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-09 11:45:59 -05:00
Ori Messinger
ad02e08e05 drm/amdgpu: Report vram vendor with sysfs (v3)
The vram vendor can be found as a separate sysfs file at:
/sys/class/drm/card[X]/device/mem_info_vram_vendor
The vram vendor is displayed as a string value.

v2: Use correct bit masking, and cache vram_vendor in gmc
v3: Drop unused functions for vram width, type, and vendor

Signed-off-by: Ori Messinger <ori.messinger@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:11:07 -05:00
YueHaibing
8f49c8220b drm/amdgpu: remove duplicated include from mmhub_v1_0.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:10:23 -05:00
Alex Deucher
71f98027f2 drm/amdgpu: move amdgpu_device_get_job_timeout_settings
It's only used in amdgpu_device.c and the naming also
reflects that.  Move it there.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:10:20 -05:00
Felix Kuehling
1995b3a35f drm/amdgpu: Fix error handling in amdgpu_ras_recovery_init
Don't set a struct pointer to NULL before freeing its members. It's
hard to see what's happening due to a local pointer-to-pointer data
aliasing con->eh_data.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Philip Cox <Philip.Cox@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:09:33 -05:00
Colin Ian King
317a8d9eb6 drm/amdgpu: remove redundant variable r and redundant return statement
There is a return statement that is not reachable and a variable that
is not used.  Remove them.

Addresses-Coverity: ("Structurally dead code")
Fixes: de7b45babd ("drm/amdgpu: cleanup creating BOs at fixed location (v2)")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:14 -05:00
Colin Ian King
17cf678a33 drm/amdgpu: fix uninitialized variable pasid_mapping_needed
The boolean variable pasid_mapping_needed is not initialized and
there are code paths that do not assign it any value before it is
is read later.  Fix this by initializing pasid_mapping_needed to
false.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 6817bf283b ("drm/amdgpu: grab the id mgr lock while accessing passid_mapping")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:14 -05:00
Leo Liu
d0312d0dca drm/amdgpu: add code comment in vcn_v2_5_hw_init
Add a comment to VCN 2.5 encode ring

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:08 -05:00
Leo Liu
fd287c8cd2 drm/amdgpu/vcn: use amdgpu_ring_test_helper
Instead of amdgpu_ring_test_ring, so the helper function determines
whether the ring is ready

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:52:53 -05:00
Alex Deucher
8a745c7ff2 drm/amdgpu: improve MSI-X handling (v3)
Check the number of supported vectors and fall back to MSI if
we return or error or 0 MSI-X vectors.

v2: only allocate one vector.  We can't currently use more than
one anyway.

v3: install the irq on vector 0.

Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Shaoyun liu  <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 13:51:19 -05:00
Maxime Ripard
4092de1ba3
Merge drm/drm-next into drm-misc-next
We haven't done any backmerge for a while due to the merge window, and it
starts to become an issue for komeda. Let's bring 5.4-rc1 in.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2019-10-03 16:38:50 +02:00
Arnd Bergmann
324fb7adf6 drm/amdgpu: hide another #warning
An earlier patch of mine disabled some #warning statements
that get in the way of build testing, but then another
instance was added around the same time.

Remove that as well.

Fixes: b5203d16ae ("drm/amd/amdgpu: hide #warning for missing DC config")
Fixes: e1c14c4339 ("drm/amdgpu: Enable DC on Renoir")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
Arnd Bergmann
128a01f472 drm/amdgpu: make pmu support optional, again
When CONFIG_PERF_EVENTS is disabled, we cannot compile the pmu
portion of the amdgpu driver:

drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:48:38: error: no member named 'hw' in 'struct perf_event'
        struct hw_perf_event *hwc = &event->hw;
                                     ~~~~~  ^
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:51:13: error: no member named 'attr' in 'struct perf_event'
        if (event->attr.type != event->pmu->type)
            ~~~~~  ^
...

The same bug was already fixed by commit d155bef063 ("amdgpu: make pmu
support optional") but broken again by what looks like an incorrectly
rebased patch.

Fixes: 64f55e6292 ("drm/amdgpu: Add RAS EEPROM table.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
yu kuai
2e0db9dec2 drm/amdgpu: remove set but not used variable 'pipe'
Fixes gcc '-Wunused-but-set-variable' warning:

rivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c: In function
‘amdgpu_gfx_graphics_queue_acquire’:
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:234:16: warning:
variable ‘pipe’ set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
Navid Emamdoost
1104057562 drm/amdgpu: fix multiple memory leaks in acp_hw_init
In acp_hw_init there are some allocations that needs to be released in
case of failure:

1- adev->acp.acp_genpd should be released if any allocation attemp for
adev->acp.acp_cell, adev->acp.acp_res or i2s_pdata fails.
2- all of those allocations should be released if
mfd_add_hotplug_devices or pm_genpd_add_device fail.
3- Release is needed in case of time out values expire.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Alex Deucher
2c9a0c66d5 drm/amdgpu: don't increment vram lost if we are in hibernation
We reset the GPU as part of our hibernation sequence so we need
to make sure we don't mark vram as lost in that case.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111879
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
shaoyunl
bd660f4f11 drm/amdgpu : enable msix for amdgpu driver
We might used out of the msi resources in some cloud project
which have a lot gpu devices(including PF and VF), msix can
provide enough resources from system level view

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
e7956997b1 drm/amdgpu: Export setup_vm_pt_regs() logic for mmhub 2.0
The KFD code will call this function later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
923c087a1f drm/amdgpu: Add the HDP flush support for Navi
The HDP flush support code was missing in the nbio and nv files.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
e392c887df drm/amdkfd: Use array to probe kfd2kgd_calls
This is the same idea as the kfd device info probe and move all the
probe control together for easy maintenance.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
47c5ab6ca0 drm/amdkfd: Delete unnecessary function declarations
Ajust the function sequences so that those function delcarations are not
needed any more.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
1456482bf8 drm/amdgpu: Delete useless header file reference
Those header file includes are not needed.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Jack Zhang
21889cec0a drm/amd/amdgpu/sriov ip block setting of Arcturus
Add ip block setting for Arcturus SRIOV

1.PSP need to be initialized before IH.
2.SMU doesn't need to be initialized at kmd driver.
3.Arcturus doesn't support DCE hardware,it needs to skip
  register access to DCE.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Marek Olšák
cf21e76a60 drm/amdgpu: return tcc_disabled_mask to userspace
UMDs need this for correct programming of harvested chips.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Lyude Paul
f8d2d39eb4 drm/amdgpu: Iterate through DRM connectors correctly
Currently, every single piece of code in amdgpu that loops through
connectors does it incorrectly and doesn't use the proper list iteration
helpers, drm_connector_list_iter_begin() and
drm_connector_list_iter_end(). Yeesh.

So, do that.

Cc: Juston Li <juston.li@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Prike Liang
88d802500a drm/amdkfd: fix kgd2kfd_device_init() definition conflict error
The patch c670707 drm/amd: Pass drm_device to kfd introduced this issue and
fix the following compiler error.

  CC [M]  drivers/gpu/drm/amd/amdgpu//../powerplay/smumgr/fiji_smumgr.o
drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.c:746:6: error: conflicting types for ‘kgd2kfd_device_init’
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
      ^
In file included from drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.c:23:0:
drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.h:253:6: note: previous declaration of ‘kgd2kfd_device_init’ was here
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
      ^
scripts/Makefile.build:273: recipe for target 'drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.o' failed
make[1]: *** [drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.o] Error 1

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Kenneth Feng
227f7d58d7 drm/amd/amdgpu: add IH cg support on soc15 project
enable/disable IH clock gating on soc15 projects.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
b2100ce1db drm/amdkfd: Use setup_vm_pt_regs function from base driver in KFD
This was done on GFX9 previously, now do it for GFX10.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
286b789e1e drm/amdgpu: Export setup_vm_pt_regs() logic for gfxhub 2.0
The KFD code will call this function later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
56fc40aba4 drm/amdkfd: Eliminate get_atc_vmid_pasid_mapping_valid
get_atc_vmid_pasid_mapping_valid() is very similar to
get_atc_vmid_pasid_mapping_pasid(), so they can be merged into a new
function get_atc_vmid_pasid_mapping_info() to reduce register access
times. More importantly, getting the PASID and the valid bit atomically
with a single read fixes some potential race conditions where the
mapping changes between the two reads.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
d19eb6aca7 drm/amdkfd: Delete unused defines
They are not used anywhere.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Harish Kasiviswanathan
3a0c342392 drm/amd: Pass drm_device to kfd
kfd needs drm_device to call into drm_cgroup functions

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
b55a8b8b41 drm/amdkfd: Use better name for sdma queue non HWS path
The old name is prone to confusion. The register offset is for a RLC queue
rather than a SDMA engine. The value is not a base address, but a
register offset.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
9941a6bfbd drm/amdkfd: Delete useless SDMA register setting on non HWS path
HW folks have confirm that we should not touch RESUME_CTX of
SDMA*_GFX_CONTEXT_CNTL when manipulating RLC queues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Christian König
56f074d815 drm/amdgpu: restrict hotplug error message
We should print the error only when we are hotplugged and crash
basically all userspace applications.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Christian König
4a24652867 drm/amdgpu: once more fix amdgpu_bo_create_kernel_at
When CPU access is needed we should tell that to
amdgpu_bo_create_reserved() or otherwise the access is denied later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
3d8361b11c drm/amdgpu: add comments in ras interrupt callback
add comments to clarify why checking GFX IP BLOCK for each ras interrupt callback

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
ba08349214 drm/amdgpu: implement common gmc_ras_late_init
common gmc_ecc_late_init can be shared among all generations of gmc

v2: rename gmc_ecc_late_init to gmc_ras_late_init

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
be5b39d87a drm/amdgpu: move xgmi ras fini to xgmi block
it's more suitable to put xgmi ras fini in xgmi block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
196041205c drm/amdgpu: move mmhub ras fini to mmhub block
it's more suitable to put mmhub ras fini in mmhub block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
181c93e5ec drm/amdgpu: move umc ras fini to umc block
it's more suitable to put umc ras fini in umc block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
f2575941e6 drm/amdgpu: add ras fini for xgmi
add ras fini for xgmi to cleanup xgmi ras framework

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
de9bbd5273 drm/amdgpu: add ras fini for nbio
add a common nbio ras fini implementation to cleanup nbio ras framework

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
0771b0bf07 drm/amdgpu: simplify the access to eeprom_control struct
simplify the code of accessing to eeprom_control struct

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
41190cd733 drm/amdgpu: remove ih_info parameter of gfx_ras_late_init
gfx_ras_late_init can get the info by itself

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
56c54b25c3 drm/amdgpu: remove ih_info parameter of umc_ras_late_init
umc_ras_late_init can get the info by itself

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
e536c81850 drm/amdgpu: add common sdma_ras_fini function
sdma_ras_fini can be shared among all generations of sdma

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
3b7b7647be drm/amdgpu: add common gfx_ras_fini function
gfx_ras_fini can be shared among all generations of gfx

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
2adf13440a drm/amdgpu: add common gmc_ras_fini function
gmc_ras_fini can be shared among all generations of gmc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
65bc47a659 drm/amdgpu: move mmhub_ras_if from gmc to mmhub block
mmhub_ras_if is relevant to mmhub

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
d65bf1f8a7 drm/amdgpu: replace mmhub_funcs with mmhub.funcs
remove mmhub_funcs in adev

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
d3a5a121b8 drm/amdgpu: add common mmhub member for adev
put mmhub_funcs and ras_if pointer into mmhub struct

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
03740baab3 drm/amdgpu: move umc_ras_if from gmc to umc block
umc_ras_if is relevant to umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
fc04e6b484 drm/amdgpu: refine sdma4 ras_data_cb
simplify code logic and refine return value

v2: remove unused error source code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
4c65dd1041 drm/amdgpu: move sdma ecc functions to generic sdma file
sdma ras ecc functions can be reused among all sdma generations

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
725253ab9b drm/amdgpu: move gfx ecc functions to generic gfx file
gfx ras ecc common functions could be reused among all gfx generations

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
34cc4fd9ff drm/amdgpu: move umc ras irq functions to umc block
move umc ras irq functions from gmc v9 to generic umc block, these
functions are relevant to umc and they can be shared among all
generations of umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Tao Zhou
f5f06e21e9 drm/amdgpu: update parameter of ras_ih_cb
change struct ras_err_data *err_data to void *err_data, align with
umc code and the callback's declaration in each ras block could
pay no attention to the structure type

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Monk Liu
e7da754b00 drm/amdgpu: fix an UMC hw arbitrator bug(v3)
issue:
the UMC6 h/w bug is that when MCLK is doing the switch
in the middle of a page access being preempted by high
priority client (e.g. DISPLAY) then UMC and the mclk switch
would stuck there due to deadlock

how:
fixed by disabling auto PreChg for UMC to avoid high
priority client preempting other client's access on
the same page, thus the deadlock could be avoided

v2:
put the patch in callback of UMC6
v3:
rename the callback to "init_registers"

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Marek Olšák
6de088a08d drm/amdgpu: remove gfx9 NGG
Never used.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Alex Deucher
631cdbd27e drm/amdgpu/atomfirmware: simplify the interface to get vram info
fetch both the vram type and width in one function call.  This
avoids having to parse the same data table twice to get the two
pieces of data.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Alex Deucher
bd5520273c drm/amdgpu/atomfirmware: use proper index for querying vram type (v3)
The index is stored in scratch register 4 after asic init.  Use
that index.  No functional change since all asics in a family
use the same type of vram (G5, G6, HBM) and that is all we use
at the monent, but if we ever need to query other info, we will
now have the proper index.

v2: module array is variable sized, handle that.
v3: fix off by one in array handling

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Shirish S
52510a4035 drm/amdgpu/psp: silence response status warning
log the response status related error to the driver's
debug log since  psp response status is not 0 even though
there was no problem while the command was submitted.

This warning misleads, hence this change.

Signed-off-by: Shirish S <shirish.s@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Jesse Zhang
df99ac0fcc drm/amd/amdgpu:Fix compute ring unable to detect hang.
When compute fence did not signal, compute ring cannot detect hardware hang
because its timeout value is set to be infinite by default.

In SR-IOV and passthrough mode, if user does not declare custome timeout
value for compute ring, then use gfx ring timeout value as default. So
that when there is a ture hardware hang, compute ring can detect it.

Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
chen gong
90a08351f7 drm/amdgpu: Use mode2 mode to perform GPU RESET for Renoir
Renoir need to use mode2 mode to implement GPU RESET

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
40463bdc22 drm/amdkfd: Sync gfx10 kfd2kgd_calls function pointers
get_hive_id was not set. Also, adjust the function setting sequence.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
c637b36aea drm/amdkfd: Fix NULL pointer dereference for set_scratch_backing_va()
Currently this function pointer is missing for GFX10. Considering it is
a void function since GFX9, fix it by checking the function pointer
before dereferencing it.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
812330eb69 drm/amdkfd: Add an error print if SDMA RLC is not idle
The message will be useful when troubleshooting the issues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
05ba0095fb drm/amdgpu: correct condition check for psp rlc autoload
Otherwise non-autoload case will go into the wrong routine and fail.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Hawking Zhang
1f01cd9905 drm/amdgpu: add command id in psp response failure message
For better clarification of issue.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
90c88dab8e drm/amdgpu: enable psp front door loading by default on Arcturus
Front door firmware loading is done via the psp rather than the
driver.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
9a018e5a85 drm/amdgpu: disable vcn ip block for front door loading on Arcturus
Needs more work to enable via front door loading.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Tianci.Yin
4db37544ce drm/amdgpu/gfx10: add support for wks firmware loading
load different cp firmware according to the DID and RID

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
f77c7109c0 drm/amdgpu/ras: fix and update the documentation for RAS
Add new sections to amdgpu.rst, fix up formatting issues,
add additional documentation to each section.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
a667b75c1e drm/amdgpu: fix documentation for amdgpu_pm.c
Fix DOC link name, clean up formatting in pp_dpm_* section.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
fc9c7f8470 drm/amdgpu/ih: fix documentation in amdgpu_irq_dispatch
Fix parameters.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
1d614ded87 drm/amdgpu/vm: fix up documentation in amdgpu_vm.c
Missing parameters, wrong comment type, etc.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
4d8e54d2b9 drm/amdgpu/mn: fix documentation for amdgpu_mn_read_lock
Document the new parameter.

Fixes: 93065ac753 ("mm, oom: distinguish blockable mode for mmu notifiers")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
ebc52c1692 drm/amdgpu: fix documentation for amdgpu_gem_prime_export
Drop extra function parameter.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
yu kuai
d0580c09c6 drm/amdgpu: remove excess function parameter description
Fixes gcc warning:

drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:431: warning: Excess function
parameter 'sw' description in 'vcn_v2_5_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:550: warning: Excess function
parameter 'sw' description in 'vcn_v2_5_enable_clock_gating'

Fixes: cbead2bdfc ("drm/amdgpu: add VCN2.5 VCPU start and stop")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Guchun Chen
e53aec7e41 drm/amdgpu: enable full ras by default
Enable full ras by default, user does not need to enable it by
boot parameter.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Jiange Zhao
57d4f3b7fd drm/amdgpu/SRIOV: add navi12 pci id for SRIOV (v2)
Add Navi12 PCI id support.

v2: flag as experimental for now (Alex)

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
7677b0dbce drm/amdgpu/gfx10: update gfx golden settings for navi14
update registers: mmUTCL1_CTRL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
aa4604b6e4 drm/amdgpu/gfx10: update gfx golden settings
update registers: mmUTCL1_CTRL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
ade9a34e7d drm/amdgpu: flag navi12 and 14 as experimental for 5.4
We can remove this later as things get closer to launch.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
01b40c98ed drm/amdgpu/psp: invalidate the hdp read cache before reading the psp response
Otherwise we may get stale data.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
e8186eeccb drm/amdgpu/psp: flush HDP write fifo after submitting cmds to the psp
We need to make sure the fifo is flushed before we ask the psp to
process the commands.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Guchun Chen
5222d26146 drm/amdgpu: remove redundant variable definition
No need to define the same variables in each loop of the function.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Guchun Chen
8a3e801f19 drm/amdgpu: avoid null pointer dereference
null ptr should be checked first to avoid null ptr access

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Hawking Zhang
fec6a08aae drm/amdgpu: do not init mec2 jt for renoir
For ASICs like renoir/arct, driver doesn't need to load mec2 jt.
when mec1 jt is loaded, mec2 jt will be loaded automatically
since the write is actaully broadcasted to both.

We need to more time to test other gfx9 asic. but for now we should
be able to draw conclusion that mec2 jt is not needed for renoir and
arct.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Hawking Zhang
2011eaea21 drm/amdgpu: add psp ip block for arct
enable psp block for firmware loading and other security
feature setup.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
a142ba8800 drm/amdgpu/ras: use GPU PAGE_SIZE/SHIFT for reserving pages
We are reserving vram pages so they should be aligned to the
GPU page size.

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Xiaojie Yuan
ec51d3facd drm/amdgpu/discovery: get gpu info from ip discovery table
except soc_bounding_box which is not integrated in discovery table yet

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tao Zhou
afa44809a4 drm/amdgpu: use GPU PAGE SHIFT for umc retired page
umc retired page belongs to vram and it should be aligned to gpu page
size

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
57516cdd74 drm/amdgpu: add navi12 pci id
Add Navi12 PCI id support.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tao Zhou
ae115c81ec drm/amdgpu: replace DRM_ERROR with DRM_WARN in ras_reserve_bad_pages
There are two cases of reserve error should be ignored:
1) a ras bad page has been allocated (used by someone);
2) a ras bad page has been reserved (duplicate error injection for one page);

DRM_ERROR is unnecessary for the failure of bad page reserve

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Adam Zerella
879e723df3 docs: drm/amdgpu: Resolve build warnings
Some of the documentation formatting could be improved
which will resolve some Sphinx amdgpu build warnings e.g

WARNING: Unexpected indentation.
WARNING: Block quote ends without a blank line; unexpected unindent.
WARNING: Inline emphasis start-string without end-string.

Signed-off-by: Adam Zerella <adam.zerella@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Alex Deucher
63b2b5e91b drm/amdgpu/vm: fix documentation for amdgpu_vm_bo_param
Add new parameters.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha
143f230533 drm/amdgpu: psp DTM init
DTM is the display topology manager. This is needed to communicate with
psp about the display configurations.

This patch adds
    -Loading the firmware
    -The functions and definitions for communication with the firmware

v2: Fix formatting

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha
ed19a9a2bb drm/amdgpu: psp HDCP init
This patch adds
-Loading the firmware
-The functions and definitions for communication with the firmware

v2: Fix formatting

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Arnd Bergmann
29174a4310 drm/amdgpu: hide another #warning
An earlier patch of mine disabled some #warning statements
that get in the way of build testing, but then another
instance was added around the same time.

Remove that as well.

Fixes: b5203d16ae ("drm/amd/amdgpu: hide #warning for missing DC config")
Fixes: e1c14c4339 ("drm/amdgpu: Enable DC on Renoir")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Arnd Bergmann
ec3e5c0f0c drm/amdgpu: make pmu support optional, again
When CONFIG_PERF_EVENTS is disabled, we cannot compile the pmu
portion of the amdgpu driver:

drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:48:38: error: no member named 'hw' in 'struct perf_event'
        struct hw_perf_event *hwc = &event->hw;
                                     ~~~~~  ^
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:51:13: error: no member named 'attr' in 'struct perf_event'
        if (event->attr.type != event->pmu->type)
            ~~~~~  ^
...

The same bug was already fixed by commit d155bef063 ("amdgpu: make pmu
support optional") but broken again by what looks like an incorrectly
rebased patch.

Fixes: 64f55e6292 ("drm/amdgpu: Add RAS EEPROM table.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Navid Emamdoost
57be09c6e8 drm/amdgpu: fix multiple memory leaks in acp_hw_init
In acp_hw_init there are some allocations that needs to be released in
case of failure:

1- adev->acp.acp_genpd should be released if any allocation attemp for
adev->acp.acp_cell, adev->acp.acp_res or i2s_pdata fails.
2- all of those allocations should be released if
mfd_add_hotplug_devices or pm_genpd_add_device fail.
3- Release is needed in case of time out values expire.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Marek Olšák
815fb4c9d7 drm/amdgpu: return tcc_disabled_mask to userspace
UMDs need this for correct programming of harvested chips.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Alex Deucher
49379032aa drm/amdgpu: don't increment vram lost if we are in hibernation
We reset the GPU as part of our hibernation sequence so we need
to make sure we don't mark vram as lost in that case.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111879
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:19 -05:00
Christian König
69f08e68af drm/amdgpu: revert "disable bulk moves for now"
This reverts commit a213c2c7e2.

The changes to fix this should have landed in 5.1.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:23:03 -05:00
Linus Torvalds
289991ce1c drm fixes for 5.4-rc1
core:
 - Some cleanups and fixes in the self-refresh helpers
 - Some cleanups and fixes in the atomic helpers
 
 amdgpu:
 - Fix a 64 bit divide
 - Prevent a memory leak in a failure case in dc
 - Load proper gfx firmware on navi14 variants
 - Add more navi12 and navi14 PCI ids
 - Misc fixes for renoir
 - Fix bandwidth issues with multiple displays on vega20
 - Support for Dali
 - Fix a possible oops with KFD on hawaii
 - Fix for backlight level after resume on some APUs
 - Other misc fixes
 
 panfrost:
 - Multiple panfrost fixes for regulator support and page fault handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdjZr6AAoJEAx081l5xIa+nboP+gPWzx45Q3IsbnaZmcdFTFEf
 +/XgScoFcv5Uhd3aXtrSYDvPnSyNXpGsV5ccE/FtxNd4G75n20tPFxGNhjzyXfdc
 B2x1IRgc82W1KxYwwDlmd+f/86h6uthFkh1ToKN3hsHWNm8Wu8AgoJnoWvqwluf9
 natSFnQPQIvcADpbpyk8FiNcXvMg0qrKQ8aj3uPxqUs4/ftigzez+5vYJOkktoEJ
 NFtlouVvIZejVo6l4Q5ebXXsol7On02iHUszpdJtb5FxMcBQwAafewCGln2622cA
 8ooWmekZNtoHUH3WmUlrs7TqPKtoOqIEkMO8UvCJDwBB4/ft8sJRPDKFgk547E/8
 Htv6MZXCSOT+/XxebM/wHijOg3MQVjPzO9s73YSmkytzGZVQ/Fgohl/6W+bN/xAm
 j/huS5ZozengAldFJHG4wruSk820Vzx736x2pk+9sbpf7PdFDIpuZus8U8wHc411
 hu3S2397IxyX4XswLg8BTaIOhCXwT7CluQ9mYD1THPgRzG5YPha8JelTcwwlVsD9
 2Cw6mCUAqydHHMboWQnEhRXhuhVfGlPAdJTsdyoI6zdXYqU/ThihJPBgG0wSq0y0
 fAsj/9NRqSzg6hk9vm1QdCeOthKOuAZ0PgLcVHI1RNSwEyrN8yOupVwe7+Mn+q2z
 UNbfr2qXGqKxn6rqUy2W
 =yCyF
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Fixes built up over the past 1.5 weeks or so, it's two weeks of
  amdgpu, some core cleanups and some panfrost fixes. I also finally
  figured out why my desktop was slow to do a bunch of stuff (someone
  gave it an IPv6 address which can't reach anything!).

  core:
   - Some cleanups and fixes in the self-refresh helpers
   - Some cleanups and fixes in the atomic helpers

  amdgpu:
   - Fix a 64 bit divide
   - Prevent a memory leak in a failure case in dc
   - Load proper gfx firmware on navi14 variants
   - Add more navi12 and navi14 PCI ids
   - Misc fixes for renoir
   - Fix bandwidth issues with multiple displays on vega20
   - Support for Dali
   - Fix a possible oops with KFD on hawaii
   - Fix for backlight level after resume on some APUs
   - Other misc fixes

  panfrost:
   - Multiple panfrost fixes for regulator support and page fault
     handling"

* tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm: (34 commits)
  drm/amd/display: prevent memory leak
  drm/amdgpu/gfx10: add support for wks firmware loading
  drm/amdgpu/display: include slab.h in dcn21_resource.c
  drm/amdgpu/display: fix 64 bit divide
  drm/panfrost: Prevent race when handling page fault
  drm/panfrost: Remove NULL checks for regulator
  drm/panfrost: Fix regulator_get_optional() misuse
  drm: Measure Self Refresh Entry/Exit times to avoid thrashing
  drm: Fix kerneldoc and remove unused struct member in self_refresh helper
  drm/atomic: Rename crtc_state->pageflip_flags to async_flip
  drm/atomic: Reject FLIP_ASYNC unconditionally
  drm/atomic: Take the atomic toys away from X
  drm/amdgpu: flag navi12 and 14 as experimental for 5.4
  drm/kms: Duct-tape for mode object lifetime checks
  drm/amdgpu: add navi12 pci id
  drm/amdgpu: add navi14 PCI ID for work station SKU
  drm/amdkfd: Swap trap temporary registers in gfx10 trap handler
  drm/amd/powerplay: implement sysfs for getting dpm clock
  drm/amd/display: Restore backlight brightness after system resume
  drm/amd/display: Implement voltage limitation for dali
  ...
2019-09-27 11:13:35 -07:00
Andrey Konovalov
35f3fc87be drm/amdgpu: untag user pointers
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.

In amdgpu_gem_userptr_ioctl() and amdgpu_amdkfd_gpuvm.c/init_user_pages()
an MMU notifier is set up with a (tagged) userspace pointer.  The untagged
address should be used so that MMU notifiers for the untagged address get
correctly matched up with the right BO.  This patch untag user pointers in
amdgpu_gem_userptr_ioctl() for the GEM case and in amdgpu_amdkfd_gpuvm_
alloc_memory_of_gpu() for the KFD case.  This also makes sure that an
untagged pointer is passed to amdgpu_ttm_tt_get_user_pages(), which uses
it for vma lookups.

Link: http://lkml.kernel.org/r/d684e1df08f2ecb6bc292e222b64fa9efbc26e69.1563904656.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Jens Wiklander <jens.wiklander@linaro.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:41 -07:00
Tianci.Yin
1e94b43813 drm/amdgpu/gfx10: add support for wks firmware loading
load different cp firmware according to the DID and RID

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-24 13:25:15 -05:00
Linus Torvalds
84da111de0 hmm related patches for 5.4
This is more cleanup and consolidation of the hmm APIs and the very
 strongly related mmu_notifier interfaces. Many places across the tree
 using these interfaces are touched in the process. Beyond that a cleanup
 to the page walker API and a few memremap related changes round out the
 series:
 
 - General improvement of hmm_range_fault() and related APIs, more
   documentation, bug fixes from testing, API simplification &
   consolidation, and unused API removal
 
 - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and
   make them internal kconfig selects
 
 - Hoist a lot of code related to mmu notifier attachment out of drivers by
   using a refcount get/put attachment idiom and remove the convoluted
   mmu_notifier_unregister_no_release() and related APIs.
 
 - General API improvement for the migrate_vma API and revision of its only
   user in nouveau
 
 - Annotate mmu_notifiers with lockdep and sleeping region debugging
 
 Two series unrelated to HMM or mmu_notifiers came along due to
 dependencies:
 
 - Allow pagemap's memremap_pages family of APIs to work without providing
   a struct device
 
 - Make walk_page_range() and related use a constant structure for function
   pointers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl1/nnkACgkQOG33FX4g
 mxqaRg//c6FqowV1pQlLutvAOAgMdpzfZ9eaaDKngy9RVQxz+k/MmJrdRH/p/mMA
 Pq93A1XfwtraGKErHegFXGEDk4XhOustVAVFwvjyXO41dTUdoFVUkti6ftbrl/rS
 6CT+X90jlvrwdRY7QBeuo7lxx7z8Qkqbk1O1kc1IOracjKfNJS+y6LTamy6weM3g
 tIMHI65PkxpRzN36DV9uCN5dMwFzJ73DWHp1b0acnDIigkl6u5zp6orAJVWRjyQX
 nmEd3/IOvdxaubAoAvboNS5CyVb4yS9xshWWMbH6AulKJv3Glca1Aa7QuSpBoN8v
 wy4c9+umzqRgzgUJUe1xwN9P49oBNhJpgBSu8MUlgBA4IOc3rDl/Tw0b5KCFVfkH
 yHkp8n6MP8VsRrzXTC6Kx0vdjIkAO8SUeylVJczAcVSyHIo6/JUJCVDeFLSTVymh
 EGWJ7zX2iRhUbssJ6/izQTTQyCH3YIyZ5QtqByWuX2U7ZrfkqS3/EnBW1Q+j+gPF
 Z2yW8iT6k0iENw6s8psE9czexuywa/Lttz94IyNlOQ8rJTiQqB9wLaAvg9hvUk7a
 kuspL+JGIZkrL3ouCeO/VA6xnaP+Q7nR8geWBRb8zKGHmtWrb5Gwmt6t+vTnCC2l
 olIDebrnnxwfBQhEJ5219W+M1pBpjiTpqK/UdBd92A4+sOOhOD0=
 =FRGg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull hmm updates from Jason Gunthorpe:
 "This is more cleanup and consolidation of the hmm APIs and the very
  strongly related mmu_notifier interfaces. Many places across the tree
  using these interfaces are touched in the process. Beyond that a
  cleanup to the page walker API and a few memremap related changes
  round out the series:

   - General improvement of hmm_range_fault() and related APIs, more
     documentation, bug fixes from testing, API simplification &
     consolidation, and unused API removal

   - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE,
     and make them internal kconfig selects

   - Hoist a lot of code related to mmu notifier attachment out of
     drivers by using a refcount get/put attachment idiom and remove the
     convoluted mmu_notifier_unregister_no_release() and related APIs.

   - General API improvement for the migrate_vma API and revision of its
     only user in nouveau

   - Annotate mmu_notifiers with lockdep and sleeping region debugging

  Two series unrelated to HMM or mmu_notifiers came along due to
  dependencies:

   - Allow pagemap's memremap_pages family of APIs to work without
     providing a struct device

   - Make walk_page_range() and related use a constant structure for
     function pointers"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits)
  libnvdimm: Enable unit test infrastructure compile checks
  mm, notifier: Catch sleeping/blocking for !blockable
  kernel.h: Add non_block_start/end()
  drm/radeon: guard against calling an unpaired radeon_mn_unregister()
  csky: add missing brackets in a macro for tlb.h
  pagewalk: use lockdep_assert_held for locking validation
  pagewalk: separate function pointers from iterator data
  mm: split out a new pagewalk.h header from mm.h
  mm/mmu_notifiers: annotate with might_sleep()
  mm/mmu_notifiers: prime lockdep
  mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
  mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports
  mm/hmm: hmm_range_fault() infinite loop
  mm/hmm: hmm_range_fault() NULL pointer bug
  mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
  mm/mmu_notifiers: remove unregister_no_release
  RDMA/odp: remove ib_ucontext from ib_umem
  RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
  RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
  RDMA/mlx5: Use ib_umem_start instead of umem.address
  ...
2019-09-21 10:07:42 -07:00
Alex Deucher
e16a7cbced drm/amdgpu: flag navi12 and 14 as experimental for 5.4
We can remove this later as things get closer to launch.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-18 08:29:30 -05:00
Tianci.Yin
10e85054f9 drm/amdgpu: add navi12 pci id
Add Navi12 PCI id support.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:49:19 -05:00
Tianci.Yin
a82e163bca drm/amdgpu: add navi14 PCI ID for work station SKU
Add the navi14 PCI device id.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:45:17 -05:00
Felix Kuehling
dcafbd50f2 drm/amdgpu: Fix KFD-related kernel oops on Hawaii
Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a574 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:38:35 -05:00
Prike Liang
4f3a2c1077 drm/amd/amdgpu: power up sdma engine when S3 resume back
The sdma_v4 should be ungated when the IP resume back,
otherwise it will hang up and resume time out error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:37:58 -05:00
Trek
73d8e6c7b8 drm/amdgpu: Check for valid number of registers to read
Do not try to allocate any amount of memory requested by the user.
Instead limit it to 128 registers. Actually the longest series of
consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and
mmGB_MACROTILE_MODE0-15 (0x2644-0x2673).

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273
Signed-off-by: Trek <trek00@inbox.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:37:45 -05:00
Aaron Liu
df794f679b drm/amdgpu: remove program of lbpw for renoir
These is no LBPW on Renoir. So removing program of lbpw for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:26:24 -05:00
Andrey Grodzovsky
103efdc1ea drm/amdgpu: Remove clock gating restore.
Restoring clock gating break SMU opeartion afterwards, avoid
this until this further invistigated with SMU.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:25:01 -05:00
Christian König
de7b45babd drm/amdgpu: cleanup creating BOs at fixed location (v2)
The placement is something TTM/BO internal and the RAS code should
avoid touching that directly.

Add a helper to create a BO at a fixed location and use that instead.

v2: squash in fixes (Alex)

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 08:06:54 -05:00
José Roberto de Souza
62afb4ad42 drm/connector: Allow max possible encoders to attach to a connector
Currently we restrict the number of encoders that can be linked to
a connector to 3, increase it to match the maximum number of encoders
that can be initialized(32).

To more effiently do that lets switch from an array of encoder ids to
bitmask.

v2: Fixing missed return on amdgpu_dm_connector_to_encoder()

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913232857.389834-2-jose.souza@intel.com
2019-09-16 15:13:53 -07:00
Andrey Grodzovsky
db338e1663 drm/amdgpu:Fix EEPROM checksum calculation.
Fix checksum calculation after manually resetting the table.
Unify reset and empty EEPROM init flow.
Protect the table reset with lock.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:44 -05:00
Guchun Chen
012dd14d1d drm/amdgpu: fix ras ctrl debugfs node leak
Use debugfs_remove_recursive to remove the whole debugfs
directory instead of removing the node one by one.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:38 -05:00
Christian König
1313dacfad drm/amdgpu: trace if a PD/PT update is done directly
This is usfull for debugging.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:32 -05:00
Christian König
bc51c1e56f drm/amdgpu: drop double HDP flush in the VM code
Already done in the CPU based backend code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:27 -05:00
Christian König
fc39d903eb drm/amdgpu: cleanup coding style in the VM code a bit
No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:13 -05:00
Christian König
03fb560f2e drm/amdgpu: revert "disable bulk moves for now"
This reverts commit a213c2c7e2.

The changes to fix this should have landed in 5.1.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:29:37 -05:00
Jiange Zhao
393993ac0c drm/amdgpu/SRIOV: Navi12 SRIOV VF gets GTT base
With changes in PSP and HV, SRIOV VF will handle

vram gtt location just like bare metal. There is

no need to differentiate it anymore.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:29:04 -05:00
Aaron Liu
28faa17ee8 drm/amdgpu: remove program of lbpw for renoir
These is no LBPW on Renoir. So removing program of lbpw for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:28:57 -05:00
zhong jiang
2032324682 drm/amdgpu: remove the redundant null checks
debugfs_remove and kfree has taken the null check in account.
hence it is unnecessary to check it. Just remove the condition.
No functional change.

This issue was detected by using the Coccinelle software.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:56 -05:00
Jean Delvare
ae2a349597 drm/amd: be quiet when no SAD block is found
It is fine for displays without audio functionality to not provide
any SAD block in their EDID. Do not log an error in that case,
just return quietly.

This fixes half of bug fdo#107825:
https://bugs.freedesktop.org/show_bug.cgi?id=107825

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:56 -05:00
Trek
13238d4fa6 drm/amdgpu: Check for valid number of registers to read
Do not try to allocate any amount of memory requested by the user.
Instead limit it to 128 registers. Actually the longest series of
consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and
mmGB_MACROTILE_MODE0-15 (0x2644-0x2673).

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273
Signed-off-by: Trek <trek00@inbox.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:56 -05:00
Kent Russell
3e103fc301 Revert "drm/amdgpu/nbio7.4: add hw bug workaround for vega20"
This reverts commit e01f2d4189.

VG20 did not require this workaround, as the fix is in the VBIOS.
Leave VG10/12 workaround as some older shipped cards do not have the
VBIOS fix in place, and the kernel workaround is required in those
situations

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
ec671737f8 drm/amdgpu: add graceful VM fault handling v3
Next step towards HMM support. For now just silence the retry fault and
optionally redirect the request to the dummy page.

v2: make sure the VM is not destroyed while we handle the fault.
v3: fix VM destroy check, cleanup comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
b65709a921 drm/amdgpu: reserve the root PD while freeing PASIDs
Free the pasid only while the root PD is reserved. This prevents use after
free in the page fault handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
061468c405 drm/amdgpu: allocate PDs/PTs with no_gpu_wait in a page fault
While handling a page fault we can't wait for other ongoing GPU
operations or we can potentially run into deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
0f6064d6af drm/amdgpu: allow direct submission of clears
For handling PD/PT clears directly in the fault handler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
acb476f541 drm/amdgpu: allow direct submission of PTE updates
For handling PTE updates directly in the fault handler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
807e299409 drm/amdgpu: allow direct submission of PDE updates v2
For handling PDE updates directly in the fault handler.

v2: fix typo in comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
47ca7efa4c drm/amdgpu: allow direct submission in the VM backends v2
This allows us to update page tables directly while in a page fault.

v2: use direct/delayed entities and still wait for moves

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
a2cf324785 drm/amdgpu: split the VM entity into direct and delayed
For page fault handling we need to use a direct update which can't be
blocked by ongoing user CS.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
6817bf283b drm/amdgpu: grab the id mgr lock while accessing passid_mapping
Need to make sure that we actually dropping the right fence.
Could be done with RCU as well, but to complicated for a fix.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:28 -05:00
Jiange Zhao
1b65782468 drm/amdgpu/SRIOV: Navi12 SRIOV VF doesn't load TOC
In SRIOV case, the autoload sequence is the same

as bare metal, except VF won't load TOC.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:21 -05:00
Jiange Zhao
a4ac7693f8 drm/amdgpu/SRIOV: Navi10/12 VF doesn't support SMU
In SRIOV case, SMU and powerplay are handled in HV.

VF shouldn't have control over SMU and powerplay.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:14 -05:00
Prike Liang
a90a24d581 drm/amd/amdgpu: power up sdma engine when S3 resume back
The sdma_v4 should be ungated when the IP resume back,
otherwise it will hang up and resume time out error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:07 -05:00
Jiange Zhao
b05b69036f drm/amdgpu: For Navi12 SRIOV VF, register mailbox functions
Mailbox functions and interrupts are only for Navi12 VF.

Register functions and irqs during initialization.

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:15:20 -05:00
Jack Zhang
51c0f58e9f drm/amdgpu/sriov: add ring_stop before ring_create in psp v11 code
psp  v11 code missed ring stop in ring create function(VMR)
while psp v3.1 code had the code. This will cause VM destroy1
fail and psp ring create fail.

For SIOV-VF, ring_stop should not be deleted in ring_create
function.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:11:28 -05:00
Evan Quan
0e0b89c0d7 drm/amd/powerplay: properly set mp1 state for SW SMU suspend/reset routine
Set mp1 state properly for SW SMU suspend/reset routine.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:10:58 -05:00
Felix Kuehling
d950800e79 drm/amdgpu: Fix KFD-related kernel oops on Hawaii
Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a574 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:10:06 -05:00
Andrey Grodzovsky
708901a666 drm/amdgpu: Fix mutex lock from atomic context.
Problem:
amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
because writing to EEPROM during ASIC reset was unstable.
But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
directly from ISR context and so locking is not allowed. Also it's
irrelevant for this partilcular interrupt as this is generic RAS
interrupt and not memory errors specific.

Fix:
Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:59 -05:00
Jiange Zhao
3636169cc0 drm/amdgpu: Add SRIOV mailbox backend for Navi1x
Mimic the ones for Vega10, add mailbox backend for Navi1x

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:52 -05:00
Guchun Chen
1a3f2e8c3c drm/amdgpu: implement ras query function for pcie bif
ras error query funtionality implementation

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:44 -05:00
Guchun Chen
d7bd680d40 drm/amdgpu: support pcie bif ras query and inject
Call pcie bif ras query/inject in amdgpu ras.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:30 -05:00
Guchun Chen
52652ef286 drm/amdgpu: add ras error query count interface for nbio
Add the interface query_ras_error_count for nbio.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:21 -05:00
Tianci.Yin
ff9d097193 drm/amdgpu: fix CPDMA hang in PRT mode for VEGA10
add and_mask since the programming logic of golden setting changed

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:14 -05:00
Hawking Zhang
f317035288 drm/amdgpu: enable error injection to XGMI block via debugfs
allow inject error to XGMI block via debugfs node ras_ctrl

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:08 -05:00
Hawking Zhang
029fbd437e drm/amdgpu: initialize ras structures for xgmi block (v2)
init ras common interface and fs node for xgmi block

v2: remove unnecessary physical node number check before
invoking amdgpu_xgmi_ras_late_init

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:08:51 -05:00
Andrey Grodzovsky
084fe13b2c drm/amdgpu: Allow to reset to EERPOM table.
The table grows quickly during debug/development effort when
multiple RAS errors are injected. Allow to avoid this by setting
table header back to empty if needed.

v2: Switch to debugfs entry instead of load time parameter.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:34 -05:00
Andrey Grodzovsky
d01b400b1a drm/amdgpu: Add amdgpu_ras_eeprom_reset_table
This will allow to reset the table on the fly.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:27 -05:00
Tao Zhou
d99659a062 drm/amdgpu: rename umc ras_init to err_cnt_init
this interface is related to specific version of umc, distinguish it
from ras_late_init

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:20 -05:00
Tao Zhou
4930aabe7c drm/amdgpu: move umc ras init to umc block
move umc ras init from ras module to umc block, generic ras module
should pay less attention to specific ras block.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:12 -05:00
Tao Zhou
86edcc7dba drm/amdgpu: move umc late init from gmc to umc block
umc late init is umc specific, it's more suitable to be put in umc block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:03 -05:00
Guchun Chen
1bd252c57b drm/amdgpu: remove duplicated header file include
amdgpu_ras.h is already included.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:29 -05:00
Shirish S
a35ad98bf9 drm/amdgpu: remove needless usage of #ifdef
define sched_policy in case CONFIG_HSA_AMD is not
enabled, with this there is no need to check for CONFIG_HSA_AMD
else where in driver code.

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:20 -05:00
Shirish S
8c9f69bc5c drm/amdgpu: fix build error without CONFIG_HSA_AMD
If CONFIG_HSA_AMD is not set, build fails:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function `amdgpu_device_ip_early_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to `sched_policy'

Use CONFIG_HSA_AMD to guard this.

Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling policy")

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:07 -05:00
Andrey Grodzovsky
4d1337d2e9 drm/amdgpu: Avoid RAS recovery init when no RAS support.
Fixes driver load regression on APUs.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:35 -05:00
Christian König
cbfae36cea drm/amdgpu: cleanup PTE flag generation v3
Move the ASIC specific code into a new callback function.

v2: mask the flags for SI and CIK instead of a BUG_ON().
v3: remove last missed BUG_ON().

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:29 -05:00
Christian König
71776b6dae drm/amdgpu: cleanup mtype mapping
Unify how we map the UAPI flags to the PTE hardware flags for a mapping.

Only the MTYPE is actually ASIC dependent, all other flags should be
copied over 1 to 1 and ASIC differences are handled later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:21 -05:00
Tianci.Yin
1dd077bbba drm/amdgpu: add navi14 PCI ID for work station SKU
Add the navi14 PCI device id.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:14 -05:00
Philip Yang
cde85ac247 drm/amdgpu: check if nbio->ras_if exist
To avoid NULL function pointer access. This happens on VG10, reboot
command hangs and have to power off/on to reboot the machine. This is
serial console log:

[  OK  ] Reached target Unmount All Filesystems.
[  OK  ] Reached target Final Step.
         Starting Reboot...
[  305.696271] systemd-shutdown[1]: Syncing filesystems and block
devices.
[  306.947328] systemd-shutdown[1]: Sending SIGTERM to remaining
processes...
[  306.963920] systemd-journald[1722]: Received SIGTERM from PID 1
(systemd-shutdow).
[  307.322717] systemd-shutdown[1]: Sending SIGKILL to remaining
processes...
[  307.336472] systemd-shutdown[1]: Unmounting file systems.
[  307.454202] EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro
[  307.480523] systemd-shutdown[1]: All filesystems unmounted.
[  307.486537] systemd-shutdown[1]: Deactivating swaps.
[  307.491962] systemd-shutdown[1]: All swaps deactivated.
[  307.497624] systemd-shutdown[1]: Detaching loop devices.
[  307.504418] systemd-shutdown[1]: All loop devices detached.
[  307.510418] systemd-shutdown[1]: Detaching DM devices.
[  307.565907] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[  307.731313] BUG: kernel NULL pointer dereference, address:
0000000000000000
[  307.738802] #PF: supervisor read access in kernel mode
[  307.744326] #PF: error_code(0x0000) - not-present page
[  307.749850] PGD 0 P4D 0
[  307.752568] Oops: 0000 [#1] SMP PTI
[  307.756314] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
5.2.0-rc1-kfd-yangp #453
[  307.764644] Hardware name: ASUS All Series/Z97-PRO(Wi-Fi ac)/USB 3.1,
BIOS 9001 03/07/2016
[  307.773580] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu]
[  307.779760] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48
8b b3 90 7d 00 00 48 c7 c7 17 b8 530
[  307.799967] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286
[  307.805585] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX:
0000000000000006
[  307.813261] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI:
ffff9eb29e350000
[  307.820935] RBP: ffff9eb299da0000 R08: 0000000000000000 R09:
0000000000000000
[  307.828609] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9eb299dbd1f8
[  307.836284] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15:
0000000000000000
[  307.843959] FS:  00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000)
knlGS:0000000000000000
[  307.852663] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  307.858842] CR2: 0000000000000000 CR3: 000000081d798005 CR4:
00000000001606e0
[  307.866516] Call Trace:
[  307.869169]  amdgpu_device_ip_suspend_phase2+0x80/0x110 [amdgpu]
[  307.875654]  ? amdgpu_device_ip_suspend_phase1+0x4d/0xd0 [amdgpu]
[  307.882230]  amdgpu_device_ip_suspend+0x2e/0x60 [amdgpu]
[  307.887966]  amdgpu_pci_shutdown+0x2f/0x40 [amdgpu]
[  307.893211]  pci_device_shutdown+0x31/0x60
[  307.897613]  device_shutdown+0x14c/0x1f0
[  307.901829]  kernel_restart+0xe/0x50
[  307.905669]  __do_sys_reboot+0x1df/0x210
[  307.909884]  ? task_work_run+0x73/0xb0
[  307.913914]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[  307.918970]  do_syscall_64+0x4a/0x1c0
[  307.922904]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  307.928336] RIP: 0033:0x7f0671cf8373
[  307.932176] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00
00 0f 1f 44 00 00 89 fa be 69 19 128
[  307.952384] RSP: 002b:00007ffdd1723d68 EFLAGS: 00000202 ORIG_RAX:
00000000000000a9
[  307.960527] RAX: ffffffffffffffda RBX: 0000000001234567 RCX:
00007f0671cf8373
[  307.968201] RDX: 0000000001234567 RSI: 0000000028121969 RDI:
00000000fee1dead
[  307.975875] RBP: 00007ffdd1723dd0 R08: 0000000000000000 R09:
0000000000000000
[  307.983550] R10: 0000000000000002 R11: 0000000000000202 R12:
00007ffdd1723dd8
[  307.991224] R13: 0000000000000000 R14: 0000001b00000004 R15:
00007ffdd17240c8
[  307.998901] Modules linked in: xt_MASQUERADE nfnetlink iptable_nat
xt_addrtype xt_conntrack nf_nat nf_cos
[  308.026505] CR2: 0000000000000000
[  308.039998] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu]
[  308.046180] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48
8b b3 90 7d 00 00 48 c7 c7 17 b8 530
[  308.066392] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286
[  308.072013] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX:
0000000000000006
[  308.079689] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI:
ffff9eb29e350000
[  308.087366] RBP: ffff9eb299da0000 R08: 0000000000000000 R09:
0000000000000000
[  308.095042] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9eb299dbd1f8
[  308.102717] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15:
0000000000000000
[  308.110394] FS:  00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000)
knlGS:0000000000000000
[  308.119099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  308.125280] CR2: 0000000000000000 CR3: 000000081d798005 CR4:
00000000001606e0
[  308.135304] printk: systemd-shutdow: 3 output lines suppressed due to
ratelimiting
[  308.143518] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x00000009
[  308.151798] Kernel Offset: 0x15000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xff)
[  308.171775] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x00000009 ]---

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:58:52 -05:00
Xiaojie Yuan
bfa603aa5e drm/amdgpu: fix null pointer deref in firmware header printing
v2: declare as (struct common_firmware_header *) type because
    struct xxx_firmware_header inherits from it

When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an
unallocated amdgpu_firmware_info instance.

This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have
optimized out such 'defined but not referenced' variable.

[ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
[ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0
[ 1120.818931] Oops: 0000 [#1] SMP
[ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid
[ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G           OE  ------------   3.10.0-1062.el7.x86_64 #1
[ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015
[ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000
[ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>]  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.990483] RSP: 0018:ffff991ee625f950  EFLAGS: 00010202
[ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000
[ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3
[ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000
[ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220
[ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30
[ 1121.032666] FS:  00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000
[ 1121.041000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0
[ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1121.068938] Call Trace:
[ 1121.071494]  [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu]
[ 1121.077886]  [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu]
[ 1121.085296]  [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu]
[ 1121.092534]  [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu]
[ 1121.099675]  [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu]
[ 1121.106888]  [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm]
[ 1121.113419]  [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140
[ 1121.120183]  [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu]
[ 1121.126919]  [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0
[ 1121.132622]  [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160
[ 1121.138607]  [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0
[ 1121.144766]  [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0
[ 1121.150507]  [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50
[ 1121.156422]  [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0
[ 1121.162213]  [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20
[ 1121.167771]  [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0
[ 1121.173590]  [<ffffffffa4eb4c94>] driver_register+0x64/0xf0
[ 1121.179345]  [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0
[ 1121.185593]  [<ffffffffc099f000>] ? 0xffffffffc099efff
[ 1121.190914]  [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu]
[ 1121.197101]  [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240
[ 1121.202901]  [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0
[ 1121.208598]  [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100
[ 1121.214894]  [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140
[ 1121.220698]  [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a
[ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7
[ 1121.247422] RIP  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1121.254432]  RSP <ffff991ee625f950>
[ 1121.258017] CR2: 000000000000000a
[ 1121.261427] ---[ end trace e98b35387ede75bd ]---

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Fixes: c5fb912653 ("drm/amdgpu: add firmware header printing for psp fw loading (v2)")
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:56:01 -05:00
Huang Rui
4042a18872 drm/amdkfd: enable renoir while device probes
This patch is to add asic flag to enable device probe during kfd init.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:50 -05:00
Huang Rui
aa978594cf drm/amdgpu: disable gfxoff while use no H/W scheduling policy
While gfxoff is enabled, the mmVM_XXX registers will be 0xfffffff while the GFX
is in "off" state. KFD queue creattion doesn't use ring based method, so it will
trigger a VM fault.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:41 -05:00
Felix Kuehling
7cae706193 drm/amdgpu: Disable retry faults in VMID0
There is no point retrying page faults in VMID0. Those faults are
always fatal.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-and-Tested-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:34 -05:00
Yong Zhao
4e66d7d215 drm/amdgpu: Add a kernel parameter for specifying the asic type
As more and more new asics start to reuse the old device IDs before
launch, there is a need to quickly override the existing asic type
corresponding to the reused device ID through a kernel parameter. With
this, engineers no longer need to rely on local hack patches,
facilitating cooperation across teams.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:25 -05:00
Alex Deucher
bb42eda284 drm/amdgpu/irq: check if nbio funcs exist
We need to check if the nbios funcs exist before
checking the individual pointers.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
2019-09-16 09:54:14 -05:00
Tao Zhou
1a6fc071e1 drm/amdgpu: move the call of ras recovery_init and bad page reserve to proper place
ras recovery_init should be called after ttm init,
bad page reserve should be put in front of gpu reset since i2c
may be unstable during gpu reset.
add cleanup for recovery_init and recovery_fini

v2: add more comment and print.
    remove cancel_work_sync in recovery_init.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:47 -05:00
Tao Zhou
87d2b92f1e drm/amdgpu: save umc error records
save umc error records to ras bad page array

v2: add bad pages before gpu reset
v3: add NULL check for adev->umc.funcs

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:40 -05:00
Tao Zhou
78ad00c903 drm/amdgpu: Hook EEPROM table to RAS
support eeprom records load and save for ras,
move EEPROM records storing to bad page reserving

v2: remove redundant check for con->eh_data

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:32 -05:00
Tao Zhou
9dc23a6325 drm/amdgpu: change ras bps type to eeprom table record structure
change bps type from retired page to eeprom table record, prepare for
saving umc error records to eeprom

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:26 -05:00
Andrey Grodzovsky
4bc2234077 drm/madgpu: Fix EEPROM Checksum calculation.
Fix typo which messed up the calculation.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:19 -05:00
Andrey Grodzovsky
4d25fba4e3 drm/amdgpu: Remove clock gating restore.
Restoring clock gating break SMU opeartion afterwards, avoid
this until this further invistigated with SMU.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:10 -05:00
Yong Zhao
050091ab6e drm/amdkfd: Query kfd device info by CHIP id instead of pci device id
This optimizes out the pci device id usage in KFD and makes the code
more maintainable.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:45 -05:00
Felix Kuehling
cd05c86510 drm/amdgpu: Disable page faults while reading user wptrs
These wptrs must be pinned and GPU accessible when this is called
from hqd_load functions. So they should never fault. This resolves
a circular lock dependency issue involving four locks including the
DQM lock and mmap_sem.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:38 -05:00
John Clements
337c200756 drm/amdgpu: clean up load TMR sequence
Removed redundant goto statement

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:11 -05:00
John Clements
4fb60b02fb drm/amdgpu: enable TA load support in Arcturus
Add support for loading XGMI/RAS FW

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:02 -05:00
Tao Zhou
c5b6e585b2 drm/amdgpu: change r type to int in gmc_v9_0_late_init
change r type from bool to int, suitable for both bool and int return
value

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:48:51 -05:00
Jack Zhang
f1d59e00ff drm/amd/amdgpu: add sw_fini interface for df_funcs
add sw_fini interface of df_funcs.
This interface will remove sysfs file of df_cntr_avail
function.

The old behavior only create sysfs of df_cntr_avail
in sw_init, but never remove it for lack of sw_fini
interface. With this,driver will report create
sysfs fail when it's loaded for the second time.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:42:15 -05:00
Hawking Zhang
9dc9134258 drm/amdgpu: init UMC & RSMU register base address
UMC RAS feature requires access to UMC & RSMU registers

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:42:08 -05:00
Hawking Zhang
1c70d3d9c4 drm/amdgpu/nbio: switch to amdgpu_nbio_ras_late_init helper function
amdgpu_nbio_ras_late_init is used to init nbio specfic
ras debugfs/sysfs node and nbio specific interrupt handler.
It can be shared among nbio generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:42:02 -05:00
Hawking Zhang
47930de4aa drm/amdgpu/mmhub: switch to amdgpu_mmhub_ras_late_init helper function
amdgpu_mmhub_ras_late_init is used to init mmhub specfic
ras debugfs/sysfs node and mmhub specific interrupt handler.
It can be shared among mmhub generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:55 -05:00
Hawking Zhang
bfcf62c2a5 drm/amdgpu/sdma: switch to amdgpu_sdma_ras_late_init helper function
amdgpu_sdma_ras_late_init is used to init sdma specfic
ras debugfs/sysfs node and sdma specific interrupt handler.
It can be shared among sdma generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:49 -05:00
Hawking Zhang
6caeee7a70 drm/amdgpu/gfx: switch to amdgpu_gfx_ras_late_init helper function
amdgpu_gfx_ras_late_init is used to init gfx specfic
ras debugfs/sysfs node and gfx specific interrupt handler.
It can be shared among gfx generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:42 -05:00
Hawking Zhang
a85eff14da drm/amdgpu/gmc: switch to amdgpu_gmc_ras_late_init helper function
amdgpu_gmc_ras_late_init is used to init gmc specfic
ras debugfs/sysfs node and gmc specific interrupt handler.
It can be shared among gmc generations.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:36 -05:00
Hawking Zhang
d094aea312 drm/amdgpu: set ip specific ras interface pointer to NULL after free it
to prevent access to dangling pointers

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:29 -05:00
Andrey Grodzovsky
d5ea093eeb dmr/amdgpu: Add system auto reboot to RAS.
In case of RAS error allow user configure auto system
reboot through ras_ctrl.
This is also part of the temproray work around for the RAS
hang problem.

v4: Use latest kernel API for disk sync.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:17 -05:00
Andrey Grodzovsky
7c6e68c777 drm/amdgpu: Avoid HW GPU reset for RAS.
Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:05 -05:00
Andrey Grodzovsky
12ffa55da6 drm/amdgpu: Fix bugs in amdgpu_device_gpu_recover in XGMI case.
Issue 1:
In  XGMI case amdgpu_device_lock_adev for other devices in hive
was called to late, after access to their repsective schedulers.
So relocate the lock to the begining of accessing the other devs.

Issue 2:
Using amdgpu_device_ip_need_full_reset to switch the device list from
all devices in hive to the single 'master' device who owns this reset
call is wrong because when stopping schedulers we iterate all the devices
in hive but when restarting we will only reactivate the 'master' device.
Also, in case amdgpu_device_pre_asic_reset conlcudes that full reset IS
needed we then have to stop schedulers for all devices in hive and not
only the 'master' but with amdgpu_device_ip_need_full_reset  we
already missed the opprotunity do to so. So just remove this logic and
always stop and start all schedulers for all devices in hive.

Also minor cleanup and print fix.

v4: Minor coding style fix.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:39:05 -05:00
Christian König
43ce6bab7b drm/amdgpu: remove amdgpu_cs_try_evict
Trying to evict things from the current working set doesn't work that
well anymore because of per VM BOs.

Rely on reserving VRAM for page tables to avoid contention.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:38:56 -05:00
Christian König
9d1b3c7805 drm/amdgpu: reserve at least 4MB of VRAM for page tables v2
This hopefully helps reduce the contention for page tables.

v2: adjust maximum reported VRAM size as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:38:47 -05:00
Christian König
629be20395 drm/amdgpu: use moving fence instead of exclusive for VM updates
Make VM updates depend on the moving fence instead of the exclusive one.

Makes it less likely to actually have a dependency.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:38:38 -05:00
Hawking Zhang
39857252e5 drm/amdgpu: only apply gds clearing workaround when ras is supported
gds clearing workaround should only be applied on asics that support gfx ras

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:29 -05:00
Hawking Zhang
8bf2485aec drm/amdgpu: fix memory leak when ras is not supported on specific ip block
free ras_if if ras is not supported

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:22 -05:00
Hawking Zhang
4ce71be67b drm/amdgpu: check mmhub_funcs pointer before refering to it
mmhub callback functions are not initialized for all the ASICs

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:16 -05:00
Felix Kuehling
17da41bf00 drm/amdgpu: Remove unnecessary TLB workaround (v2)
This workaround is better handled in user mode in a way that doesn't
require allocating extra memory and breaking userptr BOs.

The TLB bug is a performance bug, not a functional or security bug.
Hence it is safe to remove this kernel part of the workaround to
allow a better workaround using only virtual address alignments in
user mode.

v2: Removed VI_BO_SIZE_ALIGN definition

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:08 -05:00
Felix Kuehling
e0253d083c drm/amdgpu: Use optimal mtypes and PTE bits for Arcturus
For compute VRAM allocations on Arturus use the new RW mtype
for non-coherent local memory, CC mtype for coherent local
memory and PTE_SNOOPED bit for invalidating non-dirty cache
lines on remote XGMI mappings.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:01 -05:00
Felix Kuehling
d0ba51b1ca drm/amdgpu: Determing PTE flags separately for each mapping (v3)
The same BO can be mapped with different PTE flags by different GPUs.
Therefore determine the PTE flags separately for each mapping instead
of storing them in the KFD buffer object.

Add a helper function to determine the PTE flags to be extended with
ASIC and memory-type-specific logic in subsequent commits.

v2: Split Arcturus-specific MTYPE changes into separate commit
v3: Fix return type of get_pte_flags to uint64_t

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:35:55 -05:00
Oak Zeng
093e48c04d drm/amdgpu: Support new arcturus mtype
Arcturus repurposed mtype WC to RW. Modify gmc functions
to support the new mtype

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:35:48 -05:00
Hawking Zhang
22e1d14fef drm/amdgpu: switch to amdgpu_ras_late_init for nbio v7_4 (v2)
call helper function in late init phase to handle ras init
for nbio ip block

v2: init local var r to 0 in case the function return failure
on asics that don't have ras_late_init implementation

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
9ad1dc295b drm/amdgpu: add ras_late_init callback function for nbio v7_4 (v3)
ras_late_init callback function will be used to do common ras
init in late init phase.

v2: call ras_late_fini to do cleanup when fails to enable interrupt

v3: rename sysfs/debugfs node name to pcie_bif_xxx

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
dda79907a7 drm/amdgpu: add mmhub ras_late_init callback function (v2)
The function will be called in late init phase to do mmhub
ras init

v2: check ras_late_init function pointer before invoking the
function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
2452e7783c drm/amdgpu: switch to amdgpu_ras_late_init for gmc v9 block (v2)
call helper function in late init phase to handle ras init
for gmc ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
7d0a31e8cc drm/amdgpu: switch to amdgpu_ras_late_init for sdma v4 block (v2)
call helper function in late init phase to handle ras init
for sdma ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
63fa48db49 drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2)
call helper function in late init phase to handle ras init
for gfx ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
b293e891b0 drm/amdgpu: add helper function to do common ras_late_init/fini (v3)
In late_init for ras, the helper function will be used to
1). disable ras feature if the IP block is masked as disabled
2). send enable feature command if the ip block was masked as enabled
3). create debugfs/sysfs node per IP block
4). register interrupt handler

v2: check ih_info.cb to decide add interrupt handler or not

v3: add ras_late_fini for cleanup all the ras fs node and remove
interrupt handler

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
a344db8e5e drm/amdgpu: poll ras_controller_irq and err_event_athub_irq status
For the hardware that can not enable BIF ring for IH cookies for both
ras_controller_irq and err_event_athub_irq, the driver has to poll the
status register in irq handling and ack the hardware properly when there
is interrupt triggered

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
4e644fffb5 drm/amdgpu: add ras_controller and err_event_athub interrupt support
Ras controller interrupt and Ras err event athub interrupt are two dedicated
interrupts for RAS support.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
4241863afc drm/amdgpu/nbio: add functions to query ras specific interrupt status
ras_controller_interrupt and err_event_interrupt are ras specific interrupts.
add functions to check their status and ack them if they are generated. both
funcitons should only be invoked in ISR when BIF ring is disabled or even not
initialized.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:03 -05:00
Hawking Zhang
bebc076285 drm/amdgpu: switch to new amdgpu_nbio structure
no functional change, just switch to new structures

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:03 -05:00
Hawking Zhang
078ef4e932 drm/amdgpu: add new amdgpu nbio header file
More nbio funcitonalities will be added and nbio could
be treated as an ip block like gfx/sdma.etc

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:03 -05:00
Gerd Hoffmann
e7bf74d0aa drm/amdgpu: switch to gem vma offset manager
Pass gem vma_offset_manager to ttm_bo_device_init(), so ttm uses it
instead of its own embedded struct.  This makes some gem functions
(specifically drm_gem_object_lookup) work on ttm objects.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190905070509.22407-6-kraxel@redhat.com
2019-09-11 08:03:24 +02:00
Gerd Hoffmann
9d6f4484e8 drm/ttm: turn ttm_bo_device.vma_manager into a pointer
Rename the embedded struct vma_offset_manager, new name is _vma_manager.
ttm_bo_device.vma_manager changed to a pointer.

The ttm_bo_device_init() function gets an additional vma_manager
argument which allows to initialize ttm with a different vma manager.
When passing NULL the embedded _vma_manager is used.

All callers are updated to pass NULL, so the behavior doesn't change.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190905070509.22407-2-kraxel@redhat.com
2019-09-11 08:03:23 +02:00
Dave Airlie
9ed45a209a Merge tag 'drm-next-5.4-2019-08-30' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.4-2019-08-30:

amdgpu:
- Add DC support for Renoir
- Add some GPUVM hw bug workarounds
- add support for the smu11 i2c controller
- GPU reset vram lost bug fixes
- Navi1x powergating fixes
- Navi12 power fixes
- Renoir power fixes
- Misc bug fixes and cleanups

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830212650.5055-1-alexander.deucher@amd.com
2019-09-06 16:40:28 +10:00
Petr Cvek
20c14ee135 drm/amdgpu: Fix undefined dm_ip_block for navi12
There is missing "if defined" CONFIG_DRM_AMD_DC block for non DC
configurations. This will cause link error. The patch is fixing that.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110979
Signed-off-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-30 15:37:17 -05:00
Aaron Liu
537e3bbfee drm/amdgpu: fix no interrupt issue for renoir emu (v2)
In renoir's vega10_ih model, there's a security change in mmIH_CHICKEN
register, that limits IH to use physical address (FBPA, GPA) directly.
Those chicken bits need to be programmed first.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-30 15:37:17 -05:00
Andrey Grodzovsky
0b2d2c2eec drm/amdgpu: Handle job is NULL use case in amdgpu_device_gpu_recover
This should be checked at all places job is accessed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-30 15:02:39 -05:00
Roman Li
e1c14c4339 drm/amdgpu: Enable DC on Renoir
Enable DC support for renoir.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:34 -05:00
Prike Liang
334ffd0daa drm/amdgpu: Initialize and update SDMA power gating
Init SDMA HW base configuration and enable idle INT for rn.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00