Commit Graph

7400 Commits

Author SHA1 Message Date
Felix Kuehling
9f890f3044 drm/amdgpu: Optimize KFD page table reservation
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.

Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU:  1GB
New page table reservation per GPU: 32MB

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Felix Kuehling
b72ff1909c drm/amdgpu: Raise KFD unpinned system memory limit
Allow KFD applications to use more unpinned system memory through
HMM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Felix Kuehling
29a39c90ba drm/amdgpu: Optimize KFD page table reservation
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.

Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU:  1GB
New page table reservation per GPU: 32MB

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 12:24:07 -05:00
Alex Deucher
dea8b90029 drm/amdgpu: flag vram lost on baco reset for VI/CIK
VI/CIK BACO was inflight when this fix landed for SOC15/NV.
Add the fix to VI/CIK as well.

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 12:22:31 -05:00
John Clements
5985ebbe78 drm/amdgpu: Resolved offchip EEPROM I/O issue
Updated target I2C address

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 12:20:22 -05:00
Nathan Chancellor
a63141e317 drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG
Commit b0f3cd3191 ("drm/amdgpu: remove unnecessary JPEG2.0 code from
VCN2.0") introduced a new clang warning in the vcn_v2_0_stop function:

../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: warning: variable 'r'
is used uninitialized whenever 'while' loop exits because its condition
is false [-Wsometimes-uninitialized]
        SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
                while ((tmp_ & (mask)) != (expected_value)) {   \
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1083:6: note: uninitialized use
occurs here
        if (r)
            ^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: note: remove the
condition if it is always true
        SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
        ^
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
                while ((tmp_ & (mask)) != (expected_value)) {   \
                       ^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1072:7: note: initialize the
variable 'r' to silence this warning
        int r;
             ^
              = 0
1 warning generated.

To prevent warnings like this from happening in the future, make the
SOC15_WAIT_ON_RREG macro initialize its ret variable before the while
loop that can time out. This macro's return value is always checked so
it should set ret in both the success and fail path.

Link: https://github.com/ClangBuiltLinux/linux/issues/776
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 09:48:57 -05:00
John Clements
8633f126bf drm/amdgpu: Resolved offchip EEPROM I/O issue
Updated target I2C address

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-25 11:20:10 -05:00
Oak Zeng
3d3f9ba8c4 drm/amdgpu: Apply noretry setting for mmhub9.4
Config the translation retry behavior according to noretry
kernel parameter

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-25 11:19:55 -05:00
Jason Gunthorpe
81fa1af31b drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
Convert the collision-retry lock around hmm_range_fault to use the one now
provided by the mmu_interval notifier.

Although this driver does not seem to use the collision retry lock that
hmm provides correctly, it can still be converted over to use the
mmu_interval_notifier api instead of hmm_mirror without too much trouble.

This also deletes another place where a driver is associating additional
data (struct amdgpu_mn) with a mmu_struct.

Link: https://lore.kernel.org/r/20191112202231.3856-13-jgg@ziepe.ca
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23 19:56:45 -04:00
Jason Gunthorpe
62914a99de drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.

For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.

Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.

Link: https://lore.kernel.org/r/20191112202231.3856-12-jgg@ziepe.ca
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23 19:56:45 -04:00
Jason Gunthorpe
a9ae8731e6 drm/amdgpu: Call find_vma under mmap_sem
find_vma() must be called under the mmap_sem, reorganize this code to
do the vma check after entering the lock.

Further, fix the unlocked use of struct task_struct's mm, instead use
the mm from hmm_mirror which has an active mm_grab. Also the mm_grab
must be converted to a mm_get before acquiring mmap_sem or calling
find_vma().

Fixes: 66c45500bf ("drm/amdgpu: use new HMM APIs and helpers")
Fixes: 0919195f2b ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads")
Link: https://lore.kernel.org/r/20191112202231.3856-11-jgg@ziepe.ca
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23 19:56:44 -04:00
changzhu
f920d1bb9c drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.

After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:55:19 -05:00
changzhu
6c2c897237 drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:55:19 -05:00
Jack Zhang
1b34de7c3f drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
After rlcg fw 2.1, kmd driver starts to load extra fw for
LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw
because all rlcg related fw have been loaded by host driver.
Guest driver would load the three fw fail without this change.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:19 -05:00
Jack Zhang
ef1c0cbcd1 drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF
Currently the three features haven't been enabled at SRIOV, it would
trigger guest driver load fail with the bare-metal path of the three
features.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:19 -05:00
Xiaojie Yuan
210b3b3c75 drm/amdgpu/gfx10: re-init clear state buffer after gpu reset
This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

clear state buffer (resides in vram) is corrupted after 1st baco reset,
upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:55:12 -05:00
Stephen Rothwell
a3511321fd merge fix for "ftrace: Rework event_create_dir()"
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:12 -05:00
Jay Cornwall
57fb0ab2f1 drm/amdgpu: Update Arcturus golden registers
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:12 -05:00
Xiaojie Yuan
908a28be09 drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
Fixes: 0900a9efdb ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)")
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:55:12 -05:00
Xiaojie Yuan
1e902a6d32 drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
50us is not enough to wait for cp ready after gpu reset on some navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-22 14:54:56 -05:00
Alex Deucher
5e18d2b14c Revert "drm/amd/display: enable S/G for RAVEN chip"
This reverts commit 1c42591591.

S/G display is not stable with the IOMMU enabled on some
platforms.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:11 -05:00
Alex Deucher
8fc4134413 drm/amdgpu: disable gfxoff on original raven
There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:11 -05:00
Alex Deucher
5355d7e054 drm/amdgpu: remove experimental flag for Navi14
5.4 and newer works fine with navi14.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Alex Deucher
70f7eb639e drm/amdgpu: disable gfxoff when using register read interface
When gfxoff is enabled, accessing gfx registers via MMIO
can lead to a hang.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497
Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Sam Bobroff
3d0e3ce52c drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
The INTERRUPT_CNTL2 register expects a valid DMA address, but is
currently set with a GPU MC address.  This can cause problems on
systems that detect the resulting DMA read from an invalid address
(found on a Power8 guest).

Instead, use the DMA address of the dummy page because it will always
be safe.

Fixes: 27ae10641e ("drm/amdgpu: add interupt handler implementation for si v3")
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Alex Deucher
f8a69a8022 drm/amdgpu/nv: add asic func for fetching vbios from rom directly
Needed as a fallback if the vbios can't be fetched by other means.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Yintian Tao
c0e21ea1d0 drm/amdgpu: put flush_delayed_work at first
There is one regression from 042f3d7b745cd76aa
To put flush_delayed_work after adev->shutdown = true
which will make amdgpu_ih_process not response the irq
At last, all ib ring tests will be failed just like below

[drm] amdgpu: finishing device.
[drm] Fence fallback timer expired on ring gfx
[drm] Fence fallback timer expired on ring comp_1.0.0
[drm] Fence fallback timer expired on ring comp_1.1.0
[drm] Fence fallback timer expired on ring comp_1.2.0
[drm] Fence fallback timer expired on ring comp_1.3.0
[drm] Fence fallback timer expired on ring comp_1.0.1
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).

v2: replace cancel_delayed_work_sync() with flush_delayed_work()

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Leo Liu
4e20f6550b drm/amdgpu/vcn2.5: fix the enc loop with hw fini
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
Xiaojie Yuan
0900a9efdb drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)
1. no need to allocate an extra member for 'mqd_backup' array
2. backup/restore mqd to/from the correct 'mqd_backup' array slot

v2: warning fix (Alex)

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:35:10 -05:00
zhengbin
e9c5dbc1a2 drm/amdgpu: Use ARRAY_SIZE for sos_old_versions
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/psp_v3_1.c:182:40-41: WARNING: Use ARRAY_SIZE

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
changzhu
4ed8a03740 drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.

After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
changzhu
dab5ef2722 drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Jack Zhang
c348ad46b0 drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
After rlcg fw 2.1, kmd driver starts to load extra fw for
LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw
because all rlcg related fw have been loaded by host driver.
Guest driver would load the three fw fail without this change.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Jack Zhang
edc2176d51 drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF
Currently the three features haven't been enabled at SRIOV, it would
trigger guest driver load fail with the bare-metal path of the three
features.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Xiaojie Yuan
c25edaaf75 drm/amdgpu/gfx10: re-init clear state buffer after gpu reset
This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

clear state buffer (resides in vram) is corrupted after 1st baco reset,
upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Stephen Rothwell
6a93b58e5f merge fix for "ftrace: Rework event_create_dir()"
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Colin Ian King
2e77541bd1 drm/amdgpu: remove redundant assignment to pointer write_frame
The pointer write_frame is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Alex Deucher
562b49fcd0 drm/amdgpu: simplify runtime suspend
In the standard _PR3 case, the pci core handles the pci state.
The driver only needs to handle it in the legacy ATPX case.

This may fix issues with runtime suspend/resume on certain
hybrid graphics laptops.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Jay Cornwall
6e04b2248d drm/amdgpu: Update Arcturus golden registers
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Dennis Li
f6c3623b7b drm/amdgpu: implement querying ras error count for mmhub9.4
Get mmhub error counter by accessing EDC_CNT registers.

v2: Add mmhub_v9_4_ prefix for local static variable and function

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:27:11 -05:00
Dennis Li
8781e5df11 drm/amdgpu: refine query function of mmhub EDC counter in vg20
Add codes to print the detail EDC info for the subblock of mmhub

v2: Move the EDC_CNT registers' defintion from mmhub_9_4 header
files to mmhub_1_0 ones. Add mmhub_v1_0_ prefix for the local
static variable and function.

v3: squash in DC fix

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:26:46 -05:00
Dennis Li
46f719696e drm/amdgpu: define soc15_ras_field_entry for reuse
The struct soc15_ras_field_entry will be reused by
other IPs, such as mmhub and gc

v2: rename ras_subblock_regs to gc_ras_fields_vg20,
because the future asic maybe have a different table.

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:23:09 -05:00
Xiaojie Yuan
d98a07aea6 drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
Fixes: 0bb419c76b ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)")
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:22:56 -05:00
Xiaojie Yuan
387d40fd6f drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
50us is not enough to wait for cp ready after gpu reset on some navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:20:30 -05:00
Alex Sierra
b4672c8a84 amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:20:23 -05:00
Alex Sierra
f43ef951f6 drm/amdgpu: add flag to indicate amdgpu vm context
Flag added to indicate if the amdgpu vm context is used for compute or
graphics.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-22 14:20:07 -05:00
Frederick Lawler
88027c89ea drm/amdgpu: Prefer pcie_capability_read_word()
Commit 8c0d3a02c1 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.

Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().

[bhelgaas: fix a couple remaining instances in cik.c]
Link: https://lore.kernel.org/r/20191118003513.10852-1-fred@fredlawl.com
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-21 11:15:49 -06:00
Bjorn Helgaas
35e768e296 drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
Replace hard-coded magic numbers with the descriptive PCI_EXP_LNKCTL2
definitions.  No functional change intended.

Link: https://lore.kernel.org/r/20191112173503.176611-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-21 07:52:34 -06:00
Bjorn Helgaas
19d7a95a8b drm/amdgpu: Correct Transmit Margin masks
Previously we masked PCIe Link Control 2 register values with "7 << 9",
which was apparently intended to be the Transmit Margin field, but instead
was the high order bit of Transmit Margin, the Enter Modified Compliance
bit, and the Compliance SOS bit.

Correct the mask to "7 << 7", which is the Transmit Margin field.

Link: https://lore.kernel.org/r/20191112173503.176611-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-21 07:52:34 -06:00
Dave Airlie
c22fe762ba Merge tag 'drm-next-5.5-2019-11-15' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-15:

amdgpu:
- Fix AVFS handling on SMU7 parts with custom power tables
- Enable Overdrive sysfs interface for Navi parts
- Fix power limit handling on smu11 parts
- Fix pcie link sysfs output for Navi
- Probably cancel MM worker threads on shutdown

radeon:
- Cleanup for ppc change

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115163516.3714-1-alexander.deucher@amd.com
2019-11-21 08:52:27 +10:00
Alex Deucher
72f058b723 drm/amdgpu: enable runtime pm on BACO capable boards if runpm=1
BACO - Bus Active, Chip Off

Everything is in place now.  Not enabled by default yet.  You
still have to specify runpm=1.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:54 -05:00
Alex Deucher
3840c5bcc2 drm/amdgpu: disentangle runtime pm and vga_switcheroo
Originally we only supported runtime pm on PX/HG laptops
so vga_switcheroo and runtime pm are sort of entangled.

Attempt to logically separate them.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:53 -05:00
Alex Deucher
6ae6c7d404 drm/amdgpu: start to disentangle boco from runtime pm
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

We originally only supported runtime pm on PX/HG
laptops so most of the runtime pm code looks for this.
Add a new flag to check for runtime pm enablement and
use this rather than checking for PX/HG.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:53 -05:00
Alex Deucher
1913431728 drm/amdgpu: add baco support to runtime suspend/resume
BACO - Bus Active, Chip Off

This adds the necessary support to the runtime suspend
and resume functions to handle boards that support
baco.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:52 -05:00
Alex Deucher
361dbd01a1 drm/amdgpu: add helpers for baco entry and exit
BACO - Bus Active, Chip Off

Will be used for runtime pm.  Entry will enter the BACO
state (chip off).  Exit will exit the BACO state (chip on).

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:52 -05:00
Alex Deucher
11520f2708 drm/amdgpu: split swSMU baco_reset into enter and exit
BACO - Bus Active, Chip Off

So we can use it for power savings rather than just reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:51 -05:00
Alex Deucher
b97e9d47e5 drm/amdgpu: add additional boco checks to runtime suspend/resume (v2)
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

We will take slightly different paths for boco and baco.

v2: fold together two consecutive if clauses

Reviewed-by: Evan Quan <evan.quan@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:50 -05:00
Alex Deucher
31af062acf drm/amdgpu: rename amdgpu_device_is_px to amdgpu_device_supports_boco (v2)
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

To better match what we are checking for and to align with
amdgpu_device_supports_baco.

BOCO is used on PowerXpress/Hybrid Graphics systems and BACO
is used on desktop dGPU boards.

v2: fix typo in documentation

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:50 -05:00
Alex Deucher
a69cba42b1 drm/amdgpu: add a amdgpu_device_supports_baco helper
BACO - Bus Active, Chip Off

To check if a device supports BACO or not.  This will be
used in determining when to enable runtime pm.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:49 -05:00
Alex Deucher
ac7426169e drm/amdgpu: add supports_baco callback for NV asics.
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:48 -05:00
Alex Deucher
e45ed9435f drm/amdgpu: add supports_baco callback for VI asics.
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:48 -05:00
Alex Deucher
0d0c07ee07 drm/amdgpu: add supports_baco callback for CIK asics.
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:47 -05:00
Alex Deucher
3670c242e3 drm/amdgpu: add supports_baco callback for SI asics.
BACO - Bus Active, Chip Off

Not supported.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:47 -05:00
Alex Deucher
988eb9ff3e drm/amdgpu: add supports_baco callback for soc15 asics. (v2)
BACO - Bus Active, Chip Off

Check the BACO capabilities from the powerplay table.

v2: drop unrelated struct cleanup

Reviewed-by: Evan Quan <evan.quan@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:46 -05:00
Alex Deucher
69d5436d4d drm/amdgpu: add asic callback for BACO support
BACO - Bus Active, Chip Off

Used to check whether the device supports BACO.  This will
be used to enable runtime pm on devices which support BACO.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:46 -05:00
Hawking Zhang
858a2bbad6 drm/amdgpu: pull ras controller int status only when ras enabled
ras_controller_irq and athub_err_event_irq are only registered
when PCIE_BIF ras is marked as supported. as the result, the driver
also just need pull the int status in such case.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:09:23 -05:00
Hawking Zhang
5bdd0b72d6 drm/amdgpu: switch to common helper func for psp cmd submission
Drop all the IP specific cmd_submit callback function
and use the common helper instead

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:09:13 -05:00
Hawking Zhang
cc65176e51 drm/amdgpu: add helper func for psp ring cmd submission
Except for ring wptr update, the psp ring cmd submission
function shouldn't be IP specific one. Create a common
helper function to be shared for all the ASICs.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:09:06 -05:00
Hawking Zhang
13a390a6f9 drm/amdgpu: add psp funcs for ring write pointer read/write
The ring write pointer regsiter update is the only part that
is IP specific ones in psp_cmd_submit function.

Add two callbacks for wptr read/write so that we unify the
psp_cmd_submit function for all the ASICs.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:08:58 -05:00
Evan Quan
0a650c1d35 drm/amd/powerplay: add Arcturus baco reset support
Enable baco reset support on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:08:38 -05:00
Alex Deucher
2aa87ba568 Revert "drm/amd/display: enable S/G for RAVEN chip"
This reverts commit 1c42591591.

S/G display is not stable with the IOMMU enabled on some
platforms.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
3f2a06ac81 drm/amdgpu: disable gfxoff on original raven
There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
b62d955426 drm/amdgpu: remove experimental flag for Navi14
5.4 and newer works fine with navi14.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
ca9317b918 drm/amdgpu: disable gfxoff when using register read interface
When gfxoff is enabled, accessing gfx registers via MMIO
can lead to a hang.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497
Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
zhengbin
1664194925 drm/amdgpu: remove not needed memset
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c:64:13-31: WARNING: dma_alloc_coherent use in ih -> ring already zeroes out memory,  so memset is not needed

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Sam Bobroff
b992691d45 drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
The INTERRUPT_CNTL2 register expects a valid DMA address, but is
currently set with a GPU MC address.  This can cause problems on
systems that detect the resulting DMA read from an invalid address
(found on a Power8 guest).

Instead, use the DMA address of the dummy page because it will always
be safe.

Fixes: 27ae10641e ("drm/amdgpu: add interupt handler implementation for si v3")
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:54 -05:00
Alex Deucher
29bc37b410 drm/amdgpu/nv: add asic func for fetching vbios from rom directly
Needed as a fallback if the vbios can't be fetched by other means.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Alex Deucher
761e09230c drm/amdgpu/soc15: move struct definition around to align with other soc15 asics
Move reset_method next to reset callback to match the struct layout and
the other definition in this file.

Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Yintian Tao
d0d13fe874 drm/amdgpu: put flush_delayed_work at first
There is one regression from 042f3d7b745cd76aa
To put flush_delayed_work after adev->shutdown = true
which will make amdgpu_ih_process not response the irq
At last, all ib ring tests will be failed just like below

[drm] amdgpu: finishing device.
[drm] Fence fallback timer expired on ring gfx
[drm] Fence fallback timer expired on ring comp_1.0.0
[drm] Fence fallback timer expired on ring comp_1.1.0
[drm] Fence fallback timer expired on ring comp_1.2.0
[drm] Fence fallback timer expired on ring comp_1.3.0
[drm] Fence fallback timer expired on ring comp_1.0.1
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).

v2: replace cancel_delayed_work_sync() with flush_delayed_work()

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Leo Liu
4effa8dbc1 drm/amdgpu/vcn2.5: fix the enc loop with hw fini
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:51 -05:00
Leo Liu
8c74e59049 drm/amdgpu: enable Arcturus JPEG2.5 block
It also doen't care about FW loading type, so enabling it directly.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
e89e2237e8 drm/amdgpu: enable Arcturus CG for VCN and JPEG blocks
Arcturus VCN and JPEG only got CG support, and no PG support

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
14f43e8f88 drm/amdgpu: move JPEG2.5 out from VCN2.5
And clean up the duplicated stuff

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
5be45a26c9 drm/amdgpu: enable JPEG2.0 for Navi1x and Renoir
By adding JPEG IP block to the family

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
52f2e779ad drm/amdgpu: add driver support for JPEG2.0 and above
By using JPEG IP block type

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
474b6d296f drm/amdgpu: enable JPEG2.0 dpm
By using its own enabling function

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
099d66e43f drm/amdgpu: add PG and CG for JPEG2.0
And enable them for Navi1x and Renoir

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
b0f3cd3191 drm/amdgpu: remove unnecessary JPEG2.0 code from VCN2.0
They are no longer needed, using from JPEG2.0 instead.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
6ac2724110 drm/amdgpu: add JPEG v2.0 function supports
It got separated from VCN2.0 with a new jpeg_v2_0_ip_block

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
2eb167293f drm/amdgpu: add JPEG common functions to amdgpu_jpeg
They will be used for JPEG2.0 and later.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:50 -05:00
Leo Liu
0388aee766 drm/amdgpu: use the JPEG structure for general driver support
JPEG1.0 will be functional along with VCN1.0

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Leo Liu
bb0db70f3f drm/amdgpu: separate JPEG1.0 code out from VCN1.0
For VCN1.0, the separation is just in code wise, JPEG1.0 HW is still
included in the VCN1.0 HW.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Leo Liu
9d9cc9b8fe drm/amdgpu: add amdgpu_jpeg and JPEG tests
It will be used for all versions of JPEG eventually. Previous
JPEG tests will be removed later since they are still used by
JPEG2.x.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Leo Liu
88a1c40a04 drm/amdgpu: add JPEG HW IP and SW structures
It will be used for JPEG IP 1.0, 2.0, 2.5 and later.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Xiaojie Yuan
0bb419c76b drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)
1. no need to allocate an extra member for 'mqd_backup' array
2. backup/restore mqd to/from the correct 'mqd_backup' array slot

v2: warning fix (Alex)

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Hawking Zhang
9e612c11a7 drm/amdgpu: init umc functions for arcturus umc ras
reuse vg20 umc functions for arcturus umc ras

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 09:47:36 -05:00
Hawking Zhang
baaeb610b1 drm/amdgpu: enable ras capablity check on arcturus
check hw ras capablity via atomfirmware

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 09:47:30 -05:00
Jani Nikula
f139a62c7a drm/amdgpu: use drm_debug_enabled() to check for debug categories
Allow better abstraction of the drm_debug global variable in the
future. No functional changes.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Sean Paul <sean@poorly.run>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e30ddbd627be1fe1c67dd007e0d36f81a094bc91.1572258936.git.jani.nikula@intel.com
2019-11-14 14:08:39 +02:00
Dave Airlie
0990ca235d Merge tag 'drm-next-5.5-2019-11-08' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-08:

amdgpu:
- Enable VCN dynamic powergating on RV/RV2
- Fixes for Navi14
- Misc Navi fixes
- Fix MSI-X tear down
- Misc Arturus fixes
- Fix xgmi powerstate handling
- Documenation fixes

scheduler:
- Fix static code checker warning
- Fix possible thread reactivation while thread is stopped
- Avoid cleanup if thread is parked

radeon:
- SI dpm fix ported from amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108212713.5078-1-alexander.deucher@amd.com
2019-11-14 08:43:07 +10:00
yu kuai
9e089a29c6 drm/amdgpu: remove set but not used variable 'invalid'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c: In function
‘amdgpu_amdkfd_evict_userptr’:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:1665:6: warning:
variable ‘invalid’ set but not used [-Wunused-but-set-variable]

'invalid' is never used, so can be removed. Thus 'atomic_inc_return'
can be replaced as 'atomic_inc'

Fixes: 5ae0283e83 ("drm/amdgpu: Add userptr support for KFD")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:46 -05:00
yu kuai
4f2922d12d drm/amdgpu: remove set but not used variable 'amdgpu_connector'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_display.c: In function
‘amdgpu_display_crtc_scaling_mode_fixup’:
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:693:27: warning: variable
‘amdgpu_connector’ set but not used [-Wunused-but-set-variable]

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:46 -05:00
yu kuai
747a397d39 drm/amdgpu: remove set but not used variable 'mc_shared_chmap' from 'gfx_v6_0.c' and 'gfx_v7_0.c'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c: In function
‘gfx_v6_0_constants_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c:1579:6: warning: variable
‘mc_shared_chmap’ set but not used [-Wunused-but-set-variable]

drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c: In function
‘gfx_v7_0_gpu_early_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c:4262:6: warning: variable
‘mc_shared_chmap’ set but not used [-Wunused-but-set-variable]

Fixes: 2cd46ad223 ("drm/amdgpu: add graphic pipeline implementation for si v8")
Fixes: d93f3ca706 ("drm/amdgpu/gfx7: rework gpu_init()")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:46 -05:00
Colin Ian King
d785476c60 drm/amd/display: remove duplicated assignment to grph_obj_type
Variable grph_obj_type is being assigned twice, one of these is
redundant so remove it.

Addresses-Coverity: ("Evaluation order violation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
e98042db2c drm/amdgpu: remove set but not used variable 'mc_shared_chmap'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function
‘gfx_v8_0_gpu_early_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:1713:6: warning: variable
‘mc_shared_chmap’ set but not used [-Wunused-but-set-variable]

Fixes: 0bde3a95ea ("drm/amdgpu: split gfx8 gpu init into sw and hw parts")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
220ac8d144 drm/amdgpu: remove always false comparison in 'amdgpu_atombios_i2c_process_i2c_ch'
Fixes gcc '-Wtype-limits' warning:

drivers/gpu/drm/amd/amdgpu/atombios_i2c.c: In function
‘amdgpu_atombios_i2c_process_i2c_ch’:
drivers/gpu/drm/amd/amdgpu/atombios_i2c.c:79:11: warning: comparison is
always false due to limited range of data type [-Wtype-limits]

'num' is 'u8', so it will never be greater than 'TOM_MAX_HW_I2C_READ',
which is defined as 255. Therefore, the comparison can be removed.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
d1d09dc417 drm/amdgpu: remove set but not used variable 'dig'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/atombios_dp.c: In function
‘amdgpu_atombios_dp_link_train’:
drivers/gpu/drm/amd/amdgpu/atombios_dp.c:716:34: warning: variable ‘dig’
set but not used [-Wunused-but-set-variable]

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
5bea7fedb7 drm/amdgpu: remove set but not used variable 'dig_connector'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/atombios_dp.c: In function
‘amdgpu_atombios_dp_get_panel_mode’:
drivers/gpu/drm/amd/amdgpu/atombios_dp.c:364:36: warning: variable
‘dig_connector’ set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
e8b74035d8 drm/amdgpu: add function parameter description in 'amdgpu_gart_bind'
Fixes gcc warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:313: warning: Function
parameter or member 'flags' not described in 'amdgpu_gart_bind'

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
b8b7213057 drm/amdgpu: add function parameter description in 'amdgpu_device_set_cg_state'
Fixes gcc warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1954: warning: Function
parameter or member 'state' not described in 'amdgpu_device_set_cg_state'

Fixes: e3ecdffac9 ("drm/amdgpu: add documentation for amdgpu_device.c")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
yu kuai
bae028e3e5 drm/amdgpu: remove 4 set but not used variable in amdgpu_atombios_get_connector_info_from_object_table
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c: In function
'amdgpu_atombios_get_connector_info_from_object_table':
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:376:26: warning: variable
'grph_obj_num' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:376:13: warning: variable
'grph_obj_id' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:341:37: warning: variable
'con_obj_type' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:341:24: warning: variable
'con_obj_num' set but not used [-Wunused-but-set-variable]

They are never used, so can be removed.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha
b86a1aa36a drm/amd/display: rename DCN1_0 kconfig to DCN
Since dcn20 and dcn21 are under dcn1 it doesnt make sense to
have it named dcn1.

Change it to "dcn" to make it generic

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha
aca935c7cc drm/amd/display: Drop CONFIG_DRM_AMD_DC_DCN2_1 flag
[Why]

DCN21 is stable enough to be build by default. So drop the flags.

[How]

Remove them using the unifdef tool. The following commands were executed
in sequence:

$ find -name '*.c' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DCN2_1 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_1 '{}' ';'
$ find -name '*.h' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DCN2_1 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_1 '{}' ';'

In addition:

* Remove from kconfig, and replace any dependencies with DCN1_0.
* Remove from any makefiles.
* Fix and cleanup Renoir definitions in dal_asic_id.h
* Expand DCN1 ifdef to include DCN21 code in the following files:
    * clk_mgr/clk_mgr.c: dc_clk_mgr_create()
    * core/dc_resources.c: dc_create_resource_pool()
    * gpio/hw_factory.c: dal_hw_factory_init()
    * gpio/hw_translate.c: dal_hw_translate_init()

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha
1da37801a8 drm/amd/display: Drop CONFIG_DRM_AMD_DC_DCN2_0 and DSC_SUPPORTED
[Why]

DCN2 and DSC are stable enough to be build by default. So drop the flags.

[How]

Remove them using the unifdef tool. The following commands were executed
in sequence:

$ find -name '*.c' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';'
$ find -name '*.h' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';'

In addition:

* Remove from kconfig, and replace any dependencies with DCN1_0.
* Remove from any makefiles.
* Fix and cleanup NV defninitions in dal_asic_id.h
* Expand DCN1 ifdef to include DCN2 code in the following files:
    * clk_mgr/clk_mgr.c: dc_clk_mgr_create()
    * core/dc_resources.c: dc_create_resource_pool()
    * dce/dce_dmcu.c: dcn20_*lock_phy()
    * dce/dce_dmcu.c: dcn20_funcs
    * dce/dce_dmcu.c: dcn20_dmcu_create()
    * gpio/hw_factory.c: dal_hw_factory_init()
    * gpio/hw_translate.c: dal_hw_translate_init()

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:44 -05:00
Alex Deucher
622b2a0ab6 drm/amdgpu/vcn: finish delay work before release resources
flush/cancel delayed works before doing finalization
to avoid concurrently requests.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Nicholas Kazlauskas
976e51a7c0 drm/amdgpu: Add DMCUB to firmware query interface
The DMCUB firmware version can be read using the AMDGPU_INFO ioctl
or the amdgpu_firmware_info debugfs entry.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Nicholas Kazlauskas
2bd2a27ffc drm/amdgpu: Add PSP loading support for DMCUB ucode
DMCUB ucode requires secure loading through PSP. This is already
supported in PSP as GFX_FW_TYPE_DMUB, it just needs to be mapped from
AMDGPU_UCODE_ID_DMCUB to GFX_FW_TYPE_DMUB.

DMUB is a shorthand name for DMCUB and can be used interchangeably.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Nicholas Kazlauskas
02350f0bdf drm/amdgpu: Add ucode support for DMCUB
The DMCUB is a secondary DMCU (Display MicroController Unit) that has
its own separate firmware. It's required for DMCU support on Renoir.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13 15:29:42 -05:00
Dave Airlie
77e0723bd2 Linux 5.4-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd
 CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm
 6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL
 XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h
 Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY
 6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR
 dUvuYZc=
 =1N6F
 -----END PGP SIGNATURE-----

Merge v5.4-rc7 into drm-next

We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-11-14 05:53:10 +10:00
Jesse Zhang
9f87516764 drm/amd/amdgpu: finish delay works before release resources
flush/cancel delayed works before doing finalization
to avoid concurrently requests.

Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-11 17:38:13 -05:00
Hawking Zhang
51bd363857 drm/amdgpu: avoid upload corrupted ta ucode to psp
xgmi, ras, hdcp and dtm ta are actually separated ucode and
need to handled case by case to upload to psp.

We support the case that ta binary have one or multiple of
them built-in. As a result, the driver should check each ta
binariy's availablity before decide to upload them to psp.

In the terminate (unload) case, the driver will check the
context readiness before perform unload activity. It's fine
to keep it as is.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-11 17:38:13 -05:00
changzhu
eebc7f4d7f drm/amdgpu: allow direct upload save restore list for raven2
It will cause modprobe atombios stuck problem in raven2 if it doesn't
allow direct upload save restore list from gfx driver.
So it needs to allow direct upload save restore list for raven2
temporarily.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-08 12:29:52 -05:00
Andrey Grodzovsky
a28fda312a drm/amdgpu: Avoid accidental thread reactivation.
Problem:
During GPU reset we call the GPU scheduler to suspend it's
thread, those two functions in amdgpu also suspend and resume
the sceduler for their needs but this can collide with GPU
reset in progress and accidently restart a suspended thread
before time.

Fix:
Serialize with GPU reset.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Andrey Grodzovsky
7c55adb0a9 Revert "drm/amdgpu: dont schedule jobs while in reset"
This reverts commit 89b3d86403.

We will do a proper fix in next patch.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Jonathan Kim
cb5932f866 drm/amdgpu: fix vega20 pstate status change
vega20 only requires all devices be set to same pstate level for low
pstate and not high.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Evan Quan <Evan.Quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Kevin Wang
2af8153126 drm/amdgpu: fix sysfs interface pcie_replay_count error on navi asic
the asic callback function of get_pcie_replay_count is not implement on navi asic,
it will cause null pinter error when read this interface.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Emily Deng
e31dcdcfab drm/amdgpu: Need to disable msix when unloading driver
For driver reload test, it will report "can't enable
MSI (MSI-X already enabled)".

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Oak Zeng
f6baa07497 drm/amdgpu: Add comments to gmc structure
Explain fields like aper_base, agp_start etc. The definition
of those fields are confusing as they are from different view
(CPU or GPU). Add comments for easier understand.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <Alex.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Alex Deucher
77a3160221 drm/amdgpu/renoir: move gfxoff handling into gfx9 module
To properly handle the option parsing ordering.

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
changzhu
440a7a54e7 drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9
It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
changzhu
589b64a7e3 drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
Evan Quan
6a299d7aaa drm/amdgpu: register gpu instance before fan boost feature enablment
Otherwise, the feature enablement will be skipped due to wrong count.

Fixes: beff74bc6e ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
Alex Deucher
ef177d11d6 drm/amdgpu: Improve RAS documentation (v2)
Clarify some areas, clean up formatting, add section for
unrecoverable error handling.

v2: fix grammatical errors

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Alex Deucher
ad4d81dc57 drm/amdgpu/renoir: move gfxoff handling into gfx9 module
To properly handle the option parsing ordering.

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Pan Bian
365f7f8db8 drm/amdgpu: fix double reference dropping
The reference to object fence is dropped at the end of the loop.
However, it is dropped again outside the loop. The reference can be
dropped immediately after calling dma_fence_wait() in the loop and
thus the dropping operation outside the loop can be removed.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Pan Bian
946ab8db69 drm/amdgpu: fix potential double drop fence reference
The object fence is not set to NULL after its reference is dropped. As a
result, its reference may be dropped again if error occurs after that,
which may lead to a use after free bug. To avoid the issue, fence is
explicitly set to NULL after dropping its reference.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Eric Huang
f88e2d1f8e drm/amdgpu: change read of GPU clock counter on Vega10 VF
Using unified VBIOS has performance drop in sriov environment.
The fix is switching to another register instead.

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
changzhu
11c6108934 drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9
It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
changzhu
a6522a5c63 drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Evan Quan
60599a0363 drm/amdgpu: perform p-state switch after the whole hive initialized
P-state switch should be performed after all devices from the hive
get initialized.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by:  Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Evan Quan
5c5b2ba006 drm/amdgpu: fix possible pstate switch race condition
Added lock protection so that the p-state switch will
be guarded to be sequential. Also update the hive
pstate only all device from the hive are in the same
state.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Evan Quan
b0adca4d50 drm/amdgpu: register gpu instance before fan boost feature enablment
Otherwise, the feature enablement will be skipped due to wrong count.

Fixes: beff74bc6e ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Hawking Zhang
58f46d4b65 drm/amdgpu: disallow direct upload save restore list from gfx driver
Direct uploading save/restore list via mmio register writes breaks the security
policy. Instead, the driver should pass s&r list to psp.

For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list
twice, in non-psp ucode front door loading phase and gfx pg initialization phase.
The latter is not allowed.

VG12 is the only exception where the driver still keeps legacy approach for S&R
list uploading. In theory, this can be elimnated if we have valid srcntl ucode
for VG12.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Candice Li <Candice.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Emily Deng
224f82e5b7 drm/amdgpu/discovery: Need to free discovery memory
When unloading driver, need to free discovery memory.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Alex Deucher
8863baefaf drm/amdgpu/gpuvm: add some additional comments in amdgpu_vm_update_ptes
To better clarify what is happening in this function.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Alex Deucher
a4840d91c9 drm/amdgpu: enable VCN DPG on Raven and Raven2
It's safe to enable dynamic VCN powergating on raven and
raven2 for increased power savings.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Tianci.Yin
84e4e8205e drm/amdgpu: add navi14 PCI ID
Add the navi14 PCI device id.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Evan Quan
3e454860f2 drm/amd/powerplay: support xgmi pstate setting on powerplay routine V2
Add xgmi pstate setting on powerplay routine.

V2: split the change of is_support_sw_smu_xgmi into a separate patch

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Evan Quan
39ea6e5f9e drm/amdgpu: change pstate only after all XGMI device initialized
Pstate settings should be performed after all device of the
XGMI setup get initialized.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:46 -05:00
Tianci.Yin
5e200fb97a drm/amdgpu: add navi14 PCI ID
Add the navi14 PCI device id.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 15:35:08 -05:00
Shirish S
f2efc6e600 drm/amdgpu: dont schedule jobs while in reset
[Why]

doing kthread_park()/unpark() from drm_sched_entity_fini
while GPU reset is in progress defeats all the purpose of
drm_sched_stop->kthread_park.
If drm_sched_entity_fini->kthread_unpark() happens AFTER
drm_sched_stop->kthread_park nothing prevents from another
(third) thread to keep submitting job to HW which will be
picked up by the unparked scheduler thread and try to submit
to HW but fail because the HW ring is deactivated.

[How]
grab the reset lock before calling drm_sched_entity_fini()

Signed-off-by: Shirish S <shirish.s@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 15:26:53 -05:00
Alex Deucher
576daab3cd drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 15:26:20 -05:00
Shirish S
89b3d86403 drm/amdgpu: dont schedule jobs while in reset
[Why]

doing kthread_park()/unpark() from drm_sched_entity_fini
while GPU reset is in progress defeats all the purpose of
drm_sched_stop->kthread_park.
If drm_sched_entity_fini->kthread_unpark() happens AFTER
drm_sched_stop->kthread_park nothing prevents from another
(third) thread to keep submitting job to HW which will be
picked up by the unparked scheduler thread and try to submit
to HW but fail because the HW ring is deactivated.

[How]
grab the reset lock before calling drm_sched_entity_fini()

Signed-off-by: Shirish S <shirish.s@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 14:20:24 -05:00
Alex Deucher
e2f619aa14 drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 14:20:08 -05:00
Dave Airlie
8a86b00a43 Merge tag 'drm-next-5.5-2019-11-01' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-01:

amdgpu:
- Add EEPROM support for Arcturus
- Enable VCN encode support for Arcturus
- Misc PSP fixes
- Misc DC fixes
- swSMU cleanup

amdkfd:
- Misc cleanups
- Fix typo in cu bitmap parsing

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101190607.3763-1-alexander.deucher@amd.com
2019-11-04 10:22:53 +10:00
Dave Airlie
633aa7e53a drm-misc-next for 5.5:
UAPI Changes:
 -dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean)
 
 Cross-subsystem Changes:
 - None
 
 Core Changes:
 -dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock
 	  state on mmap/munmap (Christian)
 -vram: add prepare/cleanup fb helpers to vram helpers (Thomas)
 -ttm: always keep bo's on the lru + ttm cleanups (Christian)
 -sched: allow a free_job routine to sleep (Steven)
 -fb_helper: remove unused drm_fb_helper_defio_init() (Thomas)
 
 Driver Changes:
 -bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas)
 -amdgpu: Implement dma-buf import/export without drm helpers (Christian)
 -panfrost: Simplify devfreq integration in driver (Steven)
 
 Cc: Christian König <christian.koenig@amd.com>
 Cc: Thomas Zimmermann <tzimmermann@suse.de>
 Cc: Steven Price <steven.price@arm.com>
 Cc: Andrew F. Davis <afd@ti.com>
 Cc: John Stultz <john.stultz@linaro.org>
 Cc: Sean Paul <seanpaul@chromium.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl27NYYACgkQcywAJXLc
 r3k7SQgAy7MbT7MHg1NkObnWFKwbNxA0FuRV7HVuQ5AiVA5fbehGECNVI1hFZE3D
 kCyL5zRgzfrbBjI9zqclonXPmjPbD//f+ufUZJ2EaWHfE5k7LUYIEunpgWlCEgUR
 qqeuqjvx2+NY3gLZW6D8BIrUW79cKbggBvDa9QeE4aoMiI7/F3lpNog5LNo7TWCQ
 4SGgpDqtcJ2eSBneX80zLLVnI1CKfCSiWheZVoqCTRN5bfUftGwhbXb3N/JipG37
 cCblVR7D8FR7z0MQUtq2ql76tCn5SJJqloujkmUU06quYBHR5K1deB5BP+8HI1Fw
 /XMqWHFo0uX7h4hFIt1MVIJ92U+b+w==
 =ViNo
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-10-31' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.5:

UAPI Changes:
-dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean)

Cross-subsystem Changes:
- None

Core Changes:
-dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock
	  state on mmap/munmap (Christian)
-vram: add prepare/cleanup fb helpers to vram helpers (Thomas)
-ttm: always keep bo's on the lru + ttm cleanups (Christian)
-sched: allow a free_job routine to sleep (Steven)
-fb_helper: remove unused drm_fb_helper_defio_init() (Thomas)

Driver Changes:
-bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas)
-amdgpu: Implement dma-buf import/export without drm helpers (Christian)
-panfrost: Simplify devfreq integration in driver (Steven)

Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031193015.GA243509@art_vandelay
2019-11-04 09:28:51 +10:00
Alex Deucher
30ef5c7eab drm/amdgpu/gmc10: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-30 11:56:20 -04:00
Andrey Grodzovsky
57c0f58e9f drm/amdgpu: If amdgpu_ib_schedule fails return back the error.
Use ERR_PTR to return back the error happened during amdgpu_ib_schedule.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:56:15 -04:00
Tianci.Yin
47661f6dad drm/amdgpu/gfx10: update gfx golden settings for navi12
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:56:15 -04:00
Tianci.Yin
3dde767f14 drm/amdgpu/gfx10: update gfx golden settings for navi14
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:56:15 -04:00
Tianci.Yin
f52ebe1f88 drm/amdgpu/gfx10: update gfx golden settings
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-30 11:55:54 -04:00
Pierre-Eric Pelloux-Prayer
9bdf63d357 drm/amdgpu/sdma5: do not execute 0-sized IBs (v2)
This seems to help with https://bugs.freedesktop.org/show_bug.cgi?id=111481.

v2: insert a NOP instead of skipping all 0-sized IBs to avoid breaking older hw

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:55:20 -04:00
chen gong
e5574f61e9 drm/amdgpu: Fix SDMA hang when performing VKexample test
VKexample test hang during Occlusion/SDMA/Varia runs.
Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-30 11:54:33 -04:00
Alex Deucher
46203a508f drm/amdgpu/gmc10: properly set BANK_SELECT and FRAGMENT_SIZE
These were not aligned for optimal performance for GPUVM.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:07:13 -04:00
Le Ma
361d66edc5 drm/amdgpu: fix no ACK from LDS read during stress test for Arcturus
Set mmSQ_CONFIG.DISABLE_SMEM_SOFT_CLAUSE as W/R.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:07:13 -04:00
HaiJun Chang
897110eed5 drm/amdgpu: fix gfx VF FLR test fail on navi
Cp wptr in wb buffer is outdated after VF FLR.
The outdated wptr may cause cp to execute unexpected packets.
Reset cp wptr in wb buffer.

Signed-off-by: HaiJun Chang <HaiJun.Chang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:07:13 -04:00
Le Ma
bff77e86a3 drm/amdgpu: bypass some cleanup work after err_event_athub (v2)
PSP lost connection when err_event_athub occurs. These cleanup work can be
skipped in BACO reset.

v2: squash in missing include (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:52 -04:00
Le Ma
8baaadba73 drm/amdgpu: clear UVD VCPU buffer when err_event_athub generated
The err_event_athub error will mess up the buffer and cause UVD resume hang.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Jiange Zhao
b4def3744b drm/amdgpu/SRIOV: SRIOV VF doesn't support BACO
SRIOV VF doesn't support BACO.

Only PF with BACO capability can do it.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Geert Uytterhoeven
44b582b32a drm/amdgpu: Remove superfluous void * cast in debugfs_create_file() call
There is no need to cast a typed pointer to a void pointer when calling
a function that accepts the latter.  Remove it, as the cast prevents
further compiler checks.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
YueHaibing
e4b116a2c0 drm/amdgpu: remove set but not used variable 'adev'
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:1221:24: warning: variable adev set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:488:24: warning: variable adev set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:547:24: warning: variable adev set but not used [-Wunused-but-set-variable]

It is never used, so can removed it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Yong Zhao
533bfcaea1 drm/amdkfd: Delete duplicated queue bit map reservation
The KIQ is on the second MEC and its reservation is covered in the
latter logic, so no need to reserve its bit twice.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Yong Zhao
55695b36c1 drm/amdkfd: Delete unnecessary pr_fmt switch
Given amdkfd.ko has been merged into amdgpu.ko, this switch is no
longer useful.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Dave Airlie
a24e4b09dc drm-misc-next for 5.5:
UAPI Changes:
 -syncobj: allow querying the last submitted timeline value (David)
 -fourcc: explicitly defineDRM_FORMAT_BIG_ENDIAN as unsigned (Adam)
 -omap: revert the OMAP_BO_* flags that were added -- no userspace (Sean)
 
 Cross-subsystem Changes:
 -MAINTAINERS: add Mihail as komeda co-maintainer (Mihail)
 
 Core Changes:
 -edid: a few cleanups, add AVI infoframe bar info (Ville)
 -todo: remove i915 device_link item and add difficulty levels (Daniel)
 -dp_helpers: add a few new helpers to parse dpcd (Thierry)
 
 Driver Changes:
 -gma500: fix a few memory disclosure leaks (Kangjie)
 -qxl: convert to use the new drm_gem_object_funcs.mmap (Gerd)
 -various: open code dp_link helpers in preparation for helper removal (Thierry)
 
 Cc: Chunming Zhou <david1.zhou@amd.com>
 Cc: Adam Jackson <ajax@redhat.com>
 Cc: Sean Paul <seanpaul@chromium.org>
 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
 Cc: Kangjie Lu <kjlu@umn.edu>
 Cc: Mihail Atanassov <mihail.atanassov@arm.com>
 Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
 Cc: Thierry Reding <treding@nvidia.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl2xxiYACgkQcywAJXLc
 r3mJowf/U9ANPuvF+lahI6IVKpE4KD8Pz+73uNhjmxq5TnmcA7eNJ2pb8HO38tU2
 lkchhaQEGZR96nVL962DkegEDu8p08RpYYwKv8r+sDV+zw/aviN/ANLSmTVtZ//m
 wzgUI7zIqF1WxKdFEzNmQVuY0hYd4fWBn6kGvw1jS/6xL2/3KR6hKVigBZwkICSt
 /ZiCZyxA7HhAlqzasn+PqGkuLVYv6NvFu4Ug6YG4nBOh57IrKmGt1a6cEUjGsHFf
 6Ets5wTPu2ydMRvY+v6rUDDRj6JJQph7Lv4hVKtg13FerJ1+OQ7xjhu4gIk1oNl/
 7zIgRWMVZj79ksMXyk6zgFD2rZAF3w==
 =Fm3U
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-10-24-2' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.5:

UAPI Changes:
-syncobj: allow querying the last submitted timeline value (David)
-fourcc: explicitly defineDRM_FORMAT_BIG_ENDIAN as unsigned (Adam)
-omap: revert the OMAP_BO_* flags that were added -- no userspace (Sean)

Cross-subsystem Changes:
-MAINTAINERS: add Mihail as komeda co-maintainer (Mihail)

Core Changes:
-edid: a few cleanups, add AVI infoframe bar info (Ville)
-todo: remove i915 device_link item and add difficulty levels (Daniel)
-dp_helpers: add a few new helpers to parse dpcd (Thierry)

Driver Changes:
-gma500: fix a few memory disclosure leaks (Kangjie)
-qxl: convert to use the new drm_gem_object_funcs.mmap (Gerd)
-various: open code dp_link helpers in preparation for helper removal (Thierry)

Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kangjie Lu <kjlu@umn.edu>
Cc: Mihail Atanassov <mihail.atanassov@arm.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024155535.GA10294@art_vandelay
2019-10-30 06:11:47 +10:00
Dave Airlie
60845e34f0 Merge tag 'drm-next-5.5-2019-10-25' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-10-25:

amdgpu:
- BACO support for CI and VI asics
- Quick memory training support for navi
- MSI-X support
- RAS fixes
- Display AVI infoframe fixes
- Display ref clock fixes for renoir
- Fix number of audio endpoints in renoir
- Fix for discovery tables
- Powerplay fixes
- Documentation fixes
- Misc cleanups

radeon:
- revert a PPC fix which broke x86

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025221020.203546-1-alexander.deucher@amd.com
2019-10-30 05:46:09 +10:00
Christian König
a39414716c drm/amdgpu: add independent DMA-buf import v9
Instead of relying on the DRM functions just implement our own import
functions. This prepares support for taking care of unpinned DMA-buf.

v2: enable for all exporters, not just amdgpu, fix invalidation
    handling, lock reservation object while setting callback
v3: change to new dma_buf attach interface
v4: split out from unpinned DMA-buf work
v5: rebased and cleanup on new DMA-buf interface
v6: squash with invalidation callback change,
    stop using _(map|unmap)_locked
v7: drop invalidations when the BO is already in system domain
v8: rebase on new DMA-buf patch and drop move notification
v9: cleanup comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/337948/
2019-10-28 16:59:43 +01:00
Christian König
6e6db2722c drm/amdgpu: add independent DMA-buf export v8
Add an DMA-buf export implementation independent of the DRM helpers.

This not only avoids the caching of DMA-buf mappings, but also
allows us to use the new dynamic locking approach.

This is also a prerequisite of unpinned DMA-buf handling.

v2: fix unintended recursion, remove debugging leftovers
v3: split out from unpinned DMA-buf work
v4: rebase on top of new no_sgt_cache flag
v5: fix some warnings by including amdgpu_dma_buf.h
v6: fix locking for non amdgpu exports
v7: rebased on new DMA-buf locking patch
v8: drop extra include

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/337949/
2019-10-28 16:59:35 +01:00
Wambui Karuga
f440ff44b1 drm/amd: correct "_LENTH" mispelling in constant
Correct the "_LENTH" mispelling in the AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
constant.

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-28 11:19:17 -04:00
Wambui Karuga
7e0ff20c7a drm/amd: declare amdgpu_exp_hw_support in amdgpu.h
Declare `amdgpu_exp_hw_support` as extern in amdgpu.h to address the
following sparse warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:118:5: warning: symbol 'amdgpu_exp_hw_support' was not declared. Should it be static?

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Suggested-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-28 11:19:17 -04:00
Andrey Grodzovsky
db5e65fcb3 drm/amdgpu: If amdgpu_ib_schedule fails return back the error.
Use ERR_PTR to return back the error happened during amdgpu_ib_schedule.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Tianci.Yin
dcc0fcff14 drm/amdgpu/gfx10: update gfx golden settings for navi12
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Tianci.Yin
21c943f35a drm/amdgpu/gfx10: update gfx golden settings for navi14
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Tianci.Yin
d753dc6ab2 drm/amdgpu/gfx10: update gfx golden settings
update registers: mmCGTT_SPI_CLK_CTRL

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Evan Quan
0525f29713 drm/amd/powerplay: skip unsupported clock limit settings on Arcturus V2
For Arcturus, clock limit settings on uclk/socclk/fclk domains
are not supported.

V2: simplify the code to support both SGPU and MGPU cases

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Marek Olšák
664fe85a2d drm/amdgpu: Allow reading more status registers on si/cik
Allow userspace to read the same status registers for every family.
Based on commit c7890fea, added any of these registers if defined in
the include files of each architecture.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Andrey Grodzovsky
121a2bc6ae drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready.
For Arcturus the I2C traffic is done through SMU tables and so
we must postpone RAS recovery init to after they are ready
which is in amdgpu_device_ip_hw_init_phase2.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Andrey Grodzovsky
cf52ecc8b6 drm/amdgpu: Use ARCTURUS in RAS EEPROM.
Add Arcturus EEPROM/I2C support in generic EEPROM code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Nirmoy Das
9f0256da6b drm/amdgpu: remove unused parameter in amdgpu_gfx_kiq_free_ring
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
James Zhu
8047266443 drm/amdgpu/vcn: Enable VCN2.5 encoding
After VCN2.5 firmware (Version ENC: 1.1  Revision: 11),
VCN2.5 encoding can work properly.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Pelloux-prayer, Pierre-eric
3f378758b8 drm/amdgpu/sdma5: do not execute 0-sized IBs (v2)
This seems to help with https://bugs.freedesktop.org/show_bug.cgi?id=111481.

v2: insert a NOP instead of skipping all 0-sized IBs to avoid breaking older hw

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
chen gong
5aed95bbdd drm/amdgpu: Fix SDMA hang when performing VKexample test
VKexample test hang during Occlusion/SDMA/Varia runs.
Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Guchun Chen
52dd95f2b6 drm/amdgpu: define macros for retire page reservation
Easy for maintainance.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Guchun Chen
c688a06bc6 drm/amdgpu: refine reboot debugfs operation in ras case (v3)
Ras reboot debugfs node allows user one easy control to avoid
gpu recovery hang problem and directly reboot system per card
basis, after ras uncorrectable error happens. However, it is
one common entry, which should get rid of ras_ctrl node and
remove ip dependence when inputting by user. So add one new
auto_reboot node in ras debugfs dir to achieve this.

v2: in commit mssage, add justification why ras reboot debugfs
node is needed.
v3: use debugfs_create_bool to create debugfs file for boolean value

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:10 -04:00
Evan Quan
3697b339c6 drm/amd/powerplay: add lock protection for swSMU APIs V2
This is a quick and low risk fix. Those APIs which
are exposed to other IPs or to support sysfs/hwmon
interfaces or DAL will have lock protection. Meanwhile
no lock protection is enforced for swSMU internal used
APIs. Future optimization is needed.

V2: strip the lock protection for all swSMU internal APIs

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:09 -04:00
Colin Ian King
d5e5c1bce1 drm/amdgpu/psp: fix spelling mistake "initliaze" -> "initialize"
There is a spelling mistake in a DRM_ERROR error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:07 -04:00
Xiaojie Yuan
73469970a9 drm/amdgpu/psp11: fix typo in comment
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:07 -04:00
Xiaojie Yuan
d7e7f1ea25 drm/amdgpu/psp11: wait for sOS ready for ring creation
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:07 -04:00
Pelloux-prayer, Pierre-eric
ee8bcc2333 drm/amdgpu: call amdgpu_vm_prt_fini before deleting the root PD
amdgpu_vm_prt_fini uses "vm->root.base.bo" so it must still be valid when
we call it.

Fixes: b65709a921 ("drm/amdgpu: reserve the root PD while freeing PASIDs")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:47:07 -04:00
Alex Deucher
17523bd00c drm/amdgpu/vce: make some functions static
They are not used outside of the file they are defined in.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:15:00 -04:00
Alex Deucher
569557e524 drm/amdgpu/vce: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
feedback buffer, otherwise the IB test can overwrite
other memory.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:15:00 -04:00
chen gong
3a8b7d2761 drm/amdgpu/psp: declare PSP TA firmware
Add PSP TA firmware declaration for raven raven2 picasso

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:15:00 -04:00
Dave Airlie
3275a71e76 Merge tag 'drm-next-5.5-2019-10-09' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-10-09:

amdgpu:
- Additional RAS enablement for vega20
- RAS page retirement and bad page storage in EEPROM
- No GPU reset with unrecoverable RAS errors
- Reserve vram for page tables rather than trying to evict
- Fix issues with GPU reset and xgmi hives
- DC i2c over aux fixes
- Direct submission for clears, PTE/PDE updates
- Improvements to help support recoverable GPU page faults
- Silence harmless SAD block messages
- Clean up code for creating a bo at a fixed location
- Initial DC HDCP support
- Lots of documentation fixes
- GPU reset for renoir
- Add IH clockgating support for soc15 asics
- Powerplay improvements
- DC MST cleanups
- Add support for MSI-X
- Misc cleanups and bug fixes

amdkfd:
- Query KFD device info by asic type rather than pci ids
- Add navi14 support
- Add renoir support
- Add navi12 support
- gfx10 trap handler improvements
- pasid cleanups
- Check against device cgroup

ttm:
- Return -EBUSY with pipelining with no_gpu_wait

radeon:
- Silence harmless SAD block messages

device_cgroup:
- Export devcgroup_check_permission

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com
2019-10-26 05:56:57 +10:00
Christian König
97588b5b9a drm/ttm: remove pointers to globals
As the name says global memory and bo accounting is global. So it doesn't
make to much sense having pointers to global structures all around the code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/332879/
2019-10-25 11:40:51 +02:00
Christian König
9165fb879f drm/ttm: always keep BOs on the LRU
This allows blocking for BOs to become available
in the memory management.

Amdgpu is doing this for quite a while now during CS. Now
apply the new behavior to all drivers using TTM.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/332878/
2019-10-25 11:40:50 +02:00
Sean Paul
44bf67f32a Merge drm/drm-next into drm-misc-next
Parroting Daniel's backmerge justification from
2e79e22e09:

Thierry needs fd70c7755b ("drm/bridge: tc358767: fix max_tu_symbol
value") to be able to merge his dp_link patch series.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
2019-10-23 11:14:11 -04:00
Daniel Vetter
2e79e22e09 Linux 5.4-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl2su/AeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvm4H/1jkheCrvB/GJS69
 wd18vizAg+eFmNCzxlGVhpQTKGymNRy+g6clnoli3cNJ3pSVKcYgVyB3oXaONIhp
 g/ANudnBjTdjqYgJzfLij5AGecrGwDpF3YL0kuKrCB63s2I/HwQGYy/aPrYY8emy
 gAYdaf1DGRu5/DIIB6soTo/TnpKoAyTE+XY5MaPSug++t/Flov19tlU40IZxXW94
 bjTXbm0yklrsIx+LL5mYYGGnygSTCF66JjFg1qhDCBQaS2MZ21h1ZgaOtGZTwZcc
 WgEiqLC5S1Iyj96zir1t78RcVQ4RzgvDbhUOgIqUFsYAO2wOicvxyFE3Hj8rPOKd
 uGgVPRM=
 =xgZa
 -----END PGP SIGNATURE-----

Merge v5.4-rc4 into drm-next

Thierry needs fd70c7755b ("drm/bridge: tc358767: fix max_tu_symbol
value") to be able to merge his dp_link patch series.

Some adjacent changes conflicts, plus some clashes in i915 due to
cherry-picking and git trying to be helpful and leaving both versions
in.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-10-23 12:10:05 +02:00
Alex Deucher
ee027828c4 drm/amdgpu/vce: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
feedback buffer, otherwise the IB test can overwrite
other memory.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Christian König
de51a5019f drm/amdgpu: fix error handling in amdgpu_bo_list_create
We need to drop normal and userptr BOs separately.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 17:12:34 -04:00
Christian König
3122051edc drm/amdgpu: fix potential VM faults
When we allocate new page tables under memory
pressure we should not evict old ones.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 17:12:34 -04:00
Philip Yang
209620b422 drm/amdgpu: user pages array memory leak fix
user_pages array should always be freed after validation regardless if
user pages are changed after bo is created because with HMM change parse
bo always allocate user pages array to get user pages for userptr bo.

v2: remove unused local variable and amend commit

v3: add back get user pages in gem_userptr_ioctl, to detect application
bug where an userptr VMA is not ananymous memory and reject it.

Bugzilla: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1844962

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Joe Barnett <thejoe@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.3
2019-10-17 17:12:34 -04:00
Alex Deucher
c81fffc2c9 drm/amdgpu/vcn: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

- Session info is 128K according to mesa
- Use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Alex Deucher
5d230bc91f drm/amdgpu/uvd7: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Alex Deucher
ce584a8e28 drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-17 17:12:34 -04:00
Kevin Wang
2c2fdb8bca drm/amdgpu: fix amdgpu trace event print string format error
the trace event print string format error.
(use integer type to handle string)

before:
amdgpu_test_kev-1556  [002]   138.508781: amdgpu_cs_ioctl:
sched_job=8, timeline=gfx_0.0.0, context=177, seqno=1,
ring_name=ffff94d01c207bf0, num_ibs=2

after:
amdgpu_test_kev-1506  [004]   370.703783: amdgpu_cs_ioctl:
sched_job=12, timeline=gfx_0.0.0, context=234, seqno=2,
ring_name=gfx_0.0.0, num_ibs=1

change trace event list:
1.amdgpu_cs_ioctl
2.amdgpu_sched_run_job
3.amdgpu_ib_pipe_sync

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:32:00 -04:00
Tianci.Yin
367039bfb6 drm/amdgpu/psp: add psp memory training implementation(v3)
add memory training implementation code to save resume time.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:54 -04:00
Tianci.Yin
778e8c428f drm/amdgpu: reserve vram for memory training(v4)
memory training using specific fixed vram segment, reserve these
segments before anyone may allocate it.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:44 -04:00
Tianci.Yin
0586a0596a drm/amdgpu: add psp memory training callbacks and macro
add interface for memory training.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:38 -04:00
Tianci.Yin
efe4f00077 drm/amdgpu/atomfirmware: add memory training related helper functions(v3)
parse firmware to get memory training capability and fb location.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:31 -04:00
Tianci.Yin
a7d4c920f8 drm/amdgpu: introduce psp_v11_0_is_sos_alive interface(v2)
introduce psp_v11_0_is_sos_alive func for common use.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:19 -04:00
Tianci.Yin
e35e2b117f drm/amdgpu: add a generic fb accessing helper function(v3)
add a generic helper function for accessing framebuffer via MMIO

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:13 -04:00
Tianci.Yin
45cf454e4c drm/amdgpu: update amdgpu_discovery to handle revision
update amdgpu_discovery to get IP revision.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:06 -04:00
Alex Deucher
8c32d0438f drm/amdgpu/vcn: fix allocation size in enc ring test
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

- Session info is 128K according to mesa
- Use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:30:57 -04:00
Alex Deucher
b24c459f9f drm/amdgpu/uvd7: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:30:55 -04:00
Alex Deucher
481bf82c97 drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)
We need to allocate a large enough buffer for the
session info, otherwise the IB test can overwrite
other memory.

v2: - session info is 128K according to mesa
    - use the same session info for create and destroy

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:30:52 -04:00
Prike Liang
f839110157 drm/amdgpu: fix S3 failed as RLC safe mode entry stucked in polloing gfx acq
Fix gfx cgpg setting sequence for RLC deadlock at safe mode entry in polling gfx response.
The patch can fix VCN IB test failed and DAL get dispaly count failed issue.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:24:09 -04:00
Prike Liang
c8486eef2c drm/amdgpu: add GFX_PIPELINE capacity check for updating gfx cgpg
Before disable gfx pipeline power gating need check the flag AMD_PG_SUPPORT_GFX_PIPELINE.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:23:54 -04:00
Gerd Hoffmann
12067e0e89 drm/ttm: rename ttm_fbdev_mmap
Rename ttm_fbdev_mmap to ttm_bo_mmap_obj.  Move the vm_pgoff sanity
check to amdgpu_bo_fbdev_mmap (only ttm_fbdev_mmap user in tree).

The ttm_bo_mmap_obj function can now be used to map any buffer object.
This allows to implement &drm_gem_object_funcs.mmap in gem ttm helpers.

v3: patch added to series

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191016115203.20095-8-kraxel@redhat.com
2019-10-17 13:59:16 +02:00
Alex Deucher
97c002be41 drm/amdgpu: enable BACO reset for SMU7 based dGPUs (v2)
Use BACO to reset the GPU if supported on SMU7 based
dGPUs.

v2: don't use baco on CI parts

Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:55:32 -04:00
Alex Deucher
5337aae9b5 drm/amdgpu/soc15: add support for baco reset with swSMU
Add support for vega20 when the swSMU path is used.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:55:31 -04:00
Alex Deucher
31fa2991f4 drm/amdgpu: remove in_baco_reset hack
It was a vega20 specific hack.  Check if we are in reset
and what reset method we are using.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Alex Deucher
f5fda6d89a drm/amdgpu: simplify ATPX detection
Use the base class rather than the specific class and drop
the second loop.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Alex Deucher
897483d8a0 drm/amdgpu: move gpu reset out of amdgpu_device_suspend
Move it into the caller.  There are cases were we don't
want it.  We need it for hibernation, but we don't need
it for runtime pm, so drop it for runtime pm.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Alex Deucher
803cc26d5c drm/amdgpu: move pci_save_state into suspend path
for amdgpu_device_suspend.  This follows the logic
in the resume path.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Andrey Grodzovsky
ed606f8a34 dmr/amdgpu: Fix crash on SRIOV for ERREVENT_ATHUB_INTERRUPT interrupt.
Ignre the ERREVENT_ATHUB_INTERRUPT for systems without RAS.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-and-tested-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:11 -04:00
Philip Yang
06f7f57e87 drm/amdgpu: user pages array memory leak fix
user_pages array should always be freed after validation regardless if
user pages are changed after bo is created because with HMM change parse
bo always allocate user pages array to get user pages for userptr bo.

v2: remove unused local variable and amend commit

v3: add back get user pages in gem_userptr_ioctl, to detect application
bug where an userptr VMA is not ananymous memory and reject it.

Bugzilla: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1844962

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Joe Barnett <thejoe@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:01 -04:00
Evan Quan
5bcc92407c drm/amd/powerplay: enable Arcturus runtime VCN dpm on/off
Enable runtime VCN DPM on/off on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:43 -04:00
Emily Deng
bcccee89f4 drm/amdgpu: Fix tdr3 could hang with slow compute issue
When index is 1, need to set compute ring timeout for sriov and passthrough.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:23 -04:00
Christian König
b2c18f0a9c drm/amdgpu: fix potential VM faults
When we allocate new page tables under memory
pressure we should not evict old ones.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:18 -04:00
Christian König
b146570010 drm/amdgpu: fix error handling in amdgpu_bo_list_create
We need to drop normal and userptr BOs separately.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:50:07 -04:00
Dennis Li
820924745b drm/amdgpu: add RAS support for VML2 and ATCL2
v1: Add codes to query the EDC count of VML2 & ATCL2
v2: Rename VML2/ATCL2 registers and drop their mask define
v3: Add back the ECC mask for VML2 registers

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:57 -04:00
Dennis Li
13ba03442a drm/amdgpu: change to query the actual EDC counter
For the potential request in the future, change to
query the actual EDC counter.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:44 -04:00
Le Ma
956f670509 drm/amdgpu/soc15: disable doorbell interrupt as part of BACO entry sequence
Workaround to make RAS recovery work in BACO reset.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:14 -04:00
Hans de Goede
402c60d7b0 drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1
Bail from the pci_driver probe function instead of from the drm_driver
load function.

This avoid /dev/dri/card0 temporarily getting registered and then
unregistered again, sending unwanted add / remove udev events to
userspace.

Specifically this avoids triggering the (userspace) bug fixed by this
plymouth merge-request:
https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59

Note that despite that being a userspace bug, not sending unnecessary
udev events is a good idea in general.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:49:07 -04:00
Xiaojie Yuan
5f6a556f98 drm/amdgpu/discovery: reserve discovery data at the top of VRAM
IP Discovery data is TMR fenced by the latest PSP BL,
so we need to reserve this region.

Tested on navi10/12/14 with VBIOS integrated with latest PSP BL.

v2: use DISCOVERY_TMR_SIZE macro as bo size
    use amdgpu_bo_create_kernel_at() to allocate bo

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:48:46 -04:00
Xiaojie Yuan
d12c50857c drm/amdgpu/sdma5: fix mask value of POLL_REGMEM packet for pipe sync
sdma will hang once sequence number to be polled reaches 0x1000_0000

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-11 21:32:06 -05:00
Hans de Goede
984d7a929a drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1
Bail from the pci_driver probe function instead of from the drm_driver
load function.

This avoid /dev/dri/card0 temporarily getting registered and then
unregistered again, sending unwanted add / remove udev events to
userspace.

Specifically this avoids triggering the (userspace) bug fixed by this
plymouth merge-request:
https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59

Note that despite that being a userspace bug, not sending unnecessary
udev events is a good idea in general.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-10-11 21:31:14 -05:00
chen gong
6696b8adb8 drm/amdgpu: Do not implement power-on for SDMA after do mode2 reset on Renoir
Find that ring sdma0 test failed if turn on SDMA powergating after do
mode2 reset.

Perhaps the mode2 reset does not reset the SDMA PG state, SDMA is
already powered up so there is no need to ask the SMU to power it up
again. So I skip this function for a moment.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:39:06 -05:00
Xiaojie Yuan
e8939b4a0d drm/amdgpu/sdma5: fix mask value of POLL_REGMEM packet for pipe sync
sdma will hang once sequence number to be polled reaches 0x1000_0000

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:39:06 -05:00
Nirmoy Das
b9ed69e6fd drm/amdgpu: fix memory leak
cleanup error handling code and make sure temporary info array
with the handles are freed by amdgpu_bo_list_put() on
idr_replace()'s failure.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:35 -05:00
Tao Zhou
6e4be98767 drm/amdgpu: avoid ras error injection for retired page
check whether a page is bad page before umc error injection, bad page
should not be accessed again

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:27 -05:00
Luben Tuikov
4e930d96c9 drm/amdgpu: Use the ALIGN() macro
Use the ALIGN() macro to set "num_dw" to a
multiple of 8, i.e. lower 3 bits cleared.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:20 -05:00
Alex Deucher
54e9ab2edb drm/amdgpu/ras: document the reboot ras option
We recently added it, but never documented it.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:18 -05:00
Alex Deucher
a20bfd0fd4 drm/amdgpu/ras: fix typos in documentation
Fix a couple of spelling typos.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:35:11 -05:00
Oak Zeng
f81b86a043 drm/amdgpu: Enable gfx cache probing on HDP write for arcturus
This allows gfx cache to be probed and invalidated (for none-dirty cache lines)
on a HDP write (from either another GPU or CPU). This should work only for the
memory mapped as RW memory type newly added for arcturus, to achieve some cache
coherence b/t multiple memory clients.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:24:19 -05:00
Oak Zeng
cb1545f710 drm/amdgpu: Clean up gmc_v9_0_gart_enable
Many logic in this function are HDP set up,
not gart set up. Moved those logic to gmc_v9_0_hw_init.
No functional change.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:24:12 -05:00
Marek Olšák
6f3bf46a7e drm/amdgpu: simplify gds_compute_max_wave_id computation
Use asic constants.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10 19:23:35 -05:00
Dave Airlie
7ed093602e drm-misc-next for 5.5:
UAPI Changes:
 -Colorspace: Expose different prop values for DP vs. HDMI (Gwan-gyeong Mun)
 -fourcc: Add DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED (Raymond)
 -not_actually: s/ENOTSUPP/EOPNOTSUPP/ in drm_edid and drm_mipi_dbi. This should
     not reach userspace, but adding here to specifically call that out (Daniel)
 -i810: Prevent underflow in dispatch ioctls (Dan)
 -komeda: Add ACLK sysfs attribute (Mihail)
 -v3d: Allow userspace to clean up after render jobs (Iago)
 
 Cross-subsystem Changes:
 -MAINTAINERS:
  -Add Alyssa & Steven as panfrost reviewers (Rob)
  -Add Jernej as DE2 reviewer (Maxime)
  -Add Chen-Yu as Allwinner maintainer (Maxime)
 -staging: Make some stack arrays static const (Colin)
 
 Core Changes:
 -ttm: Allow drivers to specify their vma manager (to use gem mgr) (Gerd)
 -docs: Various fixes in connector/encoder/bridge docs (Daniel, Lyude, Laurent)
 -connector: Allow more than 3 possible encoders for a connector (José)
 -dp_cec: Allow a connector to be associated with a cec device (Dariusz)
 -various: Fix some compile/sparse warnings (Ville)
 -mm: Ensure mm node removals are properly serialised (Chris)
 -panel: Specify the type of panel for drm_panels for later use (Laurent)
 -panel: Use drm_panel_init to init device and funcs (Laurent)
 -mst: Refactors and cleanups in anticipation of suspend/resume support (Lyude)
 -vram:
  -Add lazy unmapping for gem bo's (Thomas)
  -Unify and rationalize vram mm and gem vram (Thomas)
  -Expose vmap and vunmap for gem vram objects (Thomas)
  -Allow objects to be pinned at the top of vram to avoid fragmentation (Thomas)
 
 Driver Changes:
 -various: Include drm_bridge.h instead of relying on drm_crtc.h (Boris)
 -ast/mgag200: Refactor show_cursor(), move cursor to top of video mem (Thomas)
 -komeda:
  -Add error event printing (behind CONFIG) and reg dump support (Lowry)
  -Add suspend/resume support (Lowry)
  -Workaround D71 shadow registers not flushing on disable (Lowry)
 -meson: Add suspend/resume support (Neil)
 -omap: Miscellaneous refactors and improvements (Tomi/Jyri)
 -panfrost/shmem: Silence lockdep by using mutex_trylock (Rob)
 -panfrost: Miscellaneous small fixes (Rob/Steven)
 -sti: Fix warnings (Benjamin/Linus)
 -sun4i:
  -Add vcc-dsi regulator to sun6i_mipi_dsi (Jagan)
  -A few patches to figure out the DRQ/start delay calc on dsi (Jagan/Icenowy)
 -virtio:
  -Add module param to switch resource reuse workaround on/off (Gerd)
  -Avoid calling vmexit while holding spinlock (Gerd)
  -Use gem shmem helpers instead of ttm (Gerd)
  -Accommodate command buffer allocations too big for cma (David)
 
 Cc: Rob Herring <robh@kernel.org>
 Cc: Maxime Ripard <mripard@kernel.org>
 Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
 Cc: Gerd Hoffmann <kraxel@redhat.com>
 Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
 Cc: Lyude Paul <lyude@redhat.com>
 Cc: José Roberto de Souza <jose.souza@intel.com>
 Cc: Dariusz Marcinkiewicz <darekm@google.com>
 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
 Cc: Raymond Smith <raymond.smith@arm.com>
 Cc: Chris Wilson <chris@chris-wilson.co.uk>
 Cc: Colin Ian King <colin.king@canonical.com>
 Cc: Thomas Zimmermann <tzimmermann@suse.de>
 Cc: Dan Carpenter <dan.carpenter@oracle.com>
 Cc: Mihail Atanassov <Mihail.Atanassov@arm.com>
 Cc: Lowry Li <Lowry.Li@arm.com>
 Cc: Neil Armstrong <narmstrong@baylibre.com>
 Cc: Jyri Sarha <jsarha@ti.com>
 Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
 Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
 Cc: Steven Price <steven.price@arm.com>
 Cc: Benjamin Gaignard <benjamin.gaignard@st.com>
 Cc: Linus Walleij <linus.walleij@linaro.org>
 Cc: Jagan Teki <jagan@amarulasolutions.com>
 Cc: Icenowy Zheng <icenowy@aosc.io>
 Cc: Iago Toral Quiroga <itoral@igalia.com>
 Cc: David Riley <davidriley@chromium.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl2d9h8ACgkQcywAJXLc
 r3ms5gf9HIFpqwJ16CqaRukSnpcBcDoYUM8DGrOic+vw2bw14BQwFqvEOqrCkKL4
 V6h/OCJlNFPtOcc1LvU/jeXxYf4AQWh/2qZeg+oee7HAGX5x8Y3f08GsEjO8+55t
 QvSVxCKVti04M1ErPRfKrM7KPVE+IC+KdY26nO8Bf5zDGeCAkiPIDrdh2aZGMRdC
 Eer0DJ96cgWW9LrhseCdj5nKwcR78DlbWa79zuPAss4LaBBbXqThNXYYzg/mZMKB
 +VYgzs48tGYKK1NXXJ6biVI3brHrM52bqv5JpIncD5HepF1oIartWOMnbAO7MAqh
 h/tgJWxL+4bnl9aqY87by1BtyVgl3w==
 =kaOE
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-10-09-2' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.5:

UAPI Changes:
-Colorspace: Expose different prop values for DP vs. HDMI (Gwan-gyeong Mun)
-fourcc: Add DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED (Raymond)
-not_actually: s/ENOTSUPP/EOPNOTSUPP/ in drm_edid and drm_mipi_dbi. This should
    not reach userspace, but adding here to specifically call that out (Daniel)
-i810: Prevent underflow in dispatch ioctls (Dan)
-komeda: Add ACLK sysfs attribute (Mihail)
-v3d: Allow userspace to clean up after render jobs (Iago)

Cross-subsystem Changes:
-MAINTAINERS:
 -Add Alyssa & Steven as panfrost reviewers (Rob)
 -Add Jernej as DE2 reviewer (Maxime)
 -Add Chen-Yu as Allwinner maintainer (Maxime)
-staging: Make some stack arrays static const (Colin)

Core Changes:
-ttm: Allow drivers to specify their vma manager (to use gem mgr) (Gerd)
-docs: Various fixes in connector/encoder/bridge docs (Daniel, Lyude, Laurent)
-connector: Allow more than 3 possible encoders for a connector (José)
-dp_cec: Allow a connector to be associated with a cec device (Dariusz)
-various: Fix some compile/sparse warnings (Ville)
-mm: Ensure mm node removals are properly serialised (Chris)
-panel: Specify the type of panel for drm_panels for later use (Laurent)
-panel: Use drm_panel_init to init device and funcs (Laurent)
-mst: Refactors and cleanups in anticipation of suspend/resume support (Lyude)
-vram:
 -Add lazy unmapping for gem bo's (Thomas)
 -Unify and rationalize vram mm and gem vram (Thomas)
 -Expose vmap and vunmap for gem vram objects (Thomas)
 -Allow objects to be pinned at the top of vram to avoid fragmentation (Thomas)

Driver Changes:
-various: Include drm_bridge.h instead of relying on drm_crtc.h (Boris)
-ast/mgag200: Refactor show_cursor(), move cursor to top of video mem (Thomas)
-komeda:
 -Add error event printing (behind CONFIG) and reg dump support (Lowry)
 -Add suspend/resume support (Lowry)
 -Workaround D71 shadow registers not flushing on disable (Lowry)
-meson: Add suspend/resume support (Neil)
-omap: Miscellaneous refactors and improvements (Tomi/Jyri)
-panfrost/shmem: Silence lockdep by using mutex_trylock (Rob)
-panfrost: Miscellaneous small fixes (Rob/Steven)
-sti: Fix warnings (Benjamin/Linus)
-sun4i:
 -Add vcc-dsi regulator to sun6i_mipi_dsi (Jagan)
 -A few patches to figure out the DRQ/start delay calc on dsi (Jagan/Icenowy)
-virtio:
 -Add module param to switch resource reuse workaround on/off (Gerd)
 -Avoid calling vmexit while holding spinlock (Gerd)
 -Use gem shmem helpers instead of ttm (Gerd)
 -Accommodate command buffer allocations too big for cma (David)

Cc: Rob Herring <robh@kernel.org>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Dariusz Marcinkiewicz <darekm@google.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Raymond Smith <raymond.smith@arm.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mihail Atanassov <Mihail.Atanassov@arm.com>
Cc: Lowry Li <Lowry.Li@arm.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Benjamin Gaignard <benjamin.gaignard@st.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Jagan Teki <jagan@amarulasolutions.com>
Cc: Icenowy Zheng <icenowy@aosc.io>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: David Riley <davidriley@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Thu 10 Oct 2019 01:00:47 AM AEST
# gpg:                using RSA key 732C002572DCAF79
# gpg: Can't check signature: public key not found

# Conflicts:
#	drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
#	drivers/gpu/drm/i915/i915_drv.c
#	drivers/gpu/drm/i915/i915_gem.c
#	drivers/gpu/drm/i915/i915_gem_gtt.c
#	drivers/gpu/drm/i915/i915_vma.c
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009150825.GA227673@art_vandelay
2019-10-11 09:30:53 +10:00
Nirmoy Das
083164dbdb drm/amdgpu: fix memory leak
cleanup error handling code and make sure temporary info array
with the handles are freed by amdgpu_bo_list_put() on
idr_replace()'s failure.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-09 11:45:59 -05:00
Ori Messinger
ad02e08e05 drm/amdgpu: Report vram vendor with sysfs (v3)
The vram vendor can be found as a separate sysfs file at:
/sys/class/drm/card[X]/device/mem_info_vram_vendor
The vram vendor is displayed as a string value.

v2: Use correct bit masking, and cache vram_vendor in gmc
v3: Drop unused functions for vram width, type, and vendor

Signed-off-by: Ori Messinger <ori.messinger@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:11:07 -05:00
YueHaibing
8f49c8220b drm/amdgpu: remove duplicated include from mmhub_v1_0.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:10:23 -05:00
Alex Deucher
71f98027f2 drm/amdgpu: move amdgpu_device_get_job_timeout_settings
It's only used in amdgpu_device.c and the naming also
reflects that.  Move it there.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:10:20 -05:00
Felix Kuehling
1995b3a35f drm/amdgpu: Fix error handling in amdgpu_ras_recovery_init
Don't set a struct pointer to NULL before freeing its members. It's
hard to see what's happening due to a local pointer-to-pointer data
aliasing con->eh_data.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Philip Cox <Philip.Cox@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:09:33 -05:00
Colin Ian King
317a8d9eb6 drm/amdgpu: remove redundant variable r and redundant return statement
There is a return statement that is not reachable and a variable that
is not used.  Remove them.

Addresses-Coverity: ("Structurally dead code")
Fixes: de7b45babd ("drm/amdgpu: cleanup creating BOs at fixed location (v2)")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:14 -05:00
Colin Ian King
17cf678a33 drm/amdgpu: fix uninitialized variable pasid_mapping_needed
The boolean variable pasid_mapping_needed is not initialized and
there are code paths that do not assign it any value before it is
is read later.  Fix this by initializing pasid_mapping_needed to
false.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 6817bf283b ("drm/amdgpu: grab the id mgr lock while accessing passid_mapping")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:14 -05:00
Leo Liu
d0312d0dca drm/amdgpu: add code comment in vcn_v2_5_hw_init
Add a comment to VCN 2.5 encode ring

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:53:08 -05:00
Leo Liu
fd287c8cd2 drm/amdgpu/vcn: use amdgpu_ring_test_helper
Instead of amdgpu_ring_test_ring, so the helper function determines
whether the ring is ready

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-04 08:52:53 -05:00
Alex Deucher
8a745c7ff2 drm/amdgpu: improve MSI-X handling (v3)
Check the number of supported vectors and fall back to MSI if
we return or error or 0 MSI-X vectors.

v2: only allocate one vector.  We can't currently use more than
one anyway.

v3: install the irq on vector 0.

Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Shaoyun liu  <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 13:51:19 -05:00
Maxime Ripard
4092de1ba3
Merge drm/drm-next into drm-misc-next
We haven't done any backmerge for a while due to the merge window, and it
starts to become an issue for komeda. Let's bring 5.4-rc1 in.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2019-10-03 16:38:50 +02:00
Arnd Bergmann
324fb7adf6 drm/amdgpu: hide another #warning
An earlier patch of mine disabled some #warning statements
that get in the way of build testing, but then another
instance was added around the same time.

Remove that as well.

Fixes: b5203d16ae ("drm/amd/amdgpu: hide #warning for missing DC config")
Fixes: e1c14c4339 ("drm/amdgpu: Enable DC on Renoir")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
Arnd Bergmann
128a01f472 drm/amdgpu: make pmu support optional, again
When CONFIG_PERF_EVENTS is disabled, we cannot compile the pmu
portion of the amdgpu driver:

drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:48:38: error: no member named 'hw' in 'struct perf_event'
        struct hw_perf_event *hwc = &event->hw;
                                     ~~~~~  ^
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:51:13: error: no member named 'attr' in 'struct perf_event'
        if (event->attr.type != event->pmu->type)
            ~~~~~  ^
...

The same bug was already fixed by commit d155bef063 ("amdgpu: make pmu
support optional") but broken again by what looks like an incorrectly
rebased patch.

Fixes: 64f55e6292 ("drm/amdgpu: Add RAS EEPROM table.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
yu kuai
2e0db9dec2 drm/amdgpu: remove set but not used variable 'pipe'
Fixes gcc '-Wunused-but-set-variable' warning:

rivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c: In function
‘amdgpu_gfx_graphics_queue_acquire’:
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:234:16: warning:
variable ‘pipe’ set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:06 -05:00
Navid Emamdoost
1104057562 drm/amdgpu: fix multiple memory leaks in acp_hw_init
In acp_hw_init there are some allocations that needs to be released in
case of failure:

1- adev->acp.acp_genpd should be released if any allocation attemp for
adev->acp.acp_cell, adev->acp.acp_res or i2s_pdata fails.
2- all of those allocations should be released if
mfd_add_hotplug_devices or pm_genpd_add_device fail.
3- Release is needed in case of time out values expire.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Alex Deucher
2c9a0c66d5 drm/amdgpu: don't increment vram lost if we are in hibernation
We reset the GPU as part of our hibernation sequence so we need
to make sure we don't mark vram as lost in that case.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111879
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
shaoyunl
bd660f4f11 drm/amdgpu : enable msix for amdgpu driver
We might used out of the msi resources in some cloud project
which have a lot gpu devices(including PF and VF), msix can
provide enough resources from system level view

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
e7956997b1 drm/amdgpu: Export setup_vm_pt_regs() logic for mmhub 2.0
The KFD code will call this function later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
923c087a1f drm/amdgpu: Add the HDP flush support for Navi
The HDP flush support code was missing in the nbio and nv files.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
e392c887df drm/amdkfd: Use array to probe kfd2kgd_calls
This is the same idea as the kfd device info probe and move all the
probe control together for easy maintenance.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
47c5ab6ca0 drm/amdkfd: Delete unnecessary function declarations
Ajust the function sequences so that those function delcarations are not
needed any more.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
1456482bf8 drm/amdgpu: Delete useless header file reference
Those header file includes are not needed.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Jack Zhang
21889cec0a drm/amd/amdgpu/sriov ip block setting of Arcturus
Add ip block setting for Arcturus SRIOV

1.PSP need to be initialized before IH.
2.SMU doesn't need to be initialized at kmd driver.
3.Arcturus doesn't support DCE hardware,it needs to skip
  register access to DCE.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Marek Olšák
cf21e76a60 drm/amdgpu: return tcc_disabled_mask to userspace
UMDs need this for correct programming of harvested chips.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Lyude Paul
f8d2d39eb4 drm/amdgpu: Iterate through DRM connectors correctly
Currently, every single piece of code in amdgpu that loops through
connectors does it incorrectly and doesn't use the proper list iteration
helpers, drm_connector_list_iter_begin() and
drm_connector_list_iter_end(). Yeesh.

So, do that.

Cc: Juston Li <juston.li@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Prike Liang
88d802500a drm/amdkfd: fix kgd2kfd_device_init() definition conflict error
The patch c670707 drm/amd: Pass drm_device to kfd introduced this issue and
fix the following compiler error.

  CC [M]  drivers/gpu/drm/amd/amdgpu//../powerplay/smumgr/fiji_smumgr.o
drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.c:746:6: error: conflicting types for ‘kgd2kfd_device_init’
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
      ^
In file included from drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.c:23:0:
drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.h:253:6: note: previous declaration of ‘kgd2kfd_device_init’ was here
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
      ^
scripts/Makefile.build:273: recipe for target 'drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.o' failed
make[1]: *** [drivers/gpu/drm/amd/amdgpu//amdgpu_amdkfd.o] Error 1

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Kenneth Feng
227f7d58d7 drm/amd/amdgpu: add IH cg support on soc15 project
enable/disable IH clock gating on soc15 projects.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
b2100ce1db drm/amdkfd: Use setup_vm_pt_regs function from base driver in KFD
This was done on GFX9 previously, now do it for GFX10.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
286b789e1e drm/amdgpu: Export setup_vm_pt_regs() logic for gfxhub 2.0
The KFD code will call this function later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
56fc40aba4 drm/amdkfd: Eliminate get_atc_vmid_pasid_mapping_valid
get_atc_vmid_pasid_mapping_valid() is very similar to
get_atc_vmid_pasid_mapping_pasid(), so they can be merged into a new
function get_atc_vmid_pasid_mapping_info() to reduce register access
times. More importantly, getting the PASID and the valid bit atomically
with a single read fixes some potential race conditions where the
mapping changes between the two reads.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
d19eb6aca7 drm/amdkfd: Delete unused defines
They are not used anywhere.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Harish Kasiviswanathan
3a0c342392 drm/amd: Pass drm_device to kfd
kfd needs drm_device to call into drm_cgroup functions

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
b55a8b8b41 drm/amdkfd: Use better name for sdma queue non HWS path
The old name is prone to confusion. The register offset is for a RLC queue
rather than a SDMA engine. The value is not a base address, but a
register offset.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
9941a6bfbd drm/amdkfd: Delete useless SDMA register setting on non HWS path
HW folks have confirm that we should not touch RESUME_CTX of
SDMA*_GFX_CONTEXT_CNTL when manipulating RLC queues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Christian König
56f074d815 drm/amdgpu: restrict hotplug error message
We should print the error only when we are hotplugged and crash
basically all userspace applications.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Christian König
4a24652867 drm/amdgpu: once more fix amdgpu_bo_create_kernel_at
When CPU access is needed we should tell that to
amdgpu_bo_create_reserved() or otherwise the access is denied later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
3d8361b11c drm/amdgpu: add comments in ras interrupt callback
add comments to clarify why checking GFX IP BLOCK for each ras interrupt callback

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
ba08349214 drm/amdgpu: implement common gmc_ras_late_init
common gmc_ecc_late_init can be shared among all generations of gmc

v2: rename gmc_ecc_late_init to gmc_ras_late_init

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Tao Zhou
be5b39d87a drm/amdgpu: move xgmi ras fini to xgmi block
it's more suitable to put xgmi ras fini in xgmi block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
196041205c drm/amdgpu: move mmhub ras fini to mmhub block
it's more suitable to put mmhub ras fini in mmhub block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
181c93e5ec drm/amdgpu: move umc ras fini to umc block
it's more suitable to put umc ras fini in umc block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
f2575941e6 drm/amdgpu: add ras fini for xgmi
add ras fini for xgmi to cleanup xgmi ras framework

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
de9bbd5273 drm/amdgpu: add ras fini for nbio
add a common nbio ras fini implementation to cleanup nbio ras framework

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
0771b0bf07 drm/amdgpu: simplify the access to eeprom_control struct
simplify the code of accessing to eeprom_control struct

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
41190cd733 drm/amdgpu: remove ih_info parameter of gfx_ras_late_init
gfx_ras_late_init can get the info by itself

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
56c54b25c3 drm/amdgpu: remove ih_info parameter of umc_ras_late_init
umc_ras_late_init can get the info by itself

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
e536c81850 drm/amdgpu: add common sdma_ras_fini function
sdma_ras_fini can be shared among all generations of sdma

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
3b7b7647be drm/amdgpu: add common gfx_ras_fini function
gfx_ras_fini can be shared among all generations of gfx

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
2adf13440a drm/amdgpu: add common gmc_ras_fini function
gmc_ras_fini can be shared among all generations of gmc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
65bc47a659 drm/amdgpu: move mmhub_ras_if from gmc to mmhub block
mmhub_ras_if is relevant to mmhub

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
d65bf1f8a7 drm/amdgpu: replace mmhub_funcs with mmhub.funcs
remove mmhub_funcs in adev

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
d3a5a121b8 drm/amdgpu: add common mmhub member for adev
put mmhub_funcs and ras_if pointer into mmhub struct

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
03740baab3 drm/amdgpu: move umc_ras_if from gmc to umc block
umc_ras_if is relevant to umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
fc04e6b484 drm/amdgpu: refine sdma4 ras_data_cb
simplify code logic and refine return value

v2: remove unused error source code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
4c65dd1041 drm/amdgpu: move sdma ecc functions to generic sdma file
sdma ras ecc functions can be reused among all sdma generations

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
725253ab9b drm/amdgpu: move gfx ecc functions to generic gfx file
gfx ras ecc common functions could be reused among all gfx generations

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
34cc4fd9ff drm/amdgpu: move umc ras irq functions to umc block
move umc ras irq functions from gmc v9 to generic umc block, these
functions are relevant to umc and they can be shared among all
generations of umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Tao Zhou
f5f06e21e9 drm/amdgpu: update parameter of ras_ih_cb
change struct ras_err_data *err_data to void *err_data, align with
umc code and the callback's declaration in each ras block could
pay no attention to the structure type

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Monk Liu
e7da754b00 drm/amdgpu: fix an UMC hw arbitrator bug(v3)
issue:
the UMC6 h/w bug is that when MCLK is doing the switch
in the middle of a page access being preempted by high
priority client (e.g. DISPLAY) then UMC and the mclk switch
would stuck there due to deadlock

how:
fixed by disabling auto PreChg for UMC to avoid high
priority client preempting other client's access on
the same page, thus the deadlock could be avoided

v2:
put the patch in callback of UMC6
v3:
rename the callback to "init_registers"

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Marek Olšák
6de088a08d drm/amdgpu: remove gfx9 NGG
Never used.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Alex Deucher
631cdbd27e drm/amdgpu/atomfirmware: simplify the interface to get vram info
fetch both the vram type and width in one function call.  This
avoids having to parse the same data table twice to get the two
pieces of data.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Alex Deucher
bd5520273c drm/amdgpu/atomfirmware: use proper index for querying vram type (v3)
The index is stored in scratch register 4 after asic init.  Use
that index.  No functional change since all asics in a family
use the same type of vram (G5, G6, HBM) and that is all we use
at the monent, but if we ever need to query other info, we will
now have the proper index.

v2: module array is variable sized, handle that.
v3: fix off by one in array handling

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Shirish S
52510a4035 drm/amdgpu/psp: silence response status warning
log the response status related error to the driver's
debug log since  psp response status is not 0 even though
there was no problem while the command was submitted.

This warning misleads, hence this change.

Signed-off-by: Shirish S <shirish.s@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Jesse Zhang
df99ac0fcc drm/amd/amdgpu:Fix compute ring unable to detect hang.
When compute fence did not signal, compute ring cannot detect hardware hang
because its timeout value is set to be infinite by default.

In SR-IOV and passthrough mode, if user does not declare custome timeout
value for compute ring, then use gfx ring timeout value as default. So
that when there is a ture hardware hang, compute ring can detect it.

Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
chen gong
90a08351f7 drm/amdgpu: Use mode2 mode to perform GPU RESET for Renoir
Renoir need to use mode2 mode to implement GPU RESET

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
40463bdc22 drm/amdkfd: Sync gfx10 kfd2kgd_calls function pointers
get_hive_id was not set. Also, adjust the function setting sequence.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
c637b36aea drm/amdkfd: Fix NULL pointer dereference for set_scratch_backing_va()
Currently this function pointer is missing for GFX10. Considering it is
a void function since GFX9, fix it by checking the function pointer
before dereferencing it.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
812330eb69 drm/amdkfd: Add an error print if SDMA RLC is not idle
The message will be useful when troubleshooting the issues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
05ba0095fb drm/amdgpu: correct condition check for psp rlc autoload
Otherwise non-autoload case will go into the wrong routine and fail.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Hawking Zhang
1f01cd9905 drm/amdgpu: add command id in psp response failure message
For better clarification of issue.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
90c88dab8e drm/amdgpu: enable psp front door loading by default on Arcturus
Front door firmware loading is done via the psp rather than the
driver.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Le Ma
9a018e5a85 drm/amdgpu: disable vcn ip block for front door loading on Arcturus
Needs more work to enable via front door loading.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Tianci.Yin
4db37544ce drm/amdgpu/gfx10: add support for wks firmware loading
load different cp firmware according to the DID and RID

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
f77c7109c0 drm/amdgpu/ras: fix and update the documentation for RAS
Add new sections to amdgpu.rst, fix up formatting issues,
add additional documentation to each section.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
a667b75c1e drm/amdgpu: fix documentation for amdgpu_pm.c
Fix DOC link name, clean up formatting in pp_dpm_* section.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
fc9c7f8470 drm/amdgpu/ih: fix documentation in amdgpu_irq_dispatch
Fix parameters.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
1d614ded87 drm/amdgpu/vm: fix up documentation in amdgpu_vm.c
Missing parameters, wrong comment type, etc.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
4d8e54d2b9 drm/amdgpu/mn: fix documentation for amdgpu_mn_read_lock
Document the new parameter.

Fixes: 93065ac753 ("mm, oom: distinguish blockable mode for mmu notifiers")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
ebc52c1692 drm/amdgpu: fix documentation for amdgpu_gem_prime_export
Drop extra function parameter.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
yu kuai
d0580c09c6 drm/amdgpu: remove excess function parameter description
Fixes gcc warning:

drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:431: warning: Excess function
parameter 'sw' description in 'vcn_v2_5_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:550: warning: Excess function
parameter 'sw' description in 'vcn_v2_5_enable_clock_gating'

Fixes: cbead2bdfc ("drm/amdgpu: add VCN2.5 VCPU start and stop")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Guchun Chen
e53aec7e41 drm/amdgpu: enable full ras by default
Enable full ras by default, user does not need to enable it by
boot parameter.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Jiange Zhao
57d4f3b7fd drm/amdgpu/SRIOV: add navi12 pci id for SRIOV (v2)
Add Navi12 PCI id support.

v2: flag as experimental for now (Alex)

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
7677b0dbce drm/amdgpu/gfx10: update gfx golden settings for navi14
update registers: mmUTCL1_CTRL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
aa4604b6e4 drm/amdgpu/gfx10: update gfx golden settings
update registers: mmUTCL1_CTRL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
ade9a34e7d drm/amdgpu: flag navi12 and 14 as experimental for 5.4
We can remove this later as things get closer to launch.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
01b40c98ed drm/amdgpu/psp: invalidate the hdp read cache before reading the psp response
Otherwise we may get stale data.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
e8186eeccb drm/amdgpu/psp: flush HDP write fifo after submitting cmds to the psp
We need to make sure the fifo is flushed before we ask the psp to
process the commands.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Guchun Chen
5222d26146 drm/amdgpu: remove redundant variable definition
No need to define the same variables in each loop of the function.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Guchun Chen
8a3e801f19 drm/amdgpu: avoid null pointer dereference
null ptr should be checked first to avoid null ptr access

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Hawking Zhang
fec6a08aae drm/amdgpu: do not init mec2 jt for renoir
For ASICs like renoir/arct, driver doesn't need to load mec2 jt.
when mec1 jt is loaded, mec2 jt will be loaded automatically
since the write is actaully broadcasted to both.

We need to more time to test other gfx9 asic. but for now we should
be able to draw conclusion that mec2 jt is not needed for renoir and
arct.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Hawking Zhang
2011eaea21 drm/amdgpu: add psp ip block for arct
enable psp block for firmware loading and other security
feature setup.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Alex Deucher
a142ba8800 drm/amdgpu/ras: use GPU PAGE_SIZE/SHIFT for reserving pages
We are reserving vram pages so they should be aligned to the
GPU page size.

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Xiaojie Yuan
ec51d3facd drm/amdgpu/discovery: get gpu info from ip discovery table
except soc_bounding_box which is not integrated in discovery table yet

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tao Zhou
afa44809a4 drm/amdgpu: use GPU PAGE SHIFT for umc retired page
umc retired page belongs to vram and it should be aligned to gpu page
size

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tianci.Yin
57516cdd74 drm/amdgpu: add navi12 pci id
Add Navi12 PCI id support.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Tao Zhou
ae115c81ec drm/amdgpu: replace DRM_ERROR with DRM_WARN in ras_reserve_bad_pages
There are two cases of reserve error should be ignored:
1) a ras bad page has been allocated (used by someone);
2) a ras bad page has been reserved (duplicate error injection for one page);

DRM_ERROR is unnecessary for the failure of bad page reserve

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:59 -05:00
Adam Zerella
879e723df3 docs: drm/amdgpu: Resolve build warnings
Some of the documentation formatting could be improved
which will resolve some Sphinx amdgpu build warnings e.g

WARNING: Unexpected indentation.
WARNING: Block quote ends without a blank line; unexpected unindent.
WARNING: Inline emphasis start-string without end-string.

Signed-off-by: Adam Zerella <adam.zerella@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Alex Deucher
63b2b5e91b drm/amdgpu/vm: fix documentation for amdgpu_vm_bo_param
Add new parameters.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha
143f230533 drm/amdgpu: psp DTM init
DTM is the display topology manager. This is needed to communicate with
psp about the display configurations.

This patch adds
    -Loading the firmware
    -The functions and definitions for communication with the firmware

v2: Fix formatting

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha
ed19a9a2bb drm/amdgpu: psp HDCP init
This patch adds
-Loading the firmware
-The functions and definitions for communication with the firmware

v2: Fix formatting

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:10:58 -05:00
Arnd Bergmann
29174a4310 drm/amdgpu: hide another #warning
An earlier patch of mine disabled some #warning statements
that get in the way of build testing, but then another
instance was added around the same time.

Remove that as well.

Fixes: b5203d16ae ("drm/amd/amdgpu: hide #warning for missing DC config")
Fixes: e1c14c4339 ("drm/amdgpu: Enable DC on Renoir")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Arnd Bergmann
ec3e5c0f0c drm/amdgpu: make pmu support optional, again
When CONFIG_PERF_EVENTS is disabled, we cannot compile the pmu
portion of the amdgpu driver:

drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:48:38: error: no member named 'hw' in 'struct perf_event'
        struct hw_perf_event *hwc = &event->hw;
                                     ~~~~~  ^
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:51:13: error: no member named 'attr' in 'struct perf_event'
        if (event->attr.type != event->pmu->type)
            ~~~~~  ^
...

The same bug was already fixed by commit d155bef063 ("amdgpu: make pmu
support optional") but broken again by what looks like an incorrectly
rebased patch.

Fixes: 64f55e6292 ("drm/amdgpu: Add RAS EEPROM table.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Navid Emamdoost
57be09c6e8 drm/amdgpu: fix multiple memory leaks in acp_hw_init
In acp_hw_init there are some allocations that needs to be released in
case of failure:

1- adev->acp.acp_genpd should be released if any allocation attemp for
adev->acp.acp_cell, adev->acp.acp_res or i2s_pdata fails.
2- all of those allocations should be released if
mfd_add_hotplug_devices or pm_genpd_add_device fail.
3- Release is needed in case of time out values expire.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Marek Olšák
815fb4c9d7 drm/amdgpu: return tcc_disabled_mask to userspace
UMDs need this for correct programming of harvested chips.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:33 -05:00
Alex Deucher
49379032aa drm/amdgpu: don't increment vram lost if we are in hibernation
We reset the GPU as part of our hibernation sequence so we need
to make sure we don't mark vram as lost in that case.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111879
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:58:19 -05:00
Christian König
69f08e68af drm/amdgpu: revert "disable bulk moves for now"
This reverts commit a213c2c7e2.

The changes to fix this should have landed in 5.1.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02 12:23:03 -05:00
Linus Torvalds
289991ce1c drm fixes for 5.4-rc1
core:
 - Some cleanups and fixes in the self-refresh helpers
 - Some cleanups and fixes in the atomic helpers
 
 amdgpu:
 - Fix a 64 bit divide
 - Prevent a memory leak in a failure case in dc
 - Load proper gfx firmware on navi14 variants
 - Add more navi12 and navi14 PCI ids
 - Misc fixes for renoir
 - Fix bandwidth issues with multiple displays on vega20
 - Support for Dali
 - Fix a possible oops with KFD on hawaii
 - Fix for backlight level after resume on some APUs
 - Other misc fixes
 
 panfrost:
 - Multiple panfrost fixes for regulator support and page fault handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdjZr6AAoJEAx081l5xIa+nboP+gPWzx45Q3IsbnaZmcdFTFEf
 +/XgScoFcv5Uhd3aXtrSYDvPnSyNXpGsV5ccE/FtxNd4G75n20tPFxGNhjzyXfdc
 B2x1IRgc82W1KxYwwDlmd+f/86h6uthFkh1ToKN3hsHWNm8Wu8AgoJnoWvqwluf9
 natSFnQPQIvcADpbpyk8FiNcXvMg0qrKQ8aj3uPxqUs4/ftigzez+5vYJOkktoEJ
 NFtlouVvIZejVo6l4Q5ebXXsol7On02iHUszpdJtb5FxMcBQwAafewCGln2622cA
 8ooWmekZNtoHUH3WmUlrs7TqPKtoOqIEkMO8UvCJDwBB4/ft8sJRPDKFgk547E/8
 Htv6MZXCSOT+/XxebM/wHijOg3MQVjPzO9s73YSmkytzGZVQ/Fgohl/6W+bN/xAm
 j/huS5ZozengAldFJHG4wruSk820Vzx736x2pk+9sbpf7PdFDIpuZus8U8wHc411
 hu3S2397IxyX4XswLg8BTaIOhCXwT7CluQ9mYD1THPgRzG5YPha8JelTcwwlVsD9
 2Cw6mCUAqydHHMboWQnEhRXhuhVfGlPAdJTsdyoI6zdXYqU/ThihJPBgG0wSq0y0
 fAsj/9NRqSzg6hk9vm1QdCeOthKOuAZ0PgLcVHI1RNSwEyrN8yOupVwe7+Mn+q2z
 UNbfr2qXGqKxn6rqUy2W
 =yCyF
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Fixes built up over the past 1.5 weeks or so, it's two weeks of
  amdgpu, some core cleanups and some panfrost fixes. I also finally
  figured out why my desktop was slow to do a bunch of stuff (someone
  gave it an IPv6 address which can't reach anything!).

  core:
   - Some cleanups and fixes in the self-refresh helpers
   - Some cleanups and fixes in the atomic helpers

  amdgpu:
   - Fix a 64 bit divide
   - Prevent a memory leak in a failure case in dc
   - Load proper gfx firmware on navi14 variants
   - Add more navi12 and navi14 PCI ids
   - Misc fixes for renoir
   - Fix bandwidth issues with multiple displays on vega20
   - Support for Dali
   - Fix a possible oops with KFD on hawaii
   - Fix for backlight level after resume on some APUs
   - Other misc fixes

  panfrost:
   - Multiple panfrost fixes for regulator support and page fault
     handling"

* tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm: (34 commits)
  drm/amd/display: prevent memory leak
  drm/amdgpu/gfx10: add support for wks firmware loading
  drm/amdgpu/display: include slab.h in dcn21_resource.c
  drm/amdgpu/display: fix 64 bit divide
  drm/panfrost: Prevent race when handling page fault
  drm/panfrost: Remove NULL checks for regulator
  drm/panfrost: Fix regulator_get_optional() misuse
  drm: Measure Self Refresh Entry/Exit times to avoid thrashing
  drm: Fix kerneldoc and remove unused struct member in self_refresh helper
  drm/atomic: Rename crtc_state->pageflip_flags to async_flip
  drm/atomic: Reject FLIP_ASYNC unconditionally
  drm/atomic: Take the atomic toys away from X
  drm/amdgpu: flag navi12 and 14 as experimental for 5.4
  drm/kms: Duct-tape for mode object lifetime checks
  drm/amdgpu: add navi12 pci id
  drm/amdgpu: add navi14 PCI ID for work station SKU
  drm/amdkfd: Swap trap temporary registers in gfx10 trap handler
  drm/amd/powerplay: implement sysfs for getting dpm clock
  drm/amd/display: Restore backlight brightness after system resume
  drm/amd/display: Implement voltage limitation for dali
  ...
2019-09-27 11:13:35 -07:00
Andrey Konovalov
35f3fc87be drm/amdgpu: untag user pointers
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.

In amdgpu_gem_userptr_ioctl() and amdgpu_amdkfd_gpuvm.c/init_user_pages()
an MMU notifier is set up with a (tagged) userspace pointer.  The untagged
address should be used so that MMU notifiers for the untagged address get
correctly matched up with the right BO.  This patch untag user pointers in
amdgpu_gem_userptr_ioctl() for the GEM case and in amdgpu_amdkfd_gpuvm_
alloc_memory_of_gpu() for the KFD case.  This also makes sure that an
untagged pointer is passed to amdgpu_ttm_tt_get_user_pages(), which uses
it for vma lookups.

Link: http://lkml.kernel.org/r/d684e1df08f2ecb6bc292e222b64fa9efbc26e69.1563904656.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Jens Wiklander <jens.wiklander@linaro.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:41 -07:00
Tianci.Yin
1e94b43813 drm/amdgpu/gfx10: add support for wks firmware loading
load different cp firmware according to the DID and RID

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-24 13:25:15 -05:00
Linus Torvalds
84da111de0 hmm related patches for 5.4
This is more cleanup and consolidation of the hmm APIs and the very
 strongly related mmu_notifier interfaces. Many places across the tree
 using these interfaces are touched in the process. Beyond that a cleanup
 to the page walker API and a few memremap related changes round out the
 series:
 
 - General improvement of hmm_range_fault() and related APIs, more
   documentation, bug fixes from testing, API simplification &
   consolidation, and unused API removal
 
 - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and
   make them internal kconfig selects
 
 - Hoist a lot of code related to mmu notifier attachment out of drivers by
   using a refcount get/put attachment idiom and remove the convoluted
   mmu_notifier_unregister_no_release() and related APIs.
 
 - General API improvement for the migrate_vma API and revision of its only
   user in nouveau
 
 - Annotate mmu_notifiers with lockdep and sleeping region debugging
 
 Two series unrelated to HMM or mmu_notifiers came along due to
 dependencies:
 
 - Allow pagemap's memremap_pages family of APIs to work without providing
   a struct device
 
 - Make walk_page_range() and related use a constant structure for function
   pointers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl1/nnkACgkQOG33FX4g
 mxqaRg//c6FqowV1pQlLutvAOAgMdpzfZ9eaaDKngy9RVQxz+k/MmJrdRH/p/mMA
 Pq93A1XfwtraGKErHegFXGEDk4XhOustVAVFwvjyXO41dTUdoFVUkti6ftbrl/rS
 6CT+X90jlvrwdRY7QBeuo7lxx7z8Qkqbk1O1kc1IOracjKfNJS+y6LTamy6weM3g
 tIMHI65PkxpRzN36DV9uCN5dMwFzJ73DWHp1b0acnDIigkl6u5zp6orAJVWRjyQX
 nmEd3/IOvdxaubAoAvboNS5CyVb4yS9xshWWMbH6AulKJv3Glca1Aa7QuSpBoN8v
 wy4c9+umzqRgzgUJUe1xwN9P49oBNhJpgBSu8MUlgBA4IOc3rDl/Tw0b5KCFVfkH
 yHkp8n6MP8VsRrzXTC6Kx0vdjIkAO8SUeylVJczAcVSyHIo6/JUJCVDeFLSTVymh
 EGWJ7zX2iRhUbssJ6/izQTTQyCH3YIyZ5QtqByWuX2U7ZrfkqS3/EnBW1Q+j+gPF
 Z2yW8iT6k0iENw6s8psE9czexuywa/Lttz94IyNlOQ8rJTiQqB9wLaAvg9hvUk7a
 kuspL+JGIZkrL3ouCeO/VA6xnaP+Q7nR8geWBRb8zKGHmtWrb5Gwmt6t+vTnCC2l
 olIDebrnnxwfBQhEJ5219W+M1pBpjiTpqK/UdBd92A4+sOOhOD0=
 =FRGg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull hmm updates from Jason Gunthorpe:
 "This is more cleanup and consolidation of the hmm APIs and the very
  strongly related mmu_notifier interfaces. Many places across the tree
  using these interfaces are touched in the process. Beyond that a
  cleanup to the page walker API and a few memremap related changes
  round out the series:

   - General improvement of hmm_range_fault() and related APIs, more
     documentation, bug fixes from testing, API simplification &
     consolidation, and unused API removal

   - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE,
     and make them internal kconfig selects

   - Hoist a lot of code related to mmu notifier attachment out of
     drivers by using a refcount get/put attachment idiom and remove the
     convoluted mmu_notifier_unregister_no_release() and related APIs.

   - General API improvement for the migrate_vma API and revision of its
     only user in nouveau

   - Annotate mmu_notifiers with lockdep and sleeping region debugging

  Two series unrelated to HMM or mmu_notifiers came along due to
  dependencies:

   - Allow pagemap's memremap_pages family of APIs to work without
     providing a struct device

   - Make walk_page_range() and related use a constant structure for
     function pointers"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits)
  libnvdimm: Enable unit test infrastructure compile checks
  mm, notifier: Catch sleeping/blocking for !blockable
  kernel.h: Add non_block_start/end()
  drm/radeon: guard against calling an unpaired radeon_mn_unregister()
  csky: add missing brackets in a macro for tlb.h
  pagewalk: use lockdep_assert_held for locking validation
  pagewalk: separate function pointers from iterator data
  mm: split out a new pagewalk.h header from mm.h
  mm/mmu_notifiers: annotate with might_sleep()
  mm/mmu_notifiers: prime lockdep
  mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
  mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports
  mm/hmm: hmm_range_fault() infinite loop
  mm/hmm: hmm_range_fault() NULL pointer bug
  mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
  mm/mmu_notifiers: remove unregister_no_release
  RDMA/odp: remove ib_ucontext from ib_umem
  RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
  RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
  RDMA/mlx5: Use ib_umem_start instead of umem.address
  ...
2019-09-21 10:07:42 -07:00
Alex Deucher
e16a7cbced drm/amdgpu: flag navi12 and 14 as experimental for 5.4
We can remove this later as things get closer to launch.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-18 08:29:30 -05:00
Tianci.Yin
10e85054f9 drm/amdgpu: add navi12 pci id
Add Navi12 PCI id support.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:49:19 -05:00
Tianci.Yin
a82e163bca drm/amdgpu: add navi14 PCI ID for work station SKU
Add the navi14 PCI device id.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:45:17 -05:00
Felix Kuehling
dcafbd50f2 drm/amdgpu: Fix KFD-related kernel oops on Hawaii
Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a574 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:38:35 -05:00
Prike Liang
4f3a2c1077 drm/amd/amdgpu: power up sdma engine when S3 resume back
The sdma_v4 should be ungated when the IP resume back,
otherwise it will hang up and resume time out error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:37:58 -05:00
Trek
73d8e6c7b8 drm/amdgpu: Check for valid number of registers to read
Do not try to allocate any amount of memory requested by the user.
Instead limit it to 128 registers. Actually the longest series of
consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and
mmGB_MACROTILE_MODE0-15 (0x2644-0x2673).

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273
Signed-off-by: Trek <trek00@inbox.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:37:45 -05:00
Aaron Liu
df794f679b drm/amdgpu: remove program of lbpw for renoir
These is no LBPW on Renoir. So removing program of lbpw for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:26:24 -05:00
Andrey Grodzovsky
103efdc1ea drm/amdgpu: Remove clock gating restore.
Restoring clock gating break SMU opeartion afterwards, avoid
this until this further invistigated with SMU.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 14:25:01 -05:00
Christian König
de7b45babd drm/amdgpu: cleanup creating BOs at fixed location (v2)
The placement is something TTM/BO internal and the RAS code should
avoid touching that directly.

Add a helper to create a BO at a fixed location and use that instead.

v2: squash in fixes (Alex)

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-17 08:06:54 -05:00
José Roberto de Souza
62afb4ad42 drm/connector: Allow max possible encoders to attach to a connector
Currently we restrict the number of encoders that can be linked to
a connector to 3, increase it to match the maximum number of encoders
that can be initialized(32).

To more effiently do that lets switch from an array of encoder ids to
bitmask.

v2: Fixing missed return on amdgpu_dm_connector_to_encoder()

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913232857.389834-2-jose.souza@intel.com
2019-09-16 15:13:53 -07:00
Andrey Grodzovsky
db338e1663 drm/amdgpu:Fix EEPROM checksum calculation.
Fix checksum calculation after manually resetting the table.
Unify reset and empty EEPROM init flow.
Protect the table reset with lock.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:44 -05:00
Guchun Chen
012dd14d1d drm/amdgpu: fix ras ctrl debugfs node leak
Use debugfs_remove_recursive to remove the whole debugfs
directory instead of removing the node one by one.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:38 -05:00
Christian König
1313dacfad drm/amdgpu: trace if a PD/PT update is done directly
This is usfull for debugging.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:32 -05:00
Christian König
bc51c1e56f drm/amdgpu: drop double HDP flush in the VM code
Already done in the CPU based backend code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:27 -05:00
Christian König
fc39d903eb drm/amdgpu: cleanup coding style in the VM code a bit
No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:30:13 -05:00
Christian König
03fb560f2e drm/amdgpu: revert "disable bulk moves for now"
This reverts commit a213c2c7e2.

The changes to fix this should have landed in 5.1.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:29:37 -05:00
Jiange Zhao
393993ac0c drm/amdgpu/SRIOV: Navi12 SRIOV VF gets GTT base
With changes in PSP and HV, SRIOV VF will handle

vram gtt location just like bare metal. There is

no need to differentiate it anymore.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:29:04 -05:00
Aaron Liu
28faa17ee8 drm/amdgpu: remove program of lbpw for renoir
These is no LBPW on Renoir. So removing program of lbpw for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 15:28:57 -05:00
zhong jiang
2032324682 drm/amdgpu: remove the redundant null checks
debugfs_remove and kfree has taken the null check in account.
hence it is unnecessary to check it. Just remove the condition.
No functional change.

This issue was detected by using the Coccinelle software.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:56 -05:00
Jean Delvare
ae2a349597 drm/amd: be quiet when no SAD block is found
It is fine for displays without audio functionality to not provide
any SAD block in their EDID. Do not log an error in that case,
just return quietly.

This fixes half of bug fdo#107825:
https://bugs.freedesktop.org/show_bug.cgi?id=107825

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:56 -05:00
Trek
13238d4fa6 drm/amdgpu: Check for valid number of registers to read
Do not try to allocate any amount of memory requested by the user.
Instead limit it to 128 registers. Actually the longest series of
consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and
mmGB_MACROTILE_MODE0-15 (0x2644-0x2673).

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273
Signed-off-by: Trek <trek00@inbox.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:56 -05:00
Kent Russell
3e103fc301 Revert "drm/amdgpu/nbio7.4: add hw bug workaround for vega20"
This reverts commit e01f2d4189.

VG20 did not require this workaround, as the fix is in the VBIOS.
Leave VG10/12 workaround as some older shipped cards do not have the
VBIOS fix in place, and the kernel workaround is required in those
situations

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
ec671737f8 drm/amdgpu: add graceful VM fault handling v3
Next step towards HMM support. For now just silence the retry fault and
optionally redirect the request to the dummy page.

v2: make sure the VM is not destroyed while we handle the fault.
v3: fix VM destroy check, cleanup comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
b65709a921 drm/amdgpu: reserve the root PD while freeing PASIDs
Free the pasid only while the root PD is reserved. This prevents use after
free in the page fault handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
061468c405 drm/amdgpu: allocate PDs/PTs with no_gpu_wait in a page fault
While handling a page fault we can't wait for other ongoing GPU
operations or we can potentially run into deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
0f6064d6af drm/amdgpu: allow direct submission of clears
For handling PD/PT clears directly in the fault handler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
acb476f541 drm/amdgpu: allow direct submission of PTE updates
For handling PTE updates directly in the fault handler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
807e299409 drm/amdgpu: allow direct submission of PDE updates v2
For handling PDE updates directly in the fault handler.

v2: fix typo in comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
47ca7efa4c drm/amdgpu: allow direct submission in the VM backends v2
This allows us to update page tables directly while in a page fault.

v2: use direct/delayed entities and still wait for moves

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
a2cf324785 drm/amdgpu: split the VM entity into direct and delayed
For page fault handling we need to use a direct update which can't be
blocked by ongoing user CS.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:42:55 -05:00
Christian König
6817bf283b drm/amdgpu: grab the id mgr lock while accessing passid_mapping
Need to make sure that we actually dropping the right fence.
Could be done with RCU as well, but to complicated for a fix.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:28 -05:00
Jiange Zhao
1b65782468 drm/amdgpu/SRIOV: Navi12 SRIOV VF doesn't load TOC
In SRIOV case, the autoload sequence is the same

as bare metal, except VF won't load TOC.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:21 -05:00
Jiange Zhao
a4ac7693f8 drm/amdgpu/SRIOV: Navi10/12 VF doesn't support SMU
In SRIOV case, SMU and powerplay are handled in HV.

VF shouldn't have control over SMU and powerplay.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:14 -05:00
Prike Liang
a90a24d581 drm/amd/amdgpu: power up sdma engine when S3 resume back
The sdma_v4 should be ungated when the IP resume back,
otherwise it will hang up and resume time out error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:16:07 -05:00
Jiange Zhao
b05b69036f drm/amdgpu: For Navi12 SRIOV VF, register mailbox functions
Mailbox functions and interrupts are only for Navi12 VF.

Register functions and irqs during initialization.

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:15:20 -05:00
Jack Zhang
51c0f58e9f drm/amdgpu/sriov: add ring_stop before ring_create in psp v11 code
psp  v11 code missed ring stop in ring create function(VMR)
while psp v3.1 code had the code. This will cause VM destroy1
fail and psp ring create fail.

For SIOV-VF, ring_stop should not be deleted in ring_create
function.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:11:28 -05:00
Evan Quan
0e0b89c0d7 drm/amd/powerplay: properly set mp1 state for SW SMU suspend/reset routine
Set mp1 state properly for SW SMU suspend/reset routine.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:10:58 -05:00
Felix Kuehling
d950800e79 drm/amdgpu: Fix KFD-related kernel oops on Hawaii
Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a574 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:10:06 -05:00
Andrey Grodzovsky
708901a666 drm/amdgpu: Fix mutex lock from atomic context.
Problem:
amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
because writing to EEPROM during ASIC reset was unstable.
But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
directly from ISR context and so locking is not allowed. Also it's
irrelevant for this partilcular interrupt as this is generic RAS
interrupt and not memory errors specific.

Fix:
Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:59 -05:00
Jiange Zhao
3636169cc0 drm/amdgpu: Add SRIOV mailbox backend for Navi1x
Mimic the ones for Vega10, add mailbox backend for Navi1x

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:52 -05:00
Guchun Chen
1a3f2e8c3c drm/amdgpu: implement ras query function for pcie bif
ras error query funtionality implementation

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:44 -05:00
Guchun Chen
d7bd680d40 drm/amdgpu: support pcie bif ras query and inject
Call pcie bif ras query/inject in amdgpu ras.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:30 -05:00
Guchun Chen
52652ef286 drm/amdgpu: add ras error query count interface for nbio
Add the interface query_ras_error_count for nbio.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:21 -05:00
Tianci.Yin
ff9d097193 drm/amdgpu: fix CPDMA hang in PRT mode for VEGA10
add and_mask since the programming logic of golden setting changed

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:14 -05:00
Hawking Zhang
f317035288 drm/amdgpu: enable error injection to XGMI block via debugfs
allow inject error to XGMI block via debugfs node ras_ctrl

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:09:08 -05:00
Hawking Zhang
029fbd437e drm/amdgpu: initialize ras structures for xgmi block (v2)
init ras common interface and fs node for xgmi block

v2: remove unnecessary physical node number check before
invoking amdgpu_xgmi_ras_late_init

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:08:51 -05:00
Andrey Grodzovsky
084fe13b2c drm/amdgpu: Allow to reset to EERPOM table.
The table grows quickly during debug/development effort when
multiple RAS errors are injected. Allow to avoid this by setting
table header back to empty if needed.

v2: Switch to debugfs entry instead of load time parameter.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:34 -05:00
Andrey Grodzovsky
d01b400b1a drm/amdgpu: Add amdgpu_ras_eeprom_reset_table
This will allow to reset the table on the fly.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:27 -05:00
Tao Zhou
d99659a062 drm/amdgpu: rename umc ras_init to err_cnt_init
this interface is related to specific version of umc, distinguish it
from ras_late_init

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:20 -05:00
Tao Zhou
4930aabe7c drm/amdgpu: move umc ras init to umc block
move umc ras init from ras module to umc block, generic ras module
should pay less attention to specific ras block.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:12 -05:00
Tao Zhou
86edcc7dba drm/amdgpu: move umc late init from gmc to umc block
umc late init is umc specific, it's more suitable to be put in umc block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:06:03 -05:00
Guchun Chen
1bd252c57b drm/amdgpu: remove duplicated header file include
amdgpu_ras.h is already included.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:29 -05:00
Shirish S
a35ad98bf9 drm/amdgpu: remove needless usage of #ifdef
define sched_policy in case CONFIG_HSA_AMD is not
enabled, with this there is no need to check for CONFIG_HSA_AMD
else where in driver code.

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:20 -05:00
Shirish S
8c9f69bc5c drm/amdgpu: fix build error without CONFIG_HSA_AMD
If CONFIG_HSA_AMD is not set, build fails:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function `amdgpu_device_ip_early_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to `sched_policy'

Use CONFIG_HSA_AMD to guard this.

Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling policy")

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:07 -05:00
Andrey Grodzovsky
4d1337d2e9 drm/amdgpu: Avoid RAS recovery init when no RAS support.
Fixes driver load regression on APUs.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:35 -05:00
Christian König
cbfae36cea drm/amdgpu: cleanup PTE flag generation v3
Move the ASIC specific code into a new callback function.

v2: mask the flags for SI and CIK instead of a BUG_ON().
v3: remove last missed BUG_ON().

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:29 -05:00
Christian König
71776b6dae drm/amdgpu: cleanup mtype mapping
Unify how we map the UAPI flags to the PTE hardware flags for a mapping.

Only the MTYPE is actually ASIC dependent, all other flags should be
copied over 1 to 1 and ASIC differences are handled later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:21 -05:00
Tianci.Yin
1dd077bbba drm/amdgpu: add navi14 PCI ID for work station SKU
Add the navi14 PCI device id.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:59:14 -05:00
Philip Yang
cde85ac247 drm/amdgpu: check if nbio->ras_if exist
To avoid NULL function pointer access. This happens on VG10, reboot
command hangs and have to power off/on to reboot the machine. This is
serial console log:

[  OK  ] Reached target Unmount All Filesystems.
[  OK  ] Reached target Final Step.
         Starting Reboot...
[  305.696271] systemd-shutdown[1]: Syncing filesystems and block
devices.
[  306.947328] systemd-shutdown[1]: Sending SIGTERM to remaining
processes...
[  306.963920] systemd-journald[1722]: Received SIGTERM from PID 1
(systemd-shutdow).
[  307.322717] systemd-shutdown[1]: Sending SIGKILL to remaining
processes...
[  307.336472] systemd-shutdown[1]: Unmounting file systems.
[  307.454202] EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro
[  307.480523] systemd-shutdown[1]: All filesystems unmounted.
[  307.486537] systemd-shutdown[1]: Deactivating swaps.
[  307.491962] systemd-shutdown[1]: All swaps deactivated.
[  307.497624] systemd-shutdown[1]: Detaching loop devices.
[  307.504418] systemd-shutdown[1]: All loop devices detached.
[  307.510418] systemd-shutdown[1]: Detaching DM devices.
[  307.565907] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[  307.731313] BUG: kernel NULL pointer dereference, address:
0000000000000000
[  307.738802] #PF: supervisor read access in kernel mode
[  307.744326] #PF: error_code(0x0000) - not-present page
[  307.749850] PGD 0 P4D 0
[  307.752568] Oops: 0000 [#1] SMP PTI
[  307.756314] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
5.2.0-rc1-kfd-yangp #453
[  307.764644] Hardware name: ASUS All Series/Z97-PRO(Wi-Fi ac)/USB 3.1,
BIOS 9001 03/07/2016
[  307.773580] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu]
[  307.779760] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48
8b b3 90 7d 00 00 48 c7 c7 17 b8 530
[  307.799967] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286
[  307.805585] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX:
0000000000000006
[  307.813261] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI:
ffff9eb29e350000
[  307.820935] RBP: ffff9eb299da0000 R08: 0000000000000000 R09:
0000000000000000
[  307.828609] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9eb299dbd1f8
[  307.836284] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15:
0000000000000000
[  307.843959] FS:  00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000)
knlGS:0000000000000000
[  307.852663] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  307.858842] CR2: 0000000000000000 CR3: 000000081d798005 CR4:
00000000001606e0
[  307.866516] Call Trace:
[  307.869169]  amdgpu_device_ip_suspend_phase2+0x80/0x110 [amdgpu]
[  307.875654]  ? amdgpu_device_ip_suspend_phase1+0x4d/0xd0 [amdgpu]
[  307.882230]  amdgpu_device_ip_suspend+0x2e/0x60 [amdgpu]
[  307.887966]  amdgpu_pci_shutdown+0x2f/0x40 [amdgpu]
[  307.893211]  pci_device_shutdown+0x31/0x60
[  307.897613]  device_shutdown+0x14c/0x1f0
[  307.901829]  kernel_restart+0xe/0x50
[  307.905669]  __do_sys_reboot+0x1df/0x210
[  307.909884]  ? task_work_run+0x73/0xb0
[  307.913914]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[  307.918970]  do_syscall_64+0x4a/0x1c0
[  307.922904]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  307.928336] RIP: 0033:0x7f0671cf8373
[  307.932176] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00
00 0f 1f 44 00 00 89 fa be 69 19 128
[  307.952384] RSP: 002b:00007ffdd1723d68 EFLAGS: 00000202 ORIG_RAX:
00000000000000a9
[  307.960527] RAX: ffffffffffffffda RBX: 0000000001234567 RCX:
00007f0671cf8373
[  307.968201] RDX: 0000000001234567 RSI: 0000000028121969 RDI:
00000000fee1dead
[  307.975875] RBP: 00007ffdd1723dd0 R08: 0000000000000000 R09:
0000000000000000
[  307.983550] R10: 0000000000000002 R11: 0000000000000202 R12:
00007ffdd1723dd8
[  307.991224] R13: 0000000000000000 R14: 0000001b00000004 R15:
00007ffdd17240c8
[  307.998901] Modules linked in: xt_MASQUERADE nfnetlink iptable_nat
xt_addrtype xt_conntrack nf_nat nf_cos
[  308.026505] CR2: 0000000000000000
[  308.039998] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu]
[  308.046180] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48
8b b3 90 7d 00 00 48 c7 c7 17 b8 530
[  308.066392] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286
[  308.072013] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX:
0000000000000006
[  308.079689] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI:
ffff9eb29e350000
[  308.087366] RBP: ffff9eb299da0000 R08: 0000000000000000 R09:
0000000000000000
[  308.095042] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9eb299dbd1f8
[  308.102717] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15:
0000000000000000
[  308.110394] FS:  00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000)
knlGS:0000000000000000
[  308.119099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  308.125280] CR2: 0000000000000000 CR3: 000000081d798005 CR4:
00000000001606e0
[  308.135304] printk: systemd-shutdow: 3 output lines suppressed due to
ratelimiting
[  308.143518] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x00000009
[  308.151798] Kernel Offset: 0x15000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xff)
[  308.171775] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x00000009 ]---

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:58:52 -05:00
Xiaojie Yuan
bfa603aa5e drm/amdgpu: fix null pointer deref in firmware header printing
v2: declare as (struct common_firmware_header *) type because
    struct xxx_firmware_header inherits from it

When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an
unallocated amdgpu_firmware_info instance.

This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have
optimized out such 'defined but not referenced' variable.

[ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
[ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0
[ 1120.818931] Oops: 0000 [#1] SMP
[ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid
[ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G           OE  ------------   3.10.0-1062.el7.x86_64 #1
[ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015
[ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000
[ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>]  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.990483] RSP: 0018:ffff991ee625f950  EFLAGS: 00010202
[ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000
[ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3
[ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000
[ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220
[ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30
[ 1121.032666] FS:  00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000
[ 1121.041000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0
[ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1121.068938] Call Trace:
[ 1121.071494]  [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu]
[ 1121.077886]  [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu]
[ 1121.085296]  [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu]
[ 1121.092534]  [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu]
[ 1121.099675]  [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu]
[ 1121.106888]  [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm]
[ 1121.113419]  [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140
[ 1121.120183]  [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu]
[ 1121.126919]  [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0
[ 1121.132622]  [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160
[ 1121.138607]  [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0
[ 1121.144766]  [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0
[ 1121.150507]  [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50
[ 1121.156422]  [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0
[ 1121.162213]  [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20
[ 1121.167771]  [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0
[ 1121.173590]  [<ffffffffa4eb4c94>] driver_register+0x64/0xf0
[ 1121.179345]  [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0
[ 1121.185593]  [<ffffffffc099f000>] ? 0xffffffffc099efff
[ 1121.190914]  [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu]
[ 1121.197101]  [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240
[ 1121.202901]  [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0
[ 1121.208598]  [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100
[ 1121.214894]  [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140
[ 1121.220698]  [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a
[ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7
[ 1121.247422] RIP  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1121.254432]  RSP <ffff991ee625f950>
[ 1121.258017] CR2: 000000000000000a
[ 1121.261427] ---[ end trace e98b35387ede75bd ]---

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Fixes: c5fb912653 ("drm/amdgpu: add firmware header printing for psp fw loading (v2)")
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:56:01 -05:00
Huang Rui
4042a18872 drm/amdkfd: enable renoir while device probes
This patch is to add asic flag to enable device probe during kfd init.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:50 -05:00
Huang Rui
aa978594cf drm/amdgpu: disable gfxoff while use no H/W scheduling policy
While gfxoff is enabled, the mmVM_XXX registers will be 0xfffffff while the GFX
is in "off" state. KFD queue creattion doesn't use ring based method, so it will
trigger a VM fault.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:41 -05:00
Felix Kuehling
7cae706193 drm/amdgpu: Disable retry faults in VMID0
There is no point retrying page faults in VMID0. Those faults are
always fatal.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-and-Tested-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:34 -05:00
Yong Zhao
4e66d7d215 drm/amdgpu: Add a kernel parameter for specifying the asic type
As more and more new asics start to reuse the old device IDs before
launch, there is a need to quickly override the existing asic type
corresponding to the reused device ID through a kernel parameter. With
this, engineers no longer need to rely on local hack patches,
facilitating cooperation across teams.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:25 -05:00
Alex Deucher
bb42eda284 drm/amdgpu/irq: check if nbio funcs exist
We need to check if the nbios funcs exist before
checking the individual pointers.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
2019-09-16 09:54:14 -05:00
Tao Zhou
1a6fc071e1 drm/amdgpu: move the call of ras recovery_init and bad page reserve to proper place
ras recovery_init should be called after ttm init,
bad page reserve should be put in front of gpu reset since i2c
may be unstable during gpu reset.
add cleanup for recovery_init and recovery_fini

v2: add more comment and print.
    remove cancel_work_sync in recovery_init.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:47 -05:00
Tao Zhou
87d2b92f1e drm/amdgpu: save umc error records
save umc error records to ras bad page array

v2: add bad pages before gpu reset
v3: add NULL check for adev->umc.funcs

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:40 -05:00
Tao Zhou
78ad00c903 drm/amdgpu: Hook EEPROM table to RAS
support eeprom records load and save for ras,
move EEPROM records storing to bad page reserving

v2: remove redundant check for con->eh_data

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:32 -05:00
Tao Zhou
9dc23a6325 drm/amdgpu: change ras bps type to eeprom table record structure
change bps type from retired page to eeprom table record, prepare for
saving umc error records to eeprom

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:26 -05:00
Andrey Grodzovsky
4bc2234077 drm/madgpu: Fix EEPROM Checksum calculation.
Fix typo which messed up the calculation.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:19 -05:00
Andrey Grodzovsky
4d25fba4e3 drm/amdgpu: Remove clock gating restore.
Restoring clock gating break SMU opeartion afterwards, avoid
this until this further invistigated with SMU.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:50:10 -05:00
Yong Zhao
050091ab6e drm/amdkfd: Query kfd device info by CHIP id instead of pci device id
This optimizes out the pci device id usage in KFD and makes the code
more maintainable.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:45 -05:00
Felix Kuehling
cd05c86510 drm/amdgpu: Disable page faults while reading user wptrs
These wptrs must be pinned and GPU accessible when this is called
from hqd_load functions. So they should never fault. This resolves
a circular lock dependency issue involving four locks including the
DQM lock and mmap_sem.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:38 -05:00
John Clements
337c200756 drm/amdgpu: clean up load TMR sequence
Removed redundant goto statement

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:11 -05:00
John Clements
4fb60b02fb drm/amdgpu: enable TA load support in Arcturus
Add support for loading XGMI/RAS FW

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:49:02 -05:00
Tao Zhou
c5b6e585b2 drm/amdgpu: change r type to int in gmc_v9_0_late_init
change r type from bool to int, suitable for both bool and int return
value

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:48:51 -05:00
Jack Zhang
f1d59e00ff drm/amd/amdgpu: add sw_fini interface for df_funcs
add sw_fini interface of df_funcs.
This interface will remove sysfs file of df_cntr_avail
function.

The old behavior only create sysfs of df_cntr_avail
in sw_init, but never remove it for lack of sw_fini
interface. With this,driver will report create
sysfs fail when it's loaded for the second time.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:42:15 -05:00
Hawking Zhang
9dc9134258 drm/amdgpu: init UMC & RSMU register base address
UMC RAS feature requires access to UMC & RSMU registers

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:42:08 -05:00
Hawking Zhang
1c70d3d9c4 drm/amdgpu/nbio: switch to amdgpu_nbio_ras_late_init helper function
amdgpu_nbio_ras_late_init is used to init nbio specfic
ras debugfs/sysfs node and nbio specific interrupt handler.
It can be shared among nbio generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:42:02 -05:00
Hawking Zhang
47930de4aa drm/amdgpu/mmhub: switch to amdgpu_mmhub_ras_late_init helper function
amdgpu_mmhub_ras_late_init is used to init mmhub specfic
ras debugfs/sysfs node and mmhub specific interrupt handler.
It can be shared among mmhub generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:55 -05:00
Hawking Zhang
bfcf62c2a5 drm/amdgpu/sdma: switch to amdgpu_sdma_ras_late_init helper function
amdgpu_sdma_ras_late_init is used to init sdma specfic
ras debugfs/sysfs node and sdma specific interrupt handler.
It can be shared among sdma generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:49 -05:00
Hawking Zhang
6caeee7a70 drm/amdgpu/gfx: switch to amdgpu_gfx_ras_late_init helper function
amdgpu_gfx_ras_late_init is used to init gfx specfic
ras debugfs/sysfs node and gfx specific interrupt handler.
It can be shared among gfx generations

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:42 -05:00
Hawking Zhang
a85eff14da drm/amdgpu/gmc: switch to amdgpu_gmc_ras_late_init helper function
amdgpu_gmc_ras_late_init is used to init gmc specfic
ras debugfs/sysfs node and gmc specific interrupt handler.
It can be shared among gmc generations.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:36 -05:00
Hawking Zhang
d094aea312 drm/amdgpu: set ip specific ras interface pointer to NULL after free it
to prevent access to dangling pointers

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:29 -05:00
Andrey Grodzovsky
d5ea093eeb dmr/amdgpu: Add system auto reboot to RAS.
In case of RAS error allow user configure auto system
reboot through ras_ctrl.
This is also part of the temproray work around for the RAS
hang problem.

v4: Use latest kernel API for disk sync.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:17 -05:00
Andrey Grodzovsky
7c6e68c777 drm/amdgpu: Avoid HW GPU reset for RAS.
Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:41:05 -05:00
Andrey Grodzovsky
12ffa55da6 drm/amdgpu: Fix bugs in amdgpu_device_gpu_recover in XGMI case.
Issue 1:
In  XGMI case amdgpu_device_lock_adev for other devices in hive
was called to late, after access to their repsective schedulers.
So relocate the lock to the begining of accessing the other devs.

Issue 2:
Using amdgpu_device_ip_need_full_reset to switch the device list from
all devices in hive to the single 'master' device who owns this reset
call is wrong because when stopping schedulers we iterate all the devices
in hive but when restarting we will only reactivate the 'master' device.
Also, in case amdgpu_device_pre_asic_reset conlcudes that full reset IS
needed we then have to stop schedulers for all devices in hive and not
only the 'master' but with amdgpu_device_ip_need_full_reset  we
already missed the opprotunity do to so. So just remove this logic and
always stop and start all schedulers for all devices in hive.

Also minor cleanup and print fix.

v4: Minor coding style fix.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:39:05 -05:00
Christian König
43ce6bab7b drm/amdgpu: remove amdgpu_cs_try_evict
Trying to evict things from the current working set doesn't work that
well anymore because of per VM BOs.

Rely on reserving VRAM for page tables to avoid contention.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:38:56 -05:00
Christian König
9d1b3c7805 drm/amdgpu: reserve at least 4MB of VRAM for page tables v2
This hopefully helps reduce the contention for page tables.

v2: adjust maximum reported VRAM size as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:38:47 -05:00
Christian König
629be20395 drm/amdgpu: use moving fence instead of exclusive for VM updates
Make VM updates depend on the moving fence instead of the exclusive one.

Makes it less likely to actually have a dependency.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:38:38 -05:00
Hawking Zhang
39857252e5 drm/amdgpu: only apply gds clearing workaround when ras is supported
gds clearing workaround should only be applied on asics that support gfx ras

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:29 -05:00
Hawking Zhang
8bf2485aec drm/amdgpu: fix memory leak when ras is not supported on specific ip block
free ras_if if ras is not supported

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:22 -05:00
Hawking Zhang
4ce71be67b drm/amdgpu: check mmhub_funcs pointer before refering to it
mmhub callback functions are not initialized for all the ASICs

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:16 -05:00
Felix Kuehling
17da41bf00 drm/amdgpu: Remove unnecessary TLB workaround (v2)
This workaround is better handled in user mode in a way that doesn't
require allocating extra memory and breaking userptr BOs.

The TLB bug is a performance bug, not a functional or security bug.
Hence it is safe to remove this kernel part of the workaround to
allow a better workaround using only virtual address alignments in
user mode.

v2: Removed VI_BO_SIZE_ALIGN definition

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:08 -05:00
Felix Kuehling
e0253d083c drm/amdgpu: Use optimal mtypes and PTE bits for Arcturus
For compute VRAM allocations on Arturus use the new RW mtype
for non-coherent local memory, CC mtype for coherent local
memory and PTE_SNOOPED bit for invalidating non-dirty cache
lines on remote XGMI mappings.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:36:01 -05:00
Felix Kuehling
d0ba51b1ca drm/amdgpu: Determing PTE flags separately for each mapping (v3)
The same BO can be mapped with different PTE flags by different GPUs.
Therefore determine the PTE flags separately for each mapping instead
of storing them in the KFD buffer object.

Add a helper function to determine the PTE flags to be extended with
ASIC and memory-type-specific logic in subsequent commits.

v2: Split Arcturus-specific MTYPE changes into separate commit
v3: Fix return type of get_pte_flags to uint64_t

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:35:55 -05:00
Oak Zeng
093e48c04d drm/amdgpu: Support new arcturus mtype
Arcturus repurposed mtype WC to RW. Modify gmc functions
to support the new mtype

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:35:48 -05:00
Hawking Zhang
22e1d14fef drm/amdgpu: switch to amdgpu_ras_late_init for nbio v7_4 (v2)
call helper function in late init phase to handle ras init
for nbio ip block

v2: init local var r to 0 in case the function return failure
on asics that don't have ras_late_init implementation

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
9ad1dc295b drm/amdgpu: add ras_late_init callback function for nbio v7_4 (v3)
ras_late_init callback function will be used to do common ras
init in late init phase.

v2: call ras_late_fini to do cleanup when fails to enable interrupt

v3: rename sysfs/debugfs node name to pcie_bif_xxx

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
dda79907a7 drm/amdgpu: add mmhub ras_late_init callback function (v2)
The function will be called in late init phase to do mmhub
ras init

v2: check ras_late_init function pointer before invoking the
function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
2452e7783c drm/amdgpu: switch to amdgpu_ras_late_init for gmc v9 block (v2)
call helper function in late init phase to handle ras init
for gmc ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:05 -05:00
Hawking Zhang
7d0a31e8cc drm/amdgpu: switch to amdgpu_ras_late_init for sdma v4 block (v2)
call helper function in late init phase to handle ras init
for sdma ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
63fa48db49 drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2)
call helper function in late init phase to handle ras init
for gfx ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
b293e891b0 drm/amdgpu: add helper function to do common ras_late_init/fini (v3)
In late_init for ras, the helper function will be used to
1). disable ras feature if the IP block is masked as disabled
2). send enable feature command if the ip block was masked as enabled
3). create debugfs/sysfs node per IP block
4). register interrupt handler

v2: check ih_info.cb to decide add interrupt handler or not

v3: add ras_late_fini for cleanup all the ras fs node and remove
interrupt handler

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
a344db8e5e drm/amdgpu: poll ras_controller_irq and err_event_athub_irq status
For the hardware that can not enable BIF ring for IH cookies for both
ras_controller_irq and err_event_athub_irq, the driver has to poll the
status register in irq handling and ack the hardware properly when there
is interrupt triggered

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
4e644fffb5 drm/amdgpu: add ras_controller and err_event_athub interrupt support
Ras controller interrupt and Ras err event athub interrupt are two dedicated
interrupts for RAS support.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:04 -05:00
Hawking Zhang
4241863afc drm/amdgpu/nbio: add functions to query ras specific interrupt status
ras_controller_interrupt and err_event_interrupt are ras specific interrupts.
add functions to check their status and ack them if they are generated. both
funcitons should only be invoked in ISR when BIF ring is disabled or even not
initialized.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:03 -05:00
Hawking Zhang
bebc076285 drm/amdgpu: switch to new amdgpu_nbio structure
no functional change, just switch to new structures

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:03 -05:00
Hawking Zhang
078ef4e932 drm/amdgpu: add new amdgpu nbio header file
More nbio funcitonalities will be added and nbio could
be treated as an ip block like gfx/sdma.etc

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:03 -05:00
Gerd Hoffmann
e7bf74d0aa drm/amdgpu: switch to gem vma offset manager
Pass gem vma_offset_manager to ttm_bo_device_init(), so ttm uses it
instead of its own embedded struct.  This makes some gem functions
(specifically drm_gem_object_lookup) work on ttm objects.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190905070509.22407-6-kraxel@redhat.com
2019-09-11 08:03:24 +02:00
Gerd Hoffmann
9d6f4484e8 drm/ttm: turn ttm_bo_device.vma_manager into a pointer
Rename the embedded struct vma_offset_manager, new name is _vma_manager.
ttm_bo_device.vma_manager changed to a pointer.

The ttm_bo_device_init() function gets an additional vma_manager
argument which allows to initialize ttm with a different vma manager.
When passing NULL the embedded _vma_manager is used.

All callers are updated to pass NULL, so the behavior doesn't change.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190905070509.22407-2-kraxel@redhat.com
2019-09-11 08:03:23 +02:00
Dave Airlie
9ed45a209a Merge tag 'drm-next-5.4-2019-08-30' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.4-2019-08-30:

amdgpu:
- Add DC support for Renoir
- Add some GPUVM hw bug workarounds
- add support for the smu11 i2c controller
- GPU reset vram lost bug fixes
- Navi1x powergating fixes
- Navi12 power fixes
- Renoir power fixes
- Misc bug fixes and cleanups

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830212650.5055-1-alexander.deucher@amd.com
2019-09-06 16:40:28 +10:00
Petr Cvek
20c14ee135 drm/amdgpu: Fix undefined dm_ip_block for navi12
There is missing "if defined" CONFIG_DRM_AMD_DC block for non DC
configurations. This will cause link error. The patch is fixing that.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110979
Signed-off-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-30 15:37:17 -05:00
Aaron Liu
537e3bbfee drm/amdgpu: fix no interrupt issue for renoir emu (v2)
In renoir's vega10_ih model, there's a security change in mmIH_CHICKEN
register, that limits IH to use physical address (FBPA, GPA) directly.
Those chicken bits need to be programmed first.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-30 15:37:17 -05:00
Andrey Grodzovsky
0b2d2c2eec drm/amdgpu: Handle job is NULL use case in amdgpu_device_gpu_recover
This should be checked at all places job is accessed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-30 15:02:39 -05:00
Roman Li
e1c14c4339 drm/amdgpu: Enable DC on Renoir
Enable DC support for renoir.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:34 -05:00
Prike Liang
334ffd0daa drm/amdgpu: Initialize and update SDMA power gating
Init SDMA HW base configuration and enable idle INT for rn.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Tianci.Yin
12842d02c7 drm/amdgpu/psp: keep TMR in visible vram region for SRIOV
Fix compute ring test failure in sriov scenario.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Tianci.Yin
994dcfaa7e drm/amdgpu: keep the stolen memory in visible vram region
stolen memory should be fixed in visible region.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Colin Ian King
92ead9fa6f drm/amdgpu: fix spelling mistake "jumpimng" -> "jumping"
There is a spelling mistake in a DRM_DEBUG_DRIVER debug message.
Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Alex Deucher
53fd9b5ae8 drm/amdgpu/virtual_dce: drop error message in hw_init
No need to add new asic cases.  This is a sw display
implementation, so just drop the error message so when
we add new asics, all we have to do is add the virtual
dce IP module.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Jean Delvare
77efe48a72 drm/amdgpu/si: fix ASIC tests
Comparing adev->family with CHIP constants is not correct.
adev->family can only be compared with AMDGPU_FAMILY constants and
adev->asic_type is the struct member to compare with CHIP constants.
They are separate identification spaces.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 62a3755341 ("drm/amdgpu: add si implementation v10")
Cc: Ken Wang <Qingqing.Wang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Jean Delvare
1cdd229bec drm/amd/amdgpu: hide voltage and power sensors on SI and KV parts
The driver does not support these sensors yet and there is no point in
creating sysfs attributes which will always return an error.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Monk Liu
e352625796 drm/amdgpu: introduce vram lost for reset (v2)
for SOC15/vega10 the BACO reset & mode1 would introduce vram lost
in high end address range, current kmd's vram lost checking cannot
catch it since it only check very ahead visible frame buffer

v2:
cover NV as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Xiaojie Yuan
5ef3b8acdc drm/amdgpu: enable athub powergating for navi12
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Xiaojie Yuan
c1653ea05b drm/amdgpu: enable vcn powergating for navi12
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:31 -05:00
Hawking Zhang
317f9cc97b drm/amdgpu: correct in_suspend setting for navi series
in_suspend flag should be set in amdgpu_device_suspend/resume in pairs,
instead of gfx10 ip suspend/resume function.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:26 -05:00
Aaron Liu
c072b0c24e drm/amdgpu: fix GFXOFF on Picasso and Raven2
For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8),
firmware is sufficient to support gfxoff.
In commit 98f58ada2d, for picasso&raven2,
return directly and cause gfxoff disabled.

Fixes: 98f58ada2d ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible")
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27 10:38:40 -05:00
Kai-Heng Feng
c7b33cfb3c drm/amdgpu: Add APTX quirk for Dell Latitude 5495
Needs ATPX rather than _PR3 to really turn off the dGPU. This can save
~5W when dGPU is runtime-suspended.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27 10:09:12 -05:00
Andrey Grodzovsky
691bac9d09 drm/amdgpu: Vega20 SMU I2C HW engine controller.
Implement HW I2C enigne controller to be used by the RAS EEPROM
table manager. This is based on code from ATITOOLs.

v2:
Rename the file and all function prefixes to smu_v11_0_i2c

By Luben's observation always fill the TX fifo to full so
we don't have garbadge interpreted by the slave as valid data.

v3:
Remove preemption disable as the HW I2C controller will not
stop the clock on empty TX fifo and so it's not critical to
keep not empty queue.
Switch to fast mode 400 khz SCL clock for faster read and write.

v5:
Restore clock gating before releasing I2C bus and fix some
style comments.

v6:
squash in warning fix, fix includes (Alex)

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Luben Tuikov <Luben.Tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27 09:17:35 -05:00
Andrey Grodzovsky
64f55e6292 drm/amdgpu: Add RAS EEPROM table.
Add RAS EEPROM table manager to eanble RAS errors to be stored
upon appearance and retrived on driver load.

v2: Fix some prints.

v3:
Fix checksum calculation.
Make table record and header structs packed to do correct byte value sum.
Fix record crossing EEPROM page boundry.

v4:
Fix byte sum val calculation for record - look at sizeof(record).
Fix some style comments.

v5: Add description to EEPROM_TABLE_RECORD_SIZE and syntax fixes.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Luben Tuikov <Luben.Tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27 08:17:14 -05:00
Gang Ba
250af743c0 Revert "drm/amdgpu: free up the first paging queue v2"
This reverts commit 4f8bc72fbf.

It turned out that a single reserved queue wouldn't be
sufficient for page fault handling.

Signed-off-by: Gang Ba <gaba@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27 08:16:26 -05:00
Xiaojie Yuan
534991731c drm/amdgpu: add dummy read for some GCVM status registers
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where status
registers requiring HW to update have a 1 cycle delay, due to the
register update having to go through GRBM.

SW may operate on an incorrect value if they write a register and
immediately check the corresponding status register.

Registers requiring HW to clear or set fields may be delayed by 1 cycle.
For example,

1. write VM_INVALIDATE_ENG0_REQ mask = 5a
2. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a
    a. HW will reset VM_INVALIDATE_ENG0_ACK = 0 until invalidation is complete
3. write VM_INVALIDATE_ENG0_REQ mask = 5a
4. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a
    a. First read of VM_INVALIDATE_ENG0_ACK = 5a instead of 0
    b. Second read of VM_INVALIDATE_ENG0_ACK = 0 because
       the remote GRBM h/w register takes one extra cycle to be cleared
    c. In this case, SW will see a false ACK if they exit on first read

Affected registers (only GC variant)  |  Recommended Dummy Read
--------------------------------------+----------------------------
VM_INVALIDATE_ENG*_ACK                |  VM_INVALIDATE_ENG*_REQ
VM_L2_STATUS                          |  VM_L2_STATUS
VM_L2_PROTECTION_FAULT_STATUS         |  VM_L2_PROTECTION_FAULT_STATUS
VM_L2_PROTECTION_FAULT_ADDR_HI/LO32   |  VM_L2_PROTECTION_FAULT_ADDR_HI/LO32
VM_L2_IH_LOG_BUSY                     |  VM_L2_IH_LOG_BUSY
MC_VM_L2_PERFCOUNTER_HI/LO            |  MC_VM_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER_HI/LO              |  ATC_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER2_HI/LO             |  ATC_L2_PERFCOUNTER2_HI/LO

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27 08:15:32 -05:00
Dave Airlie
578d2342ec Merge tag 'drm-next-5.4-2019-08-23' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.4-2019-08-23:

amdgpu:
- Enable power features on Navi12
- Enable power features on Arcturus
- RAS updates
- Initial Renoir APU support
- Enable power featyres on Renoir
- DC gamma fixes
- DCN2 fixes
- GPU reset support for Picasso
- Misc cleanups and fixes

scheduler:
- Possible race fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823202620.3870-1-alexander.deucher@amd.com
2019-08-27 17:22:15 +10:00
Alex Deucher
bad4c3e665 drm/amdgpu: set adev->num_vmhubs for gmc6,7,8
So that we properly handle them on older asics.

Fixes: 3ff985485b ("drm/amdgpu: Export function to flush TLB of specific vm hub")
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-23 11:35:25 -05:00
Guchun Chen
64cc5414fb drm/amdgpu: correct ras error count type
Use unsigned long type for the same ras count variable.
This will avoid overflow on 64 bit system.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-23 11:30:32 -05:00
Gerd Hoffmann
35616a4aa9 drm: drop resource_id parameter from drm_fb_helper_remove_conflicting_pci_framebuffers
Not needed any more for remove_conflicting_pci_framebuffers calls.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20190822090645.25410-3-kraxel@redhat.com
2019-08-23 10:48:31 +02:00
Thong Thai
8540098492 drm/amdgpu: enable VCN DPG for Renoir
This will enable indirect SRAM loading for VCN DPG mode initialization.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:48:46 -05:00
Thong Thai
134b1461ea Revert "drm/amdgpu: use direct loading on renoir vcn for the moment"
This reverts commit 444a0fea51.

We are ready to enable it now.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:48:46 -05:00
Aaron Liu
f13580a947 drm/amdgpu: update gc/sdma goldensetting for rn
This patch updates gc/sdma goldensetting for renoir

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:48:46 -05:00
Prike Liang
9a868d8bbb drm/amdgpu: enable SDMA power gating for rn
Enable SDMA PG flag during device ip early init.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
91c5b6b326 drm/amdgpu/sdma4: set sdma clock gating for rn
Add support for SDMA clockgating on RN.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
2f47d6492b drm/amdgpu/mmhub1: set mmhub clock gating for rn
setup mmhub clockgating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
8db63b7c38 drm/amdgpu: enable DF clock gating for rn
Enable DF clock gating during DF IP early init.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
e2ef3b70e8 drm/amdgpu: enable athub clock gating for rn
Enable athub MG and LS clock gating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
91ec8bbb88 drm/amdgpu: enable IH clock gating for rn
Enable IH clock gating during IH block initialized.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
753c929cc7 drm/amdgpu: enable vcn clock gating for rn
Enable VCN middle grain clock gating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
de273070c5 drm/amdgpu: enable rom clock gating for rn
Enable rom light sleep clock gating.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
9deac0a415 drm/amdgpu: enable HDP clock gating for rn
Enable HDP light sleep clock gating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
d98930f52e drm/amdgpu: enable BIF clock gating for rn
Enable BIF light sleep clock gating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
ef0e7d08a5 drm/amdgpu: enable sdma clock gating for rn
Enable sdma middle grain and light sleep clock gating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
a2d15255ea drm/amdgpu: enable mmhub clock gating for rn
Enable mmhub midle grain and light sleep clock gating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Prike Liang
ec3636a53a drm/amdgpu: enable gfx clock gating for rn
Enable gfx cg/mg/cp etc clock gating.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:40:58 -05:00
Aaron Liu
9f21e9ee7f drm/amdgpu: add and enable gfxoff feature
This patch updates gfxoff feature.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:37:39 -05:00
Aaron Liu
1268795511 drm/amdgpu: add set_gfx_cgpg implement (v2)
add set_gfx_cgpg implement

v2: check if using sw_smu (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:37:36 -05:00
Aaron Liu
97222cfac7 drm/amdgpu/powerplay: add power up/down SDMA interfaces for renoir
1.Implement PowerUpSDMA/PowerDownSDMA interfaces in the swSMU for renoir
2.adjust smu ip block ahead of gfx&sdma ip block

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:37:05 -05:00
Aaron Liu
5dbbe6a77d drm/amdgpu/powerplay: add smu ip block for renoir (v2)
add swSMU [smu_v12_0] for renoir

v2: whitespace fixes (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:36:58 -05:00
Xiaojie Yuan
9e48495017 drm/amdgpu/sdma5: fix number of sdma5 trap irq types for navi1x
v2: set num_types based on num_instances

navi1x has 2 sdma engines but commit
"e7b58d03b678 drm/amdgpu: reorganize sdma v4 code to support more instances"
changes the max number of sdma irq types (AMDGPU_SDMA_IRQ_LAST) from 2 to 8
which causes amdgpu_irq_gpu_reset_resume_helper() to recover irq of sdma
engines with following logic:

(enable irq for sdma0) * 1 time
(enable irq for sdma1) * 1 time
(disable irq for sdma1) * 6 times

as a result, after gpu reset, interrupt for sdma1 is lost.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:25:01 -05:00
Christian König
75e1cafde1 drm/amdgpu: fix dma_fence_wait without reference
We need to grab a reference to the fence we wait for.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:17:53 -05:00
Frank.Min
ea207b29ae amd/amdgpu: add Arcturus vf DID support
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Frank.Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:17:12 -05:00
Frank.Min
9d4f837aa0 drm/amdgpu: unity mc base address for arcturus
arcturus for sriov would use the unified mc base address

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Frank.Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:15:21 -05:00
Frank.Min
81c274c473 drm/amdgpu: disable agp for sriov
Since agp is not used for sriov, just disable it

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Frank.Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-22 17:15:06 -05:00
YueHaibing
252d2a5246 drm/amdgpu: remove duplicated include from gfx_v9_0.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:18:54 -05:00
YueHaibing
6892c1f866 drm/amdgpu: remove set but not used variable 'psp_enabled'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdgpu/nv.c: In function 'nv_common_early_init':
drivers/gpu/drm/amd/amdgpu/nv.c:471:7: warning:
 variable 'psp_enabled' set but not used [-Wunused-but-set-variable]

It's not used since inroduction in
commit c6b6a42175 ("drm/amdgpu: add navi10 common ip block (v3)")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:18:51 -05:00
Nicolai Hähnle
5a6a4c9d1b drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl
Error out if the AMDGPU_CS ioctl is called with multiple SYNCOBJ_OUT and/or
TIMELINE_SIGNAL chunks, since otherwise the last chunk wins while the
allocated array as well as the reference counts of sync objects are leaked.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:18:38 -05:00
Kenneth Feng
6da6c27928 drm/amd/amdgpu: disable MMHUB PG for navi10
Disable MMHUB PG for navi10 according to the production requirement.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:18:05 -05:00
Evan Quan
9aef809b5c drm/amd/powerplay: expose supported clock domains only through sysfs
Do not expose those unsupported clock domains through sysfs on
Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:17:28 -05:00
Tianci.Yin
828d6fde7f drm/amdgpu/psp: move TMR to cpu invisible vram region
so that more visible vram can be available for umd.

Reviewed-by: Christian König <christian.koenig@amd.com>.
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:16:45 -05:00
Xiaojie Yuan
50e275e880 drm/amdgpu: remove redundant argument for psp_funcs::cmd_submit callback
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:16:37 -05:00
Feifei Xu
51bfac71ca drm/amdgpu: Set no-retry as default.
This is to improve performance.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Tested-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:16:18 -05:00
Xiaojie Yuan
c5fb912653 drm/amdgpu: add firmware header printing for psp fw loading (v2)
firmware header information is printed for direct fw loading but not
added for psp fw loading yet

v2: squash in warning fix (Alex)

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:16:18 -05:00
Xiaojie Yuan
6c2243efa0 drm/amdgpu: fix debug level for ppt offset/size
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:15:28 -05:00
Xiaojie Yuan
cc216214ac drm/amdgpu: remove special autoload handling for navi12
s/r list in rlc firmware is ready, so remove the special autoload handling

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21 22:15:14 -05:00
Alex Deucher
b05f65d772 drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible
We need to set certain power gating flags after we determine
if the firmware version is sufficient to support gfxoff.
Previously we set the pg flags in early init, but we later
we might have disabled gfxoff if the firmware versions didn't
support it.  Move adding the additional pg flags after we
determine whether or not to support gfxoff.

Fixes: 005440066f ("drm/amdgpu: enable gfxoff again on raven series (v2)")
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
2019-08-21 22:15:13 -05:00
Jason Gunthorpe
daa138a58c Merge branch 'odp_fixes' into hmm.git
From rdma.git

Jason Gunthorpe says:

====================
This is a collection of general cleanups for ODP to clarify some of the
flows around umem creation and use of the interval tree.
====================

The branch is based on v5.3-rc5 due to dependencies, and is being taken
into hmm.git due to dependencies in the next patches.

* odp_fixes:
  RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
  RDMA/mlx5: Use ib_umem_start instead of umem.address
  RDMA/core: Make invalidate_range a device operation
  RDMA/odp: Use kvcalloc for the dma_list and page_list
  RDMA/odp: Check for overflow when computing the umem_odp end
  RDMA/odp: Provide ib_umem_odp_release() to undo the allocs
  RDMA/odp: Split creating a umem_odp from ib_umem_get
  RDMA/odp: Make the three ways to create a umem_odp clear
  RMDA/odp: Consolidate umem_odp initialization
  RDMA/odp: Make it clearer when a umem is an implicit ODP umem
  RDMA/odp: Iterate over the whole rbtree directly
  RDMA/odp: Use the common interval tree library instead of generic
  RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-21 20:58:18 -03:00
Dave Airlie
5f680625d9 drm-misc-next for 5.4:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - dma-buf: add reservation_object_fences helper, relax
              reservation_object_add_shared_fence, remove
              reservation_object seq number (and then
              restored)
   - dma-fence: Shrinkage of the dma_fence structure,
                Merge dma_fence_signal and dma_fence_signal_locked,
                Store the timestamp in struct dma_fence in a union with
                cb_list
 
 Driver Changes:
   - More dt-bindings YAML conversions
   - More removal of drmP.h includes
   - dw-hdmi: Support get_eld and various i2s improvements
   - gm12u320: Few fixes
   - meson: Global cleanup
   - panfrost: Few refactors, Support for GPU heap allocations
   - sun4i: Support for DDC enable GPIO
   - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02,
                 Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1
                 Toppoly TD043MTEA1
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXVqvpwAKCRDj7w1vZxhR
 xa3RAQDzAnt5zeesAxX4XhRJzHoCEwj2PJj9Re6xMJ9PlcfcvwD+OS+bcB6jfiXV
 Ug9IBd/DqjlmD9G9MxFxfSV946rksAw=
 =8uv4
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-08-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.4:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - dma-buf: add reservation_object_fences helper, relax
             reservation_object_add_shared_fence, remove
             reservation_object seq number (and then
             restored)
  - dma-fence: Shrinkage of the dma_fence structure,
               Merge dma_fence_signal and dma_fence_signal_locked,
               Store the timestamp in struct dma_fence in a union with
               cb_list

Driver Changes:
  - More dt-bindings YAML conversions
  - More removal of drmP.h includes
  - dw-hdmi: Support get_eld and various i2s improvements
  - gm12u320: Few fixes
  - meson: Global cleanup
  - panfrost: Few refactors, Support for GPU heap allocations
  - sun4i: Support for DDC enable GPIO
  - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02,
                Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1
                Toppoly TD043MTEA1

Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: fixup dma_resv rename fallout]

From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190819141923.7l2adietcr2pioct@flea
2019-08-21 16:44:41 +10:00
Jason Gunthorpe
c7d8b7824f hmm: use mmu_notifier_get/put for 'struct hmm'
This is a significant simplification, it eliminates all the remaining
'hmm' stuff in mm_struct, eliminates krefing along the critical notifier
paths, and takes away all the ugly locking and abuse of page_table_lock.

mmu_notifier_get() provides the single struct hmm per struct mm which
eliminates mm->hmm.

It also directly guarantees that no mmu_notifier op callback is callable
while concurrent free is possible, this eliminates all the krefs inside
the mmu_notifier callbacks.

The remaining krefs in the range code were overly cautious, drivers are
already not permitted to free the mirror while a range exists.

Link: https://lore.kernel.org/r/20190806231548.25242-6-jgg@ziepe.ca
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20 09:35:02 -03:00
Chris Wilson
b016cd6ed4 dma-buf: Restore seqlock around dma_resv updates
This reverts
67c97fb79a ("dma-buf: add reservation_object_fences helper")
dd7a7d1ff2 ("drm/i915: use new reservation_object_fences helper")
0e1d8083bd ("dma-buf: further relax reservation_object_add_shared_fence")
5d344f58da ("dma-buf: nuke reservation_object seq number")

The scenario that defeats simply grabbing a set of shared/exclusive
fences and using them blissfully under RCU is that any of those fences
may be reallocated by a SLAB_TYPESAFE_BY_RCU fence slab cache. In this
scenario, while keeping the rcu_read_lock we need to establish that no
fence was changed in the dma_resv after a read (or full) memory barrier.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190814182401.25009-1-chris@chris-wilson.co.uk
2019-08-16 12:40:58 +01:00
Andrey Grodzovsky
c43b849f89 drm/amdgpu: Use new mode2 reset interface for RV.
Integrate the mode2 reset into rest sequence.

v2:
Check ppfuncs pointer for NULL

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 11:00:44 -05:00
Andrey Grodzovsky
b1f5b4538e dmr/amdgpu: Fix compile error with CONFIG_DRM_AMDGPU_GART_DEBUGFS
Double defintion of 'i'

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:59:17 -05:00
Gang Ba
108b4d928c drm/amd/amdgpu: Update VM function pointer
When VM state changed and system in large bar mode,
make sure to use CPU update function, otherwise use
SDMA function.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Gang Ba <gaba@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:58:28 -05:00
Yong Zhao
8b7d6157f2 drm/amdgpu: Set VM_L2_CNTL.PDE_FAULT_CLASSIFICATION to 0 for GFX10
We have done this for pre-GFX10 asics, but GFX10 did not pick up the
new change. The below is the commit message for that change.

This is recommended by HW designers. Previously when it was set to 1,
the PDE walk error in VM fault will be treated as
PERMISSION_OR_INVALID_PAGE_FAULT rather than usually expected OTHER_FAULT.
As a result, the retry control in VM_CONTEXT*_CNTL will change accordingly.

The above behavior is kind of abnormal. Furthermore, the
PDE_FAULT_CLASSIFICATION == 1 feature was targeted for very old ASICs
and it never made it way to production. Therefore, we should set it to 0.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:58:14 -05:00
Yong Zhao
5d36d4c976 drm/amdgpu: Add more page fault info printing for GFX10
The printing we did for GFX9 was not propogated to GFX10 somehow, so fix
it now.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:58:08 -05:00
Yong Zhao
4e0ae5e214 drm/amdgpu: Add printing for RW extracted from VM_L2_PROTECTION_FAULT_STATUS
RW is also useful in most cases.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:58:02 -05:00
Oak Zeng
5413fce4b2 drm/amdkfd/gfx10: Calling amdgpu functions to invalidate TLB
Calling amdgpu function to invalidate TLB, instead of using a
kfd implementation. Delete the kfd local TLB invalidation
implementation.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:57:55 -05:00
Oak Zeng
3ff985485b drm/amdgpu: Export function to flush TLB of specific vm hub
This is for kfd to reuse amdgpu TLB invalidation function.
On gfx10, kfd only needs to flush TLB on gfx hub but not
on mm hub. So export a function for KFD flush TLB only on
specific hub.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:57:48 -05:00
Stephen Rothwell
2568cedc13 drm/amdgpu: MODULE_FIRMWARE requires linux/module.h
Fixes: 6a7a0bdbfa ("drm/amdgpu: add psp_v12_0 for renoir (v2)")
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:52:01 -05:00
Tao Zhou
d6e0cbb152 drm/amdgpu: implement querying ras error count for mmhub
get mmhub ea ras error count by accessing EDC_CNT register

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:51:50 -05:00
Kevin Wang
f0f50dcfd4 drm/amdgpu: use exiting amdgpu_ctx_total_num_entities function
simplify driver code.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:51:45 -05:00
Kevin Wang
b81e57fbf9 drm/amdgpu: fix typo error amdgput -> amdgpu
fix typo error:
change function name from "amdgput_ctx_total_num_entities" to
"amdgpu_ctx_total_num_entities".

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:51:37 -05:00
Christoph Hellwig
244511f386 drm/amdgpu: simplify and cleanup setting the dma mask
Use dma_set_mask_and_coherent to set both masks in one go, and remove
the no longer required fallback, as the kernel now always accepts
larger than required DMA masks.  Fail the driver probe if we can't
set the DMA mask, as that means the system can only support a larger
mask.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:51:01 -05:00
Christoph Hellwig
90489ce18c drm/amdgpu: handle PCIe root ports with addressing limitations
amdgpu uses a need_dma32 flag to indicate to the drm core that some
allocations need to be done using GFP_DMA32, but it only checks the
device addressing capabilities to make that decision.  Unfortunately
PCIe root ports that have limited addressing exist as well.  Use the
dma_addressing_limited instead to also take those into account.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:51:00 -05:00
Christian König
52791eeec1 dma-buf: rename reservation_object to dma_resv
Be more consistent with the naming of the other DMA-buf objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/
2019-08-13 09:09:30 +02:00
Pierre-Eric Pelloux-Prayer
17b6d2d528 drm/amdgpu: fix gfx9 soft recovery
The SOC15_REG_OFFSET() macro wasn't used, making the soft recovery fail.

v2: use WREG32_SOC15 instead of WREG32 + SOC15_REG_OFFSET

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-08-12 22:01:17 -05:00
Alex Deucher
b8cf3219cc drm/amdgpu: flag renoir as experimental for now
The current code won't likely work on production hw when
it ships so leave it as experimental until it's ready.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:51 -05:00
Huang Rui
c9d0ca8528 drm/amdgpu: skip mec2 jump table loading for renoir
Renoir need not load mec2 jump table with psp.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:51 -05:00
Huang Rui
444a0fea51 drm/amdgpu: use direct loading on renoir vcn for the moment
PSP has issue for renoir, that will cause VCN fw failed to be loaded. So use
direct loading for the moment till the issue is addressed.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
8deac23636 drm/amdgpu: set fw default loading by psp for renoir
By default, set amdgpu ucode type to AMDGPU_FW_LOAD_PSP.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
40c8a3293b drm/amdgpu: update lbpw for renoir
enable gfx_v9_0_init_lbpw for renoir

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
95f9e74c3a drm/amdgpu: enable power gating for renoir
enable gfx power gating for renoir

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
f78e007f76 drm/amdgpu: enable clock gating for renoir
enable gfx&common clock gating for renoir

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Leo Liu
279ba48e1f drm/amdgpu: add VCN2.0 to Renoir IP blocks
Thus enable VCN2.0 for Renoir

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Leo Liu
0c6b391d68 drm/amdgpu: enable Doorbell support for Renoir (v2)
Add VCN range aperture to NBIO 7.0

v2: rebase (Alex)

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Leo Liu
dc9b6e934b drm/amdgpu: enable Renoir VCN firmware loading
By adding new Renoir VCN firmware

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Huang Rui
a46e1716f3 drm/amdgpu: add sdma golden settings for renoir
This patch adds sdma golden settings for renoir asic.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Huang Rui
33294eb8cb drm/amdgpu: add gfx golden settings for renoir (v2)
This patch adds gfx golden settings for renoir real asic.

v2: update settings (Alex)

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
6a7a0bdbfa drm/amdgpu: add psp_v12_0 for renoir (v2)
1. Add psp ip block
2. Use direct loading type by default and it can also config psp
   loading type.
3. Bypass sos fw loading and xgmi&ras interface

v2: drop TA loading

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
6b3ad3b2da drm/amdgpu: set rlc funcs for renoir
add gfx_v9_0_rlc_funcs for renoir

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
e09ce48182 drm/amdgpu: add asic funcs for renoir
add asic funcs for renoir, init soc15_asic_funcs

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
b1326bbc63 drm/amdgpu: enable dce virtual ip module for Renoir
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Aaron Liu
0126abd4d1 drm/amdgpu: fix no interrupt issue for renoir emu
In renoir's ih model, there's a change in mmIH_CHICKEN
register, that limits IH to use physical address directly.
Those chicken bits need to be programmed first.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Huang Rui
61bdb39c91 drm/amdgpu: add renoir pci id
Add Renoir PCI id support.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Huang Rui
05e1f0e0ab drm/amdgpu: set ip blocks for renoir
Enable ip blocks for renoir.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:50 -05:00
Huang Rui
2d49738ae1 drm/amdgpu: add sdma support for renoir
Add renoir checks to appropriate places.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Huang Rui
1aafd447bc drm/amdgpu: add gfx support for renoir
Add Renoir checks to gfx9 code.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Huang Rui
378d53898a drm/amdgpu: set fw load type for renoir
This patch sets fw load type as direct for renoir for the moment.
Will switch to psp when psp is ready.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Huang Rui
8787ee0145 drm/amdgpu: add gmc v9 supports for renoir
Add gfx memory controller support for renoir.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Huang Rui
080deab66d drm/amdgpu: add soc15 common ip block support for renoir
This patch adds common ip support for renoir.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Huang Rui
b51a26a02a drm/amdgpu: add renoir support for gpu_info and ip block setting
This patch adds renoir support for gpu_info firmware and ip block setting.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Huang Rui
1eee4228a5 drm/amdgpu: add renoir asic_type enum
This patch adds renoir to amd_asic_type enum and amdgpu_asic_name[].

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Pierre-Eric Pelloux-Prayer
62cfcb9e23 drm/amdgpu: fix gfx9 soft recovery
The SOC15_REG_OFFSET() macro wasn't used, making the soft recovery fail.

v2: use WREG32_SOC15 instead of WREG32 + SOC15_REG_OFFSET

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Le Ma
a840159c82 drm/amdgpu: enable mmhub clock gating for Arcturus
Init MC_MGCG/LS flag. Also apply to athub CG.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Le Ma
cb15e8046d drm/amdgpu: add mmhub clock gating for Arcturus
Add 2 mmhub instances CG

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Le Ma
15e2f43a72 drm/amdgpu: increase CGCG gfx idle threshold for Arcturus
Follow the hw spec, and no need to consider gfxoff on Arcturus

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:49 -05:00
Le Ma
f9da7c4384 drm/amdgpu: add GFX_CP_LS flag to Arcturus
Missed AMD_CG_SUPPORT_GFX_CP_LS accidently when commit patch before
    drm/amdgpu: enable gfx clock gating for Arcturus

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Tao Zhou
5212a3bdf0 drm/amdgpu: remove ras block's feature status info in sysfs
feature mask info is enough for rocm tool,
"cat /sys/class/drm/card0/device/ras/features" will get the
info like this:

feature mask: 0x3ffb

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
bee7b51ac9 drm/amdgpu: split athub clock gating from mmhub
Untie the bind of get/set athub CG state from mmhub, for cosmetic fix and Asic
not using mmhub 1.0. Besides, also fix wrong athub CG state in amdgpu_pm_info.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
f7ee199528 drm/amdgpu: enable sdma clock gating for Arcturus
Init sdma MGCG/LS flag

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
8dc7e07cff drm/amdgpu: add sdma clock gating for Arcturus
Add ARCTURUS case in sdma set clockgating function

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
78864760c2 drm/amdgpu: support sdma clock gating for more instances
Shorten the code with RREG32_SDMA/WREG32_SDMA macro in CG part.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
5d111f5b3a drm/amdgpu: enable hdp clock gating for Arcturus
Init hdp MGCG/LS flag as Vega20

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
6acb87acef drm/amdgpu: add hdp clock gating for Arcturus
Add hdp CGLS for Arcturus in set common clockgating function

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
6b76ce62bf drm/amdgpu: enable gfx clock gating for Arcturus
Init gfx MGCG/LS and CGCG/LS flag.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Le Ma
f60481a945 drm/amdgpu: add gfx clock gating for Arcturus
Add ARCTURUS case in gfx set clockgating function. No 3d clock on Arcturus.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Tao Zhou
145b03eb73 drm/amdgpu: create mmhub ras framework
enable mmhub ras feature and create sysfs/debugfs node for mmhub

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Tao Zhou
9fb2d8de4a drm/amdgpu: support mmhub ras in amdgpu ras
call mmhub ras query/inject in amdgpu ras

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Tao Zhou
3d093da098 drm/amdgpu: add amdgpu_mmhub_funcs definition
add amdgpu_mmhub_funcs definition and initialize it,
prepare for mmhub ras enablement

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Tao Zhou
44494f96ba drm/amdgpu: add sub block parameter in ras inject command
ras sub block index could be passed from shell command

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Guchun Chen
a2b459947b drm/amdgpu: add check to avoid array bound issue
Sub_block_index can be passed from user level, so
add one check before accessing the array first to
prevent array index out of bound problem.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Alex Deucher
2605172032 drm/amdgpu: add navi14 PCI ID
Add the navi14 PCI device id.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Michel Dänzer
965ebe3d5d drm/amdgpu: Update pitch on page flips without DC as well
DC already handles this correctly since amdgpu minor version 31. Bump
the minor version again so that xf86-video-amdgpu can take advantage of
this working without DC as well now.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
65872e59d6 drm/amdgpu: enable vcn clock gating for navi12
enables vcn medium grained clock gating

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
ca51678db4 drm/amdgpu: enable athub clock gating for navi12
enables athub medium grained clock gating and memory light sleep

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
89b8d6da24 drm/amdgpu/athub2: set clock gating for navi12
add navi12 define

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
fbe0bc5794 drm/amdgpu: enable ih clock gating for navi12
enables ih clock gating

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
8b797b3d30 drm/amdgpu: enable mmhub clock gating for navi12
enables mmhub medium grained clock gating and memory light sleep

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
cf5a95e5b8 drm/amdgpu/mmhub2: set clock gating for navi12
add navi12 define

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
358ab97f53 drm/amdgpu: enable sdma clock gating for navi12
enables sdma medium grained clock gating and memory light sleep

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
e2f9726ee9 drm/amdgpu/sdma5: set sdma clock gating for navi12
add navi12 define

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
5211c37a34 drm/amdgpu: enable hdp clock gating for navi12
enables hdp medium grained clock gating and memory light sleep

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Xiaojie Yuan
dca009e71c drm/amdgpu: enable gfx clock gatings for navi12
enables following gfx clock gating features:

- medium grained clock gating
- medium grained light sleep
- coarse grained clock gating
- cp memory light sleep
- rlc memory light sleep

CGLS (Coarse Grained Light Sleep) will break s3, so don't enable it.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Marek Olšák
05677c958a Revert "drm/amdgpu: fix transform feedback GDS hang on gfx10 (v2)"
This reverts commit 9ed2c993d7.

SET_CONFIG_REG writes to memory if register shadowing is enabled,
causing a VM fault.

NGG streamout is unstable anyway, so all UMDs should use legacy
streamout. I think Mesa is the only driver using NGG streamout.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:47 -05:00
Dave Airlie
e7f7287bf5 Merge tag 'drm-next-5.4-2019-08-09' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.4-2019-08-09:

Same as drm-next-5.4-2019-08-06, but with the
readq/writeq stuff fixed and 5.3-rc3 backmerged.

amdgpu:
- Add navi14 support
- Add navi12 support
- Add Arcturus support
- Enable mclk DPM for Navi
- Misc DC display fixes
- Add perfmon support for DF
- Add scatter/gather display support for Raven
- Improve SMU handling for GPU reset
- RAS support for GFX
- Drop last of drmP.h
- Add support for wiping memory on buffer release
- Allow cursor async updates for fb swaps
- Misc fixes and cleanups

amdkfd:
- Add navi14 support
- Add navi12 support
- Add Arcturus support
- CWSR trap handlers updates for gfx9, 10
- Drop last of drmP.h
- Update MAINTAINERS

radeon:
- Misc fixes and cleanups
- Make kexec more reliable by tearing down the GPU

ttm:
- Add release_notify callback

uapi:
- Add wipe memory on release flag for buffer creation

Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: resolved conflicts with ttm resv moving]
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809184807.3381-1-alexander.deucher@amd.com
2019-08-12 14:20:21 +10:00
Christian König
0e1d8083bd dma-buf: further relax reservation_object_add_shared_fence
Other cores don't busy wait any more and we removed the last user of checking
the seqno for changes. Drop updating the number for shared fences altogether.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/322379/?series=64837&rev=1
2019-08-10 12:49:28 +02:00
Alex Deucher
3f61fd41f3 Linux 5.3-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl1HiQMeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGFaIIAIM7UI5LXf7FMsVl
 zVemD9uDuCqNijycIfFoXvVvDt8y1PnyFJd5C/hRtXjsHyCPB49CRULE05q9ZOh6
 68jDa9VYOrnZoDlhMT4kuLf74x78RP19gVgQOLok8n0V3VKt7Yqrow5FKNOYVEfq
 0Rd2DqZMU5yGxo6iwG4y1PjCwvwDQ/tcaAGjc9RtOlmYl9KX9MoVHuwn4EEqO8pC
 3BN5GL0c/ebiCyNKG2n+y6vJGj5Y9rekyRYrtmtvhHsfs4iBirbnssMatyGm3gNz
 klysGhbQO98+DoVq3qqclVP5eK0XPdIBCAkF624tBhUN8gczRoQqVRBFuKCUCrD2
 h9wT8dE=
 =k65Y
 -----END PGP SIGNATURE-----

Merge tag 'v5.3-rc3' into drm-next-5.4

Linux 5.3-rc3

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-09 13:07:28 -05:00
Tao Zhou
6ca523d7eb drm/amdgpu: remove RREG64/WREG64
atomic 64 bits REG operations are useless currently

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-09 11:17:30 -05:00
Tao Zhou
dd21a572c9 drm/amdgpu: implement UMC 64 bits REG operations
implement 64 bits operations via 32 bits interface

v2: make use of lower_32_bits() and upper_32_bits() macros

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-09 11:17:10 -05:00
Tao Zhou
c6dddf4540 drm/amdgpu: replace readq/writeq with atomic64 operations
what we really want is a read or write that is guaranteed to be 64 bits
at a time, atomic64 operations are supported on all architectures

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-09 11:14:11 -05:00
Dave Airlie
b0383c0653 drm-misc-next for 5.4:
UAPI Changes:
  - HDCP: Add a Content protection type property
 
 Cross-subsystem Changes:
 
 Core Changes:
  - Continue to rework the include dependencies
  - fb: Remove the unused drm_gem_fbdev_fb_create function
  - drm-dp-helper: Make the link rate calculation more tolerant to
                   non-explicitly defined, yet supported, rates
  - fb-helper: Map DRM client buffer only when required, and instanciate a
               shadow buffer when the device has a dirty function or says so
  - connector: Add a helper to link the DDC adapter used by that connector to
               the userspace
  - vblank: Switch from DRM_WAIT_ON to wait_event_interruptible_timeout
  - dma-buf: Fix a stack corruption
  - ttm: Embed a drm_gem_object struct to make ttm_buffer_object a
         superclass of GEM, and convert drivers to use it.
  - hdcp: Improvements to report the content protection type to the
          userspace
 
 Driver Changes:
  - Remove drm_gem_prime_import/export from being defined in the drivers
  - Drop DRM_AUTH usage from drivers
  - Continue to drop drmP.h
  - Convert drivers to the connector ddc helper
 
  - ingenic: Add support for more panel-related cases
  - komeda: Support for dual-link
  - lima: Reduce logging
  - mpag200: Fix the cursor support
  - panfrost: Export GPU features register to userspace through an ioctl
  - pl111: Remove the CLD pads wiring support from the DT
  - rockchip: Rework to use DRM PSR helpers, fix a bug in the VOP_WIN_GET
              macro
  - sun4i: Improve support for color encoding and range
  - tinydrm: Rework SPI support, improve MIPI-DBI support, move to drm/tiny
  - vkms: Rework of the CRC tracking
 
  - bridges:
    - sii902x: Add support for audio graph card
    - tc358767: Rework AUX data handling code
    - ti-sn65dsi86: Add Debugfs and proper DSI mode flags support
 
  - panels
    - Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech
      COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191,
      Boe Himax8279d, Sharp LD-D5116Z01B
    - Conversion of the device tree bindings to the YAML description
    - jh057n00900: Rework the enable / disable path
 
  - fbdev:
    - ssd1307fb: Support more devices based on that controller
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXUwPUAAKCRDj7w1vZxhR
 xQ4lAQDK2ijx29YHeZspbOwP4Nwq95DFs1uQcSm5GvbRt1JSowD9EwkLeNfkPkel
 Xv1Ts/Frgq7ckH2e2zkLPyCOFCHd0wA=
 =rIUl
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-08-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.4:

UAPI Changes:
 - HDCP: Add a Content protection type property

Cross-subsystem Changes:

Core Changes:
 - Continue to rework the include dependencies
 - fb: Remove the unused drm_gem_fbdev_fb_create function
 - drm-dp-helper: Make the link rate calculation more tolerant to
                  non-explicitly defined, yet supported, rates
 - fb-helper: Map DRM client buffer only when required, and instanciate a
              shadow buffer when the device has a dirty function or says so
 - connector: Add a helper to link the DDC adapter used by that connector to
              the userspace
 - vblank: Switch from DRM_WAIT_ON to wait_event_interruptible_timeout
 - dma-buf: Fix a stack corruption
 - ttm: Embed a drm_gem_object struct to make ttm_buffer_object a
        superclass of GEM, and convert drivers to use it.
 - hdcp: Improvements to report the content protection type to the
         userspace

Driver Changes:
 - Remove drm_gem_prime_import/export from being defined in the drivers
 - Drop DRM_AUTH usage from drivers
 - Continue to drop drmP.h
 - Convert drivers to the connector ddc helper

 - ingenic: Add support for more panel-related cases
 - komeda: Support for dual-link
 - lima: Reduce logging
 - mpag200: Fix the cursor support
 - panfrost: Export GPU features register to userspace through an ioctl
 - pl111: Remove the CLD pads wiring support from the DT
 - rockchip: Rework to use DRM PSR helpers, fix a bug in the VOP_WIN_GET
             macro
 - sun4i: Improve support for color encoding and range
 - tinydrm: Rework SPI support, improve MIPI-DBI support, move to drm/tiny
 - vkms: Rework of the CRC tracking

 - bridges:
   - sii902x: Add support for audio graph card
   - tc358767: Rework AUX data handling code
   - ti-sn65dsi86: Add Debugfs and proper DSI mode flags support

 - panels
   - Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech
     COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191,
     Boe Himax8279d, Sharp LD-D5116Z01B
   - Conversion of the device tree bindings to the YAML description
   - jh057n00900: Rework the enable / disable path

 - fbdev:
   - ssd1307fb: Support more devices based on that controller

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808121423.xzpedzkpyecvsiy4@flea
2019-08-09 16:04:31 +10:00
Christoph Hellwig
9c240a7bb3 mm/hmm: make HMM_MIRROR an implicit option
Make HMM_MIRROR an option that is selected by drivers wanting to use it
instead of a user visible option as it is just a low-level implementation
detail.

Link: https://lore.kernel.org/r/20190806160554.14046-15-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07 14:58:06 -03:00
Christoph Hellwig
7f08263d9b mm/hmm: remove the page_shift member from struct hmm_range
All users pass PAGE_SIZE here, and if we wanted to support single entries
for huge pages we should really just add a HMM_FAULT_HUGEPAGE flag instead
that uses the huge page size instead of having the caller calculate that
size once, just for the hmm code to verify it.

Link: https://lore.kernel.org/r/20190806160554.14046-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07 14:58:06 -03:00
Christoph Hellwig
fac555ac93 mm/hmm: remove superfluous arguments from hmm_range_register
The start, end and page_shift values are all saved in the range structure,
so we might as well use that for argument passing.

Link: https://lore.kernel.org/r/20190806160554.14046-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07 14:58:05 -03:00
Christoph Hellwig
07d82211b8 amdgpu: don't initialize range->list in amdgpu_hmm_init_range
The list is used to add the range to another list as an entry in the core
hmm code, and intended as a private member not exposed to drivers.  There
is no need to initialize it in a driver.

Link: https://lore.kernel.org/r/20190806160554.14046-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07 14:51:46 -03:00
Christoph Hellwig
9d0a16658f amdgpu: remove -EAGAIN handling for hmm_range_fault
hmm_range_fault can only return -EAGAIN if called with the
HMM_FAULT_ALLOW_RETRY flag, which amdgpu never does.  Remove the handling
for the -EAGAIN case with its non-standard locking scheme.

Link: https://lore.kernel.org/r/20190806160554.14046-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-07 14:51:46 -03:00
Marek Olšák
d9dfe768b3 Revert "drm/amdgpu: fix transform feedback GDS hang on gfx10 (v2)"
This reverts commit 9ed2c993d7.

SET_CONFIG_REG writes to memory if register shadowing is enabled,
causing a VM fault.

NGG streamout is unstable anyway, so all UMDs should use legacy
streamout. I think Mesa is the only driver using NGG streamout.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 23:28:41 -05:00
Likun Gao
72cda9bb5e drm/amdgpu: pin the csb buffer on hw init for gfx v8
Without this pin, the csb buffer will be filled with inconsistent
data after S3 resume. And that will causes gfx hang on gfxoff
exit since this csb will be executed then.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Tested-by: Paul Gover <pmw.gover@yahoo.co.uk>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 15:04:40 -05:00
Andrey Grodzovsky
b5507c7e06 drm/amdgpu: Fix GPU reset crash regression.
amdgpu_ip_block.status.hw for GMC wasn't set to
false on suspend during GPU reset and so on resume gmc_v9_0_resume
wasn't called.
Caused by 'drm/amdgpu: fix double ucode load by PSP(v3)'

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 13:53:19 -05:00
Xiaojie Yuan
b5c7385640 drm/amdgpu/discovery: move common discovery code out of navi1*_reg_base_init()
move amdgpu_discovery_reg_base_init() from navi1*_reg_base_init() to a
common function nv_reg_base_init().

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 13:53:05 -05:00
tiancyin
35ef88fa11 drm/amdgpu/soc15: fix external_rev_id for navi14
fix the hard code external_rev_id.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 13:52:58 -05:00
Tao Zhou
2a3c7ff6e3 drm/amdgpu: update ras sysfs feature info
remove confused ras error type info

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 13:52:52 -05:00
xinhui pan
876923fb92 drm/amdgpu: Fix panic during gpu reset
Clear the flag after hw suspend, otherwise it skips the corresponding hw
resume.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 13:52:37 -05:00
Likun Gao
1f288afc2c drm/amdgpu: pin the csb buffer on hw init for gfx v8
Without this pin, the csb buffer will be filled with inconsistent
data after S3 resume. And that will causes gfx hang on gfxoff
exit since this csb will be executed then.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Tested-by: Paul Gover <pmw.gover@yahoo.co.uk>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06 13:52:31 -05:00
Gerd Hoffmann
5a5011a724 drm/amdgpu: switch driver from bo->resv to bo->base.resv
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190805140119.7337-14-kraxel@redhat.com
2019-08-06 08:21:54 +02:00
Gerd Hoffmann
b96f3e7c80 drm/ttm: use gem vma_node
Drop vma_node from ttm_buffer_object, use the gem struct
(base.vma_node) instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190805140119.7337-9-kraxel@redhat.com
2019-08-06 08:21:54 +02:00
Gerd Hoffmann
c105de2828 drm/amdgpu: use embedded gem object
Drop drm_gem_object from amdgpu_bo, use the
ttm_buffer_object.base instead.

Build tested only.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190805140119.7337-6-kraxel@redhat.com
2019-08-06 08:21:54 +02:00
Christian König
0dbd555a01 dma-buf: add more reservation object locking wrappers
Complete the abstraction of the ww_mutex inside the reservation object.

This allows us to add more handling and debugging to the reservation
object in the future.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/320761/
2019-08-05 09:28:43 +02:00
Thong Thai
d1836f3813 drm/amd/amdgpu/vcn_v2_0: Move VCN 2.0 specific dec ring test to vcn_v2_0
VCN 2.0 firmware now requires a packet start command to be sent before
any other decode ring buffer command.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:36:35 -05:00
Thong Thai
af655cc5aa drm/amd/amdgpu/vcn_v2_0: Mark RB commands as KMD commands
Sets the CMD_SOURCE bit for VCN 2.0 Decoder Ring Buffer commands. This
bit was previously set by the RBC HW on older firmware. Newer firmware
uses a SW RBC and this bit has to be set by the driver.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:36:34 -05:00
shaoyunl
3cf7bf2e48 drm/amdgpu: enable Navi12 kfd support for amdgpu
Navi12 has the same interface as Navi10

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Leo Li
078655d982 drm/amdgpu: Add nv12 DC ip block
Load DC and amdgpu display manager

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Boyuan Zhang
400e9c5ea6 drm/amdgpu: enable DPG mode for Navi12
Enable Dynamic Power Gating VCN for Navi12.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Boyuan Zhang
1fbed280a2 drm/amdgpu: add VCN ip block for Navi12
Add VCN2 ip block for Navi12

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Boyuan Zhang
a3219816c4 drm/amdgpu: add Navi12 VCN firmware support
Add Navi12 to VCN family

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Xiaojie Yuan
6b66ae2e55 drm/amdgpu: add psp ip block for navi12
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Xiaojie Yuan
7f47efeb9e drm/amdgpu: add smu ip block for navi12
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Xiaojie Yuan
e60cc94b26 drm/amdgpu: start autoload till RLCG fw for navi12
rlc save restore list is not ready yet for navi12

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Xiaojie Yuan
739cdbd6a2 drm/amdgpu/psp11: add psp support for navi12
Same as other navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Jack Xiao
02938eed74 drm/amdgpu: correct smu rlc handshake enablement bit
Correct the enablement bit of SMU RLC handshake.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:41 -05:00
Xiaojie Yuan
c726fbf0fb drm/amdgpu/sdma5: add golden settings for navi12 (v2)
common golden settings are put in golden_settings_sdma_5 array

v2: update settings (Alex)

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
f8984cb9e3 drm/amdgpu/gfx10: add golden settings for navi12 (v2)
Add initial golden settings for navi12 gfx.

v2: update settings

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
7990202903 drm/amdgpu: enable virtual display for navi12
Virtual display is a sw display interface for
bring up and virtualization or for cards without
display hardware.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
71745cf474 drm/amdgpu/gfx10: set tcp harvest for navi12
Same as navi10.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
44e9e7c96c drm/amdgpu: add ip blocks for navi12
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
4a0e815fb3 drm/amdgpu/gmc10: set gart size and vm size for navi12
Same as other navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
f2d6731d77 drm/amdgpu/sdma5: add placeholder for navi12 golden settings
None yet.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
6f523fd7b3 drm/amdgpu/sdma5: declare sdma firmwares for navi12
Declare the firmwares and load the proper ones for navi12.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
4cdfc4a2be drm/amdgpu/gfx10: set rlc funcs for navi12
Same as other navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
9ff3dba6d6 drm/amdgpu/gfx10: set number of me(c)/pipe/queue for navi12
Same as other navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
716e9bb099 drm/amdgpu/gfx10: add placeholder for navi12 golden settings
Not used yet.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
92c123aec1 drm/amdgpu/gfx10: declare cp/rlc firmwares for navi12
Set the name properly to load the right ucode.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
6983469c1a drm/amdgpu/gfx10: add gfx config for navi12
got from mmCP_MAX_CONTEXT and mmPA_SC_FIFO_SIZE

v2: squash all navi asics together because the
settings are the same.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
59ab8c292b drm/amdgpu/gfx10: set gfx cg for navi12
Same as other navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
7e17e58bdd drm/amdgpu: set nbio/hdp cg for navi12
Same as navi10.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
74b5e509a0 drm/amdgpu: initialize cg/pg flags and external rev id for navi12
don't enable any cg/pg features yet.

v2: calculate external revision id from revision id so that we can
    differentiate navi12 A0 from A1 directly.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
d4d838ba4e drm/amdgpu: use front door firmware loading for navi12
Same as other navi asics.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:40 -05:00
Xiaojie Yuan
4808cf9c2a drm/amdgpu: set asic family and ip blocks for navi12
same with navi10

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Xiaojie Yuan
42b325e5ec drm/amdgpu: add gpu_info firmware for navi12
gpu_info firmare store asic configuration details.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Xiaojie Yuan
9802f5d78b drm/amdgpu: add navi12 asic type
Add asic type.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Xiaojie Yuan
03d0a073cf drm/amdgpu: initialize reg base for navi12
Set up the register offset map for navi12.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
John Clements
8c2ef8ca0e drm/amdgpu: update SDMA V4 microcode init
Removed loading duplicate instances of SDMA FW for Arcturus.
We use a single image for all instances.

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
John Clements
b86f8d8b2b drm/amdgpu: extend PSP FW loading support to 8 SDMA instances
Arcturus has 8 instances of SDMA.  Update host to PSP interface
to handle it.

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
John Clements
8fda90e821 drm/amdgpu: disable MEC2 JT context init for Arcturus
We don't need to handle it like other asics.

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
John Clements
6c37bde9c6 drm/amdgpu: update PSP CMD fail response status print
Print the response in hex with the apprpriate mask.

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
John Clements
dc0d962297 drm/amdgpu: add PSP KDB loading support for Arcturus
Add support for the arcturus specific psp metadata to the
amdgpu firmware and properly parse it when loading it.

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
John Clements
f36d9ab95f drm/amdgpu: add PSP SW init support for Arcturus
Add arcturus cases to psp init sewquence.

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
John Clements
c0dac3c9f5 drm/amdgpu: removed duplicate line
Remove duplicate break.

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Evan Quan
4abc1765d2 drm/amd/powerplay: enable SW SMU power profile switch support in KFD
Hook up the SW SMU power profile switch in KFD routine.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Tao Zhou
bd2280da46 drm/amdgpu: replace AMDGPU_RAS_UE with AMDGPU_RAS_SUCCESS
ce can also trigger interrupt, and even both ce and ue error can be
found in one ras query, distinguishing between ce and ue in interrupt
handler is uncessary.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Suggested-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Tao Zhou
91ba68f8d5 drm/amdgpu: only uncorrectable error needs gpu reset
we only read error information for correctable error in interrupt
handler, gpu reset is unnecessary since there is no data lost
in correctable error

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
b1a5895352 drm/amdgpu: update the calc algorithm of umc ecc error count
the initial value of ecc error count can be adjusted

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
b7f92097f5 drm/amdgpu: implement umc ras init function
enable umc ce interrupt and initialize ecc error count

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
51437623a0 drm/amdgpu: support ce interrupt in ras module
correctable error can also trigger interrupt in some ras blocks

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
13b7c46c18 drm/amdgpu: add error address query for umc ras
umc error address query can get ce/ue error address and clear error
status

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
2b671b6049 drm/amdgpu: apply umc_for_each_channel macro to umc_6_1
use umc_for_each_channel to make code simpler

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
fee858ba5f drm/amdgpu: add macro of umc for each channel
common function for all umc versions, loop for each umc channel is
a frequent used operation in umc block, define it as a macro to
simplify code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
3aacf4ea11 drm/amdgpu: initialize new parameters and functions for amdgpu_umc structure
add initialization for new members of amdgpu_umc structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
33b97cf896 drm/amdgpu: add more parameters and functions to amdgpu_umc structure
expose more parameters and functions of specific umc version to common
umc layer, so amdgpu_umc layer and other blocks could access them

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
a55c8d7bda drm/amdgpu: remove the clear of MCA_ADDR
clearing MCA_STATUS is enough to reset the whole MCA, writing zero to
MCA_ADDR is unnecessary

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Colin Ian King
ac4bf4a1eb drm/amdgpu: fix unsigned variable instance compared to less than zero
Currenly the error check on variable instance is always false because
it is a uint32_t type and this is never less than zero. Fix this by
making it an int type.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 7d0e6329df ("drm/amdgpu: update more sdma instances irq support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Jay Cornwall
5145d57ec5 drm/amdkfd: Extend CU mask to 8 SEs (v3)
Following bitmap layout logic introduced by:
"drm/amdgpu: support get_cu_info for Arcturus".

v2: squash in fixup for gfx_v9_0.c (Alex)
v3: squash in debug print output fix

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:19:11 -05:00
Le Ma
857b82d0df drm/amdgpu: support get_cu_info for Arcturus
This change is because SE/SH layout on Arcturus is 8*1, different from
4*2(or 4*1) on Vega ASICs.

Currently the cu bitmap array is 4x4 size, and besides the bitmap is used widely
across SW stack. To mostly reduce the scale of impact, we make the cu bitmap
array compatible with SE/SH layout on Arcturus. Then the store of cu bits of
each shader array for Arcturus will be like below:
    SE0,SH0 --> bitmap[0][0]
    SE1,SH0 --> bitmap[1][0]
    SE2,SH0 --> bitmap[2][0]
    SE3,SH0 --> bitmap[3][0]
    SE4,SH0 --> bitmap[0][1]
    SE5,SH0 --> bitmap[1][1]
    SE6,SH0 --> bitmap[2][1]
    SE7,SH0 --> bitmap[3][1]

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:19:05 -05:00
Kent Russell
612e4ed99b drm/amdgpu: Fix pcie_bw on Vega20
The registers used for VG20 are different in that certain performance
counters were split off to TXCLK3/4. Vega10/12 doesn't have this, so add
a new vg20_get_pcie_usage to reflect this change.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:58 -05:00
Andrey Grodzovsky
19ed70ff5d drm/amdgpu: Add amdgpu_asic_funcs.reset_method for Vega20
Fixes GPU reset crash.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:44 -05:00
Felix Kuehling
6856e4b65f drm/amdgpu: Mark KFD VRAM allocations for wipe on release
Memory used by KFD applications can contain sensitive information that
should not be leaked to other processes. The current approach to prevent
leaks is to clear VRAM at allocation time. This is not effective because
memory can be reused in other ways without being cleared. Synchronously
clearing memory on the allocation path also carries a significant
performance penalty.

Stop clearing memory at allocation time. Instead mark the memory for
wipe on release.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:38 -05:00
Felix Kuehling
ab2f7a5c18 drm/amdgpu: Implement VRAM wipe on release
Wipe VRAM memory containing sensitive data when moving or releasing
BOs. Clearing the memory is pipelined to minimize any impact on
subsequent memory allocation latency. Use of a poison value should
help debug future use-after-free bugs.

When moving BOs, the existing ttm_bo_pipelined_move ensures that the
memory won't be reused before being wiped.

When releasing BOs, the BO is fenced with the memory fill operation,
which results in queuing the BO for a delayed delete.

v2: Move amdgpu_amdkfd_unreserve_memory_limit into
amdgpu_bo_release_notify so that KFD can use memory that's still
being cleared in the background

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:32 -05:00
Monk Liu
482f0e5385 drm/amdgpu: fix double ucode load by PSP(v3)
previously the ucode loading of PSP was repreated, one executed in
phase_1 init/re-init/resume and the other in fw_loading routine

Avoid this double loading by clearing ip_blocks.status.hw in suspend or reset
prior to the FW loading and any block's hw_init/resume

v2:
still do the smu fw loading since it is needed by bare-metal

v3:
drop the change in reinit_early_sriov, just clear all block's status.hw
in the head place and set the status.hw after hw_init done is enough

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:34 -05:00
Monk Liu
9244d3a6eb drm/amdgpu: fix incorrect judge on sos fw version
for SRIOV the SOS fw of PSP is loaded in hypervisor thus
guest won't tell the version of it, and judging feature by
reading the sos fw version in guest side is completely wrong

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:28 -05:00
Monk Liu
4cd4c5c064 drm/amdgpu: cleanup vega10 SRIOV code path
we can simplify all those unnecessary function under
SRIOV for vega10 since:
1) PSP L1 policy is by force enabled in SRIOV
2) original logic always set all flags which make itself
   a dummy step

besides,
1) the ih_doorbell_range set should also be skipped
for VEGA10 SRIOV.
2) the gfx_common registers should also be skipped
for VEGA10 SRIOV.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:21 -05:00
Dave Airlie
4b381ee25d Merge tag 'drm-fixes-5.3-2019-07-31' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
drm-fixes-5.3-2019-07-31:

amdgpu:
- Fix temperature granularity for navi
- Fix stable pstate setting for navi
- Fix VCN DPM enablement on navi
- Fix error handling on CS ioctl when processing dependencies
- Fix possible information leak in debugfs

amdkfd:
- fix memory alignment for VegaM

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731191648.25729-1-alexander.deucher@amd.com
2019-08-02 09:35:40 +10:00
Alex Deucher
9475a77b57 drm/amdkfd: enable KFD support for navi14
Same as navi10.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:26 -05:00
Dennis Li
dc4d716d4c drm/amdgpu: disable inject for failed subblocks of gfx
some subblocks of gfx fail in inject test, disable them

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:20 -05:00
Dennis Li
83b0582c90 drm/amdgpu: support gfx ras error injection and err_cnt query
check gfx error count in both ras querry function and
ras interrupt handler.

gfx ras is still disabled by default due to known stability
issue found in gpu reset.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:14 -05:00
Dennis Li
2c960ea02f drm/amdgpu: add RAS callback for gfx
Add functions for RAS error inject and query error counter

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:08 -05:00
Dennis Li
dc23a08f03 drm/amdgpu: add define for gfx ras subblock
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:01 -05:00
Tao Zhou
7cdc2ee300 drm/amdgpu: remove ras_reserve_vram in ras injection
error injection address is not in gpu address space

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:41 -05:00
Tao Zhou
e10634938b drm/amdgpu: add check for ras error type
only ue and ce errors are supported

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:35 -05:00
Tao Zhou
81e02619e9 drm/amdgpu: update interrupt callback for all ras clients
add err_data parameter in interrupt cb for ras clients

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:29 -05:00
Tao Zhou
cf04dfd0e9 drm/amdgpu: allow ras interrupt callback to return error data
add error data as parameter for ras interrupt cb and process it

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:23 -05:00
Tao Zhou
8c94810357 drm/amdgpu: query umc ras error address
query umc ras error address, translate it to gpu 4k page view
and save it.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:17 -05:00
Tao Zhou
c2742aef4d drm/amdgpu: add structures for umc error address translation
add related registers, callback function and channel index table

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:11 -05:00
Tao Zhou
6f102dba80 drm/amdgpu: add support for recording ras error address
more than one error address may be recorded in one query

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:05 -05:00
Tao Zhou
f1ed4afa13 drm/amdgpu: update algorithm of umc uncorrectable error counting
remove the check of ErrorCodeExt

v2: refine the if condition for ue counting

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:58 -05:00
Tao Zhou
045c021653 drm/amdgpu: switch to amdgpu_umc structure
create new amdgpu_umc structure to for more umc
settings in future and switch to the new structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:52 -05:00
Tao Zhou
5bbfb64a17 drm/amdgpu: use 64bit operation macros for umc
replace some 32bit macros with 64bit operations to simplify code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:46 -05:00
Tao Zhou
4fa1c6a679 drm/amdgpu: add RREG64/WREG64(_PCIE) operations
add 64 bits register access functions

v2: implement 64 bit functions in low level

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:40 -05:00
Tao Zhou
05a58345db drm/amdgpu: add ras error count after each query (v2)
v1: increase ras ce/ue error count
v2: log the number of correctable and uncorrectable errors

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:33 -05:00
Hawking Zhang
939e2258ce drm/amdgpu: querry umc error count
check umc error count in both ras querry function and
ras interrupt handler

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:28 -05:00
Hawking Zhang
5b6b35aaac drm/amdgpu: init umc v6_1 functions for vega20
init umc callback function for vega20 in sw early init phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:22 -05:00
Hawking Zhang
9884c2b1c3 drm/amdgpu: add umc v6_1 query error count support
Implement umc query_ras_error_count function to support querry
both correctable and uncorrectable error

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:16 -05:00
Hawking Zhang
9e585a523b drm/amdgpu: add amdgpu_umc_functions structure
This is common structure as UMC callback function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:57 -05:00
Hawking Zhang
6501a77170 drm/amdgpu: init RSMU and UMC ip base address for vega20
the driver needs to program RSMU and UMC registers to
support vega20 RAS feature

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:51 -05:00
Hawking Zhang
7af25d5b7e drm/amdgpu: move some ras data structure to amdgpu_ras.h
These are common structures that can be included by IP specific
source files

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:32 -05:00
Alex Deucher
fa1884f9d8 drm/amdgpu: drop drmP.h from vcn_v2_5.c
Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:41 -05:00
Alex Deucher
9a2ffeb525 drm/amdgpu: drop drmP.h from vcn_v2_0.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:36 -05:00
Alex Deucher
75589f496d drm/amdgpu: drop drmP.h from sdma_v5_0.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:31 -05:00
Alex Deucher
e9eea90247 drm/amdgpu: drop drmP.h from nv.c
And fix up the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:26 -05:00
Alex Deucher
b23b2e9e49 drm/amdgpu: drop drmP.h from navi10_ih.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:21 -05:00
Alex Deucher
0a069bbe13 drm/amdgpu: drop drmP.h in gfx_v10_0.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:16 -05:00
Alex Deucher
3b90f6ecdf drm/amdgpu: drop drmP.h from amdgpu_amdkfd_gfx_v10.c
Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:10 -05:00
Alex Deucher
32978d8cfd drm/amdgpu: drop drmP.h in amdgpu_amdkfd_arcturus.c
Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:32:56 -05:00
Andrzej Pietrasiewicz
5b50fa2b35 drm/amdgpu: Provide ddc symlink in connector sysfs directory
Use the ddc pointer provided by the generic connector.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7fee0fa0d0f77af6595d283d5f3ae5d551475821.1564161140.git.andrzej.p@collabora.com
2019-07-31 16:35:37 +02:00
Daniel Vetter
b2ad978fd0 drm/amdgpu: Fill out gem_object->resv
That way we can ditch our gem_prime_res_obj implementation. Since ttm
absolutely needs the right reservation object all the boilerplate is
already there and we just have to wire it up correctly.

Note that gem/prime doesn't care when we do this, as long as we do it
before the bo is registered and someone can call the handle2fd ioctl
on it.

Aside: ttm_buffer_object.ttm_resv could probably be ditched in favour
of always passing a non-NULL resv to ttm_bo_init(). At least for gem
drivers that would avoid having two of these, on in ttm_buffer_object
and the other in drm_gem_object, one just there for confusion.

Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Michel Dänzer" <michel.daenzer@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Amber Lin <Amber.Lin@amd.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Junwei Zhang <Jerry.Zhang@amd.com>
Cc: Thomas Zimmermann <contact@tzimmermann.org>
Cc: Samuel Li <Samuel.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725132655.11951-4-daniel.vetter@ffwll.ch
2019-07-31 10:19:23 +02:00
Evan Quan
6dee4829cf drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 02:02:22 -05:00
Wang Xiayang
929e571c04 drm/amdgpu: fix a potential information leaking bug
Coccinelle reports a path that the array "data" is never initialized.
The path skips the checks in the conditional branches when either
of callback functions, read_wave_vgprs and read_wave_sgprs, is not
registered. Later, the uninitialized "data" array is read
in the while-loop below and passed to put_user().

Fix the path by allocating the array with kcalloc().

The patch is simplier than adding a fall-back branch that explicitly
calls memset(data, 0, ...). Also it does not need the multiplication
1024*sizeof(*data) as the size parameter for memset() though there is
no risk of integer overflow.

Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 01:26:09 -05:00
Christian König
67d0859e27 drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep
We always need to drop the ctx reference and should check
for errors first and then dereference the fence pointer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 01:26:09 -05:00
Evan Quan
479156f2e5 drm/amd/powerplay: fix null pointer dereference around dpm state relates
DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 00:06:58 -05:00
Kent Russell
d65848657c drm/amdkfd: Fix byte align on VegaM
This was missed during the addition of VegaM support

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:59:00 -05:00
Hawking Zhang
861324983d drm/amdgpu: correct irq type used for sdma ecc
we should pass irq type, instead of irq client id,
to irq_get/put interface

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:35 -05:00
Evan Quan
1f96ecef6f drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Le Ma
7d0e6329df drm/amdgpu: update more sdma instances irq support
Update for sdma ras ecc_irq and other minors.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
7c16d24abe drm/amdgpu: correct VCN powergate routine for acturus
Arcturus VCN should powergate in the way as Navi.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
fe089e1dd7 drm/amd/powerplay: enable arcturus powerplay
Arcturus powerplay is ready to use.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
cca4fafc09 drm/amd/powerplay: initialize arcturus MP1 and THM base address
Initialize base address for those IPs which are used in powerplay.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Wang Xiayang
1a2c29bce0 drm/amdgpu: fix a potential information leaking bug
Coccinelle reports a path that the array "data" is never initialized.
The path skips the checks in the conditional branches when either
of callback functions, read_wave_vgprs and read_wave_sgprs, is not
registered. Later, the uninitialized "data" array is read
in the while-loop below and passed to put_user().

Fix the path by allocating the array with kcalloc().

The patch is simplier than adding a fall-back branch that explicitly
calls memset(data, 0, ...). Also it does not need the multiplication
1024*sizeof(*data) as the size parameter for memset() though there is
no risk of integer overflow.

Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Christian König
6494120695 drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep
We always need to drop the ctx reference and should check
for errors first and then dereference the fence pointer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Thong Thai
c74dbe44ea drm/amd/amdgpu/vcn_v2_0: Move VCN 2.0 specific dec ring test to vcn_v2_0
VCN 2.0 firmware now requires a packet start command to be sent before
any other decode ring buffer command.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Alex Deucher
3207dcf3af drm/amdgpu/gfx10: update golden settings for navi14
Updated settings for hw team.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Kevin Wang
98eb03bbf0 drm/amd/powerplay: implment sysfs feature status function in smu
1. Unified feature enable status format in sysfs
2. Rename ppfeature to pp_features to adapt other pp sysfs node name
3. this function support all asic, not asic related function.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Rui Huang <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Joseph Greathouse
2c89731803 drm/amdgpu: Default disable GDS for compute+gfx
Units in the GDS block default to allowing all VMIDs access to all
entries. Disable shader access to the GDS, GWS, and OA blocks from all
compute and gfx VMIDs by default. For compute, HWS firmware will set
up the access bits for the appropriate VMID when a compute queue
requires access to these blocks.
The driver will handle enabling access on-demand for graphics VMIDs.

Leaving VMID0 with full access because otherwise HWS cannot save or
restore values during task switch.

v2: Fixed code and comment styling.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Thong Thai
333fe325fe drm/amd/amdgpu/vcn_v2_0: Mark RB commands as KMD commands
Sets the CMD_SOURCE bit for VCN 2.0 Decoder Ring Buffer commands. This
bit was previously set by the RBC HW on older firmware. Newer firmware
uses a SW RBC and this bit has to be set by the driver.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Andrey Grodzovsky
f2bd8a0ed7 drm/amdgpu: Fix amdgpu_display_supported_domains logic.
Add restriction to dissallow GTT domain if the relevant BO
doesn't have USWC flag set to avoid the APU hang scenario.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Alex Deucher
a3a09142f4 drm/amdgpu: put the SMC into the proper state on reset/unload
When doing a GPU reset or unloading the driver, we need to
put the SMU into the apprpriate state for the re-init after
the reset or unload to reliably work.

I don't think this is necessary for BACO because the SMU actually
controls the BACO state to it needs to be active.

For suspend (S3), the asic is put into D3 so the SMU would be
powered down so I don't think we need to put the SMU into
any special state.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:44 -05:00
Alex Deucher
2ddc6c3ef9 drm/amdgpu: add reset_method asic callback for navi
Navi uses either mode1 or baco depending on various
conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:17 -05:00
Alex Deucher
ee360c0b7c drm/amdgpu: add reset_method asic callback for soc15
APUs only support mode2 reset.  dGPUs use either mode1 or
baco depending on various conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:13 -05:00
Alex Deucher
9bc1932f5c drm/amdgpu: add reset_method asic callback for vi
VI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:10 -05:00
Alex Deucher
6d0f50dafe drm/amdgpu: add reset_method asic callback for cik
CIK always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:06 -05:00
Alex Deucher
dd81eede77 drm/amdgpu: add reset_method asic callback for si
SI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:56 -05:00
Alex Deucher
0cf3c64f29 drm/amdgpu: add an asic callback to determine the reset method
Sometimes the driver may have to behave differently depending
on the method we are using to reset the GPU.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:44 -05:00
Evan Quan
f0d2a7dc11 drm/amd/powerplay: fix null pointer dereference around dpm state relates
DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:29 -05:00
Evan Quan
fcd90fee8a drm/amd/powerplay: minor fixes around SW SMU power and fan setting
Add checking for possible invalid input and null pointer. And
drop redundant code.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:21 -05:00
Shirish S
1c42591591 drm/amd/display: enable S/G for RAVEN chip
enables gpu_vm_support in dm and adds
AMDGPU_GEM_DOMAIN_GTT as supported domain

v2:
Move BO placement logic into amdgpu_display_supported_domains

v3:
Use amdgpu_bo_validate_uswc in amdgpu_display_supported_domains.

v4:
amdgpu_bo_validate_uswc moved to sepperate patch.

Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:12 -05:00
Andrey Grodzovsky
ddcb7fc62f drm/amdgpu: Add check for USWC support for amdgpu_display_supported_domains
This verifies we don't add GTT as allowed domain for APUs when USWC
is disabled.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:55 -05:00
Andrey Grodzovsky
3d1b8ec76b drm/amdgpu: Create helper to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC
Move the logic to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC in
amdgpu_bo_do_create into standalone helper so it can be reused
in other functions.

v4:
Switch to return bool.

v5: Fix typos.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:48 -05:00
Andrey Grodzovsky
e4c4073b01 drm/amdgpu: Fix hard hang for S/G display BOs.
HW requires for caching to be unset for scanout BO
mappings when the BO placement is in GTT memory.
Usually the flag to unset is passed from user mode
but for FB mode this was missing.

v2:
Keep all BO placement logic in amdgpu_display_supported_domains

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:41 -05:00
Jonathan Kim
24f9aacfb0 drm/amdgpu: adding xgmi error monitoring
monitor xgmi errors via mc pie status through fica registers.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Kent Russell <Kent.Russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:34 -05:00
Jonathan Kim
64671c0fdc drm/amdgpu: add perfmon and fica atomics for df
adding perfmon and fica atomic operations to adhere to data fabrics finite
state machine requirements for indirect register access.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Kent Russell <Kent.Russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:26 -05:00
tiancyin
5f4814deab drm/amdgpu/gmc10: fix pte mytpe field error for navi14
navi14 share same PTE format with navi10.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:35 -05:00
James Zhu
6913848087 drm/amdgpu: use VCN firmware offset for cache window
Since we are using the signed FW now, and also using PSP firmware loading,
but it's still potential to break driver when loading FW directly
instead of PSP, so we should add offset.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:28 -05:00
Evan Quan
780f3a9c5b drm/amd/powerplay: some cosmetic fixes
Drop redundant check, duplicate check, duplicate setting
and fix the return value.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:13 -05:00
Chuhong Yuan
911d8b3069 drm/amdgpu: Use dev_get_drvdata where possible
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:18:41 -05:00
Kent Russell
4cab85afe9 drm/amdkfd: Fix byte align on VegaM
This was missed during the addition of VegaM support

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:17:35 -05:00
Linus Torvalds
88c5083442 Wimplicit-fallthrough patches for 5.3-rc2
Hi Linus,
 
 Please, pull the following patches that mark switch cases where we are
 expecting to fall through. These patches are part of the ongoing efforts
 to enable -Wimplicit-fallthrough. Most of them have been baking in linux-next
 for a whole development cycle.
 
 Also, pull the Makefile patch that globally enables the
 -Wimplicit-fallthrough option.
 
 Finally, some missing-break fixes that have been tagged for -stable:
 
  - drm/amdkfd: Fix missing break in switch statement
  - drm/amdgpu/gfx10: Fix missing break in switch statement
 
 Notice that with these changes, we completely get rid of all the
 fall-through warnings in the kernel.
 
 Thanks
 
 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAl06XP0ACgkQRwW0y0cG
 2zHPXhAAsatJGNIg7vuSicVipJBIlwgRLwtbE2rV+MCneUAZj37O9NtBgbxHNoJ+
 5TBB8sLpNCw3rEvPKwm6vicRgntMclEY1vaplPOHt3RiY4lqVSIkpvFSCw1hGur3
 +34+O++n/6HbtY96T+DN5WGMXrU3JSe46xnBLIt0BOoUwyKMaNdntiyd79GrrnyB
 BwPkHrmB+9FEVq+dHmPnRIhc9WUfIdily3+j1CIncaM4eXKWLjoUlOIw1VcSEc4X
 SSdFh2co+3nm/6O54ZxUEC1PyuQZWedsgFmeiTrOanG3AEWQ/jX7GwXvPlgraa2E
 F5MzGUOQwXYL/IzXl1vGjQc4+FV4nhIjEBTDdZseOp7FP5xkHyyOwzGDEmaMXpXT
 XThb+k7Q6EbiSPLFcz8zkCwNB8ngNJMNOsGP3JPFD06MfquqdzP7T1BsHcNtiMVo
 FNwQg9KtvbK9F7a9lXDqfcxfuEScbqSvnKWWwNLImCqIBATYirJzeTeLvTbVWpA2
 iyXF3ylIclsSMOvCtaz41iuoqNh12eqw2UIFPBmkU2wpVkTt+ZIn0Apgom90xe1K
 tSUqMBbBMoHVtsBsILdkdz/m6FJpK+YmmwpRrJSDvPxA5c9caD7cj+62amW/gP4z
 Hi6mC4hw1H5bKMsC2PP/QgJcmS/pgo1zwA9J2a2NNJbYPATHfKI=
 =J2sO
 -----END PGP SIGNATURE-----

Merge tag 'Wimplicit-fallthrough-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull Wimplicit-fallthrough enablement from Gustavo A. R. Silva:
 "This marks switch cases where we are expecting to fall through, and
  globally enables the -Wimplicit-fallthrough option in the main
  Makefile.

  Finally, some missing-break fixes that have been tagged for -stable:

   - drm/amdkfd: Fix missing break in switch statement

   - drm/amdgpu/gfx10: Fix missing break in switch statement

  With these changes, we completely get rid of all the fall-through
  warnings in the kernel"

* tag 'Wimplicit-fallthrough-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  Makefile: Globally enable fall-through warning
  drm/i915: Mark expected switch fall-throughs
  drm/amd/display: Mark expected switch fall-throughs
  drm/amdkfd/kfd_mqd_manager_v10: Avoid fall-through warning
  drm/amdgpu/gfx10: Fix missing break in switch statement
  drm/amdkfd: Fix missing break in switch statement
  perf/x86/intel: Mark expected switch fall-throughs
  mtd: onenand_base: Mark expected switch fall-through
  afs: fsclient: Mark expected switch fall-throughs
  afs: yfsclient: Mark expected switch fall-throughs
  can: mark expected switch fall-throughs
  firewire: mark expected switch fall-throughs
2019-07-27 11:04:18 -07:00
Christoph Hellwig
9a4903e49e mm/hmm: replace the block argument to hmm_range_fault with a flags value
This allows easier expansion to other flags, and also makes the callers a
little easier to read.

Link: https://lore.kernel.org/r/20190726005650.2566-4-rcampbell@nvidia.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-26 11:10:53 -03:00
Ralph Campbell
1f96180792 mm/hmm: replace hmm_update with mmu_notifier_range
The hmm_mirror_ops callback function sync_cpu_device_pagetables() passes a
struct hmm_update which is a simplified version of struct
mmu_notifier_range. This is unnecessary so replace hmm_update with
mmu_notifier_range directly.

Link: https://lore.kernel.org/r/20190726005650.2566-2-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
[jgg: white space tuning]
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-26 11:10:53 -03:00
Dave Airlie
4d5308e785 Merge tag 'drm-fixes-5.3-2019-07-24' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
drm-fixes-5.3-2019-07-24:

amdgpu:
- RAS fixes for vega20
- Navi VCN fix
- DC audio fixes
- DC DSC fixes
- DC dongle fixes
- DC clk mgr fixes
- Fix DDC lines on some RV2 boards
- GDS fixes for compute
- Navi SMU fixes

ttm:
- Use the same attributes when freeing d_page->vaddr

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190724210527.3415-1-alexander.deucher@amd.com
2019-07-26 14:10:26 +10:00
Gustavo A. R. Silva
d64062b57e drm/amdgpu/gfx10: Fix missing break in switch statement
Add missing break statement in order to prevent the code from falling
through to case AMDGPU_IRQ_STATE_ENABLE.

This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.

Fixes: a644d85a5c ("drm/amdgpu: add gfx v10 implementation (v10)")
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
2019-07-25 20:12:50 -05:00
Alex Deucher
1bcff32679 drm/amdgpu/smu: move fan rpm query into the asic specific code
On vega20, there is an SMU message to query it.  On navi, it's fetched
from the metrics table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-22 15:25:32 -05:00
Alex Deucher
95ccc15508 drm/amdgpu/smu: move fan rpm query into the asic specific code
On vega20, there is an SMU message to query it.  On navi, it's fetched
from the metrics table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-22 14:57:39 -05:00
Hawking Zhang
d52d6de280 drm/amdgpu: set sdma irq src num according to sdma instances
Otherwise, it will cause driver access non-existing sdma registers
in gpu reset code path

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-22 14:57:31 -05:00
Maxime Ripard
03b0f2ce73 Linus 5.3-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0006weHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGaDUIAJ4oTyVWpMRZkfG6
 vVY8qVMU3zlzEqRiyLYjkXoe/mGpuU/UVTyyStllxZ+Gg9da0mGwlugScKriPJof
 4KRUDDTGX5DrfEOo+0brKvM+PYh9uGViPgKXzyv7i6BrnX2z3JdBR4bKNuEYlAJ9
 N93Qg7v05SBHIq2Gfp3klrdWbsTTW2EaDTLbcgifXLnfKyFr47kwsmXAHPlTFP0p
 dYsZHHmf14Y9n1+ToZeVINgjQFr6mFn6ygY/PqTnd6vCgEEfP9eENJ4BZCtN1ZL/
 V0BO9MyJ5iZV0AfwSEKydk+kDEvO16TG/nyDrECVuur7AXsBx18ZplVc787f6GK+
 dyCQJ3U=
 =XLAF
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXTYRHQAKCRDj7w1vZxhR
 xY5IAQC0H/r62rlFq+JpbmksutMqvIferowP7HUk6yOaAKdVawD/c1qsTk/xxI0x
 StrxRCDqeGA7D2R/ZNb/4sobnn7+oAM=
 =k9CF
 -----END PGP SIGNATURE-----

Merge v5.3-rc1 into drm-misc-next

Noralf needs some SPI patches in 5.3 to merge some work on tinydrm.

Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2019-07-22 21:24:10 +02:00
Linus Torvalds
31cc088a4f drm fixes for -rc1:
nouveau:
 - bugfixes + TU116 enabling (minor iteration):w
 
 amdgpu:
 - large pile of fixes for new hw support this release (navi, vega20)
 - audio hotplug fix
 - bunch of corner cases and small fixes all over for amdgpu/kfd
 
 komeda:
 - back out some new properties (from this merge window) that needs
   more pondering.
 
 bochs: fb pitch setup
 
 ... plus a new panel quirk
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAl0x4sAACgkQTA9ye/CY
 qnFAAw/+JJy7fo95tIVM81p8yDxugpS3+fAJNTnKIndE2behYHPnKCrRk8BhDr0O
 x5xPy4yZHOTndmpDlLUCpV6b8xOvEX+orCNWsqbI2/Kff4yqtBRXhxBhM/3byMth
 nvfjwKVHDLo6SbL0SIIhZTTYBdBDa9zbilJjY86Xn2GdSiiyF/mC3Fhx21tXVTwq
 guoaRDcHAlAwvprKube1dC5y5IXoljJg+w6ydqwma/qUP08As/g0FiI9XvUuzLmY
 ffezdDrsHZPlNIVjGKr2QMhPl6DFSzQRV5UbqXGw7f9s6vW71qtt8a9F+rFk7Ers
 Uq0mqT9VgX6qQ9aBCyXax5UyFj+xr3Owan/D1QEyrUMPpkZHdubz5cliqw20dtYy
 1KNpZtMXR29swGn7J0o/VmtFsRr86+yX9/gL2dY8QDhGCAo/7tYRdDFXBApB+Fgb
 G3Z3Q6YYib6Rom7x3oiZpraf+KY9a+N5RTTrUgvSSxvC7SxxHw/PJbnX7Cjb13fU
 luFw1qs53qv0ytg++UQWivEf5pm/FonhBFq/KikMwtD+LhdtoIm186gPexpV6eaY
 hJZnr9BDafUCwxGZQZ4y01VUwPI5neXTUur8KVOCPqBgtFSR2m6ipgEnZUk9ltLm
 l73MfVbjbvpthds/2+8XDhzB3hnwmTzJlcXN1cQ2RJOEYoBwpe4=
 =s190
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Daniel Vetter:
 "Dave is back in shape, but now family got it so I'm doing the pull.
  Two things worthy of note:

   - nouveau feature pull was way too late, Dave&me decided to not take
     that, so Ben spun up a pull with just the fixes.

   - after some chatting with the arm display maintainers we decided to
     change a bit how that's maintained, for more oversight/review and
     cross vendor collab.

  More details below:

  nouveau:
   - bugfixes
   - TU116 enabling (minor iteration) :w

  amdgpu:
   - large pile of fixes for new hw support this release (navi, vega20)
   - audio hotplug fix
   - bunch of corner cases and small fixes all over for amdgpu/kfd

  komeda:
   - back out some new properties (from this merge window) that needs
     more pondering.

  bochs:
   - fb pitch setup

  core:
   - a new panel quirk
   - misc fixes"

* tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm: (73 commits)
  drm/nouveau/secboot/gp102-: remove WAR for SEC2 RTOS start bug
  drm/nouveau/flcn/gp102-: improve implementation of bind_context() on SEC2/GSP
  drm/nouveau: fix memory leak in nouveau_conn_reset()
  drm/nouveau/dmem: missing mutex_lock in error path
  drm/nouveau/hwmon: return EINVAL if the GPU is powered down for sensors reads
  drm/nouveau: fix bogus GPL-2 license header
  drm/nouveau: fix bogus GPL-2 license header
  drm/nouveau/i2c: Enable i2c pads & busses during preinit
  drm/nouveau/disp/tu102-: wire up scdc parameter setter
  drm/nouveau/core: recognise TU116 chipset
  drm/nouveau/kms: disallow dual-link harder if hdmi connection detected
  drm/nouveau/disp/nv50-: fix center/aspect-corrected scaling
  drm/nouveau/disp/nv50-: force scaler for any non-default LVDS/eDP modes
  drm/nouveau/mcp89/mmu: Use mcp77_mmu_new instead of g84_mmu_new on MCP89.
  drm/amd/display: init res_pool dccg_ref, dchub_ref with xtalin_freq
  drm/amdgpu/pm: remove check for pp funcs in freq sysfs handlers
  drm/amd/display: Force uclk to max for every state
  drm/amdkfd: Remove GWS from process during uninit
  drm/amd/amdgpu: Fix offset for vmid selection in debugfs interface
  drm/amd/powerplay: update vega20 driver if to fit latest SMU firmware
  ...
2019-07-19 12:29:43 -07:00
Leo Liu
53ef3969dd drm/amdgpu: use VCN firmware offset for cache window
Since we are using the signed FW now, and also using PSP firmware loading,
but it's still potential to break driver when loading FW directly
instead of PSP, so we should add offset.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
33c976c961 drm/amdgpu: drop ras self test
this function is not needed any more. error injection is
the only way to validate ras but it can't be executed in
amdgpu_ras_init, where gpu is even not initialized

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
a5dd40ca81 drm/amdgpu: only allow error injection to UMC IP block
error injection to other IP blocks (except UMC) will be enabled
until RAS feature stablize on those IP blocks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
4d249d3abd drm/amdgpu: disable GFX RAS by default
GFX RAS has not been stablized yet. disable GFX ras until
it is fully funcitonal.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
fb2a36075a drm/amdgpu: do not create ras debugfs/sysfs node for ASICs that don't have ras ability
driver shouldn't init any ras debugfs/sysfs node for ASICs that don't have ras
hardware ability

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Joseph Greathouse
fbdc5d8d84 drm/amdgpu: Default disable GDS for compute VMIDs
The GDS and GWS blocks default to allowing all VMIDs to
access all entries. Graphics VMIDs can handle setting
these limits when the driver launches work. However,
compute workloads under HWS control don't go through the
kernel driver. Instead, HWS firmware should set these
limits when a process is put into a VMID slot.

Disable access to these devices by default by turning off
all mask bits (for OA) and setting BASE=SIZE=0 (for GDS
and GWS) for all compute VMIDs. If a process wants to use
these resources, they can request this from the HWS
firmware (when such capabilities are enabled). HWS will
then handle setting the base and limit for the process when
it is assigned to a VMID.

This will also prevent user kernels from getting 'stuck' in
GWS by accident if they write GWS-using code but HWS
firmware is not set up to handle GWS reset. Until HWS is
enabled to handle GWS properly, all GWS accesses will
MEM_VIOL fault the kernel.

v2: Move initialization outside of SRBM mutex

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Alex Deucher
a08a4dae7a drm/amdgpu: flag arcturus as experimental for now
Current support will only work in internal engineering
boards.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Alex Deucher
ad91b134a2 drm/amdgpu: drop unused function definitions
These were dropped and the headers never got cleaned up.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
James Zhu
1da418ba65 drm/amdgpu:add all VCN rings into schedule request queue
Add all VCN instances' decode/encode/jpeg decode rings into
drm_sched_rq list.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Le Ma
69d4de94f8 drm/amdgpu: enable all 8 sdma instances for Arcturus silicon
The more 6 sdma instances work fine now with DF fix in vbios:
  * mmDF_PIE_AON_MiscClientsEnable(0x1c728)=0x3fe(DF_ALL_INSTANCE)
       [9:4]MmhubsEnable=3f (change from 0)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Yong Zhao
5ddd4a9a7c drm/amdgpu: Add more detail to the VM fault printing
With the printing, we don't need to parse the value on our own any more.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Le Ma
fc1e272e8d drm/amdgpu: limit sdma instances to 2 for Arcturus in BU phase
Another 6 sdma instances do not work at present. Disable them to unblock KFD
for silicon bringup as a workaround

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
f9cf36fcaf drm/amdgpu: skip gfx 9 common golden settings for arct
They are not needed by arct

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00