If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done
while handling retry fault.
Remove VMC from IH storm client, enable ring1 write pointer
overflow, then IH will drop retry fault interrupts and be able to receive
other interrupts while driver is handling retry fault.
IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
By default this timestamp is 32 bit counter. It gets overflowed in
around 10 minutes.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
unmap range always increase atomic svms->drain_pagefaults to simplify
both parent range and child range unmap, page fault handle ignores the
retry fault if svms->drain_pagefaults is set to speed up interrupt
handling. svm_range_drain_retry_fault restart draining if another
range unmap from cpu.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VMA may be removed before unmap notifier callback, and deferred list
work remove range, return success for this special case as we are
handling stale retry fault.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
kfd_process_wq_release drain retry fault to ensure no retry fault comes
after removing kfd process from the hash table, otherwise svm page fault
handler will fail to recover the fault and dump GPU vm fault log.
Refactor deferred list work to get_task_mm and take mmap write lock
to handle all ranges, and avoid mm is gone while inserting mmu notifier.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise when IH process restart, count is zero, the loop will
not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS
interrupts.
Cc: stable@vger.kernel.org
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Renoir and newer gfx9 APUs have new TSC register that is
not part of the gfxoff tile, so it can be read without
needing to disable gfx off.
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Apply the same check we do for dGPUs for APUs as well.
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: 9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")
For sriov XGMI configuration, the host driver will handle the hive reset,
so in guest side, the reset_sriov only be called once on one device. This will
make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already
been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov
function to make them balance .
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
Due to pass the wrong parameter down to the enable_stream_gating(),
it would cause the DSC of the removing stream would not be PG.
[HOW]
To pass the correct parameter down th the enable_stream_gating().
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Yi-Ling Chen <Yi-Ling.Chen2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
A warning appears in the log on GPU reset for
link_enc_cfg_link_encs_assign for the following condition:
ASSERT(state->res_ctx.link_enc_cfg_ctx.link_enc_assignments[i].valid == false);
This is not expected behavior and may result in link encoders being
incorrectly assigned.
[How]
The dc->current_state is backed up into dm->cached_dc_state before
we commit 0 streams.
DC will clear link encoder assignments on the real state but the
changes won't propagate over to the copy we made before the
0 streams commit.
DC expects that link encoder assignments are *not* valid
when committing a state, so as a workaround it needs to be cleared
before passing it back into DC.
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
We're only setting the flags on stream[0]'s planes so this logic fails
if we have more than one stream in the state.
This can cause a page flip timeout with multiple displays in the
configuration.
[How]
Index into the stream_status array using the stream index - it's a 1:1
mapping.
Fixes: cdaae8371a ("drm/amd/display: Handle GPU reset for DC block")
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
The HW interrupt gets disabled after GPU reset so we don't receive
notifications for HPD or AUX from DMUB - leading to timeout and
black screen with (or without) DPIA links connected.
[How]
Re-enable the interrupt after GPU reset like we do for the other
DC interrupts.
Fixes: 81927e2808 ("drm/amd/display: Support for DMUB AUX")
Reviewed-by: Jude Shih <Jude.Shih@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_amdkfd_gpuvm_free_memory_of_gpu drop dmabuf reference increased in
amdgpu_gem_prime_export.
amdgpu_bo_destroy drop dmabuf reference increased in
amdgpu_gem_prime_import.
So remove this extra dma_buf_put to avoid double free.
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Tested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Disable HDP register remapping on SRIOV and set rmmio_remap.reg_offset
to the fixed address of the VF register for hdp_v*_flush_hdp.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to the virtual frame buffer (vfb) the pv display driver is not
essential for booting the system. Set the respective flag.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211022064800.14978-3-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The Hyper-V DRM driver tries to free MMIO region on removing
the device regardless of VM type, while Gen1 VMs don't use MMIO
and hence causing the kernel to crash on a NULL pointer dereference.
Fix this by making deallocating MMIO only on Gen2 machines and implement
removal for Gen1
Fixes: 76c56a5aff ("drm/hyperv: Add DRM driver for hyperv synthetic video device")
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Reviewed-by: Deepak Rawat <drawat.floss@gmail.com>
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211119112900.300537-1-mgamal@redhat.com
In particular, we need to ensure all the necessary blocks are switched
to 64b mode (a5xx+) otherwise the high bits of the address of the BO to
snapshot state into will be ignored, resulting in:
*** gpu fault: ttbr0=0000000000000000 iova=0000000000012000 dir=READ type=TRANSLATION source=CP (0,0,0,0)
platform 506a000.gmu: [drm:a6xx_gmu_set_oob] *ERROR* Timeout waiting for GMU OOB set BOOT_SLUMBER: 0x0
Fixes: 4f776f4511 ("drm/msm/gpu: Convert the GPU show function to use the GPU state")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211108180122.487859-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
If you happened to try to access `/dev/drm_dp_aux` devices provided by
the MSM DP AUX driver too early at bootup you could go boom. Let's
avoid that by only allowing AUX transfers when the controller is
powered up.
Specifically the crash that was seen (on Chrome OS 5.4 tree with
relevant backports):
Kernel panic - not syncing: Asynchronous SError Interrupt
CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb #1
Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
Call trace:
dump_backtrace+0x0/0x14c
show_stack+0x20/0x2c
dump_stack+0xac/0x124
panic+0x150/0x390
nmi_panic+0x80/0x94
arm64_serror_panic+0x78/0x84
do_serror+0x0/0x118
do_serror+0xa4/0x118
el1_error+0xbc/0x160
dp_catalog_aux_write_data+0x1c/0x3c
dp_aux_cmd_fifo_tx+0xf0/0x1b0
dp_aux_transfer+0x1b0/0x2bc
drm_dp_dpcd_access+0x8c/0x11c
drm_dp_dpcd_read+0x64/0x10c
auxdev_read_iter+0xd4/0x1c4
I did a little bit of tracing and found that:
* We register the AUX device very early at bootup.
* Power isn't actually turned on for my system until
hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
* You can see that dp_power_init() calls dp_aux_init() which is where
we start allowing AUX channel requests to go through.
In general this patch is a bit of a bandaid but at least it gets us
out of the current state where userspace acting at the wrong time can
fully crash the system.
* I think the more proper fix (which requires quite a bit more
changes) is to power stuff on while an AUX transfer is
happening. This is like the solution we did for ti-sn65dsi86. This
might be required for us to move to populating the panel via the
DP-AUX bus.
* Another fix considered was to dynamically register / unregister. I
tried that at <https://crrev.com/c/3169431/3> but it got
ugly. Currently there's a bug where the pm_runtime() state isn't
tracked properly and that causes us to just keep registering more
and more.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://lore.kernel.org/r/20211109100403.1.I4e23470d681f7efe37e2e7f1a6466e15e9bb1d72@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
If "data_lanes" property of the dsi output endpoint is missing in
the DT, num_data_lanes would be 0 by default, which could cause
dsi_host_attach() to fail if dsi->lanes is set to a non-zero value
by the bridge driver.
According to the binding document of msm dsi controller, the
input/output endpoint of the controller is expected to have 4 lanes.
So let's set num_data_lanes to 4 by default.
Signed-off-by: Philip Chen <philipchen@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211030100812.1.I6cd9af36b723fed277d34539d3b2ba4ca233ad2d@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
Looks like 658f4c8296 ("drm/msm/devfreq: Add 1ms delay before
clamping freq") was badly rebased on top of efb8a170a3 ("drm/msm:
Fix devfreq NULL pointer dereference on a3xx") and ended up with
the NULL check in the wrong place.
Fixes: 658f4c8296 ("drm/msm/devfreq: Add 1ms delay before clamping freq")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211120200103.1051459-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
This was supposed to be a relative timer, not absolute.
Fixes: 658f4c8296 ("drm/msm/devfreq: Add 1ms delay before clamping freq")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20211120200103.1051459-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Avoid a possible uninitialized use of gpu_scid variable to fix the
below smatch warning:
drivers/gpu/drm/msm/adreno/a6xx_gpu.c:1480 a6xx_llc_activate()
error: uninitialized symbol 'gpu_scid'.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20211118154903.3.Ie4ac321feb10168af569d9c2b4cf6828bed8122c@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
Mesa attempts to allocate a cached-coherent buffer in order to determine
if cached-coherent is supported. Resulting in seeing this error message
once per process with newer mesa. But no reason for this to be more
than a debug msg.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211111230214.765476-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
When converting to use an idr to map userspace fence seqno values back
to a dma_fence, we lost the error return when userspace passes seqno
that is larger than the last submitted fence. Restore this check.
Reported-by: Akhil P Oommen <akhilpo@codeaurora.org>
Fixes: a61acbbe9c ("drm/msm: Track "seqno" fences by idr")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20211111192457.747899-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
We weren't dropping the submitqueue reference in all paths. In
particular, when the fence has already been signalled. Split out
a helper to simplify handling this in the various different return
paths.
Fixes: a61acbbe9c ("drm/msm: Track "seqno" fences by idr")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211111192457.747899-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
In commit 510410bfc0 ("drm/msm: Implement mmap as GEM object
function") we switched to a new/cleaner method of doing things. That's
good, but we missed a little bit.
Before that commit, we used to _first_ run through the
drm_gem_mmap_obj() case where `obj->funcs->mmap()` was NULL. That meant
that we ran:
vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
...and _then_ we modified those mappings with our own. Now that
`obj->funcs->mmap()` is no longer NULL we don't run the default
code. It looks like the fact that the vm_flags got VM_IO / VM_DONTDUMP
was important because we're now getting crashes on Chromebooks that
use ARC++ while logging out. Specifically a crash that looks like this
(this is on a 5.10 kernel w/ relevant backports but also seen on a
5.15 kernel):
Unable to handle kernel paging request at virtual address ffffffc008000000
Mem abort info:
ESR = 0x96000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=000000008293d000
[ffffffc008000000] pgd=00000001002b3003, p4d=00000001002b3003,
pud=00000001002b3003, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
[...]
CPU: 7 PID: 15734 Comm: crash_dump64 Tainted: G W 5.10.67 #1 [...]
Hardware name: Qualcomm Technologies, Inc. sc7280 IDP SKU2 platform (DT)
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
pc : __arch_copy_to_user+0xc0/0x30c
lr : copyout+0xac/0x14c
[...]
Call trace:
__arch_copy_to_user+0xc0/0x30c
copy_page_to_iter+0x1a0/0x294
process_vm_rw_core+0x240/0x408
process_vm_rw+0x110/0x16c
__arm64_sys_process_vm_readv+0x30/0x3c
el0_svc_common+0xf8/0x250
do_el0_svc+0x30/0x80
el0_svc+0x10/0x1c
el0_sync_handler+0x78/0x108
el0_sync+0x184/0x1c0
Code: f8408423 f80008c3 910020c6 36100082 (b8404423)
Let's add the two flags back in.
While we're at it, the fact that we aren't running the default means
that we _don't_ need to clear out VM_PFNMAP, so remove that and save
an instruction.
NOTE: it was confirmed that VM_IO was the important flag to fix the
problem I was seeing, but adding back VM_DONTDUMP seems like a sane
thing to do so I'm doing that too.
Fixes: 510410bfc0 ("drm/msm: Implement mmap as GEM object function")
Reported-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211110113334.1.I1687e716adb2df746da58b508db3f25423c40b27@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reported-by: Douglas Anderson <dianders@chromium.org>
Fixes: 9bc9557017 ("drm/msm: Devfreq tuning")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Tested-By: Steev Klimaszewski <steev@kali.org>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20211105202021.181092-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
In commit 142639a52a ("drm/msm/a6xx: fix crashstate capture for
A650") we changed a6xx_get_gmu_registers() to read 3 sets of
registers. Unfortunately, we didn't change the memory allocation for
the array. That leads to a KASAN warning (this was on the chromeos-5.4
kernel, which has the problematic commit backported to it):
BUG: KASAN: slab-out-of-bounds in _a6xx_get_gmu_registers+0x144/0x430
Write of size 8 at addr ffffff80c89432b0 by task A618-worker/209
CPU: 5 PID: 209 Comm: A618-worker Tainted: G W 5.4.156-lockdep #22
Hardware name: Google Lazor Limozeen without Touchscreen (rev5 - rev8) (DT)
Call trace:
dump_backtrace+0x0/0x248
show_stack+0x20/0x2c
dump_stack+0x128/0x1ec
print_address_description+0x88/0x4a0
__kasan_report+0xfc/0x120
kasan_report+0x10/0x18
__asan_report_store8_noabort+0x1c/0x24
_a6xx_get_gmu_registers+0x144/0x430
a6xx_gpu_state_get+0x330/0x25d4
msm_gpu_crashstate_capture+0xa0/0x84c
recover_worker+0x328/0x838
kthread_worker_fn+0x32c/0x574
kthread+0x2dc/0x39c
ret_from_fork+0x10/0x18
Allocated by task 209:
__kasan_kmalloc+0xfc/0x1c4
kasan_kmalloc+0xc/0x14
kmem_cache_alloc_trace+0x1f0/0x2a0
a6xx_gpu_state_get+0x164/0x25d4
msm_gpu_crashstate_capture+0xa0/0x84c
recover_worker+0x328/0x838
kthread_worker_fn+0x32c/0x574
kthread+0x2dc/0x39c
ret_from_fork+0x10/0x18
Fixes: 142639a52a ("drm/msm/a6xx: fix crashstate capture for A650")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20211103153049.1.Idfa574ccb529d17b69db3a1852e49b580132035c@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
Before the drm driver had support for this file there was a driver that
exposed the contents of the vga password register to userspace. It would
present the entire register instead of interpreting it.
The drm implementation chose to mask of the lower bit, without explaining
why. This breaks the existing userspace, which is looking for 0xa8 in
the lower byte.
Change our implementation to expose the entire register.
Fixes: 696029eb36 ("drm/aspeed: Add sysfs for output settings")
Reported-by: Oskar Senft <osk@google.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Jeremy Kerr <jk@codeconstruct.com.au>
Tested-by: Oskar Senft <osk@google.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211117010145.297253-1-joel@jms.id.au
The ->gem_create_object() functions are supposed to return NULL if there
is an error. None of the callers expect error pointers so returing one
will lead to an Oops. See drm_gem_vram_create(), for example.
Fixes: c826a6e106 ("drm/vc4: Add a BO cache.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211118111416.GC1147@kili
and one revert targeting stable 5.4, for TGL's DSI display clocks
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmGW1DAACgkQ+mJfZA7r
E8oMlwf+N6COiFS92Wv1Lt50RN9jWRFv8UPSr3R9KoqZd1gU/yo0iTMFMm7mZcOc
7uu/ykcuYSBAEB6m/oSMKsklhHTvKC3CUyCQ+dpn5keGe+YLtKlRF1Vkjfrw/+sE
n3P/pxeY4s/QxHpvPeTnuOWocRzFWxglRac4Vlb+UXnwBSt0SFUGkvaigSUefRi9
3VjmrTgR8xWRnadPMae7C3P0ZoHjcmOhCiwQSpBeDXCQD+VFlmlHEFYHRzNRmrJw
MPSkdc8NUjXwb+lHZHWWz/XORrRkP45XII/FYcpq62kN9ak6TzQBc9VLrrcBTURy
Wq8x2LXFtjaDtALoeq0+m84atS3LWQ==
=HwsI
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-fixes-2021-11-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
One quick fix for return error handling, one fix for ADL-P display
and one revert targeting stable 5.4, for TGL's DSI display clocks
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YZbUPIHpR1S3JZ2b@intel.com
The nvkm_acr_lsfw_add() function never returns NULL. It returns error
pointers on error.
Fixes: 22dcda45a3 ("drm/nouveau/acr: implement new subdev to replace "secure boot"")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211118111314.GB1147@kili
In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed
There is a potential memleak if not call kobject_put.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In SRIOV configuration, the reset may failed to bring asic back to normal but stop cpsch
already been called, the start_cpsch will not be called since there is no resume in this
case. When reset been triggered again, driver should avoid to do uninitialization again.
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add support that allow the userspace tool like RGP to get the GFX clock
value at runtime, the fix follow the old way to show the min/current/max
clocks level for compatible consideration.
=== Test ===
$ cat /sys/class/drm/card0/device/pp_dpm_sclk
0: 200Mhz *
1: 1100Mhz
2: 1600Mhz
then run stress test on one APU system.
$ cat /sys/class/drm/card0/device/pp_dpm_sclk
0: 200Mhz
1: 1040Mhz *
2: 1600Mhz
The current GFXCLK value is updated at runtime.
BugLink: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5260
Reviewed-by: Huang Ray <Ray.Huang@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode
which assign amdgpu_encoder->native_mode with *preferred_mode result in
amdgpu_encoder->native_mode.clock always be 0. That will cause
amdgpu_connector_set_property returned early on:
if ((rmx_type != DRM_MODE_SCALE_NONE) &&
(amdgpu_encoder->native_mode.clock == 0))
when we try to set scaling mode Full/Full aspect/Center.
Add the missing function to amdgpu_connector_vga_get_mode can fix this.
It also works on dvi connectors because
amdgpu_connector_dvi_helper_funcs.get_mode use the same method.
Signed-off-by: hongao <hongao@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[Why]
After commit ("drm/amdgpu/display: add support for multiple backlights")
number of eDPs is defined while registering backlight device.
However the panel's extended caps get updated once before register call.
That leads to regression with extended caps like oled brightness control.
[How]
Update connector ext caps after register_backlight_device
Fixes: 7fd13baeb7 ("drm/amdgpu/display: add support for multiple backlights")
Link: https://www.reddit.com/r/AMDLaptops/comments/qst0fm/after_updating_to_linux_515_my_brightness/
Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Samuel Čavoj <samuel@cavoj.net>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jasdeep Dhillon <Jasdeep.Dhillon@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Print Navi1x fine grained clocks in a consistent manner with other SOCs.
Don't show aritificial DPM level when the current clock equals min or max.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Driver initialization is driven by IP version from IP
discovery table. So add error print when failing to add
ip block during driver initialization, this will be more
friendly to user to know which IP version is not correct.
[ 40.467361] [drm] host supports REQ_INIT_DATA handshake
[ 40.474076] [drm] add ip block number 0 <nv_common>
[ 40.474090] [drm] add ip block number 1 <gmc_v10_0>
[ 40.474101] [drm] add ip block number 2 <psp>
[ 40.474103] [drm] add ip block number 3 <navi10_ih>
[ 40.474114] [drm] add ip block number 4 <smu>
[ 40.474119] [drm] add ip block number 5 <amdgpu_vkms>
[ 40.474134] [drm] add ip block number 6 <gfx_v10_0>
[ 40.474143] [drm] add ip block number 7 <sdma_v5_2>
[ 40.474147] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init
[ 40.474545] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device.
v2: use dev_err to multi-GPU system
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Also print the message index and parameter of the stuck command.
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The intel_engine_create_virtual() function does not return NULL. It
returns error pointers.
Fixes: e5e32171a2 ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116114916.GB11936@kili
(cherry picked from commit fc12b70d12)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This reverts commit 991d9557b0 ("drm/i915/tgl/dsi: Gate the ddi clocks
after pll mapping"). The Bspec was updated recently with the pll ungate
sequence similar to that of icl dsi enable sequence. Hence reverting.
Bspec: 49187
Fixes: 991d9557b0 ("drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping")
Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211109120428.15211-1-vandita.kulkarni@intel.com
(cherry picked from commit 4579509ef1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drm_sched_job_add_dependency() could drop the last ref, so we need to do
the dma_fence_get() first.
Cc: Christian König <christian.koenig@amd.com>
Fixes: 9c2ba26535 ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116155545.473311-1-robdclark@gmail.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
Trivial fix since we now need to grab a reference to the fence we have
added. Previously the dma_resv function where doing that for us.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 9c2ba26535 ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
Link: https://patchwork.freedesktop.org/patch/msgid/20211019112706.27769-1-christian.koenig@amd.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
References: https://lore.kernel.org/dri-devel/2023306.UmlnhvANQh@archbook/
Tested-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
When PHY_SUN6I_MIPI_DPHY is selected, and RESET_CONTROLLER
is not selected, Kbuild gives the following warning:
WARNING: unmet direct dependencies detected for PHY_SUN6I_MIPI_DPHY
Depends on [n]: (ARCH_SUNXI [=n] || COMPILE_TEST [=y]) && HAS_IOMEM [=y] && COMMON_CLK [=y] && RESET_CONTROLLER [=n]
Selected by [y]:
- DRM_SUN6I_DSI [=y] && HAS_IOMEM [=y] && DRM_SUN4I [=y]
This is because DRM_SUN6I_DSI selects PHY_SUN6I_MIPI_DPHY
without selecting or depending on RESET_CONTROLLER, despite
PHY_SUN6I_MIPI_DPHY depending on RESET_CONTROLLER.
These unmet dependency bugs were detected by Kismet,
a static analysis tool for Kconfig. Please advise if this
is not the appropriate solution.
v2:
Fixed indentation to match the rest of the file.
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211109032351.43322-1-julianbraha@gmail.com
The GEM CMA helpers allocate non-coherent (i.e., cached) backing storage
with dma_alloc_noncoherent(), but release it with dma_free_wc(). Fix this
with a call to dma_free_noncoherent(). Writecombining storage is still
released with dma_free_wc().
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: cf8ccbc72d ("drm: Add support for GEM buffers backed by non-coherent memory")
Acked-by: Paul Cercueil <paul@crapouillou.net>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.14+
Link: https://patchwork.freedesktop.org/patch/msgid/20210708175146.10618-1-tzimmermann@suse.de
bridge:
- HPD improvments for lt9611uxc
- eDP aux-bus support for ps8640
- LVDS data-mapping selection support
ttm:
- remove huge page functionality (needs reworking)
- fix a race condition during BO eviction
panels:
- add some new panels
fbdev:
- fix double-free
- remove unused scrolling acceleration
- CONFIG_FB dep improvements
locking:
- improve contended locking logging
- naming collision fix
dma-buf:
- add dma_resv_for_each_fence iterator
- fix fence refcounting bug
- name locking fixesA
prime:
- fix object references during mmap
nouveau:
- various code style changes
- refcount fix
- device removal fixes
- protect client list with a mutex
- fix CE0 address calculation
i915:
- DP rates related fixes
- Revert disabling dual eDP that was causing state readout problems
- put the cdclk vtables in const data
- Fix DVO port type for older platforms
- Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
- CCS FBs related fixes
- Fix recursive lock in GuC submission
- Revert guc_id from i915_request tracepoint
- Build fix around dmabuf
amdgpu:
- GPU reset fix
- Aldebaran fix
- Yellow Carp fixes
- DCN2.1 DMCUB fix
- IOMMU regression fix for Picasso
- DSC display fixes
- BPC display calculation fixes
- Other misc display fixes
- Don't allow partial copy from user for DC debugfs
- SRIOV fixes
- GFX9 CSB pin count fix
- Various IP version check fixes
- DP 2.0 fixes
- Limit DCN1 MPO fix to DCN1
amdkfd:
- SVM fixes
- Fix gfx version for renoir
- Reset fixes
udl:
- timeout fix
imx:
- circular locking fix
virtio:
- NULL ptr deref fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGN3YwACgkQDHTzWXnE
hr6aZQ/+Pobf1VE7V3wPUcopxccJYmgBvG/uY8EDyjA8qaxHs2pQqGN2IooOGxr6
F8G1N94Hem/PCDn3T8JI2Tqw5z4sy4UwLahEWISurFCen1IMAfA7hYfutp9X3O7X
8h7b+PgkvVruEAHF7z0kqnWGPHmcro29cIHNkXVRjnJuz+Gmn1XRfo6Jj65n6D7u
NfMeU4/lWRR3767oJQzTqyAYtGxsKaZT3/tBD5WggZBzEKC7hqhAl8EUoOLWwojo
fDqwiEpLXpraPRIQH8trkXVHhzPeLAmG916WwS8JG3CEk9mUQ+I7Jshhd8cw+bsQ
XPuk3OBfU9mtuiGgNzrLP3xXJZs/QN3EkpKZWLefTnJY+C4BgiP2RifTnghmwV31
6/7Pr83CX/cn3BRd7r0xaeBZYvVYBZmwoZcsZFJBM8SVjd/ofKUfAmCzZZKheio2
5qa6bj9DQoyjEoFAULh23plcX6hvATGP7wzfRTnJ9AlAJ0KyEjVJ3r0qE6jHMDc/
uzcTAnKIWCxt9kSgE5qwLQtxLBaBpr/iOniZbCqGkPjiZeMzqP/ug1AKVP7kk39x
FxZVT8ZOKk8Xt4iLZx8jmHi2KKheXYZi9LqieoTrJd44qMXDOmR9DCtQX9FZuWJS
EJAlMj6sCowAZdODPZMVpoMc3Gti9nZ2Fpu7mLrRcMk1gKfjKwo=
=qMNk
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm
Pull more drm updates from Dave Airlie:
"I missed a drm-misc-next pull for the main pull last week. It wasn't
that major and isn't the bulk of this at all. This has a bunch of
fixes all over, a lot for amdgpu and i915.
bridge:
- HPD improvments for lt9611uxc
- eDP aux-bus support for ps8640
- LVDS data-mapping selection support
ttm:
- remove huge page functionality (needs reworking)
- fix a race condition during BO eviction
panels:
- add some new panels
fbdev:
- fix double-free
- remove unused scrolling acceleration
- CONFIG_FB dep improvements
locking:
- improve contended locking logging
- naming collision fix
dma-buf:
- add dma_resv_for_each_fence iterator
- fix fence refcounting bug
- name locking fixesA
prime:
- fix object references during mmap
nouveau:
- various code style changes
- refcount fix
- device removal fixes
- protect client list with a mutex
- fix CE0 address calculation
i915:
- DP rates related fixes
- Revert disabling dual eDP that was causing state readout problems
- put the cdclk vtables in const data
- Fix DVO port type for older platforms
- Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
- CCS FBs related fixes
- Fix recursive lock in GuC submission
- Revert guc_id from i915_request tracepoint
- Build fix around dmabuf
amdgpu:
- GPU reset fix
- Aldebaran fix
- Yellow Carp fixes
- DCN2.1 DMCUB fix
- IOMMU regression fix for Picasso
- DSC display fixes
- BPC display calculation fixes
- Other misc display fixes
- Don't allow partial copy from user for DC debugfs
- SRIOV fixes
- GFX9 CSB pin count fix
- Various IP version check fixes
- DP 2.0 fixes
- Limit DCN1 MPO fix to DCN1
amdkfd:
- SVM fixes
- Fix gfx version for renoir
- Reset fixes
udl:
- timeout fix
imx:
- circular locking fix
virtio:
- NULL ptr deref fix"
* tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm: (126 commits)
drm/ttm: Double check mem_type of BO while eviction
drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
drm/amdgpu: drop jpeg IP initialization in SRIOV case
drm/amd/display: reject both non-zero src_x and src_y only for DCN1x
drm/amd/display: Add callbacks for DMUB HPD IRQ notifications
drm/amd/display: Don't lock connection_mutex for DMUB HPD
drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends
drm/amdkfd: Fix retry fault drain race conditions
drm/amdkfd: lower the VAs base offset to 8KB
drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly
drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages
drm/i915/fb: Fix rounding error in subsampled plane size calculation
drm/i915/hdmi: Turn DP++ TMDS output buffers back on in encoder->shutdown()
drm/locking: fix __stack_depot_* name conflict
drm/virtio: Fix NULL dereference error in virtio_gpu_poll
drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
drm/amd/amdkfd: Don't sent command to HWS on kfd reset
...
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF/AjYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1hkIAJ6sFDbvb4M4LMwf
Slh2NVL9o5sLMBDzVwnVlyMSKDbMn1WBKreGssaLgZjGDc74lxsdSmw5l9MZm0JN
xlq95Q6XFiuu+0qDHPWwfDz3JFO4TqW2ZLLPWk9NnkNbRXqccSrlVRi1RpgE1t3/
NUtS8CQLu6A2BYMc6mkk3aV6IwSNKOkWbM5eBHSvU4j8B6lLbNQop0AfO/wyY1xB
U6LiVE1RpN/b7Yv+75ITtNzuHzVIBx6305FvSnOlKbMKKvIClt96Vd2OeuoEkK+6
wGU8JraB1+fc0GckAhynNrjWQWdvi0MAhFWWEJxjS20OGcV1rXDduNfkVNauO1Zn
+dNyJ3s=
=g9fz
-----END PGP SIGNATURE-----
BackMerge tag 'v5.15' into drm-next
I got a drm-fixes which had some 5.15 stuff in it, so to avoid
the mess just backmerge here.
Linux 5.15
Signed-off-by: Dave Airlie <airlied@redhat.com>
MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a
source page was already locked during migrate_vma_collect(). If it
wasn't then the a second attempt is made to lock the page. However if
the first attempt failed it's unlikely a second attempt will succeed,
and the retry adds complexity. So clean this up by removing the retry
and MIGRATE_PFN_LOCKED flag.
Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag
set, but nothing actually checks that.
Link: https://lkml.kernel.org/r/20211025041608.289017-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gv100_hdmi_ctrl() writes vendor_infoframe.subpack0_high to 0x6f0110, and
then overwrites it with 0. Just drop the overwrite with 0, that's clearly
a mistake.
Because of this issue the HDMI VIC is 0 instead of 1 in the HDMI Vendor
InfoFrame when transmitting 4kp30.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 290ffeafcc ("drm/nouveau/disp/gv100: initial support")
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/3d3bd0f7-c150-2479-9350-35d394ee772d@xs4all.nl
BO might sit in a wrong lru list as there is a small period of memory
moving and lru list updating.
Lets skip eviction if we hit such mismatch.
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110043149.57554-2-xinhui.pan@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
fix for udl, CONFIG_FB dependency improvements, a fix for a circular
locking depency in imx, a NULL pointer dereference fix for virtio, and a
naming collision fix for drm/locking.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYYuA2wAKCRDj7w1vZxhR
xQdZAP4oqiWpCO1JfnZdvEJ/lOULqvdzYkbUZexshGLdbb4ECwEA83TzIbQvXP8p
jsC1hPNAIsOBkQ+nGZwJkTOtDcpEAQ4=
=N2pK
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-fixes-2021-11-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
Removed the TTM Huge Page functionnality to address a crash, a timeout
fix for udl, CONFIG_FB dependency improvements, a fix for a circular
locking depency in imx, a NULL pointer dereference fix for virtio, and a
naming collision fix for drm/locking.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110082114.vfpkpnecwdfg27lk@gilmour
Fixes: 96b8dd4423 ("drm/amdgpu/amdgpu_vcn: convert to IP version checking")
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: b05b9c591f ("drm/amdgpu: clean up set IP function")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Video plane gets rejected for non-zero src_y and src_x on DCN2.x.
[How]
Limit the rejection till DCN1.x and verified MPO, by dragging video
playback beyond display's left (0, 0) co-ordinates.
Fixes: d89f6048bd ("drm/amd/display: Reject non-zero src_y and src_x for video planes")
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
blank screen and other display rates fixes, and more.
Four patches targeting stable in here.
Display Fixes:
- DP rates related fixes (Imre, Jani)
- A Revert on disaling dual eDP that was causing state readout problems (Jani)
- put the cdclk vtables in const data (Jani)
- Fix DVO port type for moder platforms (Ville)
- Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown (Ville)
- CCS FBs related fixes (Imre)
GT fixes:
- Fix recursive lock in GuC submission (Matt Brost)
- Revert guc_id from i915_request tracepoint (Joonas)
- Build fix around dmabuf (Matt Auld)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmGLAXwACgkQ+mJfZA7r
E8rCbwf9EDwnn35GYmATgf0wDiXmZRJHlFqRgLcPs5o9P+4MXDvPAKINCTXJBxr6
+TpA+ZYsTHh58P9FmoGLu1btBasEACjeoiMFL5u4RVUPOQUd0qwBmXHjSHzJ+y6E
eA8hrtYVVfc9/Z3QFLybxyZNzw8s7a9BUZoBCjr0AiUfp0BdqzplIU2LpnsPyw9S
Q7UbMRyiMNd7iOndVCamfDnTbVBFRvv6WEbxveCjLL3ud02fpWywBN+CryQ3ZQs0
5gco7H/gpCrLHysJbBa60DPM6NL0JKxXvRU0aSXP2WY6ONFzc6jULYYtkRLMHjc3
NLuQlx5uBCf863h6k3shIzlNHCvF+Q==
=aRFO
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-fixes-2021-11-09' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Couple Reverts, build fix, couple virtualization fixes,
blank screen and other display rates fixes, and more.
Four patches targeting stable in here.
Display Fixes:
- DP rates related fixes (Imre, Jani)
- A Revert on disaling dual eDP that was causing state readout problems (Jani)
- put the cdclk vtables in const data (Jani)
- Fix DVO port type for moder platforms (Ville)
- Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown (Ville)
- CCS FBs related fixes (Imre)
GT fixes:
- Fix recursive lock in GuC submission (Matt Brost)
- Revert guc_id from i915_request tracepoint (Joonas)
- Build fix around dmabuf (Matt Auld)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YYsBif3HMi8GjLoU@intel.com
[Why]
We need HPD IRQ notifications (RX, short pulse) to properly handle
DP MST for DPIA connections.
[How]
A null pointer exception currently occurs when these are received
so add a check to validate that we have a handler installed for
the notification.
Extend the HPD handler to also handle HPD IRQ (RX) since the logic is
the same.
Fixes: e27c41d5b0 ("drm/amd/display: Support for DMUB HPD interrupt handling")
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Jude Shih <shenshih@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Per DRM spec we only need to hold that lock when touching
connector->state - which we do not do in that handler.
Taking this locking introduces unnecessary dependencies with other
threads which is bad for performance and opens up the potential for
a deadlock since there are multiple locks being held at once.
[How]
Remove the connection_mutex lock/unlock routine and just iterate over
the drm connectors normally. The iter helpers implicitly lock the
connection list so this is safe to do.
DC link access also does not need to be guarded since the link
table is static at creation - we don't dynamically add or remove links,
just streams.
Fixes: e27c41d5b0 ("drm/amd/display: Support for DMUB HPD interrupt handling")
Reviewed-by: Jude Shih <shenshih@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Trivial patch which adds a comment for macro
endif's in amdgpu_dm.c
Reviewed-by: Ariel Bernstein <Eric.Bernstein@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The check for whether to drain retry faults must be under the mmap write
lock to serialize with munmap notifier callbacks.
We were also missing checks on child ranges. To fix that, simplify the
logic by using a flag rather than checking on each prange. That also
allows draining less freqeuntly when many ranges are unmapped at once.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Alex Sierra <Alex.Sierra@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The low 16MB of virtual address space are currently reserved for kernel
mode allocations mapped into user virtual address space. This causes
conflicts with HMM/SVM mappings at low virtual addresses. We tried to
move those kernel mode allocations to the upper half of the 64-bit
virtual address space for GFX9, which is naturally reserved for kernel
use. However, TBA (trap handler code) has problems to access addresses
in the high virtual space. We have decided to set this to 8KB of the
lower address space as a temporary fix, while investigate TBA address
problem. It is very unlikely for user space to map memory at this low
region.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
make action upon failure in "drm_atomic_add_affected_connectors()"
consistent with the rest of failures in amdgpu_dm_atomic_check().
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The KFD pre_reset should be called before reset been executed, it will
hold the lock to prevent other rocm process to sent the packlage to hiq
during host execute the real reset on the HW
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There was a change(below) target for such issue:
d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")
But the fix for VI ASICs was missing there. This is a supplement for
that.
Fixes: d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Merge more updates from Andrew Morton:
"87 patches.
Subsystems affected by this patch series: mm (pagecache and hugetlb),
procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs,
init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork,
sysvfs, kcov, gdb, resource, selftests, and ipc"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits)
ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
selftests/kselftest/runner/run_one(): allow running non-executable files
virtio-mem: disallow mapping virtio-mem memory via /dev/mem
kernel/resource: disallow access to exclusive system RAM regions
kernel/resource: clean up and optimize iomem_is_exclusive()
scripts/gdb: handle split debug for vmlinux
kcov: replace local_irq_save() with a local_lock_t
kcov: avoid enable+disable interrupts if !in_task()
kcov: allocate per-CPU memory on the relevant node
Documentation/kcov: define `ip' in the example
Documentation/kcov: include types.h in the example
sysv: use BUILD_BUG_ON instead of runtime check
kernel/fork.c: unshare(): use swap() to make code cleaner
seq_file: fix passing wrong private data
seq_file: move seq_escape() to a header
signal: remove duplicate include in signal.h
crash_dump: remove duplicate include in crash_dump.h
crash_dump: fix boolreturn.cocci warning
hfs/hfsplus: use WARN_ON for sanity check
...
To print stack entries into a buffer, users of stackdepot, first get a
list of stack entries using stack_depot_fetch and then print this list
into a buffer using stack_trace_snprint. Provide a helper in stackdepot
for this purpose. Also change above mentioned users to use this helper.
[imran.f.khan@oracle.com: fix build error]
Link: https://lkml.kernel.org/r/20210915175321.3472770-4-imran.f.khan@oracle.com
[imran.f.khan@oracle.com: export stack_depot_snprint() to modules]
Link: https://lkml.kernel.org/r/20210916133535.3592491-4-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-4-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Jani Nikula <jani.nikula@intel.com> [i915]
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
So far the remapped view size in GTT/DPT was padded to the next aligned
offset unnecessarily after the last color plane with an unaligned size.
Remove the unnecessary padding.
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Fixes: 3d1adc3d64 ("drm/i915/adlp: Add support for remapping CCS FBs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-3-imre.deak@intel.com
(cherry picked from commit 6b6636e176)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
For NV12 FBs with odd main surface tile-row height the CCS surface
height was incorrectly calculated 1 less than the actual value. Fix this
by rounding up the result of divison. For consistency do the same for
the CCS surface width calculation.
Fixes: b3e57bccd6 ("drm/i915/tgl: Gen-12 render decompression")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-2-imre.deak@intel.com
(cherry picked from commit 2ee5ef9c93)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Looks like our VBIOS/GOP generally fail to turn the DP dual mode adater
TMDS output buffers back on after a reboot. This leads to a black screen
after reboot if we turned the TMDS output buffers off prior to reboot.
And if i915 decides to do a fastboot the black screen will persist even
after i915 takes over.
Apparently this has been a problem ever since commit b2ccb822d3 ("drm/i915:
Enable/disable TMDS output buffers in DP++ adaptor as needed") if one
rebooted while the display was turned off. And things became worse with
commit fe0f1e3bfd ("drm/i915: Shut down displays gracefully on reboot")
since now we always turn the display off before a reboot.
This was reported on a RKL, but I confirmed the same behaviour on my
SNB as well. So looks pretty universal.
Let's fix this by explicitly turning the TMDS output buffers back on
in the encoder->shutdown() hook. Note that this gets called after irqs
have been disabled, so the i2c communication with the DP dual mode
adapter has to be performed via polling (which the gmbus code is
perfectly happy to do for us).
We also need a bit of care in handling DDI encoders which may or may
not be set up for HDMI output. Specifically ddc_pin will not be
populated for a DP only DDI encoder, in which case we don't want to
call intel_gmbus_get_adapter(). We can handle that by simply doing
the dual mode adapter type check before calling
intel_gmbus_get_adapter().
Cc: <stable@vger.kernel.org> # v5.11+
Fixes: fe0f1e3bfd ("drm/i915: Shut down displays gracefully on reboot")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4371
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211029191802.18448-2-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
(cherry picked from commit 49c55f7b03)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When virgl is not enabled, vfpriv pointer would not be allocated.
Therefore, check for a valid value before dereferencing.
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20211104214249.1802789-1-vivek.kasireddy@intel.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not
set.
Fixes: f7f12b2582 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be
freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the
release_notify callback then the amdgpu_bo is freed.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-By: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When kfd need to be reset, sent command to HWS might cause hang and get unnecessary timeout.
This change try not to touch HW in pre_reset and keep queues to be in the evicted state
when the reset is done, so they are not put back on the runlist. These queues will be destroied
on process termination.
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As part of the ib padding process, accessing the RLC_SPM_* register may
trigger gfx hang. Since gfxoff may be already kicked during the whole period.
To address that, we manually toggle gfx on/off around the RLC_SPM_*
register access.
This can resolve the gfx hang issue observed on running Talos with RDP launched
in parallel.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The error count reset for xgmi3x16 pcs is missed.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Previously there was a check based on chip # for chips that aligned to
>=CHIP_NAVI10 to have RLC stopped as part of DPMS check. This was because
of gfxclk being controlled by RLC in the newer designs.
As part of IP version checking though, this got changed to match IP
version for SMU. Because Renoir designs also include smu11 that meant
that even GFX9 started to stop RLC earlier.
Adjust to match GFX IP version instead of SMU IP version to restore the
previous behavior.
Fixes: a8967967f6 ("drm/amdgpu/amdgpu_smu: convert to IP version checking")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
csb bo is not unpinned in gfx 9. It will lead to pin_count leak on
driver unload.
[How]
Call bo_free_kernel corresponding to bo_create_kernel in
gfx_rlc_init_csb. This will also unify the code path with other gfx
versions.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
For Vega10, disabling gart of gfxhub could mess up KIQ and PSP
under sriov mode, and lead to DMAR on host side.
[How]
Skip writing GMC registers under sriov.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!
v2: fix sysfs_emit -> sysfs_emit_at missed conversions
Cc: Lang Yu <lang.yu@amd.com>
Cc: Darren Powell <darren.powell@amd.com>
Fixes: 6db0c87a0a ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
BOs need to be reserved before they are added or removed, so ensure that
they are reserved during kfd_mem_attach and kfd_mem_detach
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]:
When we call hmm_range_fault to map memory after a migration, we don't
expect memory to be migrated again as a result of hmm_range_fault. The
driver ensures that all memory is in GPU-accessible locations so that
no migration should be needed. However, there is one corner case where
hmm_range_fault can unexpectedly cause a migration from DEVICE_PRIVATE
back to system memory due to a write-fault when a system memory page in
the same range was mapped read-only (e.g. COW). Ranges with individual
pages in different locations are usually the result of failed page
migrations (e.g. page lock contention). The unexpected migration back
to system memory causes a deadlock from recursive locking in our
driver.
[How]:
Creating a task reference new member under svm_range_list struct.
Setting this with "current" reference, right before the hmm_range_fault
is called. This member is checked against "current" reference at
svm_migrate_to_ram callback function. If equal, the migration will be
ignored.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is no reason to allow for partial buffers from userspace in our
debugfs. In this particular case callers will zero out the wr_buf but if
callers in the future don't do that we might be looking at corrupt data.
Linus puts it better than I can in
https://lkml.org/lkml/2021/10/26/993
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit f4b34faa08.
Since commit f4b34faa08 ("drm/imx: Annotate dma-fence critical section in
commit path") the following possible circular dependency is detected:
[ 5.001811] ======================================================
[ 5.001817] WARNING: possible circular locking dependency detected
[ 5.001824] 5.14.9-01225-g45da36cc6fcc-dirty #1 Tainted: G W
[ 5.001833] ------------------------------------------------------
[ 5.001838] kworker/u8:0/7 is trying to acquire lock:
[ 5.001848] c1752080 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x40/0x294
[ 5.001903]
[ 5.001903] but task is already holding lock:
[ 5.001909] c176df78 (dma_fence_map){++++}-{0:0}, at: imx_drm_atomic_commit_tail+0x10/0x160
[ 5.001957]
[ 5.001957] which lock already depends on the new lock.
...
Revert it for now.
Tested on a imx6q-sabresd.
Fixes: f4b34faa08 ("drm/imx: Annotate dma-fence critical section in commit path")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104001112.4035691-1-festevam@gmail.com
My previous patch correctly addressed the possible link failure, but as
Jani points out, the dependency is now stricter than it needs to be.
Change it again, to allow DRM_FBDEV_EMULATION to be used when
DRM_KMS_HELPER and FB are both loadable modules and DRM is linked into
the kernel.
As a side-effect, the option is now only visible when at least one DRM
driver makes use of DRM_KMS_HELPER. This is better, because the option
has no effect otherwise.
Fixes: 606b102876 ("drm: fb_helper: fix CONFIG_FB dependency")
Suggested-by: Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211029120307.1407047-1-arnd@kernel.org
The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.
get_user_pages_fast() considers any page that passed pmd_huge() as
usable:
if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
pmd_devmap(pmd))) {
And vmf_insert_pfn_pmd_prot() unconditionally sets
entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
As such gup_huge_pmd() will try to deref a struct page:
head = try_grab_compound_head(pmd_page(orig), refs, flags);
and thus crash.
Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.
Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.
Fixes: 314b6580ad ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
[danvet: Update subject per Thomas' &Christian's review]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com