In particular, we need to ensure all the necessary blocks are switched
to 64b mode (a5xx+) otherwise the high bits of the address of the BO to
snapshot state into will be ignored, resulting in:
*** gpu fault: ttbr0=0000000000000000 iova=0000000000012000 dir=READ type=TRANSLATION source=CP (0,0,0,0)
platform 506a000.gmu: [drm:a6xx_gmu_set_oob] *ERROR* Timeout waiting for GMU OOB set BOOT_SLUMBER: 0x0
Fixes: 4f776f4511 ("drm/msm/gpu: Convert the GPU show function to use the GPU state")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211108180122.487859-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
If you happened to try to access `/dev/drm_dp_aux` devices provided by
the MSM DP AUX driver too early at bootup you could go boom. Let's
avoid that by only allowing AUX transfers when the controller is
powered up.
Specifically the crash that was seen (on Chrome OS 5.4 tree with
relevant backports):
Kernel panic - not syncing: Asynchronous SError Interrupt
CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb #1
Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
Call trace:
dump_backtrace+0x0/0x14c
show_stack+0x20/0x2c
dump_stack+0xac/0x124
panic+0x150/0x390
nmi_panic+0x80/0x94
arm64_serror_panic+0x78/0x84
do_serror+0x0/0x118
do_serror+0xa4/0x118
el1_error+0xbc/0x160
dp_catalog_aux_write_data+0x1c/0x3c
dp_aux_cmd_fifo_tx+0xf0/0x1b0
dp_aux_transfer+0x1b0/0x2bc
drm_dp_dpcd_access+0x8c/0x11c
drm_dp_dpcd_read+0x64/0x10c
auxdev_read_iter+0xd4/0x1c4
I did a little bit of tracing and found that:
* We register the AUX device very early at bootup.
* Power isn't actually turned on for my system until
hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
* You can see that dp_power_init() calls dp_aux_init() which is where
we start allowing AUX channel requests to go through.
In general this patch is a bit of a bandaid but at least it gets us
out of the current state where userspace acting at the wrong time can
fully crash the system.
* I think the more proper fix (which requires quite a bit more
changes) is to power stuff on while an AUX transfer is
happening. This is like the solution we did for ti-sn65dsi86. This
might be required for us to move to populating the panel via the
DP-AUX bus.
* Another fix considered was to dynamically register / unregister. I
tried that at <https://crrev.com/c/3169431/3> but it got
ugly. Currently there's a bug where the pm_runtime() state isn't
tracked properly and that causes us to just keep registering more
and more.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://lore.kernel.org/r/20211109100403.1.I4e23470d681f7efe37e2e7f1a6466e15e9bb1d72@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
In commit 510410bfc0 ("drm/msm: Implement mmap as GEM object
function") we switched to a new/cleaner method of doing things. That's
good, but we missed a little bit.
Before that commit, we used to _first_ run through the
drm_gem_mmap_obj() case where `obj->funcs->mmap()` was NULL. That meant
that we ran:
vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
...and _then_ we modified those mappings with our own. Now that
`obj->funcs->mmap()` is no longer NULL we don't run the default
code. It looks like the fact that the vm_flags got VM_IO / VM_DONTDUMP
was important because we're now getting crashes on Chromebooks that
use ARC++ while logging out. Specifically a crash that looks like this
(this is on a 5.10 kernel w/ relevant backports but also seen on a
5.15 kernel):
Unable to handle kernel paging request at virtual address ffffffc008000000
Mem abort info:
ESR = 0x96000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=000000008293d000
[ffffffc008000000] pgd=00000001002b3003, p4d=00000001002b3003,
pud=00000001002b3003, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
[...]
CPU: 7 PID: 15734 Comm: crash_dump64 Tainted: G W 5.10.67 #1 [...]
Hardware name: Qualcomm Technologies, Inc. sc7280 IDP SKU2 platform (DT)
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
pc : __arch_copy_to_user+0xc0/0x30c
lr : copyout+0xac/0x14c
[...]
Call trace:
__arch_copy_to_user+0xc0/0x30c
copy_page_to_iter+0x1a0/0x294
process_vm_rw_core+0x240/0x408
process_vm_rw+0x110/0x16c
__arm64_sys_process_vm_readv+0x30/0x3c
el0_svc_common+0xf8/0x250
do_el0_svc+0x30/0x80
el0_svc+0x10/0x1c
el0_sync_handler+0x78/0x108
el0_sync+0x184/0x1c0
Code: f8408423 f80008c3 910020c6 36100082 (b8404423)
Let's add the two flags back in.
While we're at it, the fact that we aren't running the default means
that we _don't_ need to clear out VM_PFNMAP, so remove that and save
an instruction.
NOTE: it was confirmed that VM_IO was the important flag to fix the
problem I was seeing, but adding back VM_DONTDUMP seems like a sane
thing to do so I'm doing that too.
Fixes: 510410bfc0 ("drm/msm: Implement mmap as GEM object function")
Reported-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211110113334.1.I1687e716adb2df746da58b508db3f25423c40b27@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
In commit 142639a52a ("drm/msm/a6xx: fix crashstate capture for
A650") we changed a6xx_get_gmu_registers() to read 3 sets of
registers. Unfortunately, we didn't change the memory allocation for
the array. That leads to a KASAN warning (this was on the chromeos-5.4
kernel, which has the problematic commit backported to it):
BUG: KASAN: slab-out-of-bounds in _a6xx_get_gmu_registers+0x144/0x430
Write of size 8 at addr ffffff80c89432b0 by task A618-worker/209
CPU: 5 PID: 209 Comm: A618-worker Tainted: G W 5.4.156-lockdep #22
Hardware name: Google Lazor Limozeen without Touchscreen (rev5 - rev8) (DT)
Call trace:
dump_backtrace+0x0/0x248
show_stack+0x20/0x2c
dump_stack+0x128/0x1ec
print_address_description+0x88/0x4a0
__kasan_report+0xfc/0x120
kasan_report+0x10/0x18
__asan_report_store8_noabort+0x1c/0x24
_a6xx_get_gmu_registers+0x144/0x430
a6xx_gpu_state_get+0x330/0x25d4
msm_gpu_crashstate_capture+0xa0/0x84c
recover_worker+0x328/0x838
kthread_worker_fn+0x32c/0x574
kthread+0x2dc/0x39c
ret_from_fork+0x10/0x18
Allocated by task 209:
__kasan_kmalloc+0xfc/0x1c4
kasan_kmalloc+0xc/0x14
kmem_cache_alloc_trace+0x1f0/0x2a0
a6xx_gpu_state_get+0x164/0x25d4
msm_gpu_crashstate_capture+0xa0/0x84c
recover_worker+0x328/0x838
kthread_worker_fn+0x32c/0x574
kthread+0x2dc/0x39c
ret_from_fork+0x10/0x18
Fixes: 142639a52a ("drm/msm/a6xx: fix crashstate capture for A650")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20211103153049.1.Idfa574ccb529d17b69db3a1852e49b580132035c@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
Some randconfig builds fail when drm/drm_bridge.h is not included
implicitly in this file:
drivers/gpu/drm/msm/dp/dp_parser.c:279:25: error: implicit declaration of function 'devm_drm_panel_bridge_add' [-Werror,-Wimplicit-function-declaration]
parser->panel_bridge = devm_drm_panel_bridge_add(dev, panel);
Fixes: 4b296d15b3 ("drm/msm/dp: Allow attaching a drm_panel")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211026083254.3396322-1-arnd@kernel.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Clang warns:
drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c:162:6: error: variable 'commit' is uninitialized when used here [-Werror,-Wuninitialized]
if (commit)
^~~~~~
drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c:106:32: note: initialize the variable 'commit' to silence this warning
struct drm_crtc_commit *commit;
^
= NULL
1 error generated.
The assignment and use of commit in the main body of
dpu_crtc_set_crc_source() were removed from v1 to v2 but the call to
drm_crtc_commit_put() at the end was not. Do that now so there is no
more warning.
Fixes: 78d9b458cc ("drm/msm/dpu: Add CRC support for DPU")
Link: https://github.com/ClangBuiltLinux/linux/issues/1493
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211026142435.3606413-1-nathan@kernel.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
We know the upper bound on # of mixers (ie. two), so lets just allocate
this on the stack.
Fixes:
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:201
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/0
INFO: lockdep is turned off.
irq event stamp: 43642
hardirqs last enabled at (43641): [<ffffffe24dd276bc>] cpuidle_enter_state+0x158/0x25c
hardirqs last disabled at (43642): [<ffffffe24dfff450>] enter_el1_irq_or_nmi+0x10/0x1c
softirqs last enabled at (43620): [<ffffffe24d4103fc>] __do_softirq+0x1e4/0x464
softirqs last disabled at (43615): [<ffffffe24d48bd90>] __irq_exit_rcu+0x104/0x150
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.15.0-rc3-debug+ #105
Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
Call trace:
dump_backtrace+0x0/0x18c
show_stack+0x24/0x30
dump_stack_lvl+0xa0/0xd4
dump_stack+0x18/0x34
___might_sleep+0x1e0/0x1f0
__might_sleep+0x78/0x8c
slab_pre_alloc_hook.constprop.0+0x48/0x6c
__kmalloc+0xc8/0x21c
dpu_crtc_vblank_callback+0x158/0x1f8
dpu_encoder_vblank_callback+0x70/0xc4
dpu_encoder_phys_vid_vblank_irq+0x50/0x12c
dpu_core_irq+0x1bc/0x1d0
dpu_irq+0x1c/0x28
msm_irq+0x34/0x40
__handle_irq_event_percpu+0x15c/0x308
handle_irq_event_percpu+0x3c/0x90
handle_irq_event+0x54/0x98
handle_level_irq+0xa0/0xd0
handle_irq_desc+0x2c/0x44
generic_handle_domain_irq+0x28/0x34
dpu_mdss_irq+0x90/0xe8
handle_irq_desc+0x2c/0x44
handle_domain_irq+0x54/0x80
gic_handle_irq+0xd4/0x148
call_on_irq_stack+0x2c/0x54
do_interrupt_handler+0x4c/0x64
el1_interrupt+0x30/0xd0
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x78/0x7c
arch_local_irq_enable+0xc/0x14
cpuidle_enter+0x44/0x5c
do_idle+0x248/0x268
cpu_startup_entry+0x30/0x48
rest_init+0x188/0x19c
arch_call_rest_init+0x1c/0x28
start_kernel+0x704/0x744
__primary_switched+0xc0/0xc8
Fixes: 78d9b458cc ("drm/msm/dpu: Add CRC support for DPU")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211023160016.3322052-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Add CRC support to DPU, which is currently not supported by
this driver. Only supports CRC for CRTC for now, but will extend support
to other blocks later on.
Changes in v2:
- Added kfree() calls for return paths in dpu_crtc_get_crc()
- Propogated error code for dpu_crtc_get_crc()
- Renamed skip_count
- Removed dpu_crtc_is_valid_crc_source()
- Removed wait for commit in dpu_crtc_set_crc_source()
- Moved crc_source from struct dpu_crtc to struct dpu_crtc_state
- Moved CRC register constants from dpu_hw_util.h to dpu_hw_lm.c
Validated with IGT kms_pipe_crc_basic, and kms_cursor_crc
Test: kms_pipe_crc_basic
Subtests Passed:
- bad-source
- read-crc-pipe-A
- read-crc-pipe-A-frame-sequence
- nonblocking-crc-pipe-A
- nonblocking-crc-pipe-A-frame-sequence
- disable-crc-after-crtc-pipe-A[1]
- compare-crc-sanitycheck-pipe-A[1]
Rest skipped
Test: kms_cursor_crc
Subtests Passed:
- pipe-A-cursor-size-change
- pipe-A-cursor-alpha-opaque
- pipe-A-cursor-alpha-transparent
Subtests Failed:
- pipe-A-cursor-dpms
- pipe-A-cursor-*-onscreen
- pipe-A-cursor-*-offscreen
Rest skipped
Tested on Qualcomm RB3 (debian, sdm845), Qualcomm RB5 (debian, qrb5165)
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jessica Zhang <jesszhan@codeaurora.org>
[1] Skipped on RB5 due to issue related to DPMS. Planning to upload a
fix for this in the future.
Link: https://lore.kernel.org/r/20211019224822.25940-1-jesszhan@codeaurora.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Based on the removal of the g_dp_display and the movement of the
priv->dp lookup into the DP code it's now possible to have multiple
DP instances.
In line with the other controllers in the MSM driver, introduce a
per-compatible list of base addresses which is used to resolve the
"instance id" for the given DP controller. This instance id is used as
index in the priv->dp[] array.
Then extend the initialization code to initialize struct drm_encoder for
each of the registered priv->dp[] and update the logic for associating
each struct msm_dp with the struct dpu_encoder_virt.
A new enum is introduced to document the connection between the
instances referenced in the dpu_intf_cfg array and the controllers in
the DP driver and sc7180 is updated.
Lastly, bump the number of struct msm_dp instances carries by priv->dp
to 3, the currently known maximum number of controllers found in a
Qualcomm SoC.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-6-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
eDP panels might need some power sequencing and backlight management,
so make it possible to associate a drm_panel with an eDP instance and
prepare and enable the panel accordingly.
Now that we know which hardware instance is DP and which is eDP,
parser->parse() is passed the connector_type and the parser is limited
to only search for a panel in the eDP case.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-5-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
As the following patches introduced support for multiple DP blocks in a
platform and some of those block might be eDP it becomes useful to be
able to specify the connector type per block.
Although there's only a single block at this point, the array of descs
and the search in dp_display_get_desc() are introduced here to simplify
the next patch, that does introduce support for multiple DP blocks.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-4-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Functions in the DisplayPort code that relates to individual instances
(encoders) are passed both the struct msm_dp and the struct drm_encoder.
But in a situation where multiple DP instances would exist this means
that the caller need to resolve which struct msm_dp relates to the
struct drm_encoder at hand.
Store a reference to the struct msm_dp associated with each
dpu_encoder_virt to allow the particular instance to be associate with
the encoder in the following patch.
Reviewed-by: Abhinav Kumar <abhinavk@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-3-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
The msm_gem_new_impl() function cleans up after itself so there is no
need to call drm_gem_object_put(). Conceptually, it does not make sense
to call a kref_put() function until after the reference counting has
been initialized which happens immediately after this call in the
drm_gem_(private_)object_init() functions.
In the msm_gem_import() function the "obj" pointer is uninitialized, so
it will lead to a crash.
Fixes: 05b849111c ("drm/msm: prime support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211013081315.GG6010@kili
Signed-off-by: Rob Clark <robdclark@chromium.org>
The debugfs code is provided an array of a single drm_connector. Then to
access the connector, the list of all connectors of the DRM device is
traversed and all non-DisplayPort connectors are skipped, to find the
one and only DisplayPort connector.
But as we move to support multiple DisplayPort controllers this will now
find multiple connectors and has no way to distinguish them.
Pass the single connector to dp_debug_get() and use this in the debugfs
functions instead, both to simplify the code and the support the
multiple instances.
Reviewed-by: Abhinav Kumar <abhinavk@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211015232213.1839472-1-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>