linux/drivers/gpu/drm
Lyude Paul ebcc0e6b50 drm/dp_mst: Introduce new refcounting scheme for mstbs and ports
The current way of handling refcounting in the DP MST helpers is really
confusing and probably just plain wrong because it's been hacked up many
times over the years without anyone actually going over the code and
seeing if things could be simplified.

To the best of my understanding, the current scheme works like this:
drm_dp_mst_port and drm_dp_mst_branch both have a single refcount. When
this refcount hits 0 for either of the two, they're removed from the
topology state, but not immediately freed. Both ports and branch devices
will reinitialize their kref once it's hit 0 before actually destroying
themselves. The intended purpose behind this is so that we can avoid
problems like not being able to free a remote payload that might still
be active, due to us having removed all of the port/branch device
structures in memory, as per:

commit 91a25e4631 ("drm/dp/mst: deallocate payload on port destruction")

Which may have worked, but then it caused use-after-free errors. Being
new to MST at the time, I tried fixing it;

commit 263efde31f ("drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()")

But, that was broken: both drm_dp_mst_port and drm_dp_mst_branch structs
are validated in almost every DP MST helper function. Simply put, this
means we go through the topology and try to see if the given
drm_dp_mst_branch or drm_dp_mst_port is still attached to something
before trying to use it in order to avoid dereferencing freed memory
(something that has happened a LOT in the past with this library).
Because of this it doesn't actually matter whether or not we keep keep
the ports and branches around in memory as that's not enough, because
any function that validates the branches and ports passed to it will
still reject them anyway since they're no longer in the topology
structure. So, use-after-free errors were fixed but payload deallocation
was completely broken.

Two years later, AMD informed me about this issue and I attempted to
come up with a temporary fix, pending a long-overdue cleanup of this
library:

commit c54c7374ff ("drm/dp_mst: Skip validating ports during destruction, just ref")

But then that introduced use-after-free errors, so I quickly reverted
it:

commit 9765635b30 ("Revert "drm/dp_mst: Skip validating ports during destruction, just ref"")

And in the process, learned that there is just no simple fix for this:
the design is just broken. Unfortunately, the usage of these helpers are
quite broken as well. Some drivers like i915 have been smart enough to
avoid accessing any kind of information from MST port structures, but
others like nouveau have assumed, understandably so, that
drm_dp_mst_port structures are normal and can just be accessed at any
time without worrying about use-after-free errors.

After a lot of discussion, me and Daniel Vetter came up with a better
idea to replace all of this.

To summarize, since this is documented far more indepth in the
documentation this patch introduces, we make it so that drm_dp_mst_port
and drm_dp_mst_branch structures have two different classes of
refcounts: topology_kref, and malloc_kref. topology_kref corresponds to
the lifetime of the given drm_dp_mst_port or drm_dp_mst_branch in it's
given topology. Once it hits zero, any associated connectors are removed
and the branch or port can no longer be validated. malloc_kref
corresponds to the lifetime of the memory allocation for the actual
structure, and will always be non-zero so long as the topology_kref is
non-zero. This gives us a way to allow callers to hold onto port and
branch device structures past their topology lifetime, and dramatically
simplifies the lifetimes of both structures. This also finally fixes the
port deallocation problem, properly.

Additionally: since this now means that we can keep ports and branch
devices allocated in memory for however long we need, we no longer need
a significant amount of the port validation that we currently do.

Additionally, there is one last scenario that this fixes, which couldn't
have been fixed properly beforehand:

- CPU1 unrefs port from topology (refcount 1->0)
- CPU2 refs port in topology(refcount 0->1)

Since we now can guarantee memory safety for ports and branches
as-needed, we also can make our main reference counting functions fix
this problem by using kref_get_unless_zero() internally so that topology
refcounts can only ever reach 0 once.

Changes since v4:
* Change the kernel-figure summary for dp-mst/topology-figure-1.dot a
  bit - danvet
* Remove figure numbers - danvet

Changes since v3:
* Remove rebase detritus - danvet
* Split out purely style changes into separate patches - hwentlan

Changes since v2:
* Fix commit message - checkpatch
* s/)-1/) - 1/g - checkpatch

Changes since v1:
* Remove forward declarations - danvet
* Move "Branch device and port refcounting" section from documentation
  into kernel-doc comments - danvet
* Export internal topology lifetime functions into their own section in
  the kernel-docs - danvet
* s/@/&/g for struct references in kernel-docs - danvet
* Drop the "when they are no longer being used" bits from the kernel
  docs - danvet
* Modify diagrams to show how the DRM driver interacts with the topology
  and payloads - danvet
* Make suggested documentation changes for
  drm_dp_mst_topology_get_mstb() and drm_dp_mst_topology_get_port() -
  danvet
* Better explain the relationship between malloc refs and topology krefs
  in the documentation for drm_dp_mst_topology_get_port() and
  drm_dp_mst_topology_get_mstb() - danvet
* Fix "See also" in drm_dp_mst_topology_get_mstb() - danvet
* Rename drm_dp_mst_topology_get_(port|mstb)() ->
  drm_dp_mst_topology_try_get_(port|mstb)() and
  drm_dp_mst_topology_ref_(port|mstb)() ->
  drm_dp_mst_topology_get_(port|mstb)() - danvet
* s/should/must in docs - danvet
* WARN_ON(refcount == 0) in topology_get_(mstb|port) - danvet
* Move kdocs for mstb/port structs inline - danvet
* Split drm_dp_get_last_connected_port_and_mstb() changes into their own
  commit - danvet

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190111005343.17443-7-lyude@redhat.com
2019-01-10 20:12:19 -05:00
..
amd drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
arc drm/arc: do not rely on drmP.h from drm_gem_cma_helper.h 2019-01-09 22:48:51 +01:00
arm drm: mali-dp: Enable Mali-DP tiled buffer formats 2018-11-02 09:57:27 +00:00
armada
ast drm/ast: Remove set but not used variable 'bo' 2018-12-10 11:26:41 +01:00
atmel-hlcdc drm/atmel-hlcdc: Use drm_fbdev_generic_setup() 2018-11-01 15:24:22 +01:00
bochs drm/bochs: add edid present check 2018-12-20 13:25:28 +01:00
bridge drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
cirrus drm-misc-next for v4.21, part 2: 2018-11-22 12:54:38 +10:00
etnaviv drm/etnaviv: fix for 64bit seqno change 2018-12-11 17:31:52 +01:00
exynos drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
fsl-dcu drm/fsl-dcu: Use drm_fbdev_generic_setup() 2018-11-01 15:23:58 +01:00
gma500
hisilicon drm/ttm: initialize globals during device init (v2) 2018-11-05 14:21:21 -05:00
i2c drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
i810
i915 drm/edid: Add display_info.rgb_quant_range_selectable 2019-01-10 19:01:06 +02:00
imx drm/imx: fix build failure without CONFIG_DRM_FBDEV_EMULATION 2018-10-05 12:09:20 +02:00
lib
mediatek drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
meson drm: meson: Cleanup on drm_display_mode print str 2019-01-09 22:07:15 +01:00
mga
mgag200 drm/ttm: initialize globals during device init (v2) 2018-11-05 14:21:21 -05:00
msm drm: msm: Cleanup drm_display_mode print str 2019-01-10 21:05:39 +01:00
mxsfb drm: replace "drm_dev_unref" function with "drm_dev_put" 2018-11-24 22:12:54 +01:00
nouveau drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
omapdrm drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
panel drm/panel: simple: Add AUO G101EVN010 panel support 2018-12-03 17:04:48 +01:00
pl111 drm/pl111: add of_node_put() 2018-11-28 09:31:07 -08:00
qxl qxl: Use struct_size() in kzalloc() 2019-01-09 09:38:49 +01:00
r128 drm/ati_pcigart: Fix error code in drm_ati_pcigart_init() 2018-12-17 10:47:17 +01:00
radeon drm/edid: Add display_info.rgb_quant_range_selectable 2019-01-10 19:01:06 +02:00
rcar-du drm: remove include of drmP.h from bridge/dw_hdmi.h 2019-01-09 22:27:44 +01:00
rockchip drm/rockchip: Add reflection properties 2019-01-11 00:44:10 +01:00
savage
scheduler drm/scheduler: Add drm_sched_job_cleanup 2018-11-05 14:21:27 -05:00
selftests drm/selftests: Fix build warning -Wframe-larger-than 2018-11-02 14:25:32 +01:00
shmobile drm: replace "drm_dev_unref" function with "drm_dev_put" 2018-11-24 22:12:54 +01:00
sis
sti drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
stm drm/stm: Use drm_fbdev_generic_setup() 2018-10-25 17:00:28 +02:00
sun4i drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
tdfx
tegra drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
tilcdc drm/tilcdc: Use drm_fbdev_generic_setup() 2018-11-01 15:25:41 +01:00
tinydrm drm/tinydrm: do not reply on drmP.h from drm_gem_cma_helper.h 2019-01-09 22:48:56 +01:00
ttm Merge branch 'drm-next-4.21' of git://people.freedesktop.org/~agd5f/linux into drm-next 2018-11-19 11:07:52 +10:00
tve200 drm: replace "drm_dev_unref" function with "drm_dev_put" 2018-11-24 22:12:54 +01:00
udl DRM: UDL: get rid of useless vblank initialization 2018-10-23 15:59:01 +02:00
v3d drm/v3d: Invalidate the caches from the outside in. 2018-12-07 10:56:51 -08:00
vc4 drm/edid: Add display_info.rgb_quant_range_selectable 2019-01-10 19:01:06 +02:00
vgem dma-buf: make fence sequence numbers 64 bit v2 2018-12-07 12:44:16 +01:00
via
virtio drm/virtio: Drop deprecated load/unload initialization 2019-01-09 09:38:49 +01:00
vkms drm/vkms: set preferred depth to 24 2018-12-17 10:51:20 +01:00
vmwgfx Merge branch 'drm-next-4.21' of git://people.freedesktop.org/~agd5f/linux into drm-next 2018-11-19 11:07:52 +10:00
xen drm/xen: Don't set the dpms hook 2018-12-12 11:56:45 +01:00
zte drm/edid: Pass connector to AVI infoframe functions 2019-01-10 19:01:06 +02:00
ati_pcigart.c drm/ati_pcigart: Fix error code in drm_ati_pcigart_init() 2018-12-17 10:47:17 +01:00
drm_agpsupport.c
drm_atomic_helper.c drm: Fix up drm_atomic_state_helper.[hc] extraction 2018-11-30 16:37:52 +01:00
drm_atomic_state_helper.c drm: Fix up drm_atomic_state_helper.[hc] extraction 2018-11-30 16:37:52 +01:00
drm_atomic_uapi.c drm: Add connector property to limit max bpc 2018-11-02 09:15:58 -07:00
drm_atomic.c drm/atomic: integrate modeset lock with private objects 2018-12-11 15:24:30 +01:00
drm_auth.c
drm_blend.c drm: Clarify DRM_MODE_REFLECT_X/Y documentation 2018-09-11 11:21:30 +01:00
drm_bridge.c drm: bridge: document bridge attach/detach imbalance 2018-09-13 11:28:12 +02:00
drm_bufs.c drm: un-inline drm_legacy_findmap() 2019-01-02 11:37:11 +02:00
drm_cache.c
drm_client.c drm/gem: Add drm_gem_object_funcs 2018-11-20 14:56:18 +01:00
drm_color_mgmt.c drm: Add DRM_MODESET_LOCK_BEGIN/END helpers 2018-11-29 10:48:31 -05:00
drm_connector.c drm/connector: Allow creation of margin props alone 2018-12-19 14:47:58 +01:00
drm_context.c drm: Fix error handling in drm_legacy_addctx 2019-01-07 11:26:31 +01:00
drm_crtc_helper_internal.h
drm_crtc_helper.c drm/crtc-helpers: WARN when used with atomic drivers 2019-01-10 13:00:21 +01:00
drm_crtc_internal.h
drm_crtc.c drm: Add DRM_MODESET_LOCK_BEGIN/END helpers 2018-11-29 10:48:31 -05:00
drm_debugfs_crc.c
drm_debugfs.c drm: Merge drm_info.c into drm_debugfs.c 2018-11-22 09:52:27 +01:00
drm_dma.c
drm_dp_aux_dev.c
drm_dp_cec.c drm: Do not call drm_dp_cec_set_edid() while registering DP connectors 2018-10-11 10:52:35 +02:00
drm_dp_dual_mode_helper.c
drm_dp_helper.c drm/dp: DRM DP helper/macros to get DP sink DSC parameters 2018-10-31 14:08:32 -07:00
drm_dp_mst_topology.c drm/dp_mst: Introduce new refcounting scheme for mstbs and ports 2019-01-10 20:12:19 -05:00
drm_drv.c drm/drm_drv.c: Remove duplicate header 2018-12-27 12:46:49 +01:00
drm_dumb_buffers.c
drm_edid_load.c
drm_edid.c drm/edid: Add display_info.rgb_quant_range_selectable 2019-01-10 19:01:06 +02:00
drm_encoder_slave.c
drm_encoder.c drm: Differentiate the lack of an interface from invalid parameter 2018-09-14 17:29:47 +01:00
drm_fb_cma_helper.c drm-misc-next for v4.21, part 1: 2018-11-19 10:40:33 +10:00
drm_fb_helper.c drm/fb-helper: Scale back depth to supported maximum 2019-01-10 20:54:42 +01:00
drm_file.c
drm_flip_work.c
drm_fourcc.c drm: Introduce new DRM_FORMAT_XYUV 2018-11-20 16:20:13 +02:00
drm_framebuffer.c drm: Add macro to export functions only when CONFIG_DRM_DEBUG_SELFTEST is enabled 2018-11-02 09:58:10 +00:00
drm_gem_cma_helper.c drm/cma-helper: Add DRM_GEM_CMA_VMAP_DRIVER_OPS 2018-11-20 14:57:25 +01:00
drm_gem_framebuffer_helper.c drm/fourcc: Add char_per_block, block_w and block_h in drm_format_info 2018-11-02 09:55:27 +00:00
drm_gem.c drm/gem: Mark pinned pages as unevictable 2019-01-09 21:24:50 +00:00
drm_hashtab.c
drm_internal.h drm: move DRM_IF_VERSION to drm_internal.h 2018-12-27 13:08:58 +01:00
drm_ioc32.c
drm_ioctl.c drm: Return -EOPNOTSUPP in drm_setclientcap() when driver do not support KMS 2018-09-21 11:19:40 +02:00
drm_irq.c drm: Differentiate the lack of an interface from invalid parameter 2018-09-14 17:29:47 +01:00
drm_kms_helper_common.c
drm_lease.c drm: Rename crtc_idr as object_idr to KMS cleanups 2018-12-13 22:44:57 +01:00
drm_legacy.h
drm_lock.c drm: Differentiate the lack of an interface from invalid parameter 2018-09-14 17:29:47 +01:00
drm_memory.c drm: Shift * to be adjacent to pointer name 2018-10-16 14:39:25 +02:00
drm_mipi_dsi.c
drm_mm.c
drm_mode_config.c drm: Rename crtc_idr as object_idr to KMS cleanups 2018-12-13 22:44:57 +01:00
drm_mode_object.c drm: Reorder set_property_atomic to avoid returning with an active ww_ctx 2019-01-03 09:54:26 +00:00
drm_modes.c drm: Convert to using %pOFn instead of device_node.name 2018-10-01 10:16:39 +02:00
drm_modeset_helper.c drm: Unexport primary plane helpers 2018-10-05 18:06:49 +02:00
drm_modeset_lock.c drm/atomic: integrate modeset lock with private objects 2018-12-11 15:24:30 +01:00
drm_of.c
drm_panel_orientation_quirks.c drm: panel-orientation-quirks: Do rotation quirk for new GPD Win2 FW 2018-11-15 10:55:30 +01:00
drm_panel.c This is the 4.19-rc6 release 2018-10-04 11:03:34 +10:00
drm_pci.c drm/drm_pci.c: Use dma_zalloc_coherent 2018-10-23 15:59:01 +02:00
drm_plane_helper.c drm: Unexport drm_plane_helper_check_update 2018-10-05 22:45:19 +02:00
drm_plane.c drm: Add DRM_MODESET_LOCK_BEGIN/END helpers 2018-11-29 10:48:31 -05:00
drm_prime.c drm/prime: Fix drm_gem_prime_mmap() stack use 2018-11-22 15:44:05 +01:00
drm_print.c
drm_probe_helper.c
drm_property.c drm: Differentiate the lack of an interface from invalid parameter 2018-09-14 17:29:47 +01:00
drm_rect.c
drm_scatter.c drm: Differentiate the lack of an interface from invalid parameter 2018-09-14 17:29:47 +01:00
drm_scdc_helper.c
drm_simple_kms_helper.c drm/tinydrm: Advertise that we can do only DRM_FORMAT_MOD_LINEAR. 2018-10-30 13:01:50 -07:00
drm_syncobj.c drm/syncobj: remove drm_syncobj_cb and cleanup 2018-12-11 17:38:38 +01:00
drm_sysfs.c
drm_trace_points.c
drm_trace.h
drm_vblank.c drm: Differentiate the lack of an interface from invalid parameter 2018-09-14 17:29:47 +01:00
drm_vm.c
drm_vma_manager.c
drm_writeback.c
Kconfig drm/fb_helper: Allow leaking fbdev smem_start 2018-10-03 21:08:21 +02:00
Makefile drm-misc-next for v4.21: 2018-11-29 10:28:49 +10:00