linux/drivers/acpi
Rafael J. Wysocki 21a31013f7 ACPI / dock / PCI: Synchronous handling of dock events for PCI devices
The interactions between the ACPI dock driver and the ACPI-based PCI
hotplug (acpiphp) are currently problematic because of ordering
issues during hot-remove operations.

First of all, the current ACPI glue code expects that physical
devices will always be deleted before deleting the companion ACPI
device objects.  Otherwise, acpi_unbind_one() will fail with a
warning message printed to the kernel log, for example:

[  185.026073] usb usb5: Oops, 'acpi_handle' corrupt
[  185.035150] pci 0000:1b:00.0: Oops, 'acpi_handle' corrupt
[  185.035515] pci 0000:18:02.0: Oops, 'acpi_handle' corrupt
[  180.013656]  port1: Oops, 'acpi_handle' corrupt

This means, in particular, that struct pci_dev objects have to
be deleted before the struct acpi_device objects they are "glued"
with.

Now, the following happens the during the undocking of an ACPI-based
dock station:
 1) hotplug_dock_devices() invokes registered hotplug callbacks to
    destroy physical devices associated with the ACPI device objects
    depending on the dock station.  It calls dd->ops->handler() for
    each of those device objects.
 2) For PCI devices dd->ops->handler() points to
    handle_hotplug_event_func() that queues up a separate work item
    to execute _handle_hotplug_event_func() for the given device and
    returns immediately.  That work item will be executed later.
 3) hotplug_dock_devices() calls dock_remove_acpi_device() for each
    device depending on the dock station.  This runs acpi_bus_trim()
    for each of them, which causes the underlying ACPI device object
    to be destroyed, but the work items queued up by
    handle_hotplug_event_func() haven't been started yet.
 4) _handle_hotplug_event_func() queued up in step 2) are executed
    and cause the above failure to happen, because the PCI devices
    they handle do not have the companion ACPI device objects any
    more (those objects have been deleted in step 3).

The possible breakage doesn't end here, though, because
hotplug_dock_devices() may return before at least some of the
_handle_hotplug_event_func() work items spawned by it have a
chance to complete and then undock() will cause _DCK to be
evaluated and that will cause the devices handled by the
_handle_hotplug_event_func() to go away possibly while they are
being accessed.

This means that dd->ops->handler() for PCI devices should not point
to handle_hotplug_event_func().  Instead, it should point to a
function that will do the work of _handle_hotplug_event_func()
synchronously.  For this reason, introduce such a function,
hotplug_event_func(), and modity acpiphp_dock_ops to point to
it as the handler.

Unfortunately, however, this is not sufficient, because if the dock
code were not changed further, hotplug_event_func() would now
deadlock with hotplug_dock_devices() that called it, since it would
run unregister_hotplug_dock_device() which in turn would attempt to
acquire the dock station's hp_lock mutex already acquired by
hotplug_dock_devices().

To resolve that deadlock use the observation that
unregister_hotplug_dock_device() won't need to acquire hp_lock
if PCI bridges the devices on the dock station depend on are
prevented from being removed prematurely while the first loop in
hotplug_dock_devices() is in progress.

To make that possible, introduce a mechanism by which the callers of
register_hotplug_dock_device() can provide "init" and "release"
routines that will be executed, respectively, during the addition
and removal of the physical device object associated with the
given ACPI device handle.  Make acpiphp use two new functions,
acpiphp_dock_init() and acpiphp_dock_release(), that call
get_bridge() and put_bridge(), respectively, on the acpiphp bridge
holding the given device, for this purpose.

In addition to that, remove the dock station's list of
"hotplug devices" and make the dock code always walk the whole list
of "dependent devices" instead in such a way that the loops in
hotplug_dock_devices() and dock_event() (replacing the loops over
"hotplug devices") will take references to the list entries that
register_hotplug_dock_device() has been called for.  That prevents
the "release" routines associated with those entries from being
called while the given entry is being processed and for PCI
devices this means that their bridges won't be removed (by a
concurrent thread) while hotplug_event_func() handling them is
being executed.

This change is based on two earlier patches from Jiang Liu.

References: https://bugzilla.kernel.org/show_bug.cgi?id=59501
Reported-and-tested-by: Alexander E. Patrakov <patrakov@gmail.com>
Tracked-down-by: Jiang Liu <jiang.liu@huawei.com>
Tested-by: Illya Klymov <xanf@xanf.me>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: 3.9+ <stable@vger.kernel.org>
2013-06-24 11:22:53 +02:00
..
acpica ACPICA: ACPICA: Fix for _INI regression 2013-05-08 15:31:53 +02:00
apei Merge branch 'acpi-fixes' 2013-06-07 12:35:23 +02:00
ac.c ACPI / AC: Add sleep quirk for Thinkpad e530 2013-05-12 14:03:15 +02:00
acpi_i2c.c ACPI / I2C: Use parent's ACPI_HANDLE() in acpi_i2c_register_devices() 2013-04-02 15:30:41 +02:00
acpi_ipmi.c
acpi_lpss.c ACPI / LPSS: Power up LPSS devices during enumeration 2013-06-20 00:49:06 +02:00
acpi_memhotplug.c ACPI / memhotplug: Remove info->failed bit 2013-03-25 00:36:25 +01:00
acpi_pad.c ACPI / acpi_pad: Used PTR_RET 2013-03-25 00:13:15 +01:00
acpi_platform.c ACPI / scan: Add special handler for Intel Lynxpoint LPSS devices 2013-03-21 22:44:38 +01:00
battery.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
bgrt.c efi: Fix the ACPI BGRT driver for images located in EFI boot services memory 2012-09-29 12:21:03 -07:00
blacklist.c acpi: delete module.h include from files explicitly not needing it 2011-10-31 19:30:33 -04:00
bus.c ACPI: replace kmalloc+memcpy with kmemdup 2013-03-24 23:31:33 +01:00
button.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
cm_sbs.c
container.c Merge branch 'acpi-assorted' 2013-04-28 01:54:08 +02:00
custom_method.c The sweeping change is to make add_taint() explicitly indicate whether to disable 2013-02-25 15:41:43 -08:00
debugfs.c acpi: add export.h to files using THIS_MODULE/EXPORT_SYMBOL 2011-10-31 19:30:34 -04:00
device_pm.c ACPI / LPSS: Power up LPSS devices during enumeration 2013-06-20 00:49:06 +02:00
dock.c ACPI / dock / PCI: Synchronous handling of dock events for PCI devices 2013-06-24 11:22:53 +02:00
ec_sys.c simple_open: automatically convert to simple_open() 2012-04-05 15:25:50 -07:00
ec.c ACPI / EC: Restart transaction even when the IBF flag set 2013-05-12 14:03:15 +02:00
event.c acpi: add export.h to files using THIS_MODULE/EXPORT_SYMBOL 2011-10-31 19:30:34 -04:00
fan.c ACPI / fan: avoid null pointer deference error 2013-03-25 23:01:00 +01:00
glue.c ACPI / glue: Drop .find_bridge() callback from struct acpi_bus_type 2013-03-04 14:23:40 +01:00
hed.c ACPI: Remove useless type argument of driver .remove() operation 2013-01-26 00:37:24 +01:00
internal.h ACPI / dock: Initialize ACPI dock subsystem upfront 2013-06-23 00:59:55 +02:00
Kconfig Merge branch 'acpi-assorted' 2013-04-28 01:54:08 +02:00
Makefile Power management and ACPI fixes for 3.10-rc3 2013-05-25 20:32:00 -07:00
numa.c x86, ACPI, mm: Revert movablemem_map support 2013-03-02 09:34:39 -08:00
nvs.c ACPI / PM: print physical addresses consistently with other parts of kernel 2012-03-30 02:46:57 -04:00
osl.c ACPI: Fix wrong parameter passed to memblock_reserve 2013-04-24 13:50:17 +02:00
pci_irq.c PCI/ACPI: Don't cache _PRT, and don't associate them with bus numbers 2013-02-16 11:58:34 -07:00
pci_link.c ACPI: Set length even for TYPE_END_TAG acpi resource 2013-03-24 01:00:38 +01:00
pci_root.c PCI: acpiphp: Re-enumerate devices when host bridge receives Bus Check 2013-05-17 14:12:06 -06:00
pci_slot.c PCI/ACPI: Handle PCI slot devices when creating/destroying PCI buses 2013-04-12 15:38:25 -06:00
power.c ACPI / PM: Fix error code path for power resources initialization 2013-06-20 00:47:55 +02:00
proc.c procfs: new helper - PDE_DATA(inode) 2013-04-09 14:13:32 -04:00
processor_core.c ACPI / processor: Remove redundant NULL check before kfree 2013-03-04 14:23:39 +01:00
processor_driver.c ACPI / PM: Move processor suspend/resume to syscore_ops 2013-05-12 14:03:14 +02:00
processor_idle.c ACPI / PM: Move processor suspend/resume to syscore_ops 2013-05-12 14:03:14 +02:00
processor_perflib.c acpi: Export the acpi_processor_get_performance_info 2013-03-06 10:00:34 -05:00
processor_thermal.c ACPI / processor_thermal: avoid null pointer deference error 2013-03-25 23:01:01 +01:00
processor_throttling.c ACPI: suppress compiler warnings in processor_throttling.c 2013-03-25 00:05:48 +01:00
reboot.c Revert "ACPI: ignore FADT reset-reg-sup flag" 2012-04-20 11:19:35 -07:00
resource.c ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources 2013-06-19 23:55:59 +02:00
sbs.c proc: Supply a function to remove a proc entry by PDE 2013-05-01 17:29:46 -04:00
sbshc.c ACPI: Remove useless type argument of driver .remove() operation 2013-01-26 00:37:24 +01:00
sbshc.h
scan.c ACPI / dock: Initialize ACPI dock subsystem upfront 2013-06-23 00:59:55 +02:00
sleep.c ACPI / PM: fix suspend and resume on Sony Vaio VGN-FW21M 2013-03-27 00:04:53 +01:00
sleep.h ACPI: Drop power resources driver 2013-01-17 14:11:06 +01:00
sysfs.c ACPI / hotplug: Make acpi_hotplug_profile_ktype static 2013-03-19 00:16:15 +01:00
tables.c ACPICA: Cleanup table handler naming conflicts. 2013-01-11 13:10:16 +01:00
thermal.c ACPI / thermal: do not always return THERMAL_TREND_RAISING for active trip points 2013-04-26 13:34:40 +02:00
utils.c ACPI: Add acpi_handle_<level>() interfaces 2012-11-21 23:20:22 +01:00
video_detect.c ACPI / video: Add "Asus UL30A" to ACPI video detect blacklist 2013-05-23 01:41:45 +02:00
video.c ACPI / video: Do not bind to device objects with a scan handler 2013-06-10 13:00:29 +02:00
wakeup.c