PCI: Fix active state requirement in PME polling

The commit noted in fixes added a bogus requirement that runtime PM managed
devices need to be in the RPM_ACTIVE state for PME polling.  In fact, only
devices in low power states should be polled.

However there's still a requirement that the device config space must be
accessible, which has implications for both the current state of the polled
device and the parent bridge, when present.  It's not sufficient to assume
the bridge remains in D0 and cases have been observed where the bridge
passes the D0 test, but the PM state indicates RPM_SUSPENDING and config
space of the polled device becomes inaccessible during pci_pme_wakeup().

Therefore, since the bridge is already effectively required to be in the
RPM_ACTIVE state, formalize this in the code and elevate the PM usage count
to maintain the state while polling the subordinate device.

This resolves a regression reported in the bugzilla below where a
Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint
downstream of a bridge in a D3hot power state.

Link: https://lore.kernel.org/r/20240123185548.1040096-1-alex.williamson@redhat.com
Fixes: d3fcd73603 ("PCI: Fix runtime PM race with PME polling")
Reported-by: Sanath S <sanath.s@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Sanath S <sanath.s@amd.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
This commit is contained in:
Alex Williamson 2024-01-23 11:55:31 -07:00 committed by Bjorn Helgaas
parent 6613476e22
commit 41044d5360

View File

@ -2496,29 +2496,36 @@ static void pci_pme_list_scan(struct work_struct *work)
if (pdev->pme_poll) {
struct pci_dev *bridge = pdev->bus->self;
struct device *dev = &pdev->dev;
int pm_status;
struct device *bdev = bridge ? &bridge->dev : NULL;
int bref = 0;
/*
* If bridge is in low power state, the
* configuration space of subordinate devices
* may be not accessible
* If we have a bridge, it should be in an active/D0
* state or the configuration space of subordinate
* devices may not be accessible or stable over the
* course of the call.
*/
if (bridge && bridge->current_state != PCI_D0)
continue;
if (bdev) {
bref = pm_runtime_get_if_active(bdev, true);
if (!bref)
continue;
if (bridge->current_state != PCI_D0)
goto put_bridge;
}
/*
* If the device is in a low power state it
* should not be polled either.
* The device itself should be suspended but config
* space must be accessible, therefore it cannot be in
* D3cold.
*/
pm_status = pm_runtime_get_if_active(dev, true);
if (!pm_status)
continue;
if (pdev->current_state != PCI_D3cold)
if (pm_runtime_suspended(dev) &&
pdev->current_state != PCI_D3cold)
pci_pme_wakeup(pdev, NULL);
if (pm_status > 0)
pm_runtime_put(dev);
put_bridge:
if (bref > 0)
pm_runtime_put(bdev);
} else {
list_del(&pme_dev->list);
kfree(pme_dev);