Merge branch 'pci/pm'

- Always return devices to D0 when thawing to fix hibernation with
    drivers like mlx4 that used legacy power management (previously we only
    did it for drivers with new power management ops) (Dexuan Cui)

  - Clear PCIe PME Status even for legacy power management (Bjorn Helgaas)

  - Fix PCI PM documentation errors (Bjorn Helgaas)

  - Use dev_printk() for more power management messages (Bjorn Helgaas)

  - Apply D2 delay as milliseconds, not microseconds (Bjorn Helgaas)

  - Convert xen-platform from legacy to generic power management (Bjorn
    Helgaas)

  - Removed unused .resume_early() and .suspend_late() legacy power
    management hooks (Bjorn Helgaas)

  - Rearrange power management code for clarity (Rafael J. Wysocki)

  - Decode power states more clearly ("4" or "D4" really refers to
    "D3cold") (Bjorn Helgaas)

  - Notice when reading PM Control register returns an error (~0) instead
    of interpreting it as being in D3hot (Bjorn Helgaas)

  - Add missing link delays required by the PCIe spec (Mika Westerberg)

* pci/pm:
  PCI/PM: Move pci_dev_wait() definition earlier
  PCI/PM: Add missing link delays required by the PCIe spec
  PCI/PM: Add pcie_wait_for_link_delay()
  PCI/PM: Return error when changing power state from D3cold
  PCI/PM: Decode D3cold power state correctly
  PCI/PM: Fold __pci_complete_power_transition() into its caller
  PCI/PM: Avoid exporting __pci_complete_power_transition()
  PCI/PM: Fold __pci_start_power_transition() into its caller
  PCI/PM: Use pci_power_up() in pci_set_power_state()
  PCI/PM: Move power state update away from pci_power_up()
  PCI/PM: Remove unused pci_driver.suspend_late() hook
  PCI/PM: Remove unused pci_driver.resume_early() hook
  xen-platform: Convert to generic power management
  PCI/PM: Simplify pci_set_power_state()
  PCI/PM: Expand PM reset messages to mention D3hot (not just D3)
  PCI/PM: Apply D2 delay as milliseconds, not microseconds
  PCI/PM: Use pci_WARN() to include device information
  PCI/PM: Use PCI dev_printk() wrappers for consistency
  PCI/PM: Wrap long lines in documentation
  PCI/PM: Note that PME can be generated from D0
  PCI/PM: Make power management op coding style consistent
  PCI/PM: Run resume fixups before disabling wakeup events
  PCI/PM: Clear PCIe PME Status even for legacy power management
  PCI/PM: Correct pci_pm_thaw_noirq() documentation
  PCI/PM: Always return devices to D0 when thawing
This commit is contained in:
Bjorn Helgaas 2019-11-28 08:54:35 -06:00
commit 7cfe16393c
7 changed files with 362 additions and 263 deletions

View File

@ -130,8 +130,8 @@ a full power-on reset sequence and the power-on defaults are restored to the
device by hardware just as at initial power up.
PCI devices supporting the PCI PM Spec can be programmed to generate PMEs
while in a low-power state (D1-D3), but they are not required to be capable
of generating PMEs from all supported low-power states. In particular, the
while in any power state (D0-D3), but they are not required to be capable
of generating PMEs from all supported power states. In particular, the
capability of generating PMEs from D3cold is optional and depends on the
presence of additional voltage (3.3Vaux) allowing the device to remain
sufficiently active to generate a wakeup signal.
@ -426,12 +426,12 @@ pm->runtime_idle() callback.
2.4. System-Wide Power Transitions
----------------------------------
There are a few different types of system-wide power transitions, described in
Documentation/driver-api/pm/devices.rst. Each of them requires devices to be handled
in a specific way and the PM core executes subsystem-level power management
callbacks for this purpose. They are executed in phases such that each phase
involves executing the same subsystem-level callback for every device belonging
to the given subsystem before the next phase begins. These phases always run
after tasks have been frozen.
Documentation/driver-api/pm/devices.rst. Each of them requires devices to be
handled in a specific way and the PM core executes subsystem-level power
management callbacks for this purpose. They are executed in phases such that
each phase involves executing the same subsystem-level callback for every device
belonging to the given subsystem before the next phase begins. These phases
always run after tasks have been frozen.
2.4.1. System Suspend
^^^^^^^^^^^^^^^^^^^^^
@ -600,17 +600,17 @@ using the following PCI bus type's callbacks::
respectively.
The first of them, pci_pm_thaw_noirq(), is analogous to pci_pm_resume_noirq(),
but it doesn't put the device into the full power state and doesn't attempt to
restore its standard configuration registers. It also executes the device
driver's pm->thaw_noirq() callback, if defined, instead of pm->resume_noirq().
The first of them, pci_pm_thaw_noirq(), is analogous to pci_pm_resume_noirq().
It puts the device into the full power state and restores its standard
configuration registers. It also executes the device driver's pm->thaw_noirq()
callback, if defined, instead of pm->resume_noirq().
The pci_pm_thaw() routine is similar to pci_pm_resume(), but it runs the device
driver's pm->thaw() callback instead of pm->resume(). It is executed
asynchronously for different PCI devices that don't depend on each other in a
known way.
The complete phase it the same as for system resume.
The complete phase is the same as for system resume.
After saving the image, devices need to be powered down before the system can
enter the target sleep state (ACPI S4 for ACPI-based systems). This is done in
@ -636,12 +636,12 @@ System restore requires a hibernation image to be loaded into memory and the
pre-hibernation memory contents to be restored before the pre-hibernation system
activity can be resumed.
As described in Documentation/driver-api/pm/devices.rst, the hibernation image is loaded
into memory by a fresh instance of the kernel, called the boot kernel, which in
turn is loaded and run by a boot loader in the usual way. After the boot kernel
has loaded the image, it needs to replace its own code and data with the code
and data of the "hibernated" kernel stored within the image, called the image
kernel. For this purpose all devices are frozen just like before creating
As described in Documentation/driver-api/pm/devices.rst, the hibernation image
is loaded into memory by a fresh instance of the kernel, called the boot kernel,
which in turn is loaded and run by a boot loader in the usual way. After the
boot kernel has loaded the image, it needs to replace its own code and data with
the code and data of the "hibernated" kernel stored within the image, called the
image kernel. For this purpose all devices are frozen just like before creating
the image during hibernation, in the
prepare, freeze, freeze_noirq
@ -691,12 +691,12 @@ controlling the runtime power management of their devices.
At the time of this writing there are two ways to define power management
callbacks for a PCI device driver, the recommended one, based on using a
dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and the
"legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and
.resume() callbacks from struct pci_driver are used. The legacy approach,
however, doesn't allow one to define runtime power management callbacks and is
not really suitable for any new drivers. Therefore it is not covered by this
document (refer to the source code to learn more about it).
dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and
the "legacy" one, in which the .suspend() and .resume() callbacks from struct
pci_driver are used. The legacy approach, however, doesn't allow one to define
runtime power management callbacks and is not really suitable for any new
drivers. Therefore it is not covered by this document (refer to the source code
to learn more about it).
It is recommended that all PCI device drivers define a struct dev_pm_ops object
containing pointers to power management (PM) callbacks that will be executed by

View File

@ -315,7 +315,8 @@ static long local_pci_probe(void *_ddi)
* Probe function should return < 0 for failure, 0 for success
* Treat values > 0 as success, but warn.
*/
dev_warn(dev, "Driver probe function unexpectedly returned %d\n", rc);
pci_warn(pci_dev, "Driver probe function unexpectedly returned %d\n",
rc);
return 0;
}
@ -517,6 +518,12 @@ static int pci_restore_standard_config(struct pci_dev *pci_dev)
return 0;
}
static void pci_pm_default_resume(struct pci_dev *pci_dev)
{
pci_fixup_device(pci_fixup_resume, pci_dev);
pci_enable_wake(pci_dev, PCI_D0, false);
}
#endif
#ifdef CONFIG_PM_SLEEP
@ -524,6 +531,7 @@ static int pci_restore_standard_config(struct pci_dev *pci_dev)
static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
{
pci_power_up(pci_dev);
pci_update_current_state(pci_dev, PCI_D0);
pci_restore_state(pci_dev);
pci_pme_restore(pci_dev);
}
@ -578,9 +586,9 @@ static int pci_legacy_suspend(struct device *dev, pm_message_t state)
if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0
&& pci_dev->current_state != PCI_UNKNOWN) {
WARN_ONCE(pci_dev->current_state != prev,
"PCI PM: Device state not saved by %pS\n",
drv->suspend);
pci_WARN_ONCE(pci_dev, pci_dev->current_state != prev,
"PCI PM: Device state not saved by %pS\n",
drv->suspend);
}
}
@ -592,46 +600,17 @@ static int pci_legacy_suspend(struct device *dev, pm_message_t state)
static int pci_legacy_suspend_late(struct device *dev, pm_message_t state)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct pci_driver *drv = pci_dev->driver;
if (drv && drv->suspend_late) {
pci_power_t prev = pci_dev->current_state;
int error;
error = drv->suspend_late(pci_dev, state);
suspend_report_result(drv->suspend_late, error);
if (error)
return error;
if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0
&& pci_dev->current_state != PCI_UNKNOWN) {
WARN_ONCE(pci_dev->current_state != prev,
"PCI PM: Device state not saved by %pS\n",
drv->suspend_late);
goto Fixup;
}
}
if (!pci_dev->state_saved)
pci_save_state(pci_dev);
pci_pm_set_unknown_state(pci_dev);
Fixup:
pci_fixup_device(pci_fixup_suspend_late, pci_dev);
return 0;
}
static int pci_legacy_resume_early(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct pci_driver *drv = pci_dev->driver;
return drv && drv->resume_early ?
drv->resume_early(pci_dev) : 0;
}
static int pci_legacy_resume(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
@ -645,12 +624,6 @@ static int pci_legacy_resume(struct device *dev)
/* Auxiliary functions used by the new power management framework */
static void pci_pm_default_resume(struct pci_dev *pci_dev)
{
pci_fixup_device(pci_fixup_resume, pci_dev);
pci_enable_wake(pci_dev, PCI_D0, false);
}
static void pci_pm_default_suspend(struct pci_dev *pci_dev)
{
/* Disable non-bridge devices without PM support */
@ -661,16 +634,15 @@ static void pci_pm_default_suspend(struct pci_dev *pci_dev)
static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev)
{
struct pci_driver *drv = pci_dev->driver;
bool ret = drv && (drv->suspend || drv->suspend_late || drv->resume
|| drv->resume_early);
bool ret = drv && (drv->suspend || drv->resume);
/*
* Legacy PM support is used by default, so warn if the new framework is
* supported as well. Drivers are supposed to support either the
* former, or the latter, but not both at the same time.
*/
WARN(ret && drv->driver.pm, "driver %s device %04x:%04x\n",
drv->name, pci_dev->vendor, pci_dev->device);
pci_WARN(pci_dev, ret && drv->driver.pm, "device %04x:%04x\n",
pci_dev->vendor, pci_dev->device);
return ret;
}
@ -679,11 +651,11 @@ static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev)
static int pci_pm_prepare(struct device *dev)
{
struct device_driver *drv = dev->driver;
struct pci_dev *pci_dev = to_pci_dev(dev);
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
if (drv && drv->pm && drv->pm->prepare) {
int error = drv->pm->prepare(dev);
if (pm && pm->prepare) {
int error = pm->prepare(dev);
if (error < 0)
return error;
@ -793,9 +765,9 @@ static int pci_pm_suspend(struct device *dev)
if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0
&& pci_dev->current_state != PCI_UNKNOWN) {
WARN_ONCE(pci_dev->current_state != prev,
"PCI PM: State of device not saved by %pS\n",
pm->suspend);
pci_WARN_ONCE(pci_dev, pci_dev->current_state != prev,
"PCI PM: State of device not saved by %pS\n",
pm->suspend);
}
}
@ -841,9 +813,9 @@ static int pci_pm_suspend_noirq(struct device *dev)
if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0
&& pci_dev->current_state != PCI_UNKNOWN) {
WARN_ONCE(pci_dev->current_state != prev,
"PCI PM: State of device not saved by %pS\n",
pm->suspend_noirq);
pci_WARN_ONCE(pci_dev, pci_dev->current_state != prev,
"PCI PM: State of device not saved by %pS\n",
pm->suspend_noirq);
goto Fixup;
}
}
@ -865,7 +837,7 @@ static int pci_pm_suspend_noirq(struct device *dev)
pci_prepare_to_sleep(pci_dev);
}
dev_dbg(dev, "PCI PM: Suspend power state: %s\n",
pci_dbg(pci_dev, "PCI PM: Suspend power state: %s\n",
pci_power_name(pci_dev->current_state));
if (pci_dev->current_state == PCI_D0) {
@ -880,7 +852,7 @@ static int pci_pm_suspend_noirq(struct device *dev)
}
if (pci_dev->skip_bus_pm && pm_suspend_no_platform()) {
dev_dbg(dev, "PCI PM: Skipped\n");
pci_dbg(pci_dev, "PCI PM: Skipped\n");
goto Fixup;
}
@ -917,8 +889,9 @@ Fixup:
static int pci_pm_resume_noirq(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct device_driver *drv = dev->driver;
int error = 0;
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
pci_power_t prev_state = pci_dev->current_state;
bool skip_bus_pm = pci_dev->skip_bus_pm;
if (dev_pm_may_skip_resume(dev))
return 0;
@ -937,27 +910,28 @@ static int pci_pm_resume_noirq(struct device *dev)
* configuration here and attempting to put them into D0 again is
* pointless, so avoid doing that.
*/
if (!(pci_dev->skip_bus_pm && pm_suspend_no_platform()))
if (!(skip_bus_pm && pm_suspend_no_platform()))
pci_pm_default_resume_early(pci_dev);
pci_fixup_device(pci_fixup_resume_early, pci_dev);
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_resume_early(dev);
pcie_pme_root_status_cleanup(pci_dev);
if (drv && drv->pm && drv->pm->resume_noirq)
error = drv->pm->resume_noirq(dev);
if (!skip_bus_pm && prev_state == PCI_D3cold)
pci_bridge_wait_for_secondary_bus(pci_dev);
return error;
if (pci_has_legacy_pm_support(pci_dev))
return 0;
if (pm && pm->resume_noirq)
return pm->resume_noirq(dev);
return 0;
}
static int pci_pm_resume(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int error = 0;
/*
* This is necessary for the suspend error path in which resume is
@ -973,12 +947,12 @@ static int pci_pm_resume(struct device *dev)
if (pm) {
if (pm->resume)
error = pm->resume(dev);
return pm->resume(dev);
} else {
pci_pm_reenable_device(pci_dev);
}
return error;
return 0;
}
#else /* !CONFIG_SUSPEND */
@ -993,7 +967,6 @@ static int pci_pm_resume(struct device *dev)
#ifdef CONFIG_HIBERNATE_CALLBACKS
/*
* pcibios_pm_ops - provide arch-specific hooks when a PCI device is doing
* a hibernate transition
@ -1039,16 +1012,16 @@ static int pci_pm_freeze(struct device *dev)
static int pci_pm_freeze_noirq(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct device_driver *drv = dev->driver;
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_suspend_late(dev, PMSG_FREEZE);
if (drv && drv->pm && drv->pm->freeze_noirq) {
if (pm && pm->freeze_noirq) {
int error;
error = drv->pm->freeze_noirq(dev);
suspend_report_result(drv->pm->freeze_noirq, error);
error = pm->freeze_noirq(dev);
suspend_report_result(pm->freeze_noirq, error);
if (error)
return error;
}
@ -1067,8 +1040,8 @@ static int pci_pm_freeze_noirq(struct device *dev)
static int pci_pm_thaw_noirq(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct device_driver *drv = dev->driver;
int error = 0;
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int error;
if (pcibios_pm_ops.thaw_noirq) {
error = pcibios_pm_ops.thaw_noirq(dev);
@ -1076,21 +1049,25 @@ static int pci_pm_thaw_noirq(struct device *dev)
return error;
}
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_resume_early(dev);
/*
* pci_restore_state() requires the device to be in D0 (because of MSI
* restoration among other things), so force it into D0 in case the
* driver's "freeze" callbacks put it into a low-power state directly.
* The pm->thaw_noirq() callback assumes the device has been
* returned to D0 and its config state has been restored.
*
* In addition, pci_restore_state() restores MSI-X state in MMIO
* space, which requires the device to be in D0, so return it to D0
* in case the driver's "freeze" callbacks put it into a low-power
* state.
*/
pci_set_power_state(pci_dev, PCI_D0);
pci_restore_state(pci_dev);
if (drv && drv->pm && drv->pm->thaw_noirq)
error = drv->pm->thaw_noirq(dev);
if (pci_has_legacy_pm_support(pci_dev))
return 0;
return error;
if (pm && pm->thaw_noirq)
return pm->thaw_noirq(dev);
return 0;
}
static int pci_pm_thaw(struct device *dev)
@ -1161,24 +1138,24 @@ static int pci_pm_poweroff_late(struct device *dev)
static int pci_pm_poweroff_noirq(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct device_driver *drv = dev->driver;
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
if (dev_pm_smart_suspend_and_suspended(dev))
return 0;
if (pci_has_legacy_pm_support(to_pci_dev(dev)))
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_suspend_late(dev, PMSG_HIBERNATE);
if (!drv || !drv->pm) {
if (!pm) {
pci_fixup_device(pci_fixup_suspend_late, pci_dev);
return 0;
}
if (drv->pm->poweroff_noirq) {
if (pm->poweroff_noirq) {
int error;
error = drv->pm->poweroff_noirq(dev);
suspend_report_result(drv->pm->poweroff_noirq, error);
error = pm->poweroff_noirq(dev);
suspend_report_result(pm->poweroff_noirq, error);
if (error)
return error;
}
@ -1204,8 +1181,8 @@ static int pci_pm_poweroff_noirq(struct device *dev)
static int pci_pm_restore_noirq(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
struct device_driver *drv = dev->driver;
int error = 0;
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int error;
if (pcibios_pm_ops.restore_noirq) {
error = pcibios_pm_ops.restore_noirq(dev);
@ -1217,19 +1194,18 @@ static int pci_pm_restore_noirq(struct device *dev)
pci_fixup_device(pci_fixup_resume_early, pci_dev);
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_resume_early(dev);
return 0;
if (drv && drv->pm && drv->pm->restore_noirq)
error = drv->pm->restore_noirq(dev);
if (pm && pm->restore_noirq)
return pm->restore_noirq(dev);
return error;
return 0;
}
static int pci_pm_restore(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int error = 0;
/*
* This is necessary for the hibernation error path in which restore is
@ -1245,12 +1221,12 @@ static int pci_pm_restore(struct device *dev)
if (pm) {
if (pm->restore)
error = pm->restore(dev);
return pm->restore(dev);
} else {
pci_pm_reenable_device(pci_dev);
}
return error;
return 0;
}
#else /* !CONFIG_HIBERNATE_CALLBACKS */
@ -1295,11 +1271,11 @@ static int pci_pm_runtime_suspend(struct device *dev)
* log level.
*/
if (error == -EBUSY || error == -EAGAIN) {
dev_dbg(dev, "can't suspend now (%ps returned %d)\n",
pci_dbg(pci_dev, "can't suspend now (%ps returned %d)\n",
pm->runtime_suspend, error);
return error;
} else if (error) {
dev_err(dev, "can't suspend (%ps returned %d)\n",
pci_err(pci_dev, "can't suspend (%ps returned %d)\n",
pm->runtime_suspend, error);
return error;
}
@ -1310,9 +1286,9 @@ static int pci_pm_runtime_suspend(struct device *dev)
if (pm && pm->runtime_suspend
&& !pci_dev->state_saved && pci_dev->current_state != PCI_D0
&& pci_dev->current_state != PCI_UNKNOWN) {
WARN_ONCE(pci_dev->current_state != prev,
"PCI PM: State of device not saved by %pS\n",
pm->runtime_suspend);
pci_WARN_ONCE(pci_dev, pci_dev->current_state != prev,
"PCI PM: State of device not saved by %pS\n",
pm->runtime_suspend);
return 0;
}
@ -1326,9 +1302,10 @@ static int pci_pm_runtime_suspend(struct device *dev)
static int pci_pm_runtime_resume(struct device *dev)
{
int rc = 0;
struct pci_dev *pci_dev = to_pci_dev(dev);
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
pci_power_t prev_state = pci_dev->current_state;
int error = 0;
/*
* Restoring config space is necessary even if the device is not bound
@ -1341,22 +1318,23 @@ static int pci_pm_runtime_resume(struct device *dev)
return 0;
pci_fixup_device(pci_fixup_resume_early, pci_dev);
pci_enable_wake(pci_dev, PCI_D0, false);
pci_fixup_device(pci_fixup_resume, pci_dev);
pci_pm_default_resume(pci_dev);
if (prev_state == PCI_D3cold)
pci_bridge_wait_for_secondary_bus(pci_dev);
if (pm && pm->runtime_resume)
rc = pm->runtime_resume(dev);
error = pm->runtime_resume(dev);
pci_dev->runtime_d3cold = false;
return rc;
return error;
}
static int pci_pm_runtime_idle(struct device *dev)
{
struct pci_dev *pci_dev = to_pci_dev(dev);
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int ret = 0;
/*
* If pci_dev->driver is not set (unbound), the device should
@ -1369,9 +1347,9 @@ static int pci_pm_runtime_idle(struct device *dev)
return -ENOSYS;
if (pm->runtime_idle)
ret = pm->runtime_idle(dev);
return pm->runtime_idle(dev);
return ret;
return 0;
}
static const struct dev_pm_ops pci_dev_pm_ops = {

View File

@ -835,14 +835,16 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state)
return -EINVAL;
/*
* Validate current state:
* Can enter D0 from any state, but if we can only go deeper
* to sleep if we're already in a low power state
* Validate transition: We can enter D0 from any state, but if
* we're already in a low-power state, we can only go deeper. E.g.,
* we can go from D1 to D3, but we can't go directly from D3 to D1;
* we'd have to go from D3 to D0, then to D1.
*/
if (state != PCI_D0 && dev->current_state <= PCI_D3cold
&& dev->current_state > state) {
pci_err(dev, "invalid power transition (from state %d to %d)\n",
dev->current_state, state);
pci_err(dev, "invalid power transition (from %s to %s)\n",
pci_power_name(dev->current_state),
pci_power_name(state));
return -EINVAL;
}
@ -852,6 +854,12 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state)
return -EIO;
pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
if (pmcsr == (u16) ~0) {
pci_err(dev, "can't change power state from %s to %s (config space inaccessible)\n",
pci_power_name(dev->current_state),
pci_power_name(state));
return -EIO;
}
/*
* If we're (effectively) in D3, force entire word to 0.
@ -887,13 +895,14 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state)
if (state == PCI_D3hot || dev->current_state == PCI_D3hot)
pci_dev_d3_sleep(dev);
else if (state == PCI_D2 || dev->current_state == PCI_D2)
udelay(PCI_PM_D2_DELAY);
msleep(PCI_PM_D2_DELAY);
pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
if (dev->current_state != state)
pci_info_ratelimited(dev, "Refused to change power state, currently in D%d\n",
dev->current_state);
pci_info_ratelimited(dev, "refused to change power state from %s to %s\n",
pci_power_name(dev->current_state),
pci_power_name(state));
/*
* According to section 5.4.1 of the "PCI BUS POWER MANAGEMENT
@ -964,7 +973,7 @@ void pci_refresh_power_state(struct pci_dev *dev)
* @dev: PCI device to handle.
* @state: State to put the device into.
*/
static int pci_platform_power_transition(struct pci_dev *dev, pci_power_t state)
int pci_platform_power_transition(struct pci_dev *dev, pci_power_t state)
{
int error;
@ -980,6 +989,7 @@ static int pci_platform_power_transition(struct pci_dev *dev, pci_power_t state)
return error;
}
EXPORT_SYMBOL_GPL(pci_platform_power_transition);
/**
* pci_wakeup - Wake up a PCI device
@ -1003,34 +1013,70 @@ void pci_wakeup_bus(struct pci_bus *bus)
pci_walk_bus(bus, pci_wakeup, NULL);
}
/**
* __pci_start_power_transition - Start power transition of a PCI device
* @dev: PCI device to handle.
* @state: State to put the device into.
*/
static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state)
static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
{
if (state == PCI_D0) {
pci_platform_power_transition(dev, PCI_D0);
/*
* Mandatory power management transition delays, see
* PCI Express Base Specification Revision 2.0 Section
* 6.6.1: Conventional Reset. Do not delay for
* devices powered on/off by corresponding bridge,
* because have already delayed for the bridge.
*/
if (dev->runtime_d3cold) {
if (dev->d3cold_delay && !dev->imm_ready)
msleep(dev->d3cold_delay);
/*
* When powering on a bridge from D3cold, the
* whole hierarchy may be powered on into
* D0uninitialized state, resume them to give
* them a chance to suspend again
*/
pci_wakeup_bus(dev->subordinate);
int delay = 1;
u32 id;
/*
* After reset, the device should not silently discard config
* requests, but it may still indicate that it needs more time by
* responding to them with CRS completions. The Root Port will
* generally synthesize ~0 data to complete the read (except when
* CRS SV is enabled and the read was for the Vendor ID; in that
* case it synthesizes 0x0001 data).
*
* Wait for the device to return a non-CRS completion. Read the
* Command register instead of Vendor ID so we don't have to
* contend with the CRS SV value.
*/
pci_read_config_dword(dev, PCI_COMMAND, &id);
while (id == ~0) {
if (delay > timeout) {
pci_warn(dev, "not ready %dms after %s; giving up\n",
delay - 1, reset_type);
return -ENOTTY;
}
if (delay > 1000)
pci_info(dev, "not ready %dms after %s; waiting\n",
delay - 1, reset_type);
msleep(delay);
delay *= 2;
pci_read_config_dword(dev, PCI_COMMAND, &id);
}
if (delay > 1000)
pci_info(dev, "ready %dms after %s\n", delay - 1,
reset_type);
return 0;
}
/**
* pci_power_up - Put the given device into D0
* @dev: PCI device to power up
*/
int pci_power_up(struct pci_dev *dev)
{
pci_platform_power_transition(dev, PCI_D0);
/*
* Mandatory power management transition delays are handled in
* pci_pm_resume_noirq() and pci_pm_runtime_resume() of the
* corresponding bridge.
*/
if (dev->runtime_d3cold) {
/*
* When powering on a bridge from D3cold, the whole hierarchy
* may be powered on into D0uninitialized state, resume them to
* give them a chance to suspend again
*/
pci_wakeup_bus(dev->subordinate);
}
return pci_raw_set_power_state(dev, PCI_D0);
}
/**
@ -1057,27 +1103,6 @@ void pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state)
pci_walk_bus(bus, __pci_dev_set_current_state, &state);
}
/**
* __pci_complete_power_transition - Complete power transition of a PCI device
* @dev: PCI device to handle.
* @state: State to put the device into.
*
* This function should not be called directly by device drivers.
*/
int __pci_complete_power_transition(struct pci_dev *dev, pci_power_t state)
{
int ret;
if (state <= PCI_D0)
return -EINVAL;
ret = pci_platform_power_transition(dev, state);
/* Power off the bridge may power off the whole hierarchy */
if (!ret && state == PCI_D3cold)
pci_bus_set_current_state(dev->subordinate, PCI_D3cold);
return ret;
}
EXPORT_SYMBOL_GPL(__pci_complete_power_transition);
/**
* pci_set_power_state - Set the power state of a PCI device
* @dev: PCI device to handle.
@ -1118,7 +1143,8 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
if (dev->current_state == state)
return 0;
__pci_start_power_transition(dev, state);
if (state == PCI_D0)
return pci_power_up(dev);
/*
* This device is quirked not to be put into D3, so don't put it in
@ -1134,24 +1160,17 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
PCI_D3hot : state);
if (!__pci_complete_power_transition(dev, state))
error = 0;
if (pci_platform_power_transition(dev, state))
return error;
return error;
/* Powering off a bridge may power off the whole hierarchy */
if (state == PCI_D3cold)
pci_bus_set_current_state(dev->subordinate, PCI_D3cold);
return 0;
}
EXPORT_SYMBOL(pci_set_power_state);
/**
* pci_power_up - Put the given device into D0 forcibly
* @dev: PCI device to power up
*/
void pci_power_up(struct pci_dev *dev)
{
__pci_start_power_transition(dev, PCI_D0);
pci_raw_set_power_state(dev, PCI_D0);
pci_update_current_state(dev, PCI_D0);
}
/**
* pci_choose_state - Choose the power state of a PCI device
* @dev: PCI device to be suspended
@ -4431,47 +4450,6 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
}
EXPORT_SYMBOL(pci_wait_for_pending_transaction);
static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
{
int delay = 1;
u32 id;
/*
* After reset, the device should not silently discard config
* requests, but it may still indicate that it needs more time by
* responding to them with CRS completions. The Root Port will
* generally synthesize ~0 data to complete the read (except when
* CRS SV is enabled and the read was for the Vendor ID; in that
* case it synthesizes 0x0001 data).
*
* Wait for the device to return a non-CRS completion. Read the
* Command register instead of Vendor ID so we don't have to
* contend with the CRS SV value.
*/
pci_read_config_dword(dev, PCI_COMMAND, &id);
while (id == ~0) {
if (delay > timeout) {
pci_warn(dev, "not ready %dms after %s; giving up\n",
delay - 1, reset_type);
return -ENOTTY;
}
if (delay > 1000)
pci_info(dev, "not ready %dms after %s; waiting\n",
delay - 1, reset_type);
msleep(delay);
delay *= 2;
pci_read_config_dword(dev, PCI_COMMAND, &id);
}
if (delay > 1000)
pci_info(dev, "ready %dms after %s\n", delay - 1,
reset_type);
return 0;
}
/**
* pcie_has_flr - check if a device supports function level resets
* @dev: device to check
@ -4606,16 +4584,19 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
pci_dev_d3_sleep(dev);
return pci_dev_wait(dev, "PM D3->D0", PCIE_RESET_READY_POLL_MS);
return pci_dev_wait(dev, "PM D3hot->D0", PCIE_RESET_READY_POLL_MS);
}
/**
* pcie_wait_for_link - Wait until link is active or inactive
* pcie_wait_for_link_delay - Wait until link is active or inactive
* @pdev: Bridge device
* @active: waiting for active or inactive?
* @delay: Delay to wait after link has become active (in ms)
*
* Use this to wait till link becomes active or inactive.
*/
bool pcie_wait_for_link(struct pci_dev *pdev, bool active)
static bool pcie_wait_for_link_delay(struct pci_dev *pdev, bool active,
int delay)
{
int timeout = 1000;
bool ret;
@ -4652,13 +4633,144 @@ bool pcie_wait_for_link(struct pci_dev *pdev, bool active)
timeout -= 10;
}
if (active && ret)
msleep(100);
msleep(delay);
else if (ret != active)
pci_info(pdev, "Data Link Layer Link Active not %s in 1000 msec\n",
active ? "set" : "cleared");
return ret == active;
}
/**
* pcie_wait_for_link - Wait until link is active or inactive
* @pdev: Bridge device
* @active: waiting for active or inactive?
*
* Use this to wait till link becomes active or inactive.
*/
bool pcie_wait_for_link(struct pci_dev *pdev, bool active)
{
return pcie_wait_for_link_delay(pdev, active, 100);
}
/*
* Find maximum D3cold delay required by all the devices on the bus. The
* spec says 100 ms, but firmware can lower it and we allow drivers to
* increase it as well.
*
* Called with @pci_bus_sem locked for reading.
*/
static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
{
const struct pci_dev *pdev;
int min_delay = 100;
int max_delay = 0;
list_for_each_entry(pdev, &bus->devices, bus_list) {
if (pdev->d3cold_delay < min_delay)
min_delay = pdev->d3cold_delay;
if (pdev->d3cold_delay > max_delay)
max_delay = pdev->d3cold_delay;
}
return max(min_delay, max_delay);
}
/**
* pci_bridge_wait_for_secondary_bus - Wait for secondary bus to be accessible
* @dev: PCI bridge
*
* Handle necessary delays before access to the devices on the secondary
* side of the bridge are permitted after D3cold to D0 transition.
*
* For PCIe this means the delays in PCIe 5.0 section 6.6.1. For
* conventional PCI it means Tpvrh + Trhfa specified in PCI 3.0 section
* 4.3.2.
*/
void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
{
struct pci_dev *child;
int delay;
if (pci_dev_is_disconnected(dev))
return;
if (!pci_is_bridge(dev) || !dev->bridge_d3)
return;
down_read(&pci_bus_sem);
/*
* We only deal with devices that are present currently on the bus.
* For any hot-added devices the access delay is handled in pciehp
* board_added(). In case of ACPI hotplug the firmware is expected
* to configure the devices before OS is notified.
*/
if (!dev->subordinate || list_empty(&dev->subordinate->devices)) {
up_read(&pci_bus_sem);
return;
}
/* Take d3cold_delay requirements into account */
delay = pci_bus_max_d3cold_delay(dev->subordinate);
if (!delay) {
up_read(&pci_bus_sem);
return;
}
child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
bus_list);
up_read(&pci_bus_sem);
/*
* Conventional PCI and PCI-X we need to wait Tpvrh + Trhfa before
* accessing the device after reset (that is 1000 ms + 100 ms). In
* practice this should not be needed because we don't do power
* management for them (see pci_bridge_d3_possible()).
*/
if (!pci_is_pcie(dev)) {
pci_dbg(dev, "waiting %d ms for secondary bus\n", 1000 + delay);
msleep(1000 + delay);
return;
}
/*
* For PCIe downstream and root ports that do not support speeds
* greater than 5 GT/s need to wait minimum 100 ms. For higher
* speeds (gen3) we need to wait first for the data link layer to
* become active.
*
* However, 100 ms is the minimum and the PCIe spec says the
* software must allow at least 1s before it can determine that the
* device that did not respond is a broken device. There is
* evidence that 100 ms is not always enough, for example certain
* Titan Ridge xHCI controller does not always respond to
* configuration requests if we only wait for 100 ms (see
* https://bugzilla.kernel.org/show_bug.cgi?id=203885).
*
* Therefore we wait for 100 ms and check for the device presence.
* If it is still not present give it an additional 100 ms.
*/
if (!pcie_downstream_port(dev))
return;
if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
msleep(delay);
} else {
pci_dbg(dev, "waiting %d ms for downstream link, after activation\n",
delay);
if (!pcie_wait_for_link_delay(dev, true, delay)) {
/* Did not train, no need to wait any further */
return;
}
}
if (!pci_device_is_present(child)) {
pci_dbg(child, "waiting additional %d ms to become accessible\n", delay);
msleep(delay);
}
}
void pci_reset_secondary_bus(struct pci_dev *dev)
{
u16 ctrl;

View File

@ -86,7 +86,7 @@ struct pci_platform_pm_ops {
int pci_set_platform_pm(const struct pci_platform_pm_ops *ops);
void pci_update_current_state(struct pci_dev *dev, pci_power_t state);
void pci_refresh_power_state(struct pci_dev *dev);
void pci_power_up(struct pci_dev *dev);
int pci_power_up(struct pci_dev *dev);
void pci_disable_enabled_device(struct pci_dev *dev);
int pci_finish_runtime_suspend(struct pci_dev *dev);
void pcie_clear_root_pme_status(struct pci_dev *dev);
@ -105,6 +105,7 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev);
void pci_free_cap_save_buffers(struct pci_dev *dev);
bool pci_bridge_d3_possible(struct pci_dev *dev);
void pci_bridge_d3_update(struct pci_dev *dev);
void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev);
static inline void pci_wakeup_event(struct pci_dev *dev)
{

View File

@ -2593,7 +2593,7 @@ static void radeon_set_suspend(struct radeonfb_info *rinfo, int suspend)
* calling pci_set_power_state()
*/
radeonfb_whack_power_state(rinfo, PCI_D2);
__pci_complete_power_transition(rinfo->pdev, PCI_D2);
pci_platform_power_transition(rinfo->pdev, PCI_D2);
} else {
printk(KERN_DEBUG "radeonfb (%s): switching to D0 state...\n",
pci_name(rinfo->pdev));

View File

@ -74,7 +74,7 @@ static int xen_allocate_irq(struct pci_dev *pdev)
"xen-platform-pci", pdev);
}
static int platform_pci_resume(struct pci_dev *pdev)
static int platform_pci_resume(struct device *dev)
{
int err;
@ -83,7 +83,7 @@ static int platform_pci_resume(struct pci_dev *pdev)
err = xen_set_callback_via(callback_via);
if (err) {
dev_err(&pdev->dev, "platform_pci_resume failure!\n");
dev_err(dev, "platform_pci_resume failure!\n");
return err;
}
return 0;
@ -168,13 +168,17 @@ static const struct pci_device_id platform_pci_tbl[] = {
{0,}
};
static struct dev_pm_ops platform_pm_ops = {
.resume_noirq = platform_pci_resume,
};
static struct pci_driver platform_driver = {
.name = DRV_NAME,
.probe = platform_pci_probe,
.id_table = platform_pci_tbl,
#ifdef CONFIG_PM
.resume_early = platform_pci_resume,
#endif
.driver = {
.pm = &platform_pm_ops,
},
};
builtin_pci_driver(platform_driver);

View File

@ -805,8 +805,6 @@ struct module;
* The remove function always gets called from process
* context, so it can sleep.
* @suspend: Put device into low power state.
* @suspend_late: Put device into low power state.
* @resume_early: Wake device from low power state.
* @resume: Wake device from low power state.
* (Please see Documentation/power/pci.rst for descriptions
* of PCI Power Management and the related functions.)
@ -829,8 +827,6 @@ struct pci_driver {
int (*probe)(struct pci_dev *dev, const struct pci_device_id *id); /* New device inserted */
void (*remove)(struct pci_dev *dev); /* Device removed (NULL if not a hot-plug capable driver) */
int (*suspend)(struct pci_dev *dev, pm_message_t state); /* Device suspended */
int (*suspend_late)(struct pci_dev *dev, pm_message_t state);
int (*resume_early)(struct pci_dev *dev);
int (*resume)(struct pci_dev *dev); /* Device woken up */
void (*shutdown)(struct pci_dev *dev);
int (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
@ -1232,7 +1228,7 @@ struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
int pci_add_cap_save_buffer(struct pci_dev *dev, char cap, unsigned int size);
int pci_add_ext_cap_save_buffer(struct pci_dev *dev,
u16 cap, unsigned int size);
int __pci_complete_power_transition(struct pci_dev *dev, pci_power_t state);
int pci_platform_power_transition(struct pci_dev *dev, pci_power_t state);
int pci_set_power_state(struct pci_dev *dev, pci_power_t state);
pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state);
bool pci_pme_capable(struct pci_dev *dev, pci_power_t state);
@ -2398,4 +2394,12 @@ void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type);
#define pci_info_ratelimited(pdev, fmt, arg...) \
dev_info_ratelimited(&(pdev)->dev, fmt, ##arg)
#define pci_WARN(pdev, condition, fmt, arg...) \
WARN(condition, "%s %s: " fmt, \
dev_driver_string(&(pdev)->dev), pci_name(pdev), ##arg)
#define pci_WARN_ONCE(pdev, condition, fmt, arg...) \
WARN_ONCE(condition, "%s %s: " fmt, \
dev_driver_string(&(pdev)->dev), pci_name(pdev), ##arg)
#endif /* LINUX_PCI_H */