From 1f6620f87006dc02c608466cd990778aaadf386a Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Wed, 14 Oct 2020 09:56:28 +0530 Subject: [PATCH 01/10] opp: Don't always remove static OPPs in _of_add_opp_table_v1() The patch missed returning 0 early in case of success and hence the static OPPs got removed by mistake. Fix it. Fixes: 90d46d71cce2 ("opp: Handle multiple calls for same OPP table in _of_add_opp_table_v1()") Reported-by: Aisheng Dong Signed-off-by: Viresh Kumar Tested-by: Dong Aisheng --- drivers/opp/of.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/opp/of.c b/drivers/opp/of.c index 874b58756220..9faeb83e4b32 100644 --- a/drivers/opp/of.c +++ b/drivers/opp/of.c @@ -944,6 +944,8 @@ static int _of_add_opp_table_v1(struct device *dev, struct opp_table *opp_table) nr -= 2; } + return 0; + remove_static_opp: _opp_remove_all_static(opp_table); From 47efcbcb340cca5d3b3d515964f09e1fec599a29 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Tue, 20 Oct 2020 16:03:15 +0530 Subject: [PATCH 02/10] opp: Fix early exit from dev_pm_opp_register_set_opp_helper() We returned earlier by mistake even when there were no failures. Fix it. Fixes: dd461cd9183f ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER") Reported-by: Naresh Kamboju Signed-off-by: Viresh Kumar Tested-by: Naresh Kamboju --- drivers/opp/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 2483e765318a..4ac4e7ce6b8b 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1930,7 +1930,7 @@ struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, return ERR_PTR(-EINVAL); opp_table = dev_pm_opp_get_opp_table(dev); - if (!IS_ERR(opp_table)) + if (IS_ERR(opp_table)) return opp_table; /* This should be called before OPPs are initialized */ From e0df59de670b48a923246fae1f972317b84b2764 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Thu, 22 Oct 2020 12:26:08 +0530 Subject: [PATCH 03/10] opp: Reduce the size of critical section in _opp_table_kref_release() There is a lot of stuff here which can be done outside of the big opp_table_lock, do that. This helps avoiding few circular dependency lockdeps around debugfs and interconnects. Reported-by: Rob Clark Reported-by: Dmitry Osipenko Signed-off-by: Viresh Kumar --- drivers/opp/core.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 4ac4e7ce6b8b..0e0a5269dc82 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1181,6 +1181,10 @@ static void _opp_table_kref_release(struct kref *kref) struct opp_device *opp_dev, *temp; int i; + /* Drop the lock as soon as we can */ + list_del(&opp_table->node); + mutex_unlock(&opp_table_lock); + _of_clear_opp_table(opp_table); /* Release clk */ @@ -1208,10 +1212,7 @@ static void _opp_table_kref_release(struct kref *kref) mutex_destroy(&opp_table->genpd_virt_dev_lock); mutex_destroy(&opp_table->lock); - list_del(&opp_table->node); kfree(opp_table); - - mutex_unlock(&opp_table_lock); } void dev_pm_opp_put_opp_table(struct opp_table *opp_table) From 23a881852f3eff6a7ba8d240b57de076763fdef9 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Fri, 30 Oct 2020 12:51:08 +0530 Subject: [PATCH 04/10] cpufreq: schedutil: Don't skip freq update if need_freq_update is set The cpufreq policy's frequency limits (min/max) can get changed at any point of time, while schedutil is trying to update the next frequency. Though the schedutil governor has necessary locking and support in place to make sure we don't miss any of those updates, there is a corner case where the governor will find that the CPU is already running at the desired frequency and so may skip an update. For example, consider that the CPU can run at 1 GHz, 1.2 GHz and 1.4 GHz and is running at 1 GHz currently. Schedutil tries to update the frequency to 1.2 GHz, during this time the policy limits get changed as policy->min = 1.4 GHz. As schedutil (and cpufreq core) does clamp the frequency at various instances, we will eventually set the frequency to 1.4 GHz, while we will save 1.2 GHz in sg_policy->next_freq. Now lets say the policy limits get changed back at this time with policy->min as 1 GHz. The next time schedutil is invoked by the scheduler, we will reevaluate the next frequency (because need_freq_update will get set due to limits change event) and lets say we want to set the frequency to 1.2 GHz again. At this point sugov_update_next_freq() will find the next_freq == current_freq and will abort the update, while the CPU actually runs at 1.4 GHz. Until now need_freq_update was used as a flag to indicate that the policy's frequency limits have changed, and that we should consider the new limits while reevaluating the next frequency. This patch fixes the above mentioned issue by extending the purpose of the need_freq_update flag. If this flag is set now, the schedutil governor will not try to abort a frequency change even if next_freq == current_freq. As similar behavior is required in the case of CPUFREQ_NEED_UPDATE_LIMITS flag as well, need_freq_update will never be set to false if that flag is set for the driver. We also don't need to consider the need_freq_update flag in sugov_update_single() anymore to handle the special case of busy CPU, as we won't abort a frequency update anymore. Reported-by: zhuguangqing Suggested-by: Rafael J. Wysocki Signed-off-by: Viresh Kumar [ rjw: Rearrange code to avoid a branch ] Signed-off-by: Rafael J. Wysocki --- kernel/sched/cpufreq_schedutil.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index c03a5775d019..d73bccde2720 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -102,9 +102,12 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time, unsigned int next_freq) { - if (sg_policy->next_freq == next_freq && - !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS)) - return false; + if (!sg_policy->need_freq_update) { + if (sg_policy->next_freq == next_freq) + return false; + } else { + sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS); + } sg_policy->next_freq = next_freq; sg_policy->last_freq_update_time = time; @@ -162,11 +165,9 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy, freq = map_util_freq(util, freq, max); - if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update && - !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS)) + if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update) return sg_policy->next_freq; - sg_policy->need_freq_update = false; sg_policy->cached_raw_freq = freq; return cpufreq_driver_resolve_freq(policy, freq); } @@ -442,7 +443,6 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, struct sugov_policy *sg_policy = sg_cpu->sg_policy; unsigned long util, max; unsigned int next_f; - bool busy; unsigned int cached_freq = sg_policy->cached_raw_freq; sugov_iowait_boost(sg_cpu, time, flags); @@ -453,9 +453,6 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, if (!sugov_should_update_freq(sg_policy, time)) return; - /* Limits may have changed, don't skip frequency update */ - busy = !sg_policy->need_freq_update && sugov_cpu_is_busy(sg_cpu); - util = sugov_get_util(sg_cpu); max = sg_cpu->max; util = sugov_iowait_apply(sg_cpu, time, util, max); @@ -464,7 +461,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, * Do not reduce the frequency if the CPU has not been idle * recently, as the reduction is likely to be premature then. */ - if (busy && next_f < sg_policy->next_freq) { + if (sugov_cpu_is_busy(sg_cpu) && next_f < sg_policy->next_freq) { next_f = sg_policy->next_freq; /* Restore cached freq as next_freq has changed */ @@ -829,9 +826,10 @@ static int sugov_start(struct cpufreq_policy *policy) sg_policy->next_freq = 0; sg_policy->work_in_progress = false; sg_policy->limits_changed = false; - sg_policy->need_freq_update = false; sg_policy->cached_raw_freq = 0; + sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS); + for_each_cpu(cpu, policy->cpus) { struct sugov_cpu *sg_cpu = &per_cpu(sugov_cpu, cpu); From 6841ca15fe13038b9d27f8e7168700e1427b7a72 Mon Sep 17 00:00:00 2001 From: Julia Lawall Date: Sat, 31 Oct 2020 10:39:28 +0100 Subject: [PATCH 05/10] Documentation: PM: cpuidle: correct typo cerainly -> certainly Signed-off-by: Julia Lawall Signed-off-by: Rafael J. Wysocki --- Documentation/admin-guide/pm/cpuidle.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst index 37940a0584ec..a26a94bc0071 100644 --- a/Documentation/admin-guide/pm/cpuidle.rst +++ b/Documentation/admin-guide/pm/cpuidle.rst @@ -494,7 +494,7 @@ object corresponding to it, as follows: residency. ``below`` - Total number of times this idle state had been asked for, but cerainly + Total number of times this idle state had been asked for, but certainly a deeper idle state would have been a better match for the observed idle duration. From 23d18dcfc5275fbd53a515a4a1cf946b22fe7463 Mon Sep 17 00:00:00 2001 From: Julia Lawall Date: Sat, 31 Oct 2020 10:39:39 +0100 Subject: [PATCH 06/10] Documentation: PM: cpuidle: correct path name cpu/ is needed before cpu/ Signed-off-by: Julia Lawall Signed-off-by: Rafael J. Wysocki --- Documentation/admin-guide/pm/cpuidle.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst index a26a94bc0071..10fde58d0869 100644 --- a/Documentation/admin-guide/pm/cpuidle.rst +++ b/Documentation/admin-guide/pm/cpuidle.rst @@ -478,7 +478,7 @@ order to ask the hardware to enter that state. Also, for each statistics of the given idle state. That information is exposed by the kernel via ``sysfs``. -For each CPU in the system, there is a :file:`/sys/devices/system/cpu/cpuidle/` +For each CPU in the system, there is a :file:`/sys/devices/system/cpu/cpu/cpuidle/` directory in ``sysfs``, where the number ```` is assigned to the given CPU at the initialization time. That directory contains a set of subdirectories called :file:`state0`, :file:`state1` and so on, up to the number of idle state From a8193af7ec0db73c0c86d02a5d5ffc1dc2ab4e88 Mon Sep 17 00:00:00 2001 From: Tom Rix Date: Sun, 1 Nov 2020 06:11:29 -0800 Subject: [PATCH 07/10] powercap/intel_rapl: remove unneeded semicolon A semicolon is not needed after a switch statement. Signed-off-by: Tom Rix Signed-off-by: Rafael J. Wysocki --- drivers/powercap/intel_rapl_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c index 0b2830efc574..70d6d52bc1e2 100644 --- a/drivers/powercap/intel_rapl_common.c +++ b/drivers/powercap/intel_rapl_common.c @@ -620,7 +620,7 @@ static u64 rapl_unit_xlate(struct rapl_domain *rd, enum unit_type type, case ARBITRARY_UNIT: default: return value; - }; + } if (to_raw) return div64_u64(value, units) * scale; From e0e398e204634db8fb71bd89cf2f6e3e5bd09b51 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Wed, 21 Oct 2020 21:12:15 +0200 Subject: [PATCH 08/10] PM: runtime: Drop runtime PM references to supplier on link removal While removing a device link, drop the supplier device's runtime PM usage counter as many times as needed to drop all of the runtime PM references to it from the consumer in addition to dropping the consumer's link count. Fixes: baa8809f6097 ("PM / runtime: Optimize the use of device links") Signed-off-by: Rafael J. Wysocki Cc: 5.1+ # 5.1+ Tested-by: Xiang Chen Reviewed-by: Greg Kroah-Hartman --- drivers/base/core.c | 6 ++---- drivers/base/power/runtime.c | 21 ++++++++++++++++++++- include/linux/pm_runtime.h | 4 ++-- 3 files changed, 24 insertions(+), 7 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 78114ddac755..d661ada1518f 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -773,8 +773,7 @@ static void __device_link_del(struct kref *kref) dev_dbg(link->consumer, "Dropping the link to %s\n", dev_name(link->supplier)); - if (link->flags & DL_FLAG_PM_RUNTIME) - pm_runtime_drop_link(link->consumer); + pm_runtime_drop_link(link); list_del_rcu(&link->s_node); list_del_rcu(&link->c_node); @@ -788,8 +787,7 @@ static void __device_link_del(struct kref *kref) dev_info(link->consumer, "Dropping the link to %s\n", dev_name(link->supplier)); - if (link->flags & DL_FLAG_PM_RUNTIME) - pm_runtime_drop_link(link->consumer); + pm_runtime_drop_link(link); list_del(&link->s_node); list_del(&link->c_node); diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 6f605f7820bb..6919f7fc226b 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1729,7 +1729,7 @@ void pm_runtime_new_link(struct device *dev) spin_unlock_irq(&dev->power.lock); } -void pm_runtime_drop_link(struct device *dev) +static void pm_runtime_drop_link_count(struct device *dev) { spin_lock_irq(&dev->power.lock); WARN_ON(dev->power.links_count == 0); @@ -1737,6 +1737,25 @@ void pm_runtime_drop_link(struct device *dev) spin_unlock_irq(&dev->power.lock); } +/** + * pm_runtime_drop_link - Prepare for device link removal. + * @link: Device link going away. + * + * Drop the link count of the consumer end of @link and decrement the supplier + * device's runtime PM usage counter as many times as needed to drop all of the + * PM runtime reference to it from the consumer. + */ +void pm_runtime_drop_link(struct device_link *link) +{ + if (!(link->flags & DL_FLAG_PM_RUNTIME)) + return; + + pm_runtime_drop_link_count(link->consumer); + + while (refcount_dec_not_one(&link->rpm_active)) + pm_runtime_put(link->supplier); +} + static bool pm_runtime_need_not_resume(struct device *dev) { return atomic_read(&dev->power.usage_count) <= 1 && diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h index 18b02dcc168e..eadc1fdebce6 100644 --- a/include/linux/pm_runtime.h +++ b/include/linux/pm_runtime.h @@ -58,7 +58,7 @@ extern void pm_runtime_clean_up_links(struct device *dev); extern void pm_runtime_get_suppliers(struct device *dev); extern void pm_runtime_put_suppliers(struct device *dev); extern void pm_runtime_new_link(struct device *dev); -extern void pm_runtime_drop_link(struct device *dev); +extern void pm_runtime_drop_link(struct device_link *link); /** * pm_runtime_get_if_in_use - Conditionally bump up runtime PM usage counter. @@ -280,7 +280,7 @@ static inline void pm_runtime_clean_up_links(struct device *dev) {} static inline void pm_runtime_get_suppliers(struct device *dev) {} static inline void pm_runtime_put_suppliers(struct device *dev) {} static inline void pm_runtime_new_link(struct device *dev) {} -static inline void pm_runtime_drop_link(struct device *dev) {} +static inline void pm_runtime_drop_link(struct device_link *link) {} #endif /* !CONFIG_PM */ From d6e36668598154820177bfd78c1621d8e6c580a2 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Wed, 21 Oct 2020 21:13:10 +0200 Subject: [PATCH 09/10] PM: runtime: Drop pm_runtime_clean_up_links() After commit d12544fb2aa9 ("PM: runtime: Remove link state checks in rpm_get/put_supplier()") nothing prevents the consumer device's runtime PM from acquiring additional references to the supplier device after pm_runtime_clean_up_links() has run (or even while it is running), so calling this function from __device_release_driver() may be pointless (or even harmful). Moreover, it ignores stateless device links, so the runtime PM handling of managed and stateless device links is inconsistent because of it, so better get rid of it entirely. Fixes: d12544fb2aa9 ("PM: runtime: Remove link state checks in rpm_get/put_supplier()") Signed-off-by: Rafael J. Wysocki Cc: 5.1+ # 5.1+ Tested-by: Xiang Chen Reviewed-by: Greg Kroah-Hartman --- drivers/base/dd.c | 1 - drivers/base/power/runtime.c | 36 ------------------------------------ include/linux/pm_runtime.h | 2 -- 3 files changed, 39 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index b42229b74fd6..122b0372fdc9 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -1133,7 +1133,6 @@ static void __device_release_driver(struct device *dev, struct device *parent) } pm_runtime_get_sync(dev); - pm_runtime_clean_up_links(dev); driver_sysfs_remove(dev); diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 6919f7fc226b..bfda153b1a41 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1642,42 +1642,6 @@ void pm_runtime_remove(struct device *dev) pm_runtime_reinit(dev); } -/** - * pm_runtime_clean_up_links - Prepare links to consumers for driver removal. - * @dev: Device whose driver is going to be removed. - * - * Check links from this device to any consumers and if any of them have active - * runtime PM references to the device, drop the usage counter of the device - * (as many times as needed). - * - * Links with the DL_FLAG_MANAGED flag unset are ignored. - * - * Since the device is guaranteed to be runtime-active at the point this is - * called, nothing else needs to be done here. - * - * Moreover, this is called after device_links_busy() has returned 'false', so - * the status of each link is guaranteed to be DL_STATE_SUPPLIER_UNBIND and - * therefore rpm_active can't be manipulated concurrently. - */ -void pm_runtime_clean_up_links(struct device *dev) -{ - struct device_link *link; - int idx; - - idx = device_links_read_lock(); - - list_for_each_entry_rcu(link, &dev->links.consumers, s_node, - device_links_read_lock_held()) { - if (!(link->flags & DL_FLAG_MANAGED)) - continue; - - while (refcount_dec_not_one(&link->rpm_active)) - pm_runtime_put_noidle(dev); - } - - device_links_read_unlock(idx); -} - /** * pm_runtime_get_suppliers - Resume and reference-count supplier devices. * @dev: Consumer device. diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h index eadc1fdebce6..4b708f4e8eed 100644 --- a/include/linux/pm_runtime.h +++ b/include/linux/pm_runtime.h @@ -54,7 +54,6 @@ extern u64 pm_runtime_autosuspend_expiration(struct device *dev); extern void pm_runtime_update_max_time_suspended(struct device *dev, s64 delta_ns); extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable); -extern void pm_runtime_clean_up_links(struct device *dev); extern void pm_runtime_get_suppliers(struct device *dev); extern void pm_runtime_put_suppliers(struct device *dev); extern void pm_runtime_new_link(struct device *dev); @@ -276,7 +275,6 @@ static inline u64 pm_runtime_autosuspend_expiration( struct device *dev) { return 0; } static inline void pm_runtime_set_memalloc_noio(struct device *dev, bool enable){} -static inline void pm_runtime_clean_up_links(struct device *dev) {} static inline void pm_runtime_get_suppliers(struct device *dev) {} static inline void pm_runtime_put_suppliers(struct device *dev) {} static inline void pm_runtime_new_link(struct device *dev) {} From 9226c504e364158a17a68ff1fe9d67d266922f50 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Thu, 22 Oct 2020 17:38:22 +0200 Subject: [PATCH 10/10] PM: runtime: Resume the device earlier in __device_release_driver() Since the device is resumed from runtime-suspend in __device_release_driver() anyway, it is better to do that before looking for busy managed device links from it to consumers, because if there are any, device_links_unbind_consumers() will be called and it will cause the consumer devices' drivers to unbind, so the consumer devices will be runtime-resumed. In turn, resuming each consumer device will cause the supplier to be resumed and when the runtime PM references from the given consumer to it are dropped, it may be suspended. Then, the runtime-resume of the next consumer will cause the supplier to resume again and so on. Update the code accordingly. Signed-off-by: Rafael J. Wysocki Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support") Cc: All applicable # All applicable Tested-by: Xiang Chen Reviewed-by: Greg Kroah-Hartman --- drivers/base/dd.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 122b0372fdc9..148e81969e04 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -1117,6 +1117,8 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv = dev->driver; if (drv) { + pm_runtime_get_sync(dev); + while (device_links_busy(dev)) { __device_driver_unlock(dev, parent); @@ -1128,12 +1130,12 @@ static void __device_release_driver(struct device *dev, struct device *parent) * have released the driver successfully while this one * was waiting, so check for that. */ - if (dev->driver != drv) + if (dev->driver != drv) { + pm_runtime_put(dev); return; + } } - pm_runtime_get_sync(dev); - driver_sysfs_remove(dev); if (dev->bus)