The "performance" column contains variable-width values. Hence when
their printed values contain more than one digit, all values in
successive columns become misaligned.
Fix this by formatting it as a fixed-width field. Adjust successive
spaces and field widths to retain the exiting layout.
Fixes: 0155aaf95a ("PM: domains: Add the domain HW-managed mode to the summary")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/e004f9d2a75e9a49c269507bb8a4514001751e85.1725459707.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The inter-column space in the debug summary is two spaces. However, in
one case, the extra space is handled implicitly in a field width
specifier. Make inter-column space explicit to ease future maintenance.
Fixes: 45fbc464b0 ("PM: domains: Add "performance" column to debug summary")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/ae61eb363621b981edde878e1e74d701702a579f.1725459707.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The RK3576 SoC needs to ungate the power domains before their status can
be modified.
The values have been taken from the rockchip downstream driver.
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Link: https://lore.kernel.org/r/20240829202732.75961-3-detlev.casanova@collabora.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Some rockchip SoC need to ungate power domains before their power status
can be changed.
Each power domain has a gate mask that is set to 1 to ungate it when
manipulating power status, then set back to 0 to gate it again.
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Link: https://lore.kernel.org/r/20240829202732.75961-2-detlev.casanova@collabora.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Drop OF node reference immediately after using it in
syscon_node_to_regmap(), which is both simpler and typical/expected
code pattern.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20240825183116.102953-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This variable is only used within the probe() function, so let's remove
it from the context and define it locally within the same function.
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Link: https://lore.kernel.org/r/20240825143428.556439-3-dario.binacchi@amarulasolutions.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The removed code was added to handle the case where the power domain is
already on during the driver's probing. In this use case, the "is_off"
parameter is passed as false to pm_genpd_init() to inform it not to call
the power_on() callback, as it's unnecessary to perform the hardware
power-on procedure since the power domain is already on. Therefore, with
the call to clk_bulk_prepare_enable() by probe(), the system is in the
same operational state as when "is_off" is passed as true after the
power_on() callback execution:
probe() -> is_off == true -> clk_bulk_prepare_enable() called by power_on()
probe() -> is_off == false -> clk_bulk_prepare_enable() called by probe()
Reaching the same logical and operational state, it follows that upon
driver removal, there is no need to perform different actions depending
on the power domain's on/off state during probing.
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Link: https://lore.kernel.org/r/20240825143428.556439-2-dario.binacchi@amarulasolutions.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This way, the code becomes more compact, and dev_err_probe() is used in
every error path of the probe() function.
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20240825143428.556439-1-dario.binacchi@amarulasolutions.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Use scoped for_each_available_child_of_node_scoped() and
for_each_child_of_node_scoped() when iterating over
device nodes to make code a bit simpler.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-1-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Use scope based of_node_put() to simplify the code logic, and we don't
need to call of_node_put(). This will simplify the code a lot.
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Link: https://lore.kernel.org/r/20240821034022.27394-3-zhangzekun11@huawei.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
for_each_available_child_of_node() can help to iterate through the
device_node, and we don't need to use while loop. Besides, the purpose
of the while loop is to find a device_node which fits the condition
"child_req_np == ref_np", we can just read the property of "child_np"
directly in for_each_available_child_of_node(). No functional change
with such conversion.
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Link: https://lore.kernel.org/r/20240821034022.27394-2-zhangzekun11@huawei.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Merge the pmdomain fixes for v6.11-rc[n] into the next branch, to allow them
to get tested together with the new changes that are targeted for v6.12.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
It has turned out that having _set_required_opps() to recursively call
dev_pm_opp_set_opp() to set the required OPPs, doesn't really work as well
as we expected.
More precisely, at each recursive call to dev_pm_opp_set_opp() we are
changing an OPP for a required_dev that belongs to a required-OPP table.
The problem with this, is that we may have several devices sharing the same
required-OPP table, which leads to an incorrect behaviour in regards to
aggregating the per device votes.
To fix the problem for a required-OPP table belonging to a PM domain, which
is the only existing usecase for now, let's simply replace the call to
dev_pm_opp_set_opp() in _set_required_opps() by a call to _set_opp_level().
Moving forward we may potentially need to add support for other types of
required-OPP tables. In this case, the aggregation needs to be thought of.
Fixes: e37440e7e2 ("OPP: Call dev_pm_opp_set_opp() for required OPPs")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20240822224547.385095-2-ulf.hansson@linaro.org
Use scoped for_each_child_of_node_scoped() when iterating over device
nodes to make code a bit simpler.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240816150931.142208-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Use scoped for_each_child_of_node_scoped() when iterating over device
nodes to make code a bit simpler.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240816150931.142208-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The sparse tool complains as follows:
drivers/pmdomain/apple/pmgr-pwrstate.c:180:32: warning:
symbol 'apple_pmgr_reset_ops' was not declared. Should it be static:
This symbol is not used outside of pmgr-pwrstate.c, so marks it static.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20240819115956.3884847-1-ruanjinjie@huawei.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Add the devres-enabled version of dev_pm_domain_attach|detach_list.
If client drivers use devm_pm_domain_attach_list() to attach the PM domains,
devm_pm_domain_detach_list() will be invoked implicitly during remove phase.
Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Link: https://lore.kernel.org/r/1724063350-11993-2-git-send-email-quic_dikshita@quicinc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Merge the immutable branch dt into next, to allow the DT bindings to be
tested together with changes that are targeted for v6.12.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Merge the pmdomain fixes for v6.11-rc[n] into the next branch, to allow them
to get tested together with the new changes that are targeted for v6.12.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
With "quiet" set in bootargs, there is power domain failure:
"imx93_power_domain 44462400.power-domain: pd_off timeout: name:
44462400.power-domain, stat: 4"
The current power on opertation takes ISO state as power on finished
flag, but it is wrong. Before powering on operation really finishes,
powering off comes and powering off will never finish because the last
powering on still not finishes, so the following powering off actually
not trigger hardware state machine to run. SSAR is the last step when
powering on a domain, so need to wait SSAR done when powering on.
Since EdgeLock Enclave(ELE) handshake is involved in the flow, enlarge
the waiting time to 10ms for both on and off to avoid timeout.
Cc: stable@vger.kernel.org
Fixes: 0a0f7cc25d ("soc: imx: add i.MX93 SRC power domain driver")
Reviewed-by: Jacky Bai <ping.bai@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20240814124740.2778952-1-peng.fan@oss.nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Set flag GENPD_FLAG_ACTIVE_WAKEUP to rpi_power genpd, then when a device
is set as wakeup source using device_set_wakeup_enable, the power
domain could be kept on to make sure the device could wakeup the system.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20240728114200.75559-6-wahrenst@gmx.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The Raspberry Pi power driver heavily relies on the logging of the
underlying firmware driver. This results in disadvantages like unspecific
error messages or limited debugging options. So implement the logging for
the most important function rpi_firmware_set_power.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20240728114200.75559-5-wahrenst@gmx.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
According to the official Mailbox property interface the second part
of RPI_FIRMWARE_SET_POWER_STATE ( and so on ...) is named state because
it represent u32 flags and just the lowest bit is for on/off. So rename it
to align with documentation and prepare the driver for further changes.
Link: https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240728114200.75559-4-wahrenst@gmx.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Merge the pmdomain fixes for v6.11-rc[n] into the next branch, to allow them
to get tested together with the new changes that are targeted for v6.12.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
To enable the domain-idle-states to be used during s2idle on a PREEMPT_RT
based configuration, let's allow the re-assignment of the ->enter_s2idle()
callback to psci_enter_s2idle_domain_idle_state().
Similar to s2ram, let's leave the support for CPU hotplug outside
PREEMPT_RT, as it's depending on using runtime PM. For s2idle, this means
that an offline CPU's PM domain will remain powered-on. In practise this
may lead to that a shallower idle-state than necessary gets selected, which
shouldn't be an issue (besides wasting power).
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-8-ulf.hansson@linaro.org
The hierarchical PM domain topology are currently disabled on a PREEMPT_RT
based configuration. As a first step to enable it to be used, let's try to
attach the CPU devices to their PM domains on PREEMPT_RT. In this way the
syscore ops becomes available, allowing the PM domain topology to be
managed during s2ram.
For the moment let's leave the support for CPU hotplug outside PREEMPT_RT,
as it's depending on using runtime PM. For s2ram, this isn't a problem as
all CPUs are managed via the syscore ops.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-7-ulf.hansson@linaro.org
When using the hierarchical topology and PSCI OSI-mode we may end up
overriding the deepest idle-state's ->enter|enter_s2idle() callbacks, but
there is no point to also re-assign the CPUIDLE_FLAG_RCU_IDLE for the
idle-state in question, as that has already been set when parsing the
states from DT. See init_state_node().
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-6-ulf.hansson@linaro.org
The domain-idle-states are currently disabled on a PREEMPT_RT based
configuration for the cpuidle-psci-domain. To enable them to be used for
system-wide suspend and in particular during s2idle, let's set the
GENPD_FLAG_RPM_ALWAYS_ON instead of GENPD_FLAG_ALWAYS_ON for the
corresponding genpd provider.
In this way, the runtime PM path remains disabled in genpd for its attached
devices, while powering-on/off the PM domain during system-wide suspend
becomes allowed.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-5-ulf.hansson@linaro.org
Using kobject_get_path() means a dynamic memory allocation gets done, which
doesn't work on a PREEMPT_RT based configuration while holding genpd's raw
spinlock.
To fix the problem, let's convert into using the simpler dev_name(). This
means the information about the path doesn't get presented in debugfs, but
hopefully this shouldn't be an issue.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-4-ulf.hansson@linaro.org
There is no need to hold the genpd-lock, while assigning the
dev->pm_domain. In fact, it becomes a problem on a PREEMPT_RT based
configuration as the genpd-lock may be a raw spinlock, while the lock
acquired through the call to dev_pm_domain_set() is a regular spinlock.
To fix the problem, let's simply move the calls to dev_pm_domain_set()
outside the genpd-lock.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-3-ulf.hansson@linaro.org
To allow a genpd provider for a CPU PM domain to enter a domain-idle-state
during s2idle on a PREEMPT_RT based configuration, we can't use the regular
spinlock, as they are turned into sleepable locks on PREEMPT_RT.
To address this problem, let's convert into using the raw spinlock, but
only for genpd providers that have the GENPD_FLAG_CPU_DOMAIN bit set. In
this way, the lock can still be acquired/released in atomic context, which
is needed in the idle-path for PREEMPT_RT.
Do note that the genpd power-on/off notifiers may also be fired during
s2idle, but these are already prepared for PREEMPT_RT as they are based on
the raw notifiers. However, consumers of them may need to adopt accordingly
to work properly on PREEMPT_RT.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-2-ulf.hansson@linaro.org
meson-gx-pwrc-vpu has been superseded by meson-ee-pwrc since
commit 53773f2dfd ("soc: amlogic: meson-ee-pwrc: add support for the Meson GX SoCs"),
so v5.8.
This driver is obsolete and no longer used or tested.
There is no reason to keep it around so remove it.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20240730130227.712894-1-jbrunet@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The kernel sleep profile is no longer working due to a recursive locking
bug introduced by commit 42a20f86dc ("sched: Add wrapper for get_wchan()
to keep task blocked")
Booting with the 'profile=sleep' kernel command line option added or
executing
# echo -n sleep > /sys/kernel/profiling
after boot causes the system to lock up.
Lockdep reports
kthreadd/3 is trying to acquire lock:
ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70
but task is already holding lock:
ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370
with the call trace being
lock_acquire+0xc8/0x2f0
get_wchan+0x32/0x70
__update_stats_enqueue_sleeper+0x151/0x430
enqueue_entity+0x4b0/0x520
enqueue_task_fair+0x92/0x6b0
ttwu_do_activate+0x73/0x140
try_to_wake_up+0x213/0x370
swake_up_locked+0x20/0x50
complete+0x2f/0x40
kthread+0xfb/0x180
However, since nobody noticed this regression for more than two years,
let's remove 'profile=sleep' support based on the assumption that nobody
needs this functionality.
Fixes: 42a20f86dc ("sched: Add wrapper for get_wchan() to keep task blocked")
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver.
A recent change in the ACPI code which consolidated code pathes moved
the invocation of init_freq_invariance_cppc() to be moved to a CPU
hotplug handler. The first invocation on AMD CPUs ends up enabling a
static branch which dead locks because the static branch enable tries to
acquire cpu_hotplug_lock but that lock is already held write by the
hotplug machinery.
Use static_branch_enable_cpuslocked() instead and take the hotplug
lock read for the Intel code path which is invoked from the
architecture code outside of the CPU hotplug operations.
- Fix the number of reserved bits in the sev_config structure bit field
so that the bitfield does not exceed 64 bit.
- Add missing Zen5 model numbers
- Fix the alignment assumptions of pti_clone_pgtable() and
clone_entry_text() on 32-bit:
The code assumes PMD aligned code sections, but on 32-bit the kernel
entry text is not PMD aligned. So depending on the code size and
location, which is configuration and compiler dependent, entry text
can cross a PMD boundary. As the start is not PMD aligned adding PMD
size to the start address is larger than the end address which
results in partially mapped entry code for user space. That causes
endless recursion on the first entry from userspace (usually #PF).
Cure this by aligning the start address in the addition so it ends up
at the next PMD start address.
clone_entry_text() enforces PMD mapping, but on 32-bit the tail might
eventually be PTE mapped, which causes a map fail because the PMD for
the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for
the clone() invocation which resolves to PTE on 32-bit and PMD on
64-bit.
- Zero the 8-byte case for get_user() on range check failure on 32-bit
The recend consolidation of the 8-byte get_user() case broke the
zeroing in the failure case again. Establish it by clearing ECX
before the range check and not afterwards as that obvioulsy can't be
reached when the range check fails
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmave5oTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYofuHD/9AX+BeMOp1+qezoK/YAAfdeY413y9G
WVYbHEdukS4wULX5wBJm1eTGJs2seuJYJ18yO18xHog1cTBsYd8V9kdLGR629QWc
6nEcs2Wbda6NCqZcKigXDbwWHMyKdymvLgCs+ldc+fEOnflXr27ZRyT0fFl03alE
RsX9jlNLG289i6DKJlllC6TjEr+hN6hXUAqY8d5OoMaUuJMJ4HsSBlBSwKAnuvfw
J0/OYZ8cQBtSGMiL3jHG8UngsWt9ehFdWfr/ineDiHagFvFjwlKgAYZwNZ1WORIg
Wx2Ga07JD3ZB4eLCMK1/fHsCtWPw7QtTLYFaKg3QES3yWSPvDJp7YIdXFlFDLNDh
tm/hp6ArhFofpTa+k+EopppUcK5f/TwDyosbKii8FadYjdTFWX4NmBGwoX3wIhCh
M81LdkP4K5YKI+wmJTgTQlT4o6KuNXC7XkKcqrKk/5OBrPG5xgpyeHK1zgbY7p+F
Ez5lTIDEm293boB3WZGGGiImceftr4kZoXSAZjbMBnncrGVFFGBrW5KE8JVTMaKm
kkAVYZFXl+vMJQgAKAIIRgj9MTcV44Cnopq0NwRhM5hOPTFTYXibHuH3X6sUuHKL
P2X2w0HZIaEo1nFO9/pCtqIs/kNFcanP6VWiJggFcCu7ldVi4jgCBpv0UnAiCHwq
nmqq2QbTV1XAMg==
=wf31
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
- Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver.
A recent change in the ACPI code which consolidated code pathes moved
the invocation of init_freq_invariance_cppc() to be moved to a CPU
hotplug handler. The first invocation on AMD CPUs ends up enabling a
static branch which dead locks because the static branch enable tries
to acquire cpu_hotplug_lock but that lock is already held write by
the hotplug machinery.
Use static_branch_enable_cpuslocked() instead and take the hotplug
lock read for the Intel code path which is invoked from the
architecture code outside of the CPU hotplug operations.
- Fix the number of reserved bits in the sev_config structure bit field
so that the bitfield does not exceed 64 bit.
- Add missing Zen5 model numbers
- Fix the alignment assumptions of pti_clone_pgtable() and
clone_entry_text() on 32-bit:
The code assumes PMD aligned code sections, but on 32-bit the kernel
entry text is not PMD aligned. So depending on the code size and
location, which is configuration and compiler dependent, entry text
can cross a PMD boundary. As the start is not PMD aligned adding PMD
size to the start address is larger than the end address which
results in partially mapped entry code for user space. That causes
endless recursion on the first entry from userspace (usually #PF).
Cure this by aligning the start address in the addition so it ends up
at the next PMD start address.
clone_entry_text() enforces PMD mapping, but on 32-bit the tail might
eventually be PTE mapped, which causes a map fail because the PMD for
the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for
the clone() invocation which resolves to PTE on 32-bit and PMD on
64-bit.
- Zero the 8-byte case for get_user() on range check failure on 32-bit
The recend consolidation of the 8-byte get_user() case broke the
zeroing in the failure case again. Establish it by clearing ECX
before the range check and not afterwards as that obvioulsy can't be
reached when the range check fails
* tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
x86/mm: Fix pti_clone_entry_text() for i386
x86/mm: Fix pti_clone_pgtable() alignment assumption
x86/setup: Parse the builtin command line before merging
x86/CPU/AMD: Add models 0x60-0x6f to the Zen5 range
x86/sev: Fix __reserved field in sev_config
x86/aperfmperf: Fix deadlock on cpu_hotplug_lock
- The recent fix for making the take over of the broadcast timer more
reliable retrieves a per CPU pointer in preemptible context.
This went unnoticed in testing as some compilers hoist the access into
the non-preemotible section where the pointer is actually used, but
obviously compilers can rightfully invoke it where the code put it.
Move it into the non-preemptible section right to the actual usage
side to cure it.
- The clocksource watchdog is supposed to emit a warning when the retry
count is greater than one and the number of retries reaches the
limit. The condition is backwards and warns always when the count is
greater than one. Fixup the condition to prevent spamming dmesg.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmavdtQTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYofrtD/95Ck3FgHRdxZWlnIwBptzW1ApfTjKa
fuwOHBAcFzpNx13DcSyKqclMIM2QxN2lAjAv3m5IeNFO7RN5Ru1aOskPpFMgQIzj
6UfKqvtuSZeCPIqspeN9/RAnqKTRYAFRZcSnE8FcFxuM6dU9zlnLjms1gstTyLS3
HeoQoUe1DT6IpKUDKdvMP8JiwU6/i+xHAeVizEkGZ5Rxo67+UDUqqfcpBr/pIAIN
W0KfekPVLGjvL19gvmaHrPBHi5tEjM1+7HiepKeyc9GjC1FHYGQjNLxXpri3CcI+
VZmya37ZVLKoOFjvHuqmCYzFWrEs1rEfgnBeCglV5lvwxfgPoOk9awpUAR4IWkEz
HMagUUre3kThEztoPzyf7apJmltVC7U++gRfW0i7p/gSfwF9AYAPgWAcg6VyDrxn
hIbKkQvqLNM1ldXWS0tG/scgEAKEM7yYG9BP03ac/mGdFNGa6yucG986ElHoVLSR
S8Dw1E7/F7G5KOqVK6i25JLFgN0ZJNRWMWbd95VBEuZcZ4fzIKug3intNeSUSjDc
zfvKfu65nRr2bHcaxs5MkPqkDqFOVytoQqgYYqstUZ4bRyeI6px/Rmu58VgJF2cP
DmUvf6gqNXbg3g6ijQmhOiCTqzjW67bxFbYDuX68oLfBDmdVPP8MjZOJHV7/ukvp
mjcFL/MYjg6/iw==
=lI2j
-----END PGP SIGNATURE-----
Merge tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"Two fixes for the timer/clocksource code:
- The recent fix to make the take over of the broadcast timer more
reliable retrieves a per CPU pointer in preemptible context.
This went unnoticed in testing as some compilers hoist the access
into the non-preemotible section where the pointer is actually
used, but obviously compilers can rightfully invoke it where the
code put it.
Move it into the non-preemptible section right to the actual usage
side to cure it.
- The clocksource watchdog is supposed to emit a warning when the
retry count is greater than one and the number of retries reaches
the limit.
The condition is backwards and warns always when the count is
greater than one. Fixup the condition to prevent spamming dmesg"
* tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()
tick/broadcast: Move per CPU pointer access into the atomic section
- When stime is larger than rtime due to accounting imprecision, then
utime = rtime - stime becomes negative. As this is unsigned math, the
result becomes a huge positive number. Cure it by resetting stime to
rtime in that case, so utime becomes 0.
- Restore consistent state when sched_cpu_deactivate() fails.
When offlining a CPU fails in sched_cpu_deactivate() after the SMT
present counter has been decremented, then the function aborts but
fails to increment the SMT present counter and leaves it imbalanced.
Consecutive operations cause it to underflow. Add the missing fixup
for the error path.
As SMT accounting the runqueue needs to marked online again in the
error exit path to restore consistent state.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmavdT8THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYodVrEACDLJdjkM2n3T7EL8YjuBjkCW3dGWAZ
umpJGjwMDsT9/oLIU7B1wgX/IdWppssQa+0yXxZy7cKQvfP5VTd4fueuub2k5sJc
yDy5J8N0xRYvOhA0lrnp6jyqhhCZzIGDmSn3G+lDuQuuffaqfFbPkeMwoXmewiyt
72ajFsjeo7q25pm8ALgBhrSKfO5FFV1HJoAyoYKEyT5E/pliKNWrzbcA+PWstMK3
DWmj8dgYk6g/dBwNl6wORlpmcxjcDO65icH5XPSsadwosHe7q1+quIJSqMDyXHNY
qQ5r5f9bvXdq5DPKRON0GJb9gfSQNX5yE/pKdyW75mqHMxJ/pnIIds6h6mLHBewt
eZ5M1a/v8o+QiqQcDogk5DUzZlI46bKZsYLqU9y6v/WgUqa5C4BaEJT7CrQk+6wp
xUB4g3j/+asih55Tq9HKo6PEY8NLj4ytKHgeh0EvEllDxGmnRYR+PEdzLBjuWlAY
ka2/1vaNr/r5grbpQhO6N4txUAASoKF6nx1hq05I/lY45KA+RgeU0mgEN07Pa6HZ
4873Q2CnVUlvMVFulOUkJogGNk7KTDb3e7/+BMsA9Lda/2KmqaOLEh5T6egdLZ0G
feb/UQ6hoYcCD0IAsj9MfEOS3IVhOvtkJSwwLi/j09ucmC+5Ar3v3/Aw1EtTHJHm
ObdoEXJC98RLFA==
=b/q0
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
- When stime is larger than rtime due to accounting imprecision, then
utime = rtime - stime becomes negative. As this is unsigned math, the
result becomes a huge positive number.
Cure it by resetting stime to rtime in that case, so utime becomes 0.
- Restore consistent state when sched_cpu_deactivate() fails.
When offlining a CPU fails in sched_cpu_deactivate() after the SMT
present counter has been decremented, then the function aborts but
fails to increment the SMT present counter and leaves it imbalanced.
Consecutive operations cause it to underflow. Add the missing fixup
for the error path.
For SMT accounting the runqueue needs to marked online again in the
error exit path to restore consistent state.
* tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
sched/core: Introduce sched_set_rq_on/offline() helper
sched/smt: Fix unbalance sched_smt_present dec/inc
sched/smt: Introduce sched_smt_present_inc/dec() helper
sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
- Move the smp_processor_id() invocation back into the non-preemtible
region, so that the result is valid to use.
- Add the missing package C2 residency counters for Sierra Forest CPUs
to make the newly added support actually useful
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmavcrATHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYKrD/wKtioQzAPuZHu7XH4IiaTLAbGEuU7L
9B+RYhMJcYmvYgZFdCJIMhsnKeNH6U6tVYx5ajYb4ThjXOpjGnJO5x8kYFoIFRuY
CibYnGZSfjfVQtHgOXtwNt/SG0DkiTP7nS4HrX++zakXeBREVZG/gd4tAq8jWnaq
sxJS3e3WhSz0MqYqfxbkgqIPlREAVLon3pgtBsNQwu1J6y6MV9yYJYwHW4NPCMGq
fTpEEKUdWKOwayaljw+r+GfAiyub+t0IlZ9Cue7FaqNlbLKTnMKSnJgo/wWLdei8
SUs5EOh76w2gFamPz+qRv9LlndvY2mhVvb+aPb/py2EtOUIISAqa/bCNI28EldEr
pzlrybyXxU+sb8igGp7oBpa154DzSAOqIGx81pBDUeqdN9oThjAU6+qC3VgDHqLh
XNKEL+i2MIsyHKwwsjIJDcW10g5p0ngbi+4QmucqXeSCv0Ms9+64m2/xWmFGgnJM
KGu8Iv7e9k15E4wuUUdzsUcOo2UxzQb+HkYfXK0x39FZbRVR1nbjNUJsBrEe9gFA
OJ3CJHJfkVMml0uLlO0vxecDkaBkDuXN33tn7SlABtf5jlWNaeoIZC1ltp/BG8oB
vC3VTDHDpS5H3vN1wQs9UrdxXWQXbpQedv50jwotKgBcP2ibrMV5kSvxjE6LC9V0
jEaxVFfRYVkIVg==
=q4oe
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fixes from Thomas Gleixner:
- Move the smp_processor_id() invocation back into the non-preemtible
region, so that the result is valid to use
- Add the missing package C2 residency counters for Sierra Forest CPUs
to make the newly added support actually useful
* tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix smp_processor_id()-in-preemptible warnings
perf/x86/intel/cstate: Add pkg C2 residency counter for Sierra Forest