linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-30 23:02:08 +00:00

History

Johannes Weiner 1b69ac6b40 psi: fix aggregation idle shut-off psi has provisions to shut off the periodic aggregation worker when there is a period of no task activity - and thus no data that needs aggregating. However, while developing psi monitoring, Suren noticed that the aggregation clock currently won't stay shut off for good. Debugging this revealed a flaw in the idle design: an aggregation run will see no task activity and decide to go to sleep; shortly thereafter, the kworker thread that executed the aggregation will go idle and cause a scheduling change, during which the psi callback will kick the !pending worker again. This will ping-pong forever, and is equivalent to having no shut-off logic at all (but with more code!) Fix this by exempting aggregation workers from psi's clock waking logic when the state change is them going to sleep. To do this, tag workers with the last work function they executed, and if in psi we see a worker going to sleep after aggregating psi data, we will not reschedule the aggregation work item. What if the worker is also executing other items before or after? Any psi state times that were incurred by work items preceding the aggregation work will have been collected from the per-cpu buckets during the aggregation itself. If there are work items following the aggregation work, the worker's last_func tag will be overwritten and the aggregator will be kept alive to process this genuine new activity. If the aggregation work is the last thing the worker does, and we decide to go idle, the brief period of non-idle time incurred between the aggregation run and the kworker's dequeue will be stranded in the per-cpu buckets until the clock is woken by later activity. But that should not be a problem. The buckets can hold 4s worth of time, and future activity will wake the clock with a 2s delay, giving us 2s worth of data we can leave behind when disabling aggregation. If it takes a worker more than two seconds to go idle after it finishes its last work item, we likely have bigger problems in the system, and won't notice one sample that was averaged with a bogus per-CPU weight. Link: http://lkml.kernel.org/r/20190116193501.1910-1-hannes@cmpxchg.org Fixes: `eb414681d5` ("psi: pressure stall information for CPU, memory, and IO") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2019-02-01 15:46:23 -08:00
..
autogroup.c	sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]	2018-05-05 08:34:42 +02:00
autogroup.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
clock.c	sched/clock: Disable interrupts when calling generic_sched_clock_init()	2018-07-30 19:33:35 +02:00
completion.c	sched/Documentation: Update wake_up() & co. memory-barrier guarantees	2018-07-17 09:30:34 +02:00
core.c	sched/wake_q: Fix wakeup ordering for wake_q	2019-01-21 11:15:37 +01:00
cpuacct.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpudeadline.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpudeadline.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpufreq_schedutil.c	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2018-12-26 14:56:10 -08:00
cpufreq.c	sched/cpufreq: Add the SPDX tags	2018-12-11 11:35:25 +01:00
cpupri.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpupri.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cputime.c	sched: Fix various typos in comments	2018-12-03 11:55:42 +01:00
deadline.c	sched/core: Remove unnecessary unlikely() in push_*_task()	2018-12-11 15:16:57 +01:00
debug.c	jump_label: move 'asm goto' support test to Kconfig	2019-01-06 09:46:51 +09:00
fair.c	jump_label: move 'asm goto' support test to Kconfig	2019-01-06 09:46:51 +09:00
features.h	sched/fair: Disable LB_BIAS by default	2018-10-02 09:45:01 +02:00
idle.c	x86/stackprotector: Remove the call to boot_init_stack_canary() from cpu_startup_entry()	2018-10-22 04:07:24 +02:00
isolation.c	sched: Fix various typos in comments	2018-12-03 11:55:42 +01:00
loadavg.c	sched: loadavg: make calc_load_n() public	2018-10-26 16:26:32 -07:00
Makefile	psi: pressure stall information for CPU, memory, and IO	2018-10-26 16:26:32 -07:00
membarrier.c	sched/membarrier: synchronize_sched() with synchronize_rcu()	2018-11-27 09:21:43 -08:00
pelt.c	sched/fair: Remove setting task's se->runnable_weight during PELT update	2018-10-02 09:45:03 +02:00
pelt.h	sched/pelt: Fix warning and clean up IRQ PELT config	2018-10-02 09:45:00 +02:00
psi.c	psi: fix aggregation idle shut-off	2019-02-01 15:46:23 -08:00
rt.c	sched/core: Remove unnecessary unlikely() in push_*_task()	2018-12-11 15:16:57 +01:00
sched-pelt.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
sched.h	jump_label: move 'asm goto' support test to Kconfig	2019-01-06 09:46:51 +09:00
stats.c	proc: introduce proc_create_seq{,_data}	2018-05-16 07:23:35 +02:00
stats.h	psi: make disabling/enabling easier for vendor kernels	2018-11-30 14:56:14 -08:00
stop_task.c	sched: Clean up and harmonize the coding style of the scheduler code base	2018-03-03 15:50:21 +01:00
swait.c	kernel/sched/: remove caller signal_pending branch predictions	2019-01-04 13:13:48 -08:00
topology.c	sched/toplogy: Introduce the 'sched_energy_present' static key	2018-12-11 15:17:01 +01:00
wait_bit.c	sched/wait: Improve __var_waitqueue() code generation	2018-03-20 08:23:25 +01:00
wait.c	kernel/sched/: remove caller signal_pending branch predictions	2019-01-04 13:13:48 -08:00