linux/kernel/cgroup
Chengming Zhou 34f26a1561 sched/psi: Per-cgroup PSI accounting disable/re-enable interface
PSI accounts stalls for each cgroup separately and aggregates it
at each level of the hierarchy. This may cause non-negligible overhead
for some workloads when under deep level of the hierarchy.

commit 3958e2d0c3 ("cgroup: make per-cgroup pressure stall tracking configurable")
make PSI to skip per-cgroup stall accounting, only account system-wide
to avoid this each level overhead.

But for our use case, we also want leaf cgroup PSI stats accounted for
userspace adjustment on that cgroup, apart from only system-wide adjustment.

So this patch introduce a per-cgroup PSI accounting disable/re-enable
interface "cgroup.pressure", which is a read-write single value file that
allowed values are "0" and "1", the defaults is "1" so per-cgroup
PSI stats is enabled by default.

Implementation details:

It should be relatively straight-forward to disable and re-enable
state aggregation, time tracking, averaging on a per-cgroup level,
if we can live with losing history from while it was disabled.
I.e. the avgs will restart from 0, total= will have gaps.

But it's hard or complex to stop/restart groupc->tasks[] updates,
which is not implemented in this patch. So we always update
groupc->tasks[] and PSI_ONCPU bit in psi_group_change() even when
the cgroup PSI stats is disabled.

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/20220907090332.2078-1-zhouchengming@bytedance.com
2022-09-09 11:08:33 +02:00
..
cgroup-internal.h cgroup: Make !percpu threadgroup_rwsem operations optional 2022-07-23 04:29:02 -10:00
cgroup-v1.c cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all() 2022-08-25 07:36:30 -10:00
cgroup.c sched/psi: Per-cgroup PSI accounting disable/re-enable interface 2022-09-09 11:08:33 +02:00
cpuset.c cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock 2022-08-17 07:36:05 -10:00
debug.c kernel: cgroup: fix misuse of %x 2019-05-06 08:47:48 -07:00
freezer.c cgroup: cleanup comments 2022-03-13 19:19:27 -10:00
legacy_freezer.c cgroup: rename freezer.c into legacy_freezer.c 2019-04-19 11:26:48 -07:00
Makefile cgroup: Add misc cgroup controller 2021-04-04 13:34:46 -04:00
misc.c misc_cgroup: remove error log to avoid log flood 2021-09-20 07:35:38 -10:00
namespace.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
pids.c clone3: allow spawning processes into cgroups 2020-02-12 17:57:51 -05:00
rdma.c cgroup: fix spelling mistakes 2021-05-24 12:45:26 -04:00
rstat.c sched/core: add forced idle accounting for cgroups 2022-07-04 09:23:07 +02:00