rcu/nocb: Add option to opt rcuo kthreads out of RT priority

This commit introduces a RCU_NOCB_CPU_CB_BOOST Kconfig option that
prevents rcuo kthreads from running at real-time priority, even in
kernels built with RCU_BOOST.  This capability is important to devices
needing low-latency (as in a few milliseconds) response from expedited
RCU grace periods, but which are not running a classic real-time workload.
On such devices, permitting the rcuo kthreads to run at real-time priority
results in unacceptable latencies imposed on the application tasks,
which run as SCHED_OTHER.

See for example the following trace output:

<snip>
<...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270
<snip>

If that rcuop kthread were permitted to run at real-time SCHED_FIFO
priority, it would monopolize its CPU for hundreds of milliseconds
while invoking those 34619 RCU callback functions, which would cause an
unacceptably long latency spike for many application stacks on Android
platforms.

However, some existing real-time workloads require that callback
invocation run at SCHED_FIFO priority, for example, those running on
systems with heavy SCHED_OTHER background loads.  (It is the real-time
system's administrator's responsibility to make sure that important
real-time tasks run at a higher priority than do RCU's kthreads.)

Therefore, this new RCU_NOCB_CPU_CB_BOOST Kconfig option defaults to
"y" on kernels built with PREEMPT_RT and defaults to "n" otherwise.
The effect is to preserve current behavior for real-time systems, but for
other systems to allow expedited RCU grace periods to run with real-time
priority while continuing to invoke RCU callbacks as SCHED_OTHER.

As you would expect, this RCU_NOCB_CPU_CB_BOOST Kconfig option has no
effect except on CPUs with offloaded RCU callbacks.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
This commit is contained in:
Uladzislau Rezki (Sony) 2022-05-11 10:57:03 +02:00 committed by Paul E. McKenney
parent 5103850654
commit 8f489b4da5
3 changed files with 23 additions and 2 deletions

View File

@ -275,6 +275,22 @@ config RCU_NOCB_CPU_DEFAULT_ALL
Say Y here if you want offload all CPUs by default on boot.
Say N here if you are unsure.
config RCU_NOCB_CPU_CB_BOOST
bool "Offload RCU callback from real-time kthread"
depends on RCU_NOCB_CPU && RCU_BOOST
default y if PREEMPT_RT
help
Use this option to invoke offloaded callbacks as SCHED_FIFO
to avoid starvation by heavy SCHED_OTHER background load.
Of course, running as SCHED_FIFO during callback floods will
cause the rcuo[ps] kthreads to monopolize the CPU for hundreds
of milliseconds or more. Therefore, when enabling this option,
it is your responsibility to ensure that latency-sensitive
tasks either run with higher priority or run on some other CPU.
Say Y here if you want to set RT priority for offloading kthreads.
Say N here if you are building a !PREEMPT_RT kernel and are unsure.
config TASKS_TRACE_RCU_READ_MB
bool "Tasks Trace RCU readers use memory barriers in user and idle"
depends on RCU_EXPERT && TASKS_TRACE_RCU

View File

@ -154,7 +154,11 @@ static void sync_sched_exp_online_cleanup(int cpu);
static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
/* rcuc/rcub/rcuop kthread realtime priority */
/*
* rcuc/rcub/rcuop kthread realtime priority. The "rcuop"
* real-time priority(enabling/disabling) is controlled by
* the extra CONFIG_RCU_NOCB_CPU_CB_BOOST configuration.
*/
static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
module_param(kthread_prio, int, 0444);

View File

@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
goto end;
if (kthread_prio)
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST) && kthread_prio)
sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
WRITE_ONCE(rdp->nocb_cb_kthread, t);
WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
return;