From e77cb32558a7bd0e996b7e203158a7fbf1a663fb Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 22 Jun 2018 06:22:20 -0700 Subject: [PATCH 001/140] doc: Add design documentation on interruption of NMI handlers Make Requirements.html talk about how NMI handlers can take what appear to RCU to be normal interrupts. Signed-off-by: Paul E. McKenney --- .../RCU/Design/Requirements/Requirements.html | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 49690228b1c6..089a8e8faac1 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -2275,6 +2275,17 @@ he also kindly surprised me with an algorithm that meets this requirement. +

+Furthermore, NMI handlers can be interrupted by what appear to RCU +to be normal interrupts. +One way that this can happen is for code that directly invokes +rcu_irq_enter() and rcu_irq_exit() to be called +from an NMI handler. +This astonishing fact of life prompted the current code structure, +which has rcu_irq_enter() invoking rcu_nmi_enter() +and rcu_irq_exit() invoking rcu_nmi_exit(). +And yes, I also learned of this requirement the hard way. +

Loadable Modules

From a5a2889544997d3909480d6d27e2559875226532 Mon Sep 17 00:00:00 2001 From: "Joel Fernandes (Google)" Date: Fri, 22 Jun 2018 21:40:55 -0700 Subject: [PATCH 002/140] doc: Fix broken RCU-requirements link to LKML archive Two of the Requirements.html LKML links are broken. This patch changes them to use the archive from lore.kernel.org, which works fine. Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- Documentation/RCU/Design/Requirements/Requirements.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 089a8e8faac1..51f39f65002d 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -2269,10 +2269,10 @@ Thankfully, RCU update-side primitives, including The name notwithstanding, some Linux-kernel architectures can have nested NMIs, which RCU must handle correctly. Andy Lutomirski -surprised me +surprised me with this requirement; he also kindly surprised me with -an algorithm +an algorithm that meets this requirement.

From ea24c125fe79fd966a4f59768d3d70142a9eb18d Mon Sep 17 00:00:00 2001 From: "Joel Fernandes (Google)" Date: Sun, 24 Jun 2018 12:34:51 -0700 Subject: [PATCH 003/140] doc: Improve rcu_dynticks::dynticks documentation The very useful RCU Data-Structures describes that the dynticks counter of the rcu_dynticks data structure is incremented when we transitions to or from dynticks-idle mode. However it doesn't mention that it is also incremented due to transitions to and from user mode which for dynticks purposes is an extended quiescent state. I found this with tracing calls to rcu_dynticks_eqs_enter which can also happen from rcu_user_enter. Lets add this information to the Data-Structures document. Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- .../RCU/Design/Data-Structures/Data-Structures.html | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html index f5120a00f511..50be87e59937 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html @@ -1227,9 +1227,11 @@ to overflow the counter, this approach corrects the CPU enters the idle loop from process context.

The ->dynticks field counts the corresponding -CPU's transitions to and from dyntick-idle mode, so that this counter -has an even value when the CPU is in dyntick-idle mode and an odd -value otherwise. +CPU's transitions to and from either dyntick-idle or user mode, so +that this counter has an even value when the CPU is in dyntick-idle +mode or user mode and an odd value otherwise. The transitions to/from +user mode need to be counted for user mode adaptive-ticks support +(see timers/NO_HZ.txt).

The ->rcu_need_heavy_qs field is used to record the fact that the RCU core code would really like to From 31e7490741566d6f72be3a68bf9259a3bc2bc21d Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sun, 1 Jul 2018 05:48:46 -0700 Subject: [PATCH 004/140] torture: Stop overwriting Make.out file with obsolete version The old approach placed all the build products into the b* directories, which meant that some of these build products needed to be copied to the proper directory in the res hierarchy. The new approach leaves things like .config and the .o files in the b1 directory, but directs build output and diagnostics directly to the proper directory in the res hierarchy. Unfortunately, one of the copies was still carried out, which could (and sometimes did) overwrite the build output and diagnostics with obsolete output remaining in the b1 directory. This commit therefore removes the offending "cp" command. Signed-off-by: Paul E. McKenney --- tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh index f7247ee00514..58ca758a5786 100755 --- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh +++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh @@ -120,7 +120,6 @@ then parse-build.sh $resdir/Make.out $title else # Build failed. - cp $builddir/Make*.out $resdir cp $builddir/.config $resdir || : echo Build failed, not running KVM, see $resdir. if test -f $builddir.wait From 444da518fd554eb1b9875dc97fac6ec249cee330 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Jul 2018 14:14:42 -0700 Subject: [PATCH 005/140] rcutorture: Force occasional reader waits Deferred quiescent states can interact with the scheduler, but rcu_torture_reader() does not force such interaction all that frequently. This commit therefore blocks for one jiffy after ten jiffies of read-side runtime. This has the beneficial effect of being most likely to block just after long-running readers, and it is exactly these readers that are most likely to have been preempted (in CONFIG_PREEMPT=y kernels). This in turn helps increase the probability that a deferred quiescent state will be seen by RCU's context-switch hooks. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index c596c6f1e457..50a4f0ed4ebf 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1387,6 +1387,7 @@ static void rcu_torture_timer(struct timer_list *unused) static int rcu_torture_reader(void *arg) { + unsigned long lastsleep = jiffies; DEFINE_TORTURE_RANDOM(rand); struct timer_list t; @@ -1402,6 +1403,10 @@ rcu_torture_reader(void *arg) } if (!rcu_torture_one_read(&rand)) schedule_timeout_interruptible(HZ); + if (time_after(jiffies, lastsleep)) { + schedule_timeout_interruptible(1); + lastsleep = jiffies + 10; + } stutter_wait("rcu_torture_reader"); } while (!torture_must_stop()); if (irqreader && cur_ops->irq_capable) { From e746b558572efbad250e35e582a32ecabc9e9316 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 17:35:22 -0700 Subject: [PATCH 006/140] rcutorture: Warn on bad torture type for built-in tests When running a built-in rcutorture test, specifying an invalid torture type results in what looks like a hard hang, with the error messages hidden by other boot-time output. This commit therefore executes a WARN_ON() in this case so that the splat appears just following the error messages. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 50a4f0ed4ebf..5df2411f7aee 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1968,6 +1968,7 @@ rcu_torture_init(void) for (i = 0; i < ARRAY_SIZE(torture_ops); i++) pr_cont(" %s", torture_ops[i]->name); pr_cont("\n"); + WARN_ON(!IS_MODULE(CONFIG_RCU_TORTURE_TEST)); firsterr = -EINVAL; goto unwind; } From f0288064425ff9a5e05c8c0fdba6ec7681dd3330 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:26:50 -0700 Subject: [PATCH 007/140] rcuperf: Warn on bad perf type for built-in tests When running a built-in rcuperf test, specifying an invalid perf type results in what looks like a hard hang, with the error messages hidden by other boot-time output. This commit therefore executes a WARN_ON() in this case so that the splat appears just following the error messages. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcuperf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c index 34244523550e..832ce68fd45f 100644 --- a/kernel/rcu/rcuperf.c +++ b/kernel/rcu/rcuperf.c @@ -680,6 +680,7 @@ rcu_perf_init(void) for (i = 0; i < ARRAY_SIZE(perf_ops); i++) pr_cont(" %s", perf_ops[i]->name); pr_cont("\n"); + WARN_ON(!IS_MODULE(CONFIG_RCU_PERF_TEST)); firsterr = -EINVAL; goto unwind; } From a52d14addf06c00cfca4f1698e254955942be754 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 10:30:28 -0700 Subject: [PATCH 008/140] rcutorture: Remove TREE06 and TREE08 from the default test list Now that there is only one RCU flavor to rule them all, the TREE06 and TREE08 test scenarios are redundant. This commit therefore removes them. Later changes will rebalance and renumber the tests. Signed-off-by: Paul E. McKenney --- tools/testing/selftests/rcutorture/configs/rcu/CFLIST | 2 -- 1 file changed, 2 deletions(-) diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST index 6a0b9f69faad..c3c1fb5a9e1f 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST @@ -3,9 +3,7 @@ TREE02 TREE03 TREE04 TREE05 -TREE06 TREE07 -TREE08 TREE09 SRCU-N SRCU-P From 1b27291b1ea4f1f2090fb07c3425db474cdb99ba Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 18 Jul 2018 14:32:31 -0700 Subject: [PATCH 009/140] rcutorture: Add forward-progress tests for RCU grace periods This commit adds a kthread that loops going into and out of RCU read-side critical sections, but also including a cond_resched(), optionally guarded by a check of need_resched(), in that same loop. This commit relies solely on rcu_torture_writer() progress to judge the forward progress of grace periods. Note that Tasks RCU and SRCU are exempted from forward-progress testing due their (intentionally) less-robust forward-progress guarantees. Signed-off-by: Paul E. McKenney --- include/linux/rcutiny.h | 1 + kernel/rcu/rcutorture.c | 73 ++++++++++++++++++++++++++++++++++++++++- kernel/rcu/update.c | 1 + 3 files changed, 74 insertions(+), 1 deletion(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 8d9a0ea8f0b5..a6353f3d6094 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -108,6 +108,7 @@ static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt) */ static inline void rcu_virt_note_context_switch(int cpu) { } static inline void rcu_cpu_stall_reset(void) { } +static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; } static inline void rcu_idle_enter(void) { } static inline void rcu_idle_exit(void) { } static inline void rcu_irq_enter(void) { } diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 5df2411f7aee..fd3ce6cc8eea 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -89,6 +89,12 @@ torture_param(int, fqs_duration, 0, "Duration of fqs bursts (us), 0 to disable"); torture_param(int, fqs_holdoff, 0, "Holdoff time within fqs bursts (us)"); torture_param(int, fqs_stutter, 3, "Wait time between fqs bursts (s)"); +torture_param(bool, fwd_progress, 1, "Test grace-period forward progress"); +torture_param(int, fwd_progress_div, 4, "Fraction of CPU stall to wait"); +torture_param(int, fwd_progress_holdoff, 60, + "Time between forward-progress tests (s)"); +torture_param(bool, fwd_progress_need_resched, 1, + "Hide cond_resched() behind need_resched()"); torture_param(bool, gp_cond, false, "Use conditional/async GP wait primitives"); torture_param(bool, gp_exp, false, "Use expedited GP wait primitives"); torture_param(bool, gp_normal, false, @@ -137,6 +143,7 @@ static struct task_struct **cbflood_task; static struct task_struct *fqs_task; static struct task_struct *boost_tasks[NR_CPUS]; static struct task_struct *stall_task; +static struct task_struct *fwd_prog_task; static struct task_struct **barrier_cbs_tasks; static struct task_struct *barrier_task; @@ -291,6 +298,7 @@ struct rcu_torture_ops { void (*cb_barrier)(void); void (*fqs)(void); void (*stats)(void); + int (*stall_dur)(void); int irq_capable; int can_boost; int extendables; @@ -429,6 +437,7 @@ static struct rcu_torture_ops rcu_ops = { .cb_barrier = rcu_barrier, .fqs = rcu_force_quiescent_state, .stats = NULL, + .stall_dur = rcu_jiffies_till_stall_check, .irq_capable = 1, .can_boost = rcu_can_boost(), .name = "rcu" @@ -1116,7 +1125,8 @@ rcu_torture_writer(void *arg) break; } } - rcu_torture_current_version++; + WRITE_ONCE(rcu_torture_current_version, + rcu_torture_current_version + 1); /* Cycle through nesting levels of rcu_expedite_gp() calls. */ if (can_expedite && !(torture_random(&rand) & 0xff & (!!expediting - 1))) { @@ -1660,6 +1670,63 @@ static int __init rcu_torture_stall_init(void) return torture_create_kthread(rcu_torture_stall, NULL, stall_task); } +/* Carry out grace-period forward-progress testing. */ +static int rcu_torture_fwd_prog(void *args) +{ + unsigned long cvar; + int idx; + unsigned long stopat; + bool tested = false; + + VERBOSE_TOROUT_STRING("rcu_torture_fwd_progress task started"); + do { + schedule_timeout_interruptible(fwd_progress_holdoff * HZ); + cvar = READ_ONCE(rcu_torture_current_version); + stopat = jiffies + cur_ops->stall_dur() / fwd_progress_div; + while (time_before(jiffies, stopat) && !torture_must_stop()) { + idx = cur_ops->readlock(); + udelay(10); + cur_ops->readunlock(idx); + if (!fwd_progress_need_resched || need_resched()) + cond_resched(); + } + if (!time_before(jiffies, stopat) && !torture_must_stop()) { + tested = true; + WARN_ON_ONCE(cvar == + READ_ONCE(rcu_torture_current_version)); + } + /* Avoid slow periods, better to test when busy. */ + stutter_wait("rcu_torture_fwd_prog"); + } while (!torture_must_stop()); + WARN_ON(!tested); + torture_kthread_stopping("rcu_torture_fwd_prog"); + return 0; +} + +/* If forward-progress checking is requested and feasible, spawn the thread. */ +static int __init rcu_torture_fwd_prog_init(void) +{ + if (!fwd_progress) + return 0; /* Not requested, so don't do it. */ + if (!cur_ops->stall_dur || cur_ops->stall_dur() <= 0) { + VERBOSE_TOROUT_STRING("rcu_torture_fwd_prog_init: Disabled, unsupported by RCU flavor under test"); + return 0; + } + if (stall_cpu > 0) { + VERBOSE_TOROUT_STRING("rcu_torture_fwd_prog_init: Disabled, conflicts with CPU-stall testing"); + if (IS_MODULE(CONFIG_RCU_TORTURE_TESTS)) + return -EINVAL; /* In module, can fail back to user. */ + WARN_ON(1); /* Make sure rcutorture notices conflict. */ + return 0; + } + if (fwd_progress_holdoff <= 0) + fwd_progress_holdoff = 1; + if (fwd_progress_div <= 0) + fwd_progress_div = 4; + return torture_create_kthread(rcu_torture_fwd_prog, + NULL, fwd_prog_task); +} + /* Callback function for RCU barrier testing. */ static void rcu_torture_barrier_cbf(struct rcu_head *rcu) { @@ -1833,6 +1900,7 @@ rcu_torture_cleanup(void) } rcu_torture_barrier_cleanup(); + torture_stop_kthread(rcu_torture_fwd_prog, fwd_prog_task); torture_stop_kthread(rcu_torture_stall, stall_task); torture_stop_kthread(rcu_torture_writer, writer_task); @@ -2104,6 +2172,9 @@ rcu_torture_init(void) if (firsterr) goto unwind; firsterr = rcu_torture_stall_init(); + if (firsterr) + goto unwind; + firsterr = rcu_torture_fwd_prog_init(); if (firsterr) goto unwind; firsterr = rcu_torture_barrier_init(); diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index 39cb23d22109..a6b860422d18 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -472,6 +472,7 @@ int rcu_jiffies_till_stall_check(void) } return till_stall_check * HZ + RCU_STALL_DELAY_DELTA; } +EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check); void rcu_sysrq_start(void) { From 119248bec9d318ae41da8ab8f400f07e7a610cc3 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 18 Jul 2018 15:39:37 -0700 Subject: [PATCH 010/140] rcutorture: Also use GP sequence to judge forward progress Currently, rcutorture relies solely on the progress of rcu_torture_writer() to judge grace-period forward progress. In theory, this is the gold standard of forward progress, but in practice rcutorture separately detects and reports rcu_torture_writer() stalls. This commit therefore adds the grace-period sequence number (when provided) to the judgment of grace-period forward progress, which makes it easier to distinguish between failure of actual grace periods to progress on the one hand and downstream forward-progress failures on the other. For example, given this change, if rcu_torture_writer() stalls, but rcu_torture_fwd_prog() does not complain, then the grace-period computation is working, which is a hint that the failure lies in callback processing, wakeup of the rcu_torture_writer() kthread, or similar. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index fd3ce6cc8eea..dee7b45b2186 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1673,7 +1673,8 @@ static int __init rcu_torture_stall_init(void) /* Carry out grace-period forward-progress testing. */ static int rcu_torture_fwd_prog(void *args) { - unsigned long cvar; + unsigned long cver; + unsigned long gps; int idx; unsigned long stopat; bool tested = false; @@ -1681,7 +1682,8 @@ static int rcu_torture_fwd_prog(void *args) VERBOSE_TOROUT_STRING("rcu_torture_fwd_progress task started"); do { schedule_timeout_interruptible(fwd_progress_holdoff * HZ); - cvar = READ_ONCE(rcu_torture_current_version); + cver = READ_ONCE(rcu_torture_current_version); + gps = cur_ops->get_gp_seq(); stopat = jiffies + cur_ops->stall_dur() / fwd_progress_div; while (time_before(jiffies, stopat) && !torture_must_stop()) { idx = cur_ops->readlock(); @@ -1692,8 +1694,9 @@ static int rcu_torture_fwd_prog(void *args) } if (!time_before(jiffies, stopat) && !torture_must_stop()) { tested = true; - WARN_ON_ONCE(cvar == - READ_ONCE(rcu_torture_current_version)); + cver = cver == READ_ONCE(rcu_torture_current_version); + gps = rcutorture_seq_diff(cur_ops->get_gp_seq(), gps); + WARN_ON_ONCE(cver && gps < 2); } /* Avoid slow periods, better to test when busy. */ stutter_wait("rcu_torture_fwd_prog"); From 152f4afbfd58f8ada7591113129aa6ba7fe114c5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 19 Jul 2018 10:57:58 -0700 Subject: [PATCH 011/140] rcutorture: Avoid no-test complaint if too few forward-progress tries In a too-short test, random delays can cause each attempt to do forward-progress testing to fail to complete, thus resulting in spurious splats. This commit therefore requires at least five tries before complaining about rcutorture runs that failed to produce at least one valid forward-progress testing attempt. Note that actual forward-progress failures will splat regardless of the number of tries. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index dee7b45b2186..8ab23143c244 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1678,6 +1678,7 @@ static int rcu_torture_fwd_prog(void *args) int idx; unsigned long stopat; bool tested = false; + int tested_tries = 0; VERBOSE_TOROUT_STRING("rcu_torture_fwd_progress task started"); do { @@ -1692,6 +1693,7 @@ static int rcu_torture_fwd_prog(void *args) if (!fwd_progress_need_resched || need_resched()) cond_resched(); } + tested_tries++; if (!time_before(jiffies, stopat) && !torture_must_stop()) { tested = true; cver = cver == READ_ONCE(rcu_torture_current_version); @@ -1701,7 +1703,8 @@ static int rcu_torture_fwd_prog(void *args) /* Avoid slow periods, better to test when busy. */ stutter_wait("rcu_torture_fwd_prog"); } while (!torture_must_stop()); - WARN_ON(!tested); + /* Short runs might not contain a valid forward-progress attempt. */ + WARN_ON(!tested && tested_tries >= 5); torture_kthread_stopping("rcu_torture_fwd_prog"); return 0; } From 08a7a2ec68348ebc6d8bf5f20df23815fc0d332b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 19 Jul 2018 13:07:20 -0700 Subject: [PATCH 012/140] rcutorture: Vary forward-progress test interval Some of the Linux kernel's RCU implementations provide several mechanisms to promote forward progress that operate over different timeframes. This commit therefore causes rcu_torture_fwd_prog() to vary the duration of its forward-progress testing in order to test each such mechanism. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 8ab23143c244..89cc4d9c9a0c 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1676,16 +1676,21 @@ static int rcu_torture_fwd_prog(void *args) unsigned long cver; unsigned long gps; int idx; + int sd; + int sd4; unsigned long stopat; bool tested = false; int tested_tries = 0; + static DEFINE_TORTURE_RANDOM(trs); VERBOSE_TOROUT_STRING("rcu_torture_fwd_progress task started"); do { schedule_timeout_interruptible(fwd_progress_holdoff * HZ); cver = READ_ONCE(rcu_torture_current_version); gps = cur_ops->get_gp_seq(); - stopat = jiffies + cur_ops->stall_dur() / fwd_progress_div; + sd = cur_ops->stall_dur() + 1; + sd4 = (sd + fwd_progress_div - 1) / fwd_progress_div; + stopat = jiffies + sd4 + torture_random(&trs) % (sd - sd4); while (time_before(jiffies, stopat) && !torture_must_stop()) { idx = cur_ops->readlock(); udelay(10); From 9fdcb9afe082794c6dcf2b79b3070ef5dafc8a8f Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 19 Jul 2018 13:36:00 -0700 Subject: [PATCH 013/140] rcutorture: Add self-propagating callback to forward-progress testing If rcutorture is run on a quiet system with the rcutorture.stutter module parameter set high, then there can legitimately be an extended period during which no RCU forward progress takes place. This can result in false-positive no-forward-progress splats. This commit therefore makes rcu_torture_fwd_prog() create a self-propagating RCU callback to ensure that grace periods are in progress for the duration of the forward-progress test. Note that the RCU flavor under test must define ->call(), ->sync(), and ->cb_barrier() for this self-propagating callback to be created. If one or more of those rcu_torture_ops fields are NULL, then the rcu_torture_fwd_prog() function will silently proceed without creating the self-propagating callback. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 89cc4d9c9a0c..316083687fd7 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1670,20 +1670,49 @@ static int __init rcu_torture_stall_init(void) return torture_create_kthread(rcu_torture_stall, NULL, stall_task); } +/* State structure for forward-progress self-propagating RCU callback. */ +struct fwd_cb_state { + struct rcu_head rh; + int stop; +}; + +/* + * Forward-progress self-propagating RCU callback function. Because + * callbacks run from softirq, this function is an implicit RCU read-side + * critical section. + */ +static void rcu_torture_fwd_prog_cb(struct rcu_head *rhp) +{ + struct fwd_cb_state *fcsp = container_of(rhp, struct fwd_cb_state, rh); + + if (READ_ONCE(fcsp->stop)) { + WRITE_ONCE(fcsp->stop, 2); + return; + } + cur_ops->call(&fcsp->rh, rcu_torture_fwd_prog_cb); +} + /* Carry out grace-period forward-progress testing. */ static int rcu_torture_fwd_prog(void *args) { unsigned long cver; + struct fwd_cb_state fcs = { .stop = 0 }; unsigned long gps; int idx; int sd; int sd4; + bool selfpropcb = false; unsigned long stopat; bool tested = false; int tested_tries = 0; static DEFINE_TORTURE_RANDOM(trs); VERBOSE_TOROUT_STRING("rcu_torture_fwd_progress task started"); + if (cur_ops->call && cur_ops->sync && cur_ops->cb_barrier) { + init_rcu_head_on_stack(&fcs.rh); + cur_ops->call(&fcs.rh, rcu_torture_fwd_prog_cb); + selfpropcb = true; + } do { schedule_timeout_interruptible(fwd_progress_holdoff * HZ); cver = READ_ONCE(rcu_torture_current_version); @@ -1708,6 +1737,13 @@ static int rcu_torture_fwd_prog(void *args) /* Avoid slow periods, better to test when busy. */ stutter_wait("rcu_torture_fwd_prog"); } while (!torture_must_stop()); + if (selfpropcb) { + WRITE_ONCE(fcs.stop, 1); + cur_ops->sync(); /* Wait for running callback to complete. */ + cur_ops->cb_barrier(); /* Wait for queued callbacks. */ + WARN_ON(READ_ONCE(fcs.stop) != 2); + destroy_rcu_head_on_stack(&fcs.rh); + } /* Short runs might not contain a valid forward-progress attempt. */ WARN_ON(!tested && tested_tries >= 5); torture_kthread_stopping("rcu_torture_fwd_prog"); From 3cff54a830f760eafc9c20191ce1d4b8c356d002 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 19 Jul 2018 15:25:57 -0700 Subject: [PATCH 014/140] rcutorture: Increase rcu_read_delay() longdelay_ms RCU now takes certain actions 100 and 200 milliseconds into a grace period by default, but rcutorture only runs RCU read-side critical sections with durations up to 50 milliseconds. This commit therefore increases test coverage by increasing the maximum critical-section duration to 300 milliseconds. Note that the existing code automatically dials down the probability of long delays based on the maximum duration, which means that this change should not significantly change the rate of execution of RCU read-side critical sections. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 316083687fd7..b98bb11d47a2 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -323,7 +323,7 @@ static void rcu_read_delay(struct torture_random_state *rrsp) unsigned long started; unsigned long completed; const unsigned long shortdelay_us = 200; - const unsigned long longdelay_ms = 50; + const unsigned long longdelay_ms = 300; unsigned long long ts; /* We want a short delay sometimes to make a reader delay the grace From 1e69676592edaf81eed88ba53f5239d84fae4e67 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 20 Jul 2018 12:04:12 -0700 Subject: [PATCH 015/140] rcutorture: Limit reader duration if irq or bh disabled There are debug checks in some environments that will complain if the duration of a bh-disabled region of code exceeds about 50 milliseconds. Because rcu_read_delay() can produce a 50-millisecond delay and because there could be up to eight reader segments with such delays, this commit limits the maximum delay to 10 milliseconds if either interrupts or softirqs are disabled. Reported-by: Thomas Gleixner Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index b98bb11d47a2..9622192ec5c9 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -323,7 +323,7 @@ static void rcu_read_delay(struct torture_random_state *rrsp) unsigned long started; unsigned long completed; const unsigned long shortdelay_us = 200; - const unsigned long longdelay_ms = 300; + unsigned long longdelay_ms = 300; unsigned long long ts; /* We want a short delay sometimes to make a reader delay the grace @@ -333,6 +333,8 @@ static void rcu_read_delay(struct torture_random_state *rrsp) if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms))) { started = cur_ops->get_gp_seq(); ts = rcu_trace_clock_local(); + if (preempt_count() & (SOFTIRQ_MASK | HARDIRQ_MASK)) + longdelay_ms = 5; /* Avoid triggering BH limits. */ mdelay(longdelay_ms); completed = cur_ops->get_gp_seq(); do_trace_rcu_torture_read(cur_ops->name, NULL, ts, From fecad5091f35425246316ab25c8a9f2aa44a7051 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 20 Jul 2018 12:18:11 -0700 Subject: [PATCH 016/140] rcutorture: Reduce priority of forward-progress testing On !SMP tests, the forward-progress kthread might prevent RCU's grace-period kthread from running, which would defeat RCU's forward-progress measures. On PREEMPT tests without RCU priority boosting, the forward-progress kthread might preempt a reader for an extended time period, which would also defeat RCU's forward-progress measures. This commit therefore reduced rcutorture's forward-progress kthread's priority in those cases. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 9622192ec5c9..ac487ea8d245 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1710,6 +1710,8 @@ static int rcu_torture_fwd_prog(void *args) static DEFINE_TORTURE_RANDOM(trs); VERBOSE_TOROUT_STRING("rcu_torture_fwd_progress task started"); + if (!IS_ENABLED(CONFIG_SMP) || !IS_ENABLED(CONFIG_RCU_BOOST)) + set_user_nice(current, MAX_NICE); if (cur_ops->call && cur_ops->sync && cur_ops->cb_barrier) { init_rcu_head_on_stack(&fcs.rh); cur_ops->call(&fcs.rh, rcu_torture_fwd_prog_cb); From c04dd09bd38c0df1aa6318164a51eccbc3a9fa5e Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 23 Jul 2018 14:16:47 -0700 Subject: [PATCH 017/140] rcutorture: Adjust number of reader kthreads per CPU-hotplug operations Currently, rcutorture provisions rcu_torture_reader() kthreads based on the initial number of CPUs. This can be problematic when CPU hotplug is enabled, as a system with a very large number of CPUs will provision a very large number of rcu_torture_reader() kthreads. All of these kthreads will continue running even if the CPU-hotplug operations result in only one remaining online CPU. This can result in all sorts of strange artifacts due simply to massive overload. This commit therefore causes the rcu_torture_reader() kthreads to start blocking as the number of online CPUs decreases. This is accomplished by numbering these kthreads, and having each check to make sure that the number of online CPUs is at least as large as its assigned number. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index ac487ea8d245..50015b78a43f 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1400,6 +1400,8 @@ static int rcu_torture_reader(void *arg) { unsigned long lastsleep = jiffies; + long myid = (long)arg; + int mynumonline = myid; DEFINE_TORTURE_RANDOM(rand); struct timer_list t; @@ -1419,6 +1421,8 @@ rcu_torture_reader(void *arg) schedule_timeout_interruptible(1); lastsleep = jiffies + 10; } + while (num_online_cpus() < mynumonline && !torture_must_stop()) + schedule_timeout_interruptible(HZ / 5); stutter_wait("rcu_torture_reader"); } while (!torture_must_stop()); if (irqreader && cur_ops->irq_capable) { @@ -2063,7 +2067,7 @@ static void rcu_test_debug_objects(void) static int __init rcu_torture_init(void) { - int i; + long i; int cpu; int firsterr = 0; static struct rcu_torture_ops *torture_ops[] = { @@ -2169,7 +2173,7 @@ rcu_torture_init(void) goto unwind; } for (i = 0; i < nrealreaders; i++) { - firsterr = torture_create_kthread(rcu_torture_reader, NULL, + firsterr = torture_create_kthread(rcu_torture_reader, (void *)i, reader_tasks[i]); if (firsterr) goto unwind; From f4de46ed5bbc8ba9acebc8ac75809751b716e470 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 24 Jul 2018 20:50:40 -0700 Subject: [PATCH 018/140] rcutorture: Print forward-progress test interval on error This commit prints the duration of the forward-progress test interval in the case that no forward progress was observed as an aid to debugging. When forward progress does happen, it prints out the number of rcu_torture_writer() versions and grace periods that elapsed during the forward-progress test. At the end of the run, it also prints the number of attempted and actual forward-progress tests. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 50015b78a43f..7df8142a6a22 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1702,6 +1702,7 @@ static void rcu_torture_fwd_prog_cb(struct rcu_head *rhp) static int rcu_torture_fwd_prog(void *args) { unsigned long cver; + unsigned long dur; struct fwd_cb_state fcs = { .stop = 0 }; unsigned long gps; int idx; @@ -1709,7 +1710,7 @@ static int rcu_torture_fwd_prog(void *args) int sd4; bool selfpropcb = false; unsigned long stopat; - bool tested = false; + int tested = 0; int tested_tries = 0; static DEFINE_TORTURE_RANDOM(trs); @@ -1727,7 +1728,8 @@ static int rcu_torture_fwd_prog(void *args) gps = cur_ops->get_gp_seq(); sd = cur_ops->stall_dur() + 1; sd4 = (sd + fwd_progress_div - 1) / fwd_progress_div; - stopat = jiffies + sd4 + torture_random(&trs) % (sd - sd4); + dur = sd4 + torture_random(&trs) % (sd - sd4); + stopat = jiffies + dur; while (time_before(jiffies, stopat) && !torture_must_stop()) { idx = cur_ops->readlock(); udelay(10); @@ -1737,10 +1739,11 @@ static int rcu_torture_fwd_prog(void *args) } tested_tries++; if (!time_before(jiffies, stopat) && !torture_must_stop()) { - tested = true; - cver = cver == READ_ONCE(rcu_torture_current_version); + tested++; + cver = READ_ONCE(rcu_torture_current_version) - cver; gps = rcutorture_seq_diff(cur_ops->get_gp_seq(), gps); - WARN_ON_ONCE(cver && gps < 2); + WARN_ON(!cver && gps < 2); + pr_alert("%s: Duration %ld cver %ld gps %ld\n", __func__, dur, cver, gps); } /* Avoid slow periods, better to test when busy. */ stutter_wait("rcu_torture_fwd_prog"); @@ -1754,6 +1757,7 @@ static int rcu_torture_fwd_prog(void *args) } /* Short runs might not contain a valid forward-progress attempt. */ WARN_ON(!tested && tested_tries >= 5); + pr_alert("%s: tested %d tested_tries %d\n", __func__, tested, tested_tries); torture_kthread_stopping("rcu_torture_fwd_prog"); return 0; } From 474e59b476b3390ef9f730515439f21640b61623 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 7 Aug 2018 14:34:44 -0700 Subject: [PATCH 019/140] rcutorture: Check GP completion at stutter end The rcu_torture_writer() function invokes stutter_wait() at the end of each writer pass, which occasionally blocks for an extended time period in order to ensure that RCU can handle intermittent loads. But part of handling a busy period is invoking all the callbacks before the end of the idle period induced by stutter_wait(). This commit therefore adds a return value to stutter_wait() indicating whether stutter_wait() actually waited. In addition, this commit causes rcu_torture_writer() to test this value and if set, checks that all the elements of the rcu_tortures[] array have been freed up. Signed-off-by: Paul E. McKenney --- include/linux/torture.h | 2 +- kernel/rcu/rcutorture.c | 5 ++++- kernel/torture.c | 3 ++- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/include/linux/torture.h b/include/linux/torture.h index 61dfd93b6ee4..48fad21109fc 100644 --- a/include/linux/torture.h +++ b/include/linux/torture.h @@ -77,7 +77,7 @@ void torture_shutdown_absorb(const char *title); int torture_shutdown_init(int ssecs, void (*cleanup)(void)); /* Task stuttering, which forces load/no-load transitions. */ -void stutter_wait(const char *title); +bool stutter_wait(const char *title); int torture_stutter_init(int s); /* Initialization and cleanup. */ diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 7df8142a6a22..ae10ad531993 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1144,7 +1144,10 @@ rcu_torture_writer(void *arg) !rcu_gp_is_normal(); } rcu_torture_writer_state = RTWS_STUTTER; - stutter_wait("rcu_torture_writer"); + if (stutter_wait("rcu_torture_writer")) + for (i = 0; i < ARRAY_SIZE(rcu_tortures); i++) + if (list_empty(&rcu_tortures[i].rtort_free)) + WARN_ON_ONCE(1); } while (!torture_must_stop()); /* Reset expediting back to unexpedited. */ if (expediting > 0) diff --git a/kernel/torture.c b/kernel/torture.c index 1ac24a826589..17d91f5fba2a 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -573,7 +573,7 @@ static int stutter; * Block until the stutter interval ends. This must be called periodically * by all running kthreads that need to be subject to stuttering. */ -void stutter_wait(const char *title) +bool stutter_wait(const char *title) { int spt; @@ -590,6 +590,7 @@ void stutter_wait(const char *title) } torture_shutdown_absorb(title); } + return !!spt; } EXPORT_SYMBOL_GPL(stutter_wait); From 7c590fcca66b58957f8e34acdb0587cd1eeed35b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 7 Aug 2018 16:42:42 -0700 Subject: [PATCH 020/140] rcutorture: Maintain self-propagating CB only during forward-progress test The current forward-progress testing maintains a self-propagating callback during the full test. This could result in false negatives for stutter-end checking, where it might appear that RCU was clearing out old callbacks only because it was being continually motivated by the self-propagating callback. This commit therefore shuts down the self-propagating callback at the end of each forward-progress test interval. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index ae10ad531993..a02a2f21386b 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1706,7 +1706,7 @@ static int rcu_torture_fwd_prog(void *args) { unsigned long cver; unsigned long dur; - struct fwd_cb_state fcs = { .stop = 0 }; + struct fwd_cb_state fcs; unsigned long gps; int idx; int sd; @@ -1722,11 +1722,14 @@ static int rcu_torture_fwd_prog(void *args) set_user_nice(current, MAX_NICE); if (cur_ops->call && cur_ops->sync && cur_ops->cb_barrier) { init_rcu_head_on_stack(&fcs.rh); - cur_ops->call(&fcs.rh, rcu_torture_fwd_prog_cb); selfpropcb = true; } do { schedule_timeout_interruptible(fwd_progress_holdoff * HZ); + if (selfpropcb) { + WRITE_ONCE(fcs.stop, 0); + cur_ops->call(&fcs.rh, rcu_torture_fwd_prog_cb); + } cver = READ_ONCE(rcu_torture_current_version); gps = cur_ops->get_gp_seq(); sd = cur_ops->stall_dur() + 1; @@ -1748,13 +1751,15 @@ static int rcu_torture_fwd_prog(void *args) WARN_ON(!cver && gps < 2); pr_alert("%s: Duration %ld cver %ld gps %ld\n", __func__, dur, cver, gps); } + if (selfpropcb) { + WRITE_ONCE(fcs.stop, 1); + cur_ops->sync(); /* Wait for running CB to complete. */ + cur_ops->cb_barrier(); /* Wait for queued callbacks. */ + } /* Avoid slow periods, better to test when busy. */ stutter_wait("rcu_torture_fwd_prog"); } while (!torture_must_stop()); if (selfpropcb) { - WRITE_ONCE(fcs.stop, 1); - cur_ops->sync(); /* Wait for running callback to complete. */ - cur_ops->cb_barrier(); /* Wait for queued callbacks. */ WARN_ON(READ_ONCE(fcs.stop) != 2); destroy_rcu_head_on_stack(&fcs.rh); } From 77095901b895a64b6d775879b54c73472ba21e68 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 2 Jul 2018 08:25:57 -0700 Subject: [PATCH 021/140] doc: Update removal of RCU-bh/sched update machinery The RCU-bh update API is now defined in terms of that of RCU-bh and RCU-sched, so this commit updates the documentation accordingly. In addition, although RCU-sched persists in !PREEMPT kernels, in the PREEMPT case its update API is now defined in terms of that of RCU-preempt, so this commit also updates the documentation accordingly. While in the area, this commit removes the documentation for the now-obsolete synchronize_rcu_mult() and clarifies the Tasks RCU documentation. Signed-off-by: Paul E. McKenney --- .../Data-Structures/Data-Structures.html | 23 +-- .../Expedited-Grace-Periods.html | 7 +- .../RCU/Design/Requirements/Requirements.html | 149 ++++-------------- Documentation/RCU/stallwarn.txt | 13 +- Documentation/RCU/whatisRCU.txt | 3 +- .../admin-guide/kernel-parameters.txt | 16 +- Documentation/kernel-per-CPU-kthreads.txt | 2 +- 7 files changed, 61 insertions(+), 152 deletions(-) diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html index 50be87e59937..1d2051c0c3fc 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html @@ -1374,8 +1374,7 @@ that is, if the CPU is currently idle. Accessor Functions

The following listing shows the -rcu_get_root(), rcu_for_each_node_breadth_first, -rcu_for_each_nonleaf_node_breadth_first(), and +rcu_get_root(), rcu_for_each_node_breadth_first and rcu_for_each_leaf_node() function and macros:

@@ -1388,13 +1387,9 @@ Accessor Functions
   7   for ((rnp) = &(rsp)->node[0]; \
   8        (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
   9
- 10 #define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \
- 11   for ((rnp) = &(rsp)->node[0]; \
- 12        (rnp) < (rsp)->level[NUM_RCU_LVLS - 1]; (rnp)++)
- 13
- 14 #define rcu_for_each_leaf_node(rsp, rnp) \
- 15   for ((rnp) = (rsp)->level[NUM_RCU_LVLS - 1]; \
- 16        (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
+ 10 #define rcu_for_each_leaf_node(rsp, rnp) \
+ 11   for ((rnp) = (rsp)->level[NUM_RCU_LVLS - 1]; \
+ 12        (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
 

The rcu_get_root() simply returns a pointer to the @@ -1407,10 +1402,7 @@ macro takes advantage of the layout of the rcu_node structures in the rcu_state structure's ->node[] array, performing a breadth-first traversal by simply traversing the array in order. -The rcu_for_each_nonleaf_node_breadth_first() macro operates -similarly, but traverses only the first part of the array, thus excluding -the leaf rcu_node structures. -Finally, the rcu_for_each_leaf_node() macro traverses only +Similarly, the rcu_for_each_leaf_node() macro traverses only the last part of the array, thus traversing only the leaf rcu_node structures. @@ -1418,15 +1410,14 @@ the last part of the array, thus traversing only the leaf   Quick Quiz: - What do rcu_for_each_nonleaf_node_breadth_first() and + What does rcu_for_each_leaf_node() do if the rcu_node tree contains only a single node? Answer: In the single-node case, - rcu_for_each_nonleaf_node_breadth_first() is a no-op - and rcu_for_each_leaf_node() traverses the single node. + rcu_for_each_leaf_node() traverses the single node.   diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html index 7394f034be65..ffd612bfa436 100644 --- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html +++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html @@ -12,10 +12,9 @@ high efficiency and minimal disturbance, expedited grace periods accept lower efficiency and significant disturbance to attain shorter latencies.

-There are three flavors of RCU (RCU-bh, RCU-preempt, and RCU-sched), -but only two flavors of expedited grace periods because the RCU-bh -expedited grace period maps onto the RCU-sched expedited grace period. -Each of the remaining two implementations is covered in its own section. +There are two flavors of RCU (RCU-preempt and RCU-sched), with an earlier +third RCU-bh flavor having been implemented in terms of the other two. +Each of the two implementations is covered in its own section.

  1. diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 51f39f65002d..701b5c53607f 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -1306,8 +1306,6 @@ doing so would degrade real-time response.

    This non-requirement appeared with preemptible RCU. -If you need a grace period that waits on non-preemptible code regions, use -RCU-sched.

    Parallelism Facts of Life

    @@ -2165,14 +2163,9 @@ however, this is not a panacea because there would be severe restrictions on what operations those callbacks could invoke.

    -Perhaps surprisingly, synchronize_rcu(), -synchronize_rcu_bh() -(discussed below), -synchronize_sched(), +Perhaps surprisingly, synchronize_rcu() and synchronize_rcu_expedited(), -synchronize_rcu_bh_expedited(), and -synchronize_sched_expedited() -will all operate normally +will operate normally during very early boot, the reason being that there is only one CPU and preemption is disabled. This means that the call synchronize_rcu() (or friends) @@ -2861,15 +2854,22 @@ The other four flavors are listed below, with requirements for each described in a separate section.

      -
    1. Bottom-Half Flavor -
    2. Sched Flavor +
    3. Bottom-Half Flavor (Historical) +
    4. Sched Flavor (Historical)
    5. Sleepable RCU
    6. Tasks RCU -
    7. - Waiting for Multiple Grace Periods
    -

    Bottom-Half Flavor

    +

    Bottom-Half Flavor (Historical)

    + +

    +The RCU-bh flavor of RCU has since been expressed in terms of +the other RCU flavors as part of a consolidation of the three +flavors into a single flavor. +The read-side API remains, and continues to disable softirq and to +be accounted for by lockdep. +Much of the material in this section is therefore strictly historical +in nature.

    The softirq-disable (AKA “bottom-half”, @@ -2929,8 +2929,20 @@ includes call_rcu_bh(), rcu_barrier_bh(), and rcu_read_lock_bh_held(). +However, the update-side APIs are now simple wrappers for other RCU +flavors, namely RCU-sched in CONFIG_PREEMPT=n kernels and RCU-preempt +otherwise. -

    Sched Flavor

    +

    Sched Flavor (Historical)

    + +

    +The RCU-sched flavor of RCU has since been expressed in terms of +the other RCU flavors as part of a consolidation of the three +flavors into a single flavor. +The read-side API remains, and continues to disable preemption and to +be accounted for by lockdep. +Much of the material in this section is therefore strictly historical +in nature.

    Before preemptible RCU, waiting for an RCU grace period had the @@ -3150,94 +3162,14 @@ The tasks-RCU API is quite compact, consisting only of call_rcu_tasks(), synchronize_rcu_tasks(), and rcu_barrier_tasks(). - -

    -Waiting for Multiple Grace Periods

    - -

    -Perhaps you have an RCU protected data structure that is accessed from -RCU read-side critical sections, from softirq handlers, and from -hardware interrupt handlers. -That is three flavors of RCU, the normal flavor, the bottom-half flavor, -and the sched flavor. -How to wait for a compound grace period? - -

    -The best approach is usually to “just say no!” and -insert rcu_read_lock() and rcu_read_unlock() -around each RCU read-side critical section, regardless of what -environment it happens to be in. -But suppose that some of the RCU read-side critical sections are -on extremely hot code paths, and that use of CONFIG_PREEMPT=n -is not a viable option, so that rcu_read_lock() and -rcu_read_unlock() are not free. -What then? - -

    -You could wait on all three grace periods in succession, as follows: - -

    -
    - 1 synchronize_rcu();
    - 2 synchronize_rcu_bh();
    - 3 synchronize_sched();
    -
    -
    - -

    -This works, but triples the update-side latency penalty. -In cases where this is not acceptable, synchronize_rcu_mult() -may be used to wait on all three flavors of grace period concurrently: - -

    -
    - 1 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched);
    -
    -
    - -

    -But what if it is necessary to also wait on SRCU? -This can be done as follows: - -

    -
    - 1 static void call_my_srcu(struct rcu_head *head,
    - 2        void (*func)(struct rcu_head *head))
    - 3 {
    - 4   call_srcu(&my_srcu, head, func);
    - 5 }
    - 6
    - 7 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched, call_my_srcu);
    -
    -
    - -

    -If you needed to wait on multiple different flavors of SRCU -(but why???), you would need to create a wrapper function resembling -call_my_srcu() for each SRCU flavor. - - - - - - - - -
     
    Quick Quiz:
    - But what if I need to wait for multiple RCU flavors, but I also need - the grace periods to be expedited? -
    Answer:
    - If you are using expedited grace periods, there should be less penalty - for waiting on them in succession. - But if that is nevertheless a problem, you can use workqueues - or multiple kthreads to wait on the various expedited grace - periods concurrently. -
     
    - -

    -Again, it is usually better to adjust the RCU read-side critical sections -to use a single flavor of RCU, but when this is not feasible, you can use -synchronize_rcu_mult(). +In CONFIG_PREEMPT=n kernels, trampolines cannot be preempted, +so these APIs map to +call_rcu(), +synchronize_rcu(), and +rcu_barrier(), respectively. +In CONFIG_PREEMPT=y kernels, trampolines can be preempted, +and these three APIs are therefore implemented by separate functions +that check for voluntary context switches.

    Possible Future Changes

    @@ -3248,12 +3180,6 @@ If this becomes a serious problem, it will be necessary to rework the grace-period state machine so as to avoid the need for the additional latency. -

    -Expedited grace periods scan the CPUs, so their latency and overhead -increases with increasing numbers of CPUs. -If this becomes a serious problem on large systems, it will be necessary -to do some redesign to avoid this scalability problem. -

    RCU disables CPU hotplug in a few places, perhaps most notably in the rcu_barrier() operations. @@ -3298,11 +3224,6 @@ Please note that arrangements that require RCU to remap CPU numbers will require extremely good demonstration of need and full exploration of alternatives. -

    -There is an embarrassingly large number of flavors of RCU, and this -number has been increasing over time. -Perhaps it will be possible to combine some at some future date. -

    RCU's various kthreads are reasonably recent additions. It is quite likely that adjustments will be required to more gracefully diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index f99cf11b314b..491043fd976f 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -16,12 +16,9 @@ o A CPU looping in an RCU read-side critical section. o A CPU looping with interrupts disabled. -o A CPU looping with preemption disabled. This condition can - result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh - stalls. +o A CPU looping with preemption disabled. -o A CPU looping with bottom halves disabled. This condition can - result in RCU-sched and RCU-bh stalls. +o A CPU looping with bottom halves disabled. o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel without invoking schedule(). If the looping in the kernel is @@ -87,9 +84,9 @@ o A hardware failure. This is quite unlikely, but has occurred This resulted in a series of RCU CPU stall warnings, eventually leading the realization that the CPU had failed. -The RCU, RCU-sched, RCU-bh, and RCU-tasks implementations have CPU stall -warning. Note that SRCU does -not- have CPU stall warnings. Please note -that RCU only detects CPU stalls when there is a grace period in progress. +The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning. +Note that SRCU does -not- have CPU stall warnings. Please note that +RCU only detects CPU stalls when there is a grace period in progress. No grace period, no CPU stall warnings. To diagnose the cause of the stall, inspect the stack traces. diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index c2a7facf7ff9..86d82f7f3500 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -934,7 +934,8 @@ c. Do you need to treat NMI handlers, hardirq handlers, d. Do you need RCU grace periods to complete even in the face of softirq monopolization of one or more of the CPUs? For example, is your code subject to network-based denial-of-service - attacks? If so, you need RCU-bh. + attacks? If so, you should disable softirq across your readers, + for example, by using rcu_read_lock_bh(). e. Is your workload too update-intensive for normal use of RCU, but inappropriate for other synchronization mechanisms? diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 9871e649ffef..9e69ec54b7d3 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3534,14 +3534,14 @@ In kernels built with CONFIG_RCU_NOCB_CPU=y, set the specified list of CPUs to be no-callback CPUs. - Invocation of these CPUs' RCU callbacks will - be offloaded to "rcuox/N" kthreads created for - that purpose, where "x" is "b" for RCU-bh, "p" - for RCU-preempt, and "s" for RCU-sched, and "N" - is the CPU number. This reduces OS jitter on the - offloaded CPUs, which can be useful for HPC and - real-time workloads. It can also improve energy - efficiency for asymmetric multiprocessors. + Invocation of these CPUs' RCU callbacks will be + offloaded to "rcuox/N" kthreads created for that + purpose, where "x" is "p" for RCU-preempt, and + "s" for RCU-sched, and "N" is the CPU number. + This reduces OS jitter on the offloaded CPUs, + which can be useful for HPC and real-time + workloads. It can also improve energy efficiency + for asymmetric multiprocessors. rcu_nocb_poll [KNL] Rather than requiring that offloaded CPUs diff --git a/Documentation/kernel-per-CPU-kthreads.txt b/Documentation/kernel-per-CPU-kthreads.txt index 0f00f9c164ac..23b0c8b20cd1 100644 --- a/Documentation/kernel-per-CPU-kthreads.txt +++ b/Documentation/kernel-per-CPU-kthreads.txt @@ -321,7 +321,7 @@ To reduce its OS jitter, do at least one of the following: to do. Name: - rcuob/%d, rcuop/%d, and rcuos/%d + rcuop/%d and rcuos/%d Purpose: Offload RCU callbacks from the corresponding CPU. From 5c3f78ec285bf4902ff6b5c8e0ced520c750aeae Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 26 Jul 2018 13:19:22 -0700 Subject: [PATCH 022/140] doc: Fix broken HTML directive This commit adds the needed "<". Signed-off-by: Paul E. McKenney --- .../Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html index ffd612bfa436..e62c7c34a369 100644 --- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html +++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html @@ -157,7 +157,7 @@ whether or not the current CPU is in an RCU read-side critical section. The best that sync_sched_exp_handler() can do is to check for idle, on the off-chance that the CPU went idle while the IPI was in flight. -If the CPU is idle, then tt>sync_sched_exp_handler() reports +If the CPU is idle, then sync_sched_exp_handler() reports the quiescent state.

    From cf7614e13c8fcaf290c5ffaa04b2e1b4f704a52a Mon Sep 17 00:00:00 2001 From: Byungchul Park Date: Fri, 22 Jun 2018 15:12:06 +0900 Subject: [PATCH 023/140] rcu: Refactor rcu_{nmi,irq}_{enter,exit}() When entering or exiting irq or NMI handlers, the current code uses ->dynticks_nmi_nesting to detect if it is in the outermost handler, that is, the one interrupting or returning to an RCU-idle context (the idle loop or nohz_full usermode execution). When entering the outermost handler via an interrupt (as opposed to NMI), it is necessary to invoke rcu_dynticks_task_exit() just before the CPU is marked non-idle from an RCU perspective and to invoke rcu_cleanup_after_idle() just after the CPU is marked non-idle. Similarly, when exiting the outermost handler via an interrupt, it is necessary to invoke rcu_prepare_for_idle() just before marking the CPU idle and to invoke rcu_dynticks_task_enter() just after marking the CPU idle. The decision to execute these four functions is currently taken in rcu_irq_enter() and rcu_irq_exit() as follows: rcu_irq_enter() /* A conditional branch with ->dynticks_nmi_nesting */ rcu_nmi_enter() /* A conditional branch with ->dynticks */ /* A conditional branch with ->dynticks_nmi_nesting */ rcu_irq_exit() /* A conditional branch with ->dynticks_nmi_nesting */ rcu_nmi_exit() /* A conditional branch with ->dynticks_nmi_nesting */ /* A conditional branch with ->dynticks_nmi_nesting */ rcu_nmi_enter() /* A conditional branch with ->dynticks */ rcu_nmi_exit() /* A conditional branch with ->dynticks_nmi_nesting */ This works, but the conditional branches in rcu_irq_enter() and rcu_irq_exit() are redundant with those in rcu_nmi_enter() and rcu_nmi_exit(), respectively. Redundant branches are not something we want in the to/from-idle fastpaths, so this commit refactors rcu_{nmi,irq}_{enter,exit}() so they use a common inlined function passed a constant argument as follows: rcu_irq_enter() inlining rcu_nmi_enter_common(irq=true) /* A conditional branch with ->dynticks */ rcu_irq_exit() inlining rcu_nmi_exit_common(irq=true) /* A conditional branch with ->dynticks_nmi_nesting */ rcu_nmi_enter() inlining rcu_nmi_enter_common(irq=false) /* A conditional branch with ->dynticks */ rcu_nmi_exit() inlining rcu_nmi_exit_common(irq=false) /* A conditional branch with ->dynticks_nmi_nesting */ The combination of the constant function argument and the inlining allows the compiler to discard the conditionals that previously controlled execution of rcu_dynticks_task_exit(), rcu_cleanup_after_idle(), rcu_prepare_for_idle(), and rcu_dynticks_task_enter(). This reduces both the to-idle and from-idle path lengths by two conditional branches each, and improves readability as well. This commit also changes order of execution from this: rcu_dynticks_task_exit(); rcu_dynticks_eqs_exit(); trace_rcu_dyntick(); rcu_cleanup_after_idle(); To this: rcu_dynticks_task_exit(); rcu_dynticks_eqs_exit(); rcu_cleanup_after_idle(); trace_rcu_dyntick(); In other words, the calls to rcu_cleanup_after_idle() and trace_rcu_dyntick() are reversed. This has no functional effect because the real concern is whether a given call is before or after the call to rcu_dynticks_eqs_exit(), and this patch does not change that. Before the call to rcu_dynticks_eqs_exit(), RCU is not yet watching the current CPU and after that call RCU is watching. A similar switch in calling order happens on the idle-entry path, with similar lack of effect for the same reasons. Suggested-by: Paul E. McKenney Signed-off-by: Byungchul Park Signed-off-by: Paul E. McKenney [ paulmck: Applied Steven Rostedt feedback. ] Reviewed-by: Steven Rostedt (VMware) --- kernel/rcu/tree.c | 66 +++++++++++++++++++++++++++++++---------------- 1 file changed, 44 insertions(+), 22 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0b760c1369f7..36786789b625 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -770,18 +770,16 @@ void rcu_user_enter(void) } #endif /* CONFIG_NO_HZ_FULL */ -/** - * rcu_nmi_exit - inform RCU of exit from NMI context - * +/* * If we are returning from the outermost NMI handler that interrupted an * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting * to let the RCU grace-period handling know that the CPU is back to * being RCU-idle. * - * If you add or remove a call to rcu_nmi_exit(), be sure to test + * If you add or remove a call to rcu_nmi_exit_common(), be sure to test * with CONFIG_RCU_EQS_DEBUG=y. */ -void rcu_nmi_exit(void) +static __always_inline void rcu_nmi_exit_common(bool irq) { struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); @@ -807,7 +805,26 @@ void rcu_nmi_exit(void) /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */ trace_rcu_dyntick(TPS("Startirq"), rdtp->dynticks_nmi_nesting, 0, rdtp->dynticks); WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ + + if (irq) + rcu_prepare_for_idle(); + rcu_dynticks_eqs_enter(); + + if (irq) + rcu_dynticks_task_enter(); +} + +/** + * rcu_nmi_exit - inform RCU of exit from NMI context + * @irq: Is this call from rcu_irq_exit? + * + * If you add or remove a call to rcu_nmi_exit(), be sure to test + * with CONFIG_RCU_EQS_DEBUG=y. + */ +void rcu_nmi_exit(void) +{ + rcu_nmi_exit_common(false); } /** @@ -831,14 +848,8 @@ void rcu_nmi_exit(void) */ void rcu_irq_exit(void) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); - lockdep_assert_irqs_disabled(); - if (rdtp->dynticks_nmi_nesting == 1) - rcu_prepare_for_idle(); - rcu_nmi_exit(); - if (rdtp->dynticks_nmi_nesting == 0) - rcu_dynticks_task_enter(); + rcu_nmi_exit_common(true); } /* @@ -921,7 +932,8 @@ void rcu_user_exit(void) #endif /* CONFIG_NO_HZ_FULL */ /** - * rcu_nmi_enter - inform RCU of entry to NMI context + * rcu_nmi_enter_common - inform RCU of entry to NMI context + * @irq: Is this call from rcu_irq_enter? * * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and * rdtp->dynticks_nmi_nesting to let the RCU grace-period handling know @@ -929,10 +941,10 @@ void rcu_user_exit(void) * long as the nesting level does not overflow an int. (You will probably * run out of stack space first.) * - * If you add or remove a call to rcu_nmi_enter(), be sure to test + * If you add or remove a call to rcu_nmi_enter_common(), be sure to test * with CONFIG_RCU_EQS_DEBUG=y. */ -void rcu_nmi_enter(void) +static __always_inline void rcu_nmi_enter_common(bool irq) { struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); long incby = 2; @@ -949,7 +961,15 @@ void rcu_nmi_enter(void) * period (observation due to Andy Lutomirski). */ if (rcu_dynticks_curr_cpu_in_eqs()) { + + if (irq) + rcu_dynticks_task_exit(); + rcu_dynticks_eqs_exit(); + + if (irq) + rcu_cleanup_after_idle(); + incby = 1; } trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="), @@ -960,6 +980,14 @@ void rcu_nmi_enter(void) barrier(); } +/** + * rcu_nmi_enter - inform RCU of entry to NMI context + */ +void rcu_nmi_enter(void) +{ + rcu_nmi_enter_common(false); +} + /** * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle * @@ -984,14 +1012,8 @@ void rcu_nmi_enter(void) */ void rcu_irq_enter(void) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); - lockdep_assert_irqs_disabled(); - if (rdtp->dynticks_nmi_nesting == 0) - rcu_dynticks_task_exit(); - rcu_nmi_enter(); - if (rdtp->dynticks_nmi_nesting == 1) - rcu_cleanup_after_idle(); + rcu_nmi_enter_common(true); } /* From 3e31009898699dfca823893054748d85048dc7b3 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 21 Jun 2018 12:50:01 -0700 Subject: [PATCH 024/140] rcu: Defer reporting RCU-preempt quiescent states when disabled This commit defers reporting of RCU-preempt quiescent states at rcu_read_unlock_special() time when any of interrupts, softirq, or preemption are disabled. These deferred quiescent states are reported at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug offline operation. Of course, if another RCU read-side critical section has started in the meantime, the reporting of the quiescent state will be further deferred. This also means that disabling preemption, interrupts, and/or softirqs will act as an RCU-preempt read-side critical section. This is enforced by checking preempt_count() as needed. Some special cases must be handled on an ad-hoc basis, for example, context switch is a quiescent state even though both the scheduler and do_exit() disable preemption. In these cases, additional calls to rcu_preempt_deferred_qs() override the preemption disabling. Similar logic overrides disabled interrupts in rcu_preempt_check_callbacks() because in this case the quiescent state happened just before the corresponding scheduling-clock interrupt. In theory, this change lifts a long-standing restriction that required that if interrupts were disabled across a call to rcu_read_unlock() that the matching rcu_read_lock() also be contained within that interrupts-disabled region of code. Because the reporting of the corresponding RCU-preempt quiescent state is now deferred until after interrupts have been enabled, it is no longer possible for this situation to result in deadlocks involving the scheduler's runqueue and priority-inheritance locks. This may allow some code simplification that might reduce interrupt latency a bit. Unfortunately, in practice this would also defer deboosting a low-priority task that had been subjected to RCU priority boosting, so real-time-response considerations might well force this restriction to remain in place. Because RCU-preempt grace periods are now blocked not only by RCU read-side critical sections, but also by disabling of interrupts, preemption, and softirqs, it will be possible to eliminate RCU-bh and RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels. This may require some additional plumbing to provide the network denial-of-service guarantees that have been traditionally provided by RCU-bh. Once these are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh into RCU-sched. This would mean that all kernels would have but one flavor of RCU, which would open the door to significant code cleanup. Moving to a single flavor of RCU would also have the beneficial effect of reducing the NOCB kthreads by at least a factor of two. Signed-off-by: Paul E. McKenney [ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback from Joel Fernandes. ] [ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in response to bug reports from kbuild test robot. ] [ paulmck: Fix bug located by kbuild test robot involving recursion via rcu_preempt_deferred_qs(). ] --- .../RCU/Design/Requirements/Requirements.html | 50 +++--- include/linux/rcutiny.h | 5 + kernel/rcu/tree.c | 9 ++ kernel/rcu/tree.h | 3 + kernel/rcu/tree_exp.h | 71 +++++++-- kernel/rcu/tree_plugin.h | 144 +++++++++++++----- 6 files changed, 205 insertions(+), 77 deletions(-) diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 49690228b1c6..038714475edb 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -2394,30 +2394,9 @@ when invoked from a CPU-hotplug notifier.

    RCU depends on the scheduler, and the scheduler uses RCU to protect some of its data structures. -This means the scheduler is forbidden from acquiring -the runqueue locks and the priority-inheritance locks -in the middle of an outermost RCU read-side critical section unless either -(1) it releases them before exiting that same -RCU read-side critical section, or -(2) interrupts are disabled across -that entire RCU read-side critical section. -This same prohibition also applies (recursively!) to any lock that is acquired -while holding any lock to which this prohibition applies. -Adhering to this rule prevents preemptible RCU from invoking -rcu_read_unlock_special() while either runqueue or -priority-inheritance locks are held, thus avoiding deadlock. - -

    -Prior to v4.4, it was only necessary to disable preemption across -RCU read-side critical sections that acquired scheduler locks. -In v4.4, expedited grace periods started using IPIs, and these -IPIs could force a rcu_read_unlock() to take the slowpath. -Therefore, this expedited-grace-period change required disabling of -interrupts, not just preemption. - -

    -For RCU's part, the preemptible-RCU rcu_read_unlock() -implementation must be written carefully to avoid similar deadlocks. +The preemptible-RCU rcu_read_unlock() +implementation must therefore be written carefully to avoid deadlocks +involving the scheduler's runqueue and priority-inheritance locks. In particular, rcu_read_unlock() must tolerate an interrupt where the interrupt handler invokes both rcu_read_lock() and rcu_read_unlock(). @@ -2426,7 +2405,7 @@ negative nesting levels to avoid destructive recursion via interrupt handler's use of RCU.

    -This pair of mutual scheduler-RCU requirements came as a +This scheduler-RCU requirement came as a complete surprise.

    @@ -2437,9 +2416,28 @@ when running context-switch-heavy workloads when built with CONFIG_NO_HZ_FULL=y did come as a surprise [PDF]. RCU has made good progress towards meeting this requirement, even -for context-switch-have CONFIG_NO_HZ_FULL=y workloads, +for context-switch-heavy CONFIG_NO_HZ_FULL=y workloads, but there is room for further improvement. +

    +In the past, it was forbidden to disable interrupts across an +rcu_read_unlock() unless that interrupt-disabled region +of code also included the matching rcu_read_lock(). +Violating this restriction could result in deadlocks involving the +scheduler's runqueue and priority-inheritance spinlocks. +This restriction was lifted when interrupt-disabled calls to +rcu_read_unlock() started deferring the reporting of +the resulting RCU-preempt quiescent state until the end of that +interrupts-disabled region. +This deferred reporting means that the scheduler's runqueue and +priority-inheritance locks cannot be held while reporting an RCU-preempt +quiescent state, which lifts the earlier restriction, at least from +a deadlock perspective. +Unfortunately, real-time systems using RCU priority boosting may +need this restriction to remain in effect because deferred +quiescent-state reporting also defers deboosting, which in turn +degrades real-time latencies. +

    Tracing and RCU

    diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 8d9a0ea8f0b5..f617ab19bb51 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -115,6 +115,11 @@ static inline void rcu_irq_exit_irqson(void) { } static inline void rcu_irq_enter_irqson(void) { } static inline void rcu_irq_exit(void) { } static inline void exit_rcu(void) { } +static inline bool rcu_preempt_need_deferred_qs(struct task_struct *t) +{ + return false; +} +static inline void rcu_preempt_deferred_qs(struct task_struct *t) { } #ifdef CONFIG_SRCU void rcu_scheduler_starting(void); #else /* #ifndef CONFIG_SRCU */ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 36786789b625..346624716d6e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -422,6 +422,7 @@ static void rcu_momentary_dyntick_idle(void) special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); /* It is illegal to call this from idle state. */ WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR)); + rcu_preempt_deferred_qs(current); } /* @@ -729,6 +730,7 @@ static void rcu_eqs_enter(bool user) do_nocb_deferred_wakeup(rdp); } rcu_prepare_for_idle(); + rcu_preempt_deferred_qs(current); WRITE_ONCE(rdtp->dynticks_nesting, 0); /* Avoid irq-access tearing. */ rcu_dynticks_eqs_enter(); rcu_dynticks_task_enter(); @@ -2850,6 +2852,12 @@ __rcu_process_callbacks(struct rcu_state *rsp) WARN_ON_ONCE(!rdp->beenonline); + /* Report any deferred quiescent states if preemption enabled. */ + if (!(preempt_count() & PREEMPT_MASK)) + rcu_preempt_deferred_qs(current); + else if (rcu_preempt_need_deferred_qs(current)) + resched_cpu(rdp->cpu); /* Provoke future context switch. */ + /* Update RCU state based on any recent quiescent states. */ rcu_check_quiescent_state(rsp, rdp); @@ -3823,6 +3831,7 @@ void rcu_report_dead(unsigned int cpu) rcu_report_exp_rdp(&rcu_sched_state, this_cpu_ptr(rcu_sched_state.rda), true); preempt_enable(); + rcu_preempt_deferred_qs(current); for_each_rcu_flavor(rsp) rcu_cleanup_dying_idle_cpu(cpu, rsp); diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 4e74df768c57..025bd2e5592b 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -195,6 +195,7 @@ struct rcu_data { bool core_needs_qs; /* Core waits for quiesc state. */ bool beenonline; /* CPU online at least once. */ bool gpwrap; /* Possible ->gp_seq wrap. */ + bool deferred_qs; /* This CPU awaiting a deferred QS? */ struct rcu_node *mynode; /* This CPU's leaf of hierarchy */ unsigned long grpmask; /* Mask to apply to leaf qsmask. */ unsigned long ticks_this_gp; /* The number of scheduling-clock */ @@ -461,6 +462,8 @@ static void rcu_cleanup_after_idle(void); static void rcu_prepare_for_idle(void); static void rcu_idle_count_callbacks_posted(void); static bool rcu_preempt_has_tasks(struct rcu_node *rnp); +static bool rcu_preempt_need_deferred_qs(struct task_struct *t); +static void rcu_preempt_deferred_qs(struct task_struct *t); static void print_cpu_stall_info_begin(void); static void print_cpu_stall_info(struct rcu_state *rsp, int cpu); static void print_cpu_stall_info_end(void); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 0b2c2ad69629..f9d5bbd8adce 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -262,6 +262,7 @@ static void rcu_report_exp_cpu_mult(struct rcu_state *rsp, struct rcu_node *rnp, static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp, bool wake) { + WRITE_ONCE(rdp->deferred_qs, false); rcu_report_exp_cpu_mult(rsp, rdp->mynode, rdp->grpmask, wake); } @@ -735,32 +736,70 @@ EXPORT_SYMBOL_GPL(synchronize_sched_expedited); */ static void sync_rcu_exp_handler(void *info) { - struct rcu_data *rdp; + unsigned long flags; struct rcu_state *rsp = info; + struct rcu_data *rdp = this_cpu_ptr(rsp->rda); + struct rcu_node *rnp = rdp->mynode; struct task_struct *t = current; /* - * Within an RCU read-side critical section, request that the next - * rcu_read_unlock() report. Unless this RCU read-side critical - * section has already blocked, in which case it is already set - * up for the expedited grace period to wait on it. + * First, the common case of not being in an RCU read-side + * critical section. If also enabled or idle, immediately + * report the quiescent state, otherwise defer. */ - if (t->rcu_read_lock_nesting > 0 && - !t->rcu_read_unlock_special.b.blocked) { - t->rcu_read_unlock_special.b.exp_need_qs = true; + if (!t->rcu_read_lock_nesting) { + if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) || + rcu_dynticks_curr_cpu_in_eqs()) { + rcu_report_exp_rdp(rsp, rdp, true); + } else { + rdp->deferred_qs = true; + resched_cpu(rdp->cpu); + } return; } /* - * We are either exiting an RCU read-side critical section (negative - * values of t->rcu_read_lock_nesting) or are not in one at all - * (zero value of t->rcu_read_lock_nesting). Or we are in an RCU - * read-side critical section that blocked before this expedited - * grace period started. Either way, we can immediately report - * the quiescent state. + * Second, the less-common case of being in an RCU read-side + * critical section. In this case we can count on a future + * rcu_read_unlock(). However, this rcu_read_unlock() might + * execute on some other CPU, but in that case there will be + * a future context switch. Either way, if the expedited + * grace period is still waiting on this CPU, set ->deferred_qs + * so that the eventual quiescent state will be reported. + * Note that there is a large group of race conditions that + * can have caused this quiescent state to already have been + * reported, so we really do need to check ->expmask. */ - rdp = this_cpu_ptr(rsp->rda); - rcu_report_exp_rdp(rsp, rdp, true); + if (t->rcu_read_lock_nesting > 0) { + raw_spin_lock_irqsave_rcu_node(rnp, flags); + if (rnp->expmask & rdp->grpmask) + rdp->deferred_qs = true; + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + } + + /* + * The final and least likely case is where the interrupted + * code was just about to or just finished exiting the RCU-preempt + * read-side critical section, and no, we can't tell which. + * So either way, set ->deferred_qs to flag later code that + * a quiescent state is required. + * + * If the CPU is fully enabled (or if some buggy RCU-preempt + * read-side critical section is being used from idle), just + * invoke rcu_preempt_defer_qs() to immediately report the + * quiescent state. We cannot use rcu_read_unlock_special() + * because we are in an interrupt handler, which will cause that + * function to take an early exit without doing anything. + * + * Otherwise, use resched_cpu() to force a context switch after + * the CPU enables everything. + */ + rdp->deferred_qs = true; + if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) || + WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs())) + rcu_preempt_deferred_qs(t); + else + resched_cpu(rdp->cpu); } /** diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index a97c20ea9bce..542791361908 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -371,6 +371,9 @@ static void rcu_preempt_note_context_switch(bool preempt) * behalf of preempted instance of __rcu_read_unlock(). */ rcu_read_unlock_special(t); + rcu_preempt_deferred_qs(t); + } else { + rcu_preempt_deferred_qs(t); } /* @@ -464,54 +467,51 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp) } /* - * Handle special cases during rcu_read_unlock(), such as needing to - * notify RCU core processing or task having blocked during the RCU - * read-side critical section. + * Report deferred quiescent states. The deferral time can + * be quite short, for example, in the case of the call from + * rcu_read_unlock_special(). */ -static void rcu_read_unlock_special(struct task_struct *t) +static void +rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) { bool empty_exp; bool empty_norm; bool empty_exp_now; - unsigned long flags; struct list_head *np; bool drop_boost_mutex = false; struct rcu_data *rdp; struct rcu_node *rnp; union rcu_special special; - /* NMI handlers cannot block and cannot safely manipulate state. */ - if (in_nmi()) - return; - - local_irq_save(flags); - /* * If RCU core is waiting for this CPU to exit its critical section, * report the fact that it has exited. Because irqs are disabled, * t->rcu_read_unlock_special cannot change. */ special = t->rcu_read_unlock_special; + rdp = this_cpu_ptr(rcu_state_p->rda); + if (!special.s && !rdp->deferred_qs) { + local_irq_restore(flags); + return; + } if (special.b.need_qs) { rcu_preempt_qs(); t->rcu_read_unlock_special.b.need_qs = false; - if (!t->rcu_read_unlock_special.s) { + if (!t->rcu_read_unlock_special.s && !rdp->deferred_qs) { local_irq_restore(flags); return; } } /* - * Respond to a request for an expedited grace period, but only if - * we were not preempted, meaning that we were running on the same - * CPU throughout. If we were preempted, the exp_need_qs flag - * would have been cleared at the time of the first preemption, - * and the quiescent state would be reported when we were dequeued. + * Respond to a request by an expedited grace period for a + * quiescent state from this CPU. Note that requests from + * tasks are handled when removing the task from the + * blocked-tasks list below. */ - if (special.b.exp_need_qs) { - WARN_ON_ONCE(special.b.blocked); + if (special.b.exp_need_qs || rdp->deferred_qs) { t->rcu_read_unlock_special.b.exp_need_qs = false; - rdp = this_cpu_ptr(rcu_state_p->rda); + rdp->deferred_qs = false; rcu_report_exp_rdp(rcu_state_p, rdp, true); if (!t->rcu_read_unlock_special.s) { local_irq_restore(flags); @@ -519,19 +519,6 @@ static void rcu_read_unlock_special(struct task_struct *t) } } - /* Hardware IRQ handlers cannot block, complain if they get here. */ - if (in_irq() || in_serving_softirq()) { - lockdep_rcu_suspicious(__FILE__, __LINE__, - "rcu_read_unlock() from irq or softirq with blocking in critical section!!!\n"); - pr_alert("->rcu_read_unlock_special: %#x (b: %d, enq: %d nq: %d)\n", - t->rcu_read_unlock_special.s, - t->rcu_read_unlock_special.b.blocked, - t->rcu_read_unlock_special.b.exp_need_qs, - t->rcu_read_unlock_special.b.need_qs); - local_irq_restore(flags); - return; - } - /* Clean up if blocked during RCU read-side critical section. */ if (special.b.blocked) { t->rcu_read_unlock_special.b.blocked = false; @@ -602,6 +589,72 @@ static void rcu_read_unlock_special(struct task_struct *t) } } +/* + * Is a deferred quiescent-state pending, and are we also not in + * an RCU read-side critical section? It is the caller's responsibility + * to ensure it is otherwise safe to report any deferred quiescent + * states. The reason for this is that it is safe to report a + * quiescent state during context switch even though preemption + * is disabled. This function cannot be expected to understand these + * nuances, so the caller must handle them. + */ +static bool rcu_preempt_need_deferred_qs(struct task_struct *t) +{ + return (this_cpu_ptr(&rcu_preempt_data)->deferred_qs || + READ_ONCE(t->rcu_read_unlock_special.s)) && + !t->rcu_read_lock_nesting; +} + +/* + * Report a deferred quiescent state if needed and safe to do so. + * As with rcu_preempt_need_deferred_qs(), "safe" involves only + * not being in an RCU read-side critical section. The caller must + * evaluate safety in terms of interrupt, softirq, and preemption + * disabling. + */ +static void rcu_preempt_deferred_qs(struct task_struct *t) +{ + unsigned long flags; + bool couldrecurse = t->rcu_read_lock_nesting >= 0; + + if (!rcu_preempt_need_deferred_qs(t)) + return; + if (couldrecurse) + t->rcu_read_lock_nesting -= INT_MIN; + local_irq_save(flags); + rcu_preempt_deferred_qs_irqrestore(t, flags); + if (couldrecurse) + t->rcu_read_lock_nesting += INT_MIN; +} + +/* + * Handle special cases during rcu_read_unlock(), such as needing to + * notify RCU core processing or task having blocked during the RCU + * read-side critical section. + */ +static void rcu_read_unlock_special(struct task_struct *t) +{ + unsigned long flags; + bool preempt_bh_were_disabled = + !!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)); + bool irqs_were_disabled; + + /* NMI handlers cannot block and cannot safely manipulate state. */ + if (in_nmi()) + return; + + local_irq_save(flags); + irqs_were_disabled = irqs_disabled_flags(flags); + if ((preempt_bh_were_disabled || irqs_were_disabled) && + t->rcu_read_unlock_special.b.blocked) { + /* Need to defer quiescent state until everything is enabled. */ + raise_softirq_irqoff(RCU_SOFTIRQ); + local_irq_restore(flags); + return; + } + rcu_preempt_deferred_qs_irqrestore(t, flags); +} + /* * Dump detailed information for all tasks blocking the current RCU * grace period on the specified rcu_node structure. @@ -737,10 +790,20 @@ static void rcu_preempt_check_callbacks(void) struct rcu_state *rsp = &rcu_preempt_state; struct task_struct *t = current; - if (t->rcu_read_lock_nesting == 0) { - rcu_preempt_qs(); + if (t->rcu_read_lock_nesting > 0 || + (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) { + /* No QS, force context switch if deferred. */ + if (rcu_preempt_need_deferred_qs(t)) + resched_cpu(smp_processor_id()); + } else if (rcu_preempt_need_deferred_qs(t)) { + rcu_preempt_deferred_qs(t); /* Report deferred QS. */ + return; + } else if (!t->rcu_read_lock_nesting) { + rcu_preempt_qs(); /* Report immediate QS. */ return; } + + /* If GP is oldish, ask for help from rcu_read_unlock_special(). */ if (t->rcu_read_lock_nesting > 0 && __this_cpu_read(rcu_data_p->core_needs_qs) && __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm) && @@ -859,6 +922,7 @@ void exit_rcu(void) barrier(); t->rcu_read_unlock_special.b.blocked = true; __rcu_read_unlock(); + rcu_preempt_deferred_qs(current); } /* @@ -940,6 +1004,16 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp) return false; } +/* + * Because there is no preemptible RCU, there can be no deferred quiescent + * states. + */ +static bool rcu_preempt_need_deferred_qs(struct task_struct *t) +{ + return false; +} +static void rcu_preempt_deferred_qs(struct task_struct *t) { } + /* * Because preemptible RCU does not exist, we never have to check for * tasks blocked within RCU read-side critical sections. From c0335743c5d80233753d81a4c7d22b7437363a8f Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 21 Jun 2018 16:17:46 -0700 Subject: [PATCH 025/140] rcutorture: Test extended "rcu" read-side critical sections This commit makes the "rcu" torture type test extended read-side critical sections in order to test the deferral of RCU-preempt quiescent-state testing. In CONFIG_PREEMPT=n kernels, this simply duplicates the setup already in place for the "sched" torture type. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index c596c6f1e457..c55d1483886e 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -431,6 +431,7 @@ static struct rcu_torture_ops rcu_ops = { .stats = NULL, .irq_capable = 1, .can_boost = rcu_can_boost(), + .extendables = RCUTORTURE_MAX_EXTEND, .name = "rcu" }; From 27c744e32a9a4066daca0ee7496819bff78c1b37 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 27 Jun 2018 21:48:00 -0700 Subject: [PATCH 026/140] rcu: Allow processing deferred QSes for exiting RCU-preempt readers If an RCU-preempt read-side critical section is exiting, that is, ->rcu_read_lock_nesting is negative, then it is a good time to look at the possibility of reporting deferred quiescent states. This commit therefore updates the checks in rcu_preempt_need_deferred_qs() to allow exiting critical sections to report deferred quiescent states. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_plugin.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 542791361908..24c209676d20 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -602,7 +602,7 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t) { return (this_cpu_ptr(&rcu_preempt_data)->deferred_qs || READ_ONCE(t->rcu_read_unlock_special.s)) && - !t->rcu_read_lock_nesting; + t->rcu_read_lock_nesting <= 0; } /* From fcc878e4dfb70128a73857c609d70570629b0d9e Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 28 Jun 2018 07:39:59 -0700 Subject: [PATCH 027/140] rcu: Remove now-unused ->b.exp_need_qs field from the rcu_special union The ->b.exp_need_qs field is now set only to false, so this commit removes it. The job this field used to do is now done by the rcu_data structure's ->deferred_qs field, which is a consequence of a better split between task-based (the rcu_node structure's ->exp_tasks field) and CPU-based (the aforementioned rcu_data structure's ->deferred_qs field) tracking of quiescent states for RCU-preempt expedited grace periods. Signed-off-by: Paul E. McKenney --- include/linux/sched.h | 6 +----- kernel/rcu/tree_plugin.h | 13 ++++--------- 2 files changed, 5 insertions(+), 14 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 977cb57d7bc9..004ca21f7e80 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -571,12 +571,8 @@ union rcu_special { struct { u8 blocked; u8 need_qs; - u8 exp_need_qs; - - /* Otherwise the compiler can store garbage here: */ - u8 pad; } b; /* Bits. */ - u32 s; /* Set of bits. */ + u16 s; /* Set of bits. */ }; enum perf_event_task_context { diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 24c209676d20..527a52792dce 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -284,13 +284,10 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp) * no need to check for a subsequent expedited GP. (Though we are * still in a quiescent state in any case.) */ - if (blkd_state & RCU_EXP_BLKD && - t->rcu_read_unlock_special.b.exp_need_qs) { - t->rcu_read_unlock_special.b.exp_need_qs = false; + if (blkd_state & RCU_EXP_BLKD && rdp->deferred_qs) rcu_report_exp_rdp(rdp->rsp, rdp, true); - } else { - WARN_ON_ONCE(t->rcu_read_unlock_special.b.exp_need_qs); - } + else + WARN_ON_ONCE(rdp->deferred_qs); } /* @@ -509,9 +506,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * tasks are handled when removing the task from the * blocked-tasks list below. */ - if (special.b.exp_need_qs || rdp->deferred_qs) { - t->rcu_read_unlock_special.b.exp_need_qs = false; - rdp->deferred_qs = false; + if (rdp->deferred_qs) { rcu_report_exp_rdp(rcu_state_p, rdp, true); if (!t->rcu_read_unlock_special.s) { local_irq_restore(flags); From e11ec65cc8d63c41fc468363b65826a5ae4b8c66 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 28 Jun 2018 12:45:23 -0700 Subject: [PATCH 028/140] rcu: Add warning to detect half-interrupts RCU's dyntick-idle code is written to tolerate half-interrupts, that it, either an interrupt that invokes rcu_irq_enter() but never invokes the corresponding rcu_irq_exit() on the one hand, or an interrupt that never invokes rcu_irq_enter() but does invoke the "corresponding" rcu_irq_exit() on the other. These things really did happen at one time, as evidenced by this ca-2011 LKML post: http://lkml.kernel.org/r/20111014170019.GE2428@linux.vnet.ibm.com The reason why RCU tolerates half-interrupts is that usermode helpers used exceptions to invoke a system call from within the kernel such that the system call did a normal return (not a return from exception) to the calling context. This caused rcu_irq_enter() to be invoked without a matching rcu_irq_exit(). However, usermode helpers have since been rewritten to make much more housebroken use of workqueues, kernel threads, and do_execve(), and therefore should no longer produce half-interrupts. No one knows of any other source of half-interrupts, but then again, no one seems insane enough to go audit the entire kernel to verify that half-interrupts really are a relic of the past. This commit therefore adds a pair of WARN_ON_ONCE() calls that will trigger in the presence of half interrupts, which the code will continue to handle correctly. If neither of these WARN_ON_ONCE() trigger by mid-2021, then perhaps RCU can stop handling half-interrupts, which would be a considerable simplification. Reported-by: Steven Rostedt Reported-by: Joel Fernandes Reported-by: Andy Lutomirski Signed-off-by: Paul E. McKenney Reviewed-by: Joel Fernandes (Google) --- kernel/rcu/tree.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 346624716d6e..0b42249e2e40 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -714,6 +714,7 @@ static void rcu_eqs_enter(bool user) struct rcu_dynticks *rdtp; rdtp = this_cpu_ptr(&rcu_dynticks); + WARN_ON_ONCE(rdtp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE); WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && rdtp->dynticks_nesting == 0); @@ -896,6 +897,7 @@ static void rcu_eqs_exit(bool user) trace_rcu_dyntick(TPS("End"), rdtp->dynticks_nesting, 1, rdtp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); WRITE_ONCE(rdtp->dynticks_nesting, 1); + WARN_ON_ONCE(rdtp->dynticks_nmi_nesting); WRITE_ONCE(rdtp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE); } From d28139c4e96713d52a300fb9036c5be2f45e0741 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 28 Jun 2018 14:45:25 -0700 Subject: [PATCH 029/140] rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safe One necessary step towards consolidating the three flavors of RCU is to make sure that the resulting consolidated "one flavor to rule them all" correctly handles networking denial-of-service attacks. One thing that allows RCU-bh to do so is that __do_softirq() invokes rcu_bh_qs() every so often, and so something similar has to happen for consolidated RCU. This must be done carefully. For example, if a preemption-disabled region of code takes an interrupt which does softirq processing before returning, consolidated RCU must ignore the resulting rcu_bh_qs() invocations -- preemption is still disabled, and that means an RCU reader for the consolidated flavor. This commit therefore creates a new rcu_softirq_qs() that is called only from the ksoftirqd task, thus avoiding the interrupted-a-preempted-region problem. This new rcu_softirq_qs() function invokes rcu_sched_qs(), rcu_preempt_qs(), and rcu_preempt_deferred_qs(). The latter call handles any deferred quiescent states. Note that __do_softirq() still invokes rcu_bh_qs(). It will continue to do so until a later stage of cleanup when the RCU-bh flavor is removed. Signed-off-by: Paul E. McKenney [ paulmck: Fix !SMP issue located by kbuild test robot. ] --- include/linux/rcutiny.h | 5 +++++ include/linux/rcutree.h | 1 + kernel/rcu/tree.c | 7 +++++++ kernel/rcu/tree.h | 1 + kernel/rcu/tree_plugin.h | 5 +++++ kernel/softirq.c | 2 ++ 6 files changed, 21 insertions(+) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index f617ab19bb51..bcfbc40a7239 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -90,6 +90,11 @@ static inline void kfree_call_rcu(struct rcu_head *head, call_rcu(head, func); } +static inline void rcu_softirq_qs(void) +{ + rcu_sched_qs(); +} + #define rcu_note_context_switch(preempt) \ do { \ rcu_sched_qs(); \ diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 914655848ef6..664b580695d6 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -30,6 +30,7 @@ #ifndef __LINUX_RCUTREE_H #define __LINUX_RCUTREE_H +void rcu_softirq_qs(void); void rcu_note_context_switch(bool preempt); int rcu_needs_cpu(u64 basem, u64 *nextevt); void rcu_cpu_stall_reset(void); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0b42249e2e40..cb35a417d947 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -255,6 +255,13 @@ void rcu_bh_qs(void) } } +void rcu_softirq_qs(void) +{ + rcu_sched_qs(); + rcu_preempt_qs(); + rcu_preempt_deferred_qs(current); +} + /* * Steal a bit from the bottom of ->dynticks for idle entry/exit * control. Initially this is for TLB flushing. diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 025bd2e5592b..e02c882861eb 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -433,6 +433,7 @@ DECLARE_PER_CPU(char, rcu_cpu_has_work); /* Forward declarations for rcutree_plugin.h */ static void rcu_bootup_announce(void); +static void rcu_preempt_qs(void); static void rcu_preempt_note_context_switch(bool preempt); static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); #ifdef CONFIG_HOTPLUG_CPU diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 527a52792dce..c686bf63bba5 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -974,6 +974,11 @@ static void __init rcu_bootup_announce(void) rcu_bootup_announce_oddness(); } +/* Because preemptible RCU does not exist, we can ignore its QSes. */ +static void rcu_preempt_qs(void) +{ +} + /* * Because preemptible RCU does not exist, we never have to check for * CPUs being in quiescent states. diff --git a/kernel/softirq.c b/kernel/softirq.c index 6f584861d329..ebd69694144a 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -302,6 +302,8 @@ restart: } rcu_bh_qs(); + if (__this_cpu_read(ksoftirqd) == current) + rcu_softirq_qs(); local_irq_disable(); pending = local_softirq_pending(); From ba1c64c27239373be1b3d88cf0a9ac1b10fa871f Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 30 Jun 2018 15:23:37 -0700 Subject: [PATCH 030/140] rcu: Report expedited grace periods at context-switch time This commit reduces the latency of expedited RCU grace periods by reporting a quiescent state for the CPU at context-switch time. In CONFIG_PREEMPT=y kernels, if the outgoing task is still within an RCU read-side critical section (and thus still blocking some grace period, perhaps including this expedited grace period), then that task will already have been placed on one of the leaf rcu_node structures' ->blkd_tasks list. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_plugin.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index c686bf63bba5..0d7107fb3dec 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -332,7 +332,7 @@ static void rcu_preempt_qs(void) static void rcu_preempt_note_context_switch(bool preempt) { struct task_struct *t = current; - struct rcu_data *rdp; + struct rcu_data *rdp = this_cpu_ptr(rcu_state_p->rda); struct rcu_node *rnp; lockdep_assert_irqs_disabled(); @@ -341,7 +341,6 @@ static void rcu_preempt_note_context_switch(bool preempt) !t->rcu_read_unlock_special.b.blocked) { /* Possibly blocking in an RCU read-side critical section. */ - rdp = this_cpu_ptr(rcu_state_p->rda); rnp = rdp->mynode; raw_spin_lock_rcu_node(rnp); t->rcu_read_unlock_special.b.blocked = true; @@ -383,6 +382,8 @@ static void rcu_preempt_note_context_switch(bool preempt) * means that we continue to block the current grace period. */ rcu_preempt_qs(); + if (rdp->deferred_qs) + rcu_report_exp_rdp(rcu_state_p, rdp, true); } /* From 65cfe3583b612a22e12fba9a7bbd2d37ca5ad941 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sun, 1 Jul 2018 07:40:52 -0700 Subject: [PATCH 031/140] rcu: Define RCU-bh update API in terms of RCU Now that the main RCU API knows about softirq disabling and softirq's quiescent states, the RCU-bh update code can be dispensed with. This commit therefore removes the RCU-bh update-side implementation and defines RCU-bh's update-side API in terms of that of either RCU-preempt or RCU-sched, depending on the setting of the CONFIG_PREEMPT Kconfig option. In kernels built with CONFIG_RCU_NOCB_CPU=y this has the knock-on effect of reducing by one the number of rcuo kthreads per CPU. Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 10 ++-- include/linux/rcutiny.h | 10 +++- include/linux/rcutree.h | 8 ++- kernel/rcu/tiny.c | 115 +++++++-------------------------------- kernel/rcu/tree.c | 97 +++------------------------------ kernel/rcu/tree_plugin.h | 1 - kernel/softirq.c | 1 - 7 files changed, 48 insertions(+), 194 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 75e5b393cf44..9ebfd436cec7 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -55,11 +55,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func); #define call_rcu call_rcu_sched #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ -void call_rcu_bh(struct rcu_head *head, rcu_callback_t func); void call_rcu_sched(struct rcu_head *head, rcu_callback_t func); void synchronize_sched(void); void rcu_barrier_tasks(void); +static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) +{ + call_rcu(head, func); +} + #ifdef CONFIG_PREEMPT_RCU void __rcu_read_lock(void); @@ -104,7 +108,6 @@ static inline int rcu_preempt_depth(void) void rcu_init(void); extern int rcu_scheduler_active __read_mostly; void rcu_sched_qs(void); -void rcu_bh_qs(void); void rcu_check_callbacks(int user); void rcu_report_dead(unsigned int cpu); void rcutree_migrate_callbacks(int cpu); @@ -326,8 +329,7 @@ static inline void rcu_preempt_sleep_check(void) { } * and rcu_assign_pointer(). Some of these could be folded into their * callers, but they are left separate in order to ease introduction of * multiple flavors of pointers to match the multiple flavors of RCU - * (e.g., __rcu_bh, * __rcu_sched, and __srcu), should this make sense in - * the future. + * (e.g., __rcu_sched, and __srcu), should this make sense in the future. */ #ifdef __CHECKER__ diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index bcfbc40a7239..ac26c27ccde8 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -56,19 +56,23 @@ static inline void cond_synchronize_sched(unsigned long oldstate) might_sleep(); } -extern void rcu_barrier_bh(void); -extern void rcu_barrier_sched(void); - static inline void synchronize_rcu_expedited(void) { synchronize_sched(); /* Only one CPU, so pretty fast anyway!!! */ } +extern void rcu_barrier_sched(void); + static inline void rcu_barrier(void) { rcu_barrier_sched(); /* Only one CPU, so only one list of callbacks! */ } +static inline void rcu_barrier_bh(void) +{ + rcu_barrier(); +} + static inline void synchronize_rcu_bh(void) { synchronize_sched(); diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 664b580695d6..c789c302a2c9 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -45,7 +45,11 @@ static inline void rcu_virt_note_context_switch(int cpu) rcu_note_context_switch(false); } -void synchronize_rcu_bh(void); +static inline void synchronize_rcu_bh(void) +{ + synchronize_rcu(); +} + void synchronize_sched_expedited(void); void synchronize_rcu_expedited(void); @@ -69,7 +73,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func); */ static inline void synchronize_rcu_bh_expedited(void) { - synchronize_sched_expedited(); + synchronize_rcu_expedited(); } void rcu_barrier(void); diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index befc9321a89c..cadcf63c4889 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -51,64 +51,22 @@ static struct rcu_ctrlblk rcu_sched_ctrlblk = { .curtail = &rcu_sched_ctrlblk.rcucblist, }; -static struct rcu_ctrlblk rcu_bh_ctrlblk = { - .donetail = &rcu_bh_ctrlblk.rcucblist, - .curtail = &rcu_bh_ctrlblk.rcucblist, -}; - -void rcu_barrier_bh(void) -{ - wait_rcu_gp(call_rcu_bh); -} -EXPORT_SYMBOL(rcu_barrier_bh); - void rcu_barrier_sched(void) { wait_rcu_gp(call_rcu_sched); } EXPORT_SYMBOL(rcu_barrier_sched); -/* - * Helper function for rcu_sched_qs() and rcu_bh_qs(). - * Also irqs are disabled to avoid confusion due to interrupt handlers - * invoking call_rcu(). - */ -static int rcu_qsctr_help(struct rcu_ctrlblk *rcp) -{ - if (rcp->donetail != rcp->curtail) { - rcp->donetail = rcp->curtail; - return 1; - } - - return 0; -} - -/* - * Record an rcu quiescent state. And an rcu_bh quiescent state while we - * are at it, given that any rcu quiescent state is also an rcu_bh - * quiescent state. Use "+" instead of "||" to defeat short circuiting. - */ +/* Record an rcu quiescent state. */ void rcu_sched_qs(void) { unsigned long flags; local_irq_save(flags); - if (rcu_qsctr_help(&rcu_sched_ctrlblk) + - rcu_qsctr_help(&rcu_bh_ctrlblk)) - raise_softirq(RCU_SOFTIRQ); - local_irq_restore(flags); -} - -/* - * Record an rcu_bh quiescent state. - */ -void rcu_bh_qs(void) -{ - unsigned long flags; - - local_irq_save(flags); - if (rcu_qsctr_help(&rcu_bh_ctrlblk)) + if (rcu_sched_ctrlblk.donetail != rcu_sched_ctrlblk.curtail) { + rcu_sched_ctrlblk.donetail = rcu_sched_ctrlblk.curtail; raise_softirq(RCU_SOFTIRQ); + } local_irq_restore(flags); } @@ -122,32 +80,27 @@ void rcu_check_callbacks(int user) { if (user) rcu_sched_qs(); - if (user || !in_softirq()) - rcu_bh_qs(); } -/* - * Invoke the RCU callbacks on the specified rcu_ctrlkblk structure - * whose grace period has elapsed. - */ -static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp) +/* Invoke the RCU callbacks whose grace period has elapsed. */ +static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) { struct rcu_head *next, *list; unsigned long flags; /* Move the ready-to-invoke callbacks to a local list. */ local_irq_save(flags); - if (rcp->donetail == &rcp->rcucblist) { + if (rcu_sched_ctrlblk.donetail == &rcu_sched_ctrlblk.rcucblist) { /* No callbacks ready, so just leave. */ local_irq_restore(flags); return; } - list = rcp->rcucblist; - rcp->rcucblist = *rcp->donetail; - *rcp->donetail = NULL; - if (rcp->curtail == rcp->donetail) - rcp->curtail = &rcp->rcucblist; - rcp->donetail = &rcp->rcucblist; + list = rcu_sched_ctrlblk.rcucblist; + rcu_sched_ctrlblk.rcucblist = *rcu_sched_ctrlblk.donetail; + *rcu_sched_ctrlblk.donetail = NULL; + if (rcu_sched_ctrlblk.curtail == rcu_sched_ctrlblk.donetail) + rcu_sched_ctrlblk.curtail = &rcu_sched_ctrlblk.rcucblist; + rcu_sched_ctrlblk.donetail = &rcu_sched_ctrlblk.rcucblist; local_irq_restore(flags); /* Invoke the callbacks on the local list. */ @@ -162,19 +115,13 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp) } } -static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) -{ - __rcu_process_callbacks(&rcu_sched_ctrlblk); - __rcu_process_callbacks(&rcu_bh_ctrlblk); -} - /* * Wait for a grace period to elapse. But it is illegal to invoke * synchronize_sched() from within an RCU read-side critical section. * Therefore, any legal call to synchronize_sched() is a quiescent * state, and so on a UP system, synchronize_sched() need do nothing. - * Ditto for synchronize_rcu_bh(). (But Lai Jiangshan points out the - * benefits of doing might_sleep() to reduce latency.) + * (But Lai Jiangshan points out the benefits of doing might_sleep() + * to reduce latency.) * * Cool, huh? (Due to Josh Triplett.) */ @@ -188,11 +135,11 @@ void synchronize_sched(void) EXPORT_SYMBOL_GPL(synchronize_sched); /* - * Helper function for call_rcu() and call_rcu_bh(). + * Post an RCU callback to be invoked after the end of an RCU-sched grace + * period. But since we have but one CPU, that would be after any + * quiescent state. */ -static void __call_rcu(struct rcu_head *head, - rcu_callback_t func, - struct rcu_ctrlblk *rcp) +void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) { unsigned long flags; @@ -201,8 +148,8 @@ static void __call_rcu(struct rcu_head *head, head->next = NULL; local_irq_save(flags); - *rcp->curtail = head; - rcp->curtail = &head->next; + *rcu_sched_ctrlblk.curtail = head; + rcu_sched_ctrlblk.curtail = &head->next; local_irq_restore(flags); if (unlikely(is_idle_task(current))) { @@ -210,28 +157,8 @@ static void __call_rcu(struct rcu_head *head, resched_cpu(0); } } - -/* - * Post an RCU callback to be invoked after the end of an RCU-sched grace - * period. But since we have but one CPU, that would be after any - * quiescent state. - */ -void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) -{ - __call_rcu(head, func, &rcu_sched_ctrlblk); -} EXPORT_SYMBOL_GPL(call_rcu_sched); -/* - * Post an RCU bottom-half callback to be invoked after any subsequent - * quiescent state. - */ -void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) -{ - __call_rcu(head, func, &rcu_bh_ctrlblk); -} -EXPORT_SYMBOL_GPL(call_rcu_bh); - void __init rcu_init(void) { open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index cb35a417d947..aedf81a0abd8 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -108,7 +108,6 @@ struct rcu_state sname##_state = { \ } RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu_sched); -RCU_STATE_INITIALIZER(rcu_bh, 'b', call_rcu_bh); static struct rcu_state *const rcu_state_p; LIST_HEAD(rcu_struct_flavors); @@ -244,17 +243,6 @@ void rcu_sched_qs(void) this_cpu_ptr(&rcu_sched_data), true); } -void rcu_bh_qs(void) -{ - RCU_LOCKDEP_WARN(preemptible(), "rcu_bh_qs() invoked with preemption enabled!!!"); - if (__this_cpu_read(rcu_bh_data.cpu_no_qs.s)) { - trace_rcu_grace_period(TPS("rcu_bh"), - __this_cpu_read(rcu_bh_data.gp_seq), - TPS("cpuqs")); - __this_cpu_write(rcu_bh_data.cpu_no_qs.b.norm, false); - } -} - void rcu_softirq_qs(void) { rcu_sched_qs(); @@ -581,7 +569,7 @@ EXPORT_SYMBOL_GPL(rcu_sched_get_gp_seq); */ unsigned long rcu_bh_get_gp_seq(void) { - return READ_ONCE(rcu_bh_state.gp_seq); + return READ_ONCE(rcu_state_p->gp_seq); } EXPORT_SYMBOL_GPL(rcu_bh_get_gp_seq); @@ -621,7 +609,7 @@ EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); */ void rcu_bh_force_quiescent_state(void) { - force_quiescent_state(&rcu_bh_state); + force_quiescent_state(rcu_state_p); } EXPORT_SYMBOL_GPL(rcu_bh_force_quiescent_state); @@ -680,10 +668,8 @@ void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, switch (test_type) { case RCU_FLAVOR: - rsp = rcu_state_p; - break; case RCU_BH_FLAVOR: - rsp = &rcu_bh_state; + rsp = rcu_state_p; break; case RCU_SCHED_FLAVOR: rsp = &rcu_sched_state; @@ -2673,26 +2659,15 @@ void rcu_check_callbacks(int user) * nested interrupt. In this case, the CPU is in * a quiescent state, so note it. * - * No memory barrier is required here because both - * rcu_sched_qs() and rcu_bh_qs() reference only CPU-local - * variables that other CPUs neither access nor modify, - * at least not while the corresponding CPU is online. + * No memory barrier is required here because + * rcu_sched_qs() references only CPU-local variables + * that other CPUs neither access nor modify, at least + * not while the corresponding CPU is online. */ rcu_sched_qs(); - rcu_bh_qs(); rcu_note_voluntary_context_switch(current); - } else if (!in_softirq()) { - - /* - * Get here if this CPU did not take its interrupt from - * softirq, in other words, if it is not interrupting - * a rcu_bh read-side critical section. This is an _bh - * critical section, so note it. - */ - - rcu_bh_qs(); } rcu_preempt_check_callbacks(); if (rcu_pending()) @@ -3079,34 +3054,6 @@ void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) } EXPORT_SYMBOL_GPL(call_rcu_sched); -/** - * call_rcu_bh() - Queue an RCU for invocation after a quicker grace period. - * @head: structure to be used for queueing the RCU updates. - * @func: actual callback function to be invoked after the grace period - * - * The callback function will be invoked some time after a full grace - * period elapses, in other words after all currently executing RCU - * read-side critical sections have completed. call_rcu_bh() assumes - * that the read-side critical sections end on completion of a softirq - * handler. This means that read-side critical sections in process - * context must not be interrupted by softirqs. This interface is to be - * used when most of the read-side critical sections are in softirq context. - * RCU read-side critical sections are delimited by: - * - * - rcu_read_lock() and rcu_read_unlock(), if in interrupt context, OR - * - rcu_read_lock_bh() and rcu_read_unlock_bh(), if in process context. - * - * These may be nested. - * - * See the description of call_rcu() for more detailed information on - * memory ordering guarantees. - */ -void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) -{ - __call_rcu(head, func, &rcu_bh_state, -1, 0); -} -EXPORT_SYMBOL_GPL(call_rcu_bh); - /* * Queue an RCU callback for lazy invocation after a grace period. * This will likely be later named something like "call_rcu_lazy()", @@ -3191,33 +3138,6 @@ void synchronize_sched(void) } EXPORT_SYMBOL_GPL(synchronize_sched); -/** - * synchronize_rcu_bh - wait until an rcu_bh grace period has elapsed. - * - * Control will return to the caller some time after a full rcu_bh grace - * period has elapsed, in other words after all currently executing rcu_bh - * read-side critical sections have completed. RCU read-side critical - * sections are delimited by rcu_read_lock_bh() and rcu_read_unlock_bh(), - * and may be nested. - * - * See the description of synchronize_sched() for more detailed information - * on memory ordering guarantees. - */ -void synchronize_rcu_bh(void) -{ - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || - lock_is_held(&rcu_lock_map) || - lock_is_held(&rcu_sched_lock_map), - "Illegal synchronize_rcu_bh() in RCU-bh read-side critical section"); - if (rcu_blocking_is_gp()) - return; - if (rcu_gp_is_expedited()) - synchronize_rcu_bh_expedited(); - else - wait_rcu_gp(call_rcu_bh); -} -EXPORT_SYMBOL_GPL(synchronize_rcu_bh); - /** * get_state_synchronize_rcu - Snapshot current RCU state * @@ -3529,7 +3449,7 @@ static void _rcu_barrier(struct rcu_state *rsp) */ void rcu_barrier_bh(void) { - _rcu_barrier(&rcu_bh_state); + _rcu_barrier(rcu_state_p); } EXPORT_SYMBOL_GPL(rcu_barrier_bh); @@ -4180,7 +4100,6 @@ void __init rcu_init(void) rcu_bootup_announce(); rcu_init_geometry(); - rcu_init_one(&rcu_bh_state); rcu_init_one(&rcu_sched_state); if (dump_tree) rcu_dump_rcu_node_tree(&rcu_sched_state); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 0d7107fb3dec..1ff742a3c8d1 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1320,7 +1320,6 @@ static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, static void rcu_kthread_do_work(void) { rcu_do_batch(&rcu_sched_state, this_cpu_ptr(&rcu_sched_data)); - rcu_do_batch(&rcu_bh_state, this_cpu_ptr(&rcu_bh_data)); rcu_do_batch(&rcu_preempt_state, this_cpu_ptr(&rcu_preempt_data)); } diff --git a/kernel/softirq.c b/kernel/softirq.c index ebd69694144a..7a0720a20003 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -301,7 +301,6 @@ restart: pending >>= softirq_bit; } - rcu_bh_qs(); if (__this_cpu_read(ksoftirqd) == current) rcu_softirq_qs(); local_irq_disable(); From 82fcecfa81855924cc69f3078113cf63dd6c2964 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 2 Jul 2018 09:04:27 -0700 Subject: [PATCH 032/140] rcu: Update comments and help text for no more RCU-bh updaters This commit updates comments and help text to account for the fact that RCU-bh update-side functions are now simple wrappers for their RCU or RCU-sched counterparts. Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 12 ++++-------- include/linux/rcupdate_wait.h | 6 +++--- include/linux/rcutree.h | 14 ++------------ kernel/rcu/Kconfig | 10 +++++----- kernel/rcu/tree.c | 17 +++++++++-------- kernel/rcu/update.c | 2 +- 6 files changed, 24 insertions(+), 37 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 9ebfd436cec7..8d5740edd63c 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -688,14 +688,10 @@ static inline void rcu_read_unlock(void) /** * rcu_read_lock_bh() - mark the beginning of an RCU-bh critical section * - * This is equivalent of rcu_read_lock(), but to be used when updates - * are being done using call_rcu_bh() or synchronize_rcu_bh(). Since - * both call_rcu_bh() and synchronize_rcu_bh() consider completion of a - * softirq handler to be a quiescent state, a process in RCU read-side - * critical section must be protected by disabling softirqs. Read-side - * critical sections in interrupt context can use just rcu_read_lock(), - * though this should at least be commented to avoid confusing people - * reading the code. + * This is equivalent of rcu_read_lock(), but also disables softirqs. + * Note that synchronize_rcu() and friends may be used for the update + * side, although synchronize_rcu_bh() is available as a wrapper in the + * short term. Longer term, the _bh update-side API will be eliminated. * * Note that rcu_read_lock_bh() and the matching rcu_read_unlock_bh() * must occur in the same context, for example, it is illegal to invoke diff --git a/include/linux/rcupdate_wait.h b/include/linux/rcupdate_wait.h index 57f371344152..bc104699560e 100644 --- a/include/linux/rcupdate_wait.h +++ b/include/linux/rcupdate_wait.h @@ -36,13 +36,13 @@ do { \ * @...: List of call_rcu() functions for the flavors to wait on. * * This macro waits concurrently for multiple flavors of RCU grace periods. - * For example, synchronize_rcu_mult(call_rcu, call_rcu_bh) would wait - * on concurrent RCU and RCU-bh grace periods. Waiting on a give SRCU + * For example, synchronize_rcu_mult(call_rcu, call_rcu_sched) would wait + * on concurrent RCU and RCU-sched grace periods. Waiting on a give SRCU * domain requires you to write a wrapper function for that SRCU domain's * call_srcu() function, supplying the corresponding srcu_struct. * * If Tiny RCU, tell _wait_rcu_gp() not to bother waiting for RCU - * or RCU-bh, given that anywhere synchronize_rcu_mult() can be called + * or RCU-sched, given that anywhere synchronize_rcu_mult() can be called * is automatically a grace period. */ #define synchronize_rcu_mult(...) \ diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index c789c302a2c9..f7a41323aa54 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -58,18 +58,8 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func); /** * synchronize_rcu_bh_expedited - Brute-force RCU-bh grace period * - * Wait for an RCU-bh grace period to elapse, but use a "big hammer" - * approach to force the grace period to end quickly. This consumes - * significant time on all CPUs and is unfriendly to real-time workloads, - * so is thus not recommended for any sort of common-case code. In fact, - * if you are using synchronize_rcu_bh_expedited() in a loop, please - * restructure your code to batch your updates, and then use a single - * synchronize_rcu_bh() instead. - * - * Note that it is illegal to call this function while holding any lock - * that is acquired by a CPU-hotplug notifier. And yes, it is also illegal - * to call this function from a CPU-hotplug notifier. Failing to observe - * these restriction will result in deadlock. + * This is a transitional API and will soon be removed, with all + * callers converted to synchronize_rcu_expedited(). */ static inline void synchronize_rcu_bh_expedited(void) { diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig index 9210379c0353..a0b7f0103ca9 100644 --- a/kernel/rcu/Kconfig +++ b/kernel/rcu/Kconfig @@ -229,11 +229,11 @@ config RCU_NOCB_CPU CPUs specified at boot time by the rcu_nocbs parameter. For each such CPU, a kthread ("rcuox/N") will be created to invoke callbacks, where the "N" is the CPU being offloaded, - and where the "x" is "b" for RCU-bh, "p" for RCU-preempt, and - "s" for RCU-sched. Nothing prevents this kthread from running - on the specified CPUs, but (1) the kthreads may be preempted - between each callback, and (2) affinity or cgroups can be used - to force the kthreads to run on whatever set of CPUs is desired. + and where the "p" for RCU-preempt and "s" for RCU-sched. + Nothing prevents this kthread from running on the specified + CPUs, but (1) the kthreads may be preempted between each + callback, and (2) affinity or cgroups can be used to force + the kthreads to run on whatever set of CPUs is desired. Say Y here if you want to help to debug reduced OS jitter. Say N here if you are unsure. diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index aedf81a0abd8..158c58d47b07 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -565,7 +565,8 @@ unsigned long rcu_sched_get_gp_seq(void) EXPORT_SYMBOL_GPL(rcu_sched_get_gp_seq); /* - * Return the number of RCU-bh GPs completed thus far for debug & stats. + * Return the number of RCU GPs completed thus far for debug & stats. + * This is a transitional API and will soon be removed. */ unsigned long rcu_bh_get_gp_seq(void) { @@ -3069,13 +3070,13 @@ void kfree_call_rcu(struct rcu_head *head, EXPORT_SYMBOL_GPL(kfree_call_rcu); /* - * Because a context switch is a grace period for RCU-sched and RCU-bh, - * any blocking grace-period wait automatically implies a grace period - * if there is only one CPU online at any point time during execution - * of either synchronize_sched() or synchronize_rcu_bh(). It is OK to - * occasionally incorrectly indicate that there are multiple CPUs online - * when there was in fact only one the whole time, as this just adds - * some overhead: RCU still operates correctly. + * Because a context switch is a grace period for RCU-sched, any blocking + * grace-period wait automatically implies a grace period if there + * is only one CPU online at any point time during execution of either + * synchronize_sched() or synchronize_rcu_bh(). It is OK to occasionally + * incorrectly indicate that there are multiple CPUs online when there + * was in fact only one the whole time, as this just adds some overhead: + * RCU still operates correctly. */ static int rcu_blocking_is_gp(void) { diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index 39cb23d22109..9ea87d0aa386 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -298,7 +298,7 @@ EXPORT_SYMBOL_GPL(rcu_read_lock_held); * * Check debug_lockdep_rcu_enabled() to prevent false positives during boot. * - * Note that rcu_read_lock() is disallowed if the CPU is either idle or + * Note that rcu_read_lock_bh() is disallowed if the CPU is either idle or * offline from an RCU perspective, so check for those as well. */ int rcu_read_lock_bh_held(void) From 2bbfc25b09dff6335acf4103c6c7c4591e62988b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 2 Jul 2018 09:17:57 -0700 Subject: [PATCH 033/140] rcu: Drop "wake" parameter from rcu_report_exp_rdp() The rcu_report_exp_rdp() function is always invoked with its "wake" argument set to "true", so this commit drops this parameter. The only potential call site that would use "false" is in the code driving the expedited grace period, and that code uses rcu_report_exp_cpu_mult() instead, which therefore retains its "wake" parameter. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 9 +++------ kernel/rcu/tree_exp.h | 9 ++++----- kernel/rcu/tree_plugin.h | 6 +++--- 3 files changed, 10 insertions(+), 14 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 158c58d47b07..e1927147a4a5 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -165,8 +165,7 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf); static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu); static void invoke_rcu_core(void); static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp); -static void rcu_report_exp_rdp(struct rcu_state *rsp, - struct rcu_data *rdp, bool wake); +static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp); static void sync_sched_exp_online_cleanup(int cpu); /* rcuc/rcub kthread realtime priority */ @@ -239,8 +238,7 @@ void rcu_sched_qs(void) if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.b.exp)) return; __this_cpu_write(rcu_sched_data.cpu_no_qs.b.exp, false); - rcu_report_exp_rdp(&rcu_sched_state, - this_cpu_ptr(&rcu_sched_data), true); + rcu_report_exp_rdp(&rcu_sched_state, this_cpu_ptr(&rcu_sched_data)); } void rcu_softirq_qs(void) @@ -3758,8 +3756,7 @@ void rcu_report_dead(unsigned int cpu) /* QS for any half-done expedited RCU-sched GP. */ preempt_disable(); - rcu_report_exp_rdp(&rcu_sched_state, - this_cpu_ptr(rcu_sched_state.rda), true); + rcu_report_exp_rdp(&rcu_sched_state, this_cpu_ptr(rcu_sched_state.rda)); preempt_enable(); rcu_preempt_deferred_qs(current); for_each_rcu_flavor(rsp) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index f9d5bbd8adce..0f8f225c1b46 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -259,11 +259,10 @@ static void rcu_report_exp_cpu_mult(struct rcu_state *rsp, struct rcu_node *rnp, /* * Report expedited quiescent state for specified rcu_data (CPU). */ -static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp, - bool wake) +static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp) { WRITE_ONCE(rdp->deferred_qs, false); - rcu_report_exp_cpu_mult(rsp, rdp->mynode, rdp->grpmask, wake); + rcu_report_exp_cpu_mult(rsp, rdp->mynode, rdp->grpmask, true); } /* Common code for synchronize_{rcu,sched}_expedited() work-done checking. */ @@ -352,7 +351,7 @@ static void sync_sched_exp_handler(void *data) return; if (rcu_is_cpu_rrupt_from_idle()) { rcu_report_exp_rdp(&rcu_sched_state, - this_cpu_ptr(&rcu_sched_data), true); + this_cpu_ptr(&rcu_sched_data)); return; } __this_cpu_write(rcu_sched_data.cpu_no_qs.b.exp, true); @@ -750,7 +749,7 @@ static void sync_rcu_exp_handler(void *info) if (!t->rcu_read_lock_nesting) { if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) || rcu_dynticks_curr_cpu_in_eqs()) { - rcu_report_exp_rdp(rsp, rdp, true); + rcu_report_exp_rdp(rsp, rdp); } else { rdp->deferred_qs = true; resched_cpu(rdp->cpu); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 1ff742a3c8d1..9f0d054e6c20 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -285,7 +285,7 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp) * still in a quiescent state in any case.) */ if (blkd_state & RCU_EXP_BLKD && rdp->deferred_qs) - rcu_report_exp_rdp(rdp->rsp, rdp, true); + rcu_report_exp_rdp(rdp->rsp, rdp); else WARN_ON_ONCE(rdp->deferred_qs); } @@ -383,7 +383,7 @@ static void rcu_preempt_note_context_switch(bool preempt) */ rcu_preempt_qs(); if (rdp->deferred_qs) - rcu_report_exp_rdp(rcu_state_p, rdp, true); + rcu_report_exp_rdp(rcu_state_p, rdp); } /* @@ -508,7 +508,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * blocked-tasks list below. */ if (rdp->deferred_qs) { - rcu_report_exp_rdp(rcu_state_p, rdp, true); + rcu_report_exp_rdp(rcu_state_p, rdp); if (!t->rcu_read_unlock_special.s) { local_irq_restore(flags); return; From 4cf439a200fd621f838270c36c853407a934bcb5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 2 Jul 2018 12:15:25 -0700 Subject: [PATCH 034/140] rcu: Fix typo in rcu_get_gp_kthreads_prio() header comment Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e1927147a4a5..61c15de884b0 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -181,7 +181,7 @@ module_param(gp_init_delay, int, 0444); static int gp_cleanup_delay; module_param(gp_cleanup_delay, int, 0444); -/* Retreive RCU kthreads priority for rcutorture */ +/* Retrieve RCU kthreads priority for rcutorture */ int rcu_get_gp_kthreads_prio(void) { return kthread_prio; From 45975c7d21a1c0aba97e3d8007e2a7c123145748 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 2 Jul 2018 14:30:37 -0700 Subject: [PATCH 035/140] rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds Now that RCU-preempt knows about preemption disabling, its implementation of synchronize_rcu() works for synchronize_sched(), and likewise for the other RCU-sched update-side API members. This commit therefore confines the RCU-sched update-side code to CONFIG_PREEMPT=n builds, and defines RCU-sched's update-side API members in terms of those of RCU-preempt. This means that any given build of the Linux kernel has only one update-side flavor of RCU, namely RCU-preempt for CONFIG_PREEMPT=y builds and RCU-sched for CONFIG_PREEMPT=n builds. This in turn means that kernels built with CONFIG_RCU_NOCB_CPU=y have only one rcuo kthread per CPU. Signed-off-by: Paul E. McKenney Cc: Andi Kleen --- include/linux/rcupdate.h | 14 +- include/linux/rcutiny.h | 7 + include/linux/rcutree.h | 7 +- kernel/rcu/tree.c | 297 +++++++++++++------------------------- kernel/rcu/tree.h | 9 +- kernel/rcu/tree_exp.h | 153 ++++++++++---------- kernel/rcu/tree_plugin.h | 299 +++++++++++++++------------------------ 7 files changed, 307 insertions(+), 479 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 8d5740edd63c..94474bb6b5c4 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -49,11 +49,11 @@ /* Exported common interfaces */ -#ifdef CONFIG_PREEMPT_RCU -void call_rcu(struct rcu_head *head, rcu_callback_t func); -#else /* #ifdef CONFIG_PREEMPT_RCU */ +#ifdef CONFIG_TINY_RCU #define call_rcu call_rcu_sched -#endif /* #else #ifdef CONFIG_PREEMPT_RCU */ +#else +void call_rcu(struct rcu_head *head, rcu_callback_t func); +#endif void call_rcu_sched(struct rcu_head *head, rcu_callback_t func); void synchronize_sched(void); @@ -92,11 +92,6 @@ static inline void __rcu_read_unlock(void) preempt_enable(); } -static inline void synchronize_rcu(void) -{ - synchronize_sched(); -} - static inline int rcu_preempt_depth(void) { return 0; @@ -107,7 +102,6 @@ static inline int rcu_preempt_depth(void) /* Internal to kernel */ void rcu_init(void); extern int rcu_scheduler_active __read_mostly; -void rcu_sched_qs(void); void rcu_check_callbacks(int user); void rcu_report_dead(unsigned int cpu); void rcutree_migrate_callbacks(int cpu); diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index ac26c27ccde8..df2c0895c5e7 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -36,6 +36,11 @@ static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp) /* Never flag non-existent other CPUs! */ static inline bool rcu_eqs_special_set(int cpu) { return false; } +static inline void synchronize_rcu(void) +{ + synchronize_sched(); +} + static inline unsigned long get_state_synchronize_rcu(void) { return 0; @@ -94,6 +99,8 @@ static inline void kfree_call_rcu(struct rcu_head *head, call_rcu(head, func); } +void rcu_sched_qs(void); + static inline void rcu_softirq_qs(void) { rcu_sched_qs(); diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index f7a41323aa54..0c44720f0e84 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -45,14 +45,19 @@ static inline void rcu_virt_note_context_switch(int cpu) rcu_note_context_switch(false); } +void synchronize_rcu(void); static inline void synchronize_rcu_bh(void) { synchronize_rcu(); } -void synchronize_sched_expedited(void); void synchronize_rcu_expedited(void); +static inline void synchronize_sched_expedited(void) +{ + synchronize_rcu_expedited(); +} + void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func); /** diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 61c15de884b0..5f79315f094e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -92,24 +92,29 @@ static const char *tp_##sname##_varname __used __tracepoint_string = sname##_var #define RCU_STATE_INITIALIZER(sname, sabbr, cr) \ DEFINE_RCU_TPS(sname) \ -static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, sname##_data); \ -struct rcu_state sname##_state = { \ - .level = { &sname##_state.node[0] }, \ - .rda = &sname##_data, \ +static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data); \ +struct rcu_state rcu_state = { \ + .level = { &rcu_state.node[0] }, \ + .rda = &rcu_data, \ .call = cr, \ .gp_state = RCU_GP_IDLE, \ .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, \ - .barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \ + .barrier_mutex = __MUTEX_INITIALIZER(rcu_state.barrier_mutex), \ .name = RCU_STATE_NAME(sname), \ .abbr = sabbr, \ - .exp_mutex = __MUTEX_INITIALIZER(sname##_state.exp_mutex), \ - .exp_wake_mutex = __MUTEX_INITIALIZER(sname##_state.exp_wake_mutex), \ - .ofl_lock = __SPIN_LOCK_UNLOCKED(sname##_state.ofl_lock), \ + .exp_mutex = __MUTEX_INITIALIZER(rcu_state.exp_mutex), \ + .exp_wake_mutex = __MUTEX_INITIALIZER(rcu_state.exp_wake_mutex), \ + .ofl_lock = __SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), \ } -RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu_sched); +#ifdef CONFIG_PREEMPT_RCU +RCU_STATE_INITIALIZER(rcu_preempt, 'p', call_rcu); +#else +RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu); +#endif -static struct rcu_state *const rcu_state_p; +static struct rcu_state *const rcu_state_p = &rcu_state; +static struct rcu_data __percpu *const rcu_data_p = &rcu_data; LIST_HEAD(rcu_struct_flavors); /* Dump rcu_node combining tree at boot to verify correct setup. */ @@ -220,31 +225,9 @@ static int rcu_gp_in_progress(struct rcu_state *rsp) return rcu_seq_state(rcu_seq_current(&rsp->gp_seq)); } -/* - * Note a quiescent state. Because we do not need to know - * how many quiescent states passed, just if there was at least - * one since the start of the grace period, this just sets a flag. - * The caller must have disabled preemption. - */ -void rcu_sched_qs(void) -{ - RCU_LOCKDEP_WARN(preemptible(), "rcu_sched_qs() invoked with preemption enabled!!!"); - if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.s)) - return; - trace_rcu_grace_period(TPS("rcu_sched"), - __this_cpu_read(rcu_sched_data.gp_seq), - TPS("cpuqs")); - __this_cpu_write(rcu_sched_data.cpu_no_qs.b.norm, false); - if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.b.exp)) - return; - __this_cpu_write(rcu_sched_data.cpu_no_qs.b.exp, false); - rcu_report_exp_rdp(&rcu_sched_state, this_cpu_ptr(&rcu_sched_data)); -} - void rcu_softirq_qs(void) { - rcu_sched_qs(); - rcu_preempt_qs(); + rcu_qs(); rcu_preempt_deferred_qs(current); } @@ -418,31 +401,18 @@ static void rcu_momentary_dyntick_idle(void) rcu_preempt_deferred_qs(current); } -/* - * Note a context switch. This is a quiescent state for RCU-sched, - * and requires special handling for preemptible RCU. - * The caller must have disabled interrupts. +/** + * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle + * + * If the current CPU is idle or running at a first-level (not nested) + * interrupt from idle, return true. The caller must have at least + * disabled preemption. */ -void rcu_note_context_switch(bool preempt) +static int rcu_is_cpu_rrupt_from_idle(void) { - barrier(); /* Avoid RCU read-side critical sections leaking down. */ - trace_rcu_utilization(TPS("Start context switch")); - rcu_sched_qs(); - rcu_preempt_note_context_switch(preempt); - /* Load rcu_urgent_qs before other flags. */ - if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) - goto out; - this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); - if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) - rcu_momentary_dyntick_idle(); - this_cpu_inc(rcu_dynticks.rcu_qs_ctr); - if (!preempt) - rcu_tasks_qs(current); -out: - trace_rcu_utilization(TPS("End context switch")); - barrier(); /* Avoid RCU read-side critical sections leaking up. */ + return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 0 && + __this_cpu_read(rcu_dynticks.dynticks_nmi_nesting) <= 1; } -EXPORT_SYMBOL_GPL(rcu_note_context_switch); /* * Register a quiescent state for all RCU flavors. If there is an @@ -476,8 +446,8 @@ void rcu_all_qs(void) rcu_momentary_dyntick_idle(); local_irq_restore(flags); } - if (unlikely(raw_cpu_read(rcu_sched_data.cpu_no_qs.b.exp))) - rcu_sched_qs(); + if (unlikely(raw_cpu_read(rcu_data.cpu_no_qs.b.exp))) + rcu_qs(); this_cpu_inc(rcu_dynticks.rcu_qs_ctr); barrier(); /* Avoid RCU read-side critical sections leaking up. */ preempt_enable(); @@ -558,7 +528,7 @@ EXPORT_SYMBOL_GPL(rcu_get_gp_seq); */ unsigned long rcu_sched_get_gp_seq(void) { - return READ_ONCE(rcu_sched_state.gp_seq); + return rcu_get_gp_seq(); } EXPORT_SYMBOL_GPL(rcu_sched_get_gp_seq); @@ -590,7 +560,7 @@ EXPORT_SYMBOL_GPL(rcu_exp_batches_completed); */ unsigned long rcu_exp_batches_completed_sched(void) { - return rcu_sched_state.expedited_sequence; + return rcu_state.expedited_sequence; } EXPORT_SYMBOL_GPL(rcu_exp_batches_completed_sched); @@ -617,7 +587,7 @@ EXPORT_SYMBOL_GPL(rcu_bh_force_quiescent_state); */ void rcu_sched_force_quiescent_state(void) { - force_quiescent_state(&rcu_sched_state); + rcu_force_quiescent_state(); } EXPORT_SYMBOL_GPL(rcu_sched_force_quiescent_state); @@ -668,10 +638,8 @@ void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, switch (test_type) { case RCU_FLAVOR: case RCU_BH_FLAVOR: - rsp = rcu_state_p; - break; case RCU_SCHED_FLAVOR: - rsp = &rcu_sched_state; + rsp = rcu_state_p; break; default: break; @@ -1107,19 +1075,6 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online); #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */ -/** - * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle - * - * If the current CPU is idle or running at a first-level (not nested) - * interrupt from idle, return true. The caller must have at least - * disabled preemption. - */ -static int rcu_is_cpu_rrupt_from_idle(void) -{ - return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 0 && - __this_cpu_read(rcu_dynticks.dynticks_nmi_nesting) <= 1; -} - /* * We are reporting a quiescent state on behalf of some other CPU, so * it is our responsibility to check for and handle potential overflow @@ -2364,7 +2319,7 @@ rcu_report_unblock_qs_rnp(struct rcu_state *rsp, struct rcu_node *rnp_p; raw_lockdep_assert_held_rcu_node(rnp); - if (WARN_ON_ONCE(rcu_state_p == &rcu_sched_state) || + if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT)) || WARN_ON_ONCE(rsp != rcu_state_p) || WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)) || rnp->qsmask != 0) { @@ -2650,25 +2605,7 @@ void rcu_check_callbacks(int user) { trace_rcu_utilization(TPS("Start scheduler-tick")); increment_cpu_stall_ticks(); - if (user || rcu_is_cpu_rrupt_from_idle()) { - - /* - * Get here if this CPU took its interrupt from user - * mode or from the idle loop, and if this is not a - * nested interrupt. In this case, the CPU is in - * a quiescent state, so note it. - * - * No memory barrier is required here because - * rcu_sched_qs() references only CPU-local variables - * that other CPUs neither access nor modify, at least - * not while the corresponding CPU is online. - */ - - rcu_sched_qs(); - rcu_note_voluntary_context_switch(current); - - } - rcu_preempt_check_callbacks(); + rcu_flavor_check_callbacks(user); if (rcu_pending()) invoke_rcu_core(); @@ -2694,7 +2631,7 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)) mask = 0; raw_spin_lock_irqsave_rcu_node(rnp, flags); if (rnp->qsmask == 0) { - if (rcu_state_p == &rcu_sched_state || + if (!IS_ENABLED(CONFIG_PREEMPT) || rsp != rcu_state_p || rcu_preempt_blocked_readers_cgp(rnp)) { /* @@ -3028,28 +2965,56 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func, } /** - * call_rcu_sched() - Queue an RCU for invocation after sched grace period. + * call_rcu() - Queue an RCU callback for invocation after a grace period. * @head: structure to be used for queueing the RCU updates. * @func: actual callback function to be invoked after the grace period * * The callback function will be invoked some time after a full grace - * period elapses, in other words after all currently executing RCU - * read-side critical sections have completed. call_rcu_sched() assumes - * that the read-side critical sections end on enabling of preemption - * or on voluntary preemption. - * RCU read-side critical sections are delimited by: + * period elapses, in other words after all pre-existing RCU read-side + * critical sections have completed. However, the callback function + * might well execute concurrently with RCU read-side critical sections + * that started after call_rcu() was invoked. RCU read-side critical + * sections are delimited by rcu_read_lock() and rcu_read_unlock(), and + * may be nested. In addition, regions of code across which interrupts, + * preemption, or softirqs have been disabled also serve as RCU read-side + * critical sections. This includes hardware interrupt handlers, softirq + * handlers, and NMI handlers. * - * - rcu_read_lock_sched() and rcu_read_unlock_sched(), OR - * - anything that disables preemption. + * Note that all CPUs must agree that the grace period extended beyond + * all pre-existing RCU read-side critical section. On systems with more + * than one CPU, this means that when "func()" is invoked, each CPU is + * guaranteed to have executed a full memory barrier since the end of its + * last RCU read-side critical section whose beginning preceded the call + * to call_rcu(). It also means that each CPU executing an RCU read-side + * critical section that continues beyond the start of "func()" must have + * executed a memory barrier after the call_rcu() but before the beginning + * of that RCU read-side critical section. Note that these guarantees + * include CPUs that are offline, idle, or executing in user mode, as + * well as CPUs that are executing in the kernel. * - * These may be nested. + * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the + * resulting RCU callback function "func()", then both CPU A and CPU B are + * guaranteed to execute a full memory barrier during the time interval + * between the call to call_rcu() and the invocation of "func()" -- even + * if CPU A and CPU B are the same CPU (but again only if the system has + * more than one CPU). + */ +void call_rcu(struct rcu_head *head, rcu_callback_t func) +{ + __call_rcu(head, func, rcu_state_p, -1, 0); +} +EXPORT_SYMBOL_GPL(call_rcu); + +/** + * call_rcu_sched() - Queue an RCU for invocation after sched grace period. + * @head: structure to be used for queueing the RCU updates. + * @func: actual callback function to be invoked after the grace period * - * See the description of call_rcu() for more detailed information on - * memory ordering guarantees. + * This is transitional. */ void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) { - __call_rcu(head, func, &rcu_sched_state, -1, 0); + call_rcu(head, func); } EXPORT_SYMBOL_GPL(call_rcu_sched); @@ -3067,73 +3032,14 @@ void kfree_call_rcu(struct rcu_head *head, } EXPORT_SYMBOL_GPL(kfree_call_rcu); -/* - * Because a context switch is a grace period for RCU-sched, any blocking - * grace-period wait automatically implies a grace period if there - * is only one CPU online at any point time during execution of either - * synchronize_sched() or synchronize_rcu_bh(). It is OK to occasionally - * incorrectly indicate that there are multiple CPUs online when there - * was in fact only one the whole time, as this just adds some overhead: - * RCU still operates correctly. - */ -static int rcu_blocking_is_gp(void) -{ - int ret; - - might_sleep(); /* Check for RCU read-side critical section. */ - preempt_disable(); - ret = num_online_cpus() <= 1; - preempt_enable(); - return ret; -} - /** * synchronize_sched - wait until an rcu-sched grace period has elapsed. * - * Control will return to the caller some time after a full rcu-sched - * grace period has elapsed, in other words after all currently executing - * rcu-sched read-side critical sections have completed. These read-side - * critical sections are delimited by rcu_read_lock_sched() and - * rcu_read_unlock_sched(), and may be nested. Note that preempt_disable(), - * local_irq_disable(), and so on may be used in place of - * rcu_read_lock_sched(). - * - * This means that all preempt_disable code sequences, including NMI and - * non-threaded hardware-interrupt handlers, in progress on entry will - * have completed before this primitive returns. However, this does not - * guarantee that softirq handlers will have completed, since in some - * kernels, these handlers can run in process context, and can block. - * - * Note that this guarantee implies further memory-ordering guarantees. - * On systems with more than one CPU, when synchronize_sched() returns, - * each CPU is guaranteed to have executed a full memory barrier since the - * end of its last RCU-sched read-side critical section whose beginning - * preceded the call to synchronize_sched(). In addition, each CPU having - * an RCU read-side critical section that extends beyond the return from - * synchronize_sched() is guaranteed to have executed a full memory barrier - * after the beginning of synchronize_sched() and before the beginning of - * that RCU read-side critical section. Note that these guarantees include - * CPUs that are offline, idle, or executing in user mode, as well as CPUs - * that are executing in the kernel. - * - * Furthermore, if CPU A invoked synchronize_sched(), which returned - * to its caller on CPU B, then both CPU A and CPU B are guaranteed - * to have executed a full memory barrier during the execution of - * synchronize_sched() -- even if CPU A and CPU B are the same CPU (but - * again only if the system has more than one CPU). + * This is transitional. */ void synchronize_sched(void) { - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || - lock_is_held(&rcu_lock_map) || - lock_is_held(&rcu_sched_lock_map), - "Illegal synchronize_sched() in RCU-sched read-side critical section"); - if (rcu_blocking_is_gp()) - return; - if (rcu_gp_is_expedited()) - synchronize_sched_expedited(); - else - wait_rcu_gp(call_rcu_sched); + synchronize_rcu(); } EXPORT_SYMBOL_GPL(synchronize_sched); @@ -3181,41 +3087,23 @@ EXPORT_SYMBOL_GPL(cond_synchronize_rcu); /** * get_state_synchronize_sched - Snapshot current RCU-sched state * - * Returns a cookie that is used by a later call to cond_synchronize_sched() - * to determine whether or not a full grace period has elapsed in the - * meantime. + * This is transitional, and only used by rcutorture. */ unsigned long get_state_synchronize_sched(void) { - /* - * Any prior manipulation of RCU-protected data must happen - * before the load from ->gp_seq. - */ - smp_mb(); /* ^^^ */ - return rcu_seq_snap(&rcu_sched_state.gp_seq); + return get_state_synchronize_rcu(); } EXPORT_SYMBOL_GPL(get_state_synchronize_sched); /** * cond_synchronize_sched - Conditionally wait for an RCU-sched grace period - * * @oldstate: return value from earlier call to get_state_synchronize_sched() * - * If a full RCU-sched grace period has elapsed since the earlier call to - * get_state_synchronize_sched(), just return. Otherwise, invoke - * synchronize_sched() to wait for a full grace period. - * - * Yes, this function does not take counter wrap into account. But - * counter wrap is harmless. If the counter wraps, we have waited for - * more than 2 billion grace periods (and way more on a 64-bit system!), - * so waiting for one additional grace period should be just fine. + * This is transitional and only used by rcutorture. */ void cond_synchronize_sched(unsigned long oldstate) { - if (!rcu_seq_done(&rcu_sched_state.gp_seq, oldstate)) - synchronize_sched(); - else - smp_mb(); /* Ensure GP ends before subsequent accesses. */ + cond_synchronize_rcu(oldstate); } EXPORT_SYMBOL_GPL(cond_synchronize_sched); @@ -3452,12 +3340,28 @@ void rcu_barrier_bh(void) } EXPORT_SYMBOL_GPL(rcu_barrier_bh); +/** + * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete. + * + * Note that this primitive does not necessarily wait for an RCU grace period + * to complete. For example, if there are no RCU callbacks queued anywhere + * in the system, then rcu_barrier() is within its rights to return + * immediately, without waiting for anything, much less an RCU grace period. + */ +void rcu_barrier(void) +{ + _rcu_barrier(rcu_state_p); +} +EXPORT_SYMBOL_GPL(rcu_barrier); + /** * rcu_barrier_sched - Wait for in-flight call_rcu_sched() callbacks. + * + * This is transitional. */ void rcu_barrier_sched(void) { - _rcu_barrier(&rcu_sched_state); + rcu_barrier(); } EXPORT_SYMBOL_GPL(rcu_barrier_sched); @@ -3756,7 +3660,7 @@ void rcu_report_dead(unsigned int cpu) /* QS for any half-done expedited RCU-sched GP. */ preempt_disable(); - rcu_report_exp_rdp(&rcu_sched_state, this_cpu_ptr(rcu_sched_state.rda)); + rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(rcu_state.rda)); preempt_enable(); rcu_preempt_deferred_qs(current); for_each_rcu_flavor(rsp) @@ -4098,10 +4002,9 @@ void __init rcu_init(void) rcu_bootup_announce(); rcu_init_geometry(); - rcu_init_one(&rcu_sched_state); + rcu_init_one(&rcu_state); if (dump_tree) - rcu_dump_rcu_node_tree(&rcu_sched_state); - __rcu_init_preempt(); + rcu_dump_rcu_node_tree(&rcu_state); open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); /* diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index e02c882861eb..38658ca87dcb 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -225,9 +225,6 @@ struct rcu_data { /* 5) _rcu_barrier(), OOM callbacks, and expediting. */ struct rcu_head barrier_head; -#ifdef CONFIG_RCU_FAST_NO_HZ - struct rcu_head oom_head; -#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ int exp_dynticks_snap; /* Double-check need for IPI. */ /* 6) Callback offloading. */ @@ -433,8 +430,7 @@ DECLARE_PER_CPU(char, rcu_cpu_has_work); /* Forward declarations for rcutree_plugin.h */ static void rcu_bootup_announce(void); -static void rcu_preempt_qs(void); -static void rcu_preempt_note_context_switch(bool preempt); +static void rcu_qs(void); static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); #ifdef CONFIG_HOTPLUG_CPU static bool rcu_preempt_has_tasks(struct rcu_node *rnp); @@ -444,9 +440,8 @@ static int rcu_print_task_stall(struct rcu_node *rnp); static int rcu_print_task_exp_stall(struct rcu_node *rnp); static void rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp); -static void rcu_preempt_check_callbacks(void); +static void rcu_flavor_check_callbacks(int user); void call_rcu(struct rcu_head *head, rcu_callback_t func); -static void __init __rcu_init_preempt(void); static void dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck); static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 0f8f225c1b46..5619edfd414e 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -265,7 +265,7 @@ static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp) rcu_report_exp_cpu_mult(rsp, rdp->mynode, rdp->grpmask, true); } -/* Common code for synchronize_{rcu,sched}_expedited() work-done checking. */ +/* Common code for work-done checking. */ static bool sync_exp_work_done(struct rcu_state *rsp, unsigned long s) { if (rcu_exp_gp_seq_done(rsp, s)) { @@ -337,45 +337,6 @@ fastpath: return false; } -/* Invoked on each online non-idle CPU for expedited quiescent state. */ -static void sync_sched_exp_handler(void *data) -{ - struct rcu_data *rdp; - struct rcu_node *rnp; - struct rcu_state *rsp = data; - - rdp = this_cpu_ptr(rsp->rda); - rnp = rdp->mynode; - if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || - __this_cpu_read(rcu_sched_data.cpu_no_qs.b.exp)) - return; - if (rcu_is_cpu_rrupt_from_idle()) { - rcu_report_exp_rdp(&rcu_sched_state, - this_cpu_ptr(&rcu_sched_data)); - return; - } - __this_cpu_write(rcu_sched_data.cpu_no_qs.b.exp, true); - /* Store .exp before .rcu_urgent_qs. */ - smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); - resched_cpu(smp_processor_id()); -} - -/* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */ -static void sync_sched_exp_online_cleanup(int cpu) -{ - struct rcu_data *rdp; - int ret; - struct rcu_node *rnp; - struct rcu_state *rsp = &rcu_sched_state; - - rdp = per_cpu_ptr(rsp->rda, cpu); - rnp = rdp->mynode; - if (!(READ_ONCE(rnp->expmask) & rdp->grpmask)) - return; - ret = smp_call_function_single(cpu, sync_sched_exp_handler, rsp, 0); - WARN_ON_ONCE(ret); -} - /* * Select the CPUs within the specified rcu_node that the upcoming * expedited grace period needs to wait for. @@ -691,39 +652,6 @@ static void _synchronize_rcu_expedited(struct rcu_state *rsp, mutex_unlock(&rsp->exp_mutex); } -/** - * synchronize_sched_expedited - Brute-force RCU-sched grace period - * - * Wait for an RCU-sched grace period to elapse, but use a "big hammer" - * approach to force the grace period to end quickly. This consumes - * significant time on all CPUs and is unfriendly to real-time workloads, - * so is thus not recommended for any sort of common-case code. In fact, - * if you are using synchronize_sched_expedited() in a loop, please - * restructure your code to batch your updates, and then use a single - * synchronize_sched() instead. - * - * This implementation can be thought of as an application of sequence - * locking to expedited grace periods, but using the sequence counter to - * determine when someone else has already done the work instead of for - * retrying readers. - */ -void synchronize_sched_expedited(void) -{ - struct rcu_state *rsp = &rcu_sched_state; - - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || - lock_is_held(&rcu_lock_map) || - lock_is_held(&rcu_sched_lock_map), - "Illegal synchronize_sched_expedited() in RCU read-side critical section"); - - /* If only one CPU, this is automatically a grace period. */ - if (rcu_blocking_is_gp()) - return; - - _synchronize_rcu_expedited(rsp, sync_sched_exp_handler); -} -EXPORT_SYMBOL_GPL(synchronize_sched_expedited); - #ifdef CONFIG_PREEMPT_RCU /* @@ -801,6 +729,11 @@ static void sync_rcu_exp_handler(void *info) resched_cpu(rdp->cpu); } +/* PREEMPT=y, so no RCU-sched to clean up after. */ +static void sync_sched_exp_online_cleanup(int cpu) +{ +} + /** * synchronize_rcu_expedited - Brute-force RCU grace period * @@ -818,6 +751,8 @@ static void sync_rcu_exp_handler(void *info) * you are using synchronize_rcu_expedited() in a loop, please restructure * your code to batch your updates, and then Use a single synchronize_rcu() * instead. + * + * This has the same semantics as (but is more brutal than) synchronize_rcu(). */ void synchronize_rcu_expedited(void) { @@ -836,13 +771,79 @@ EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); #else /* #ifdef CONFIG_PREEMPT_RCU */ +/* Invoked on each online non-idle CPU for expedited quiescent state. */ +static void sync_sched_exp_handler(void *data) +{ + struct rcu_data *rdp; + struct rcu_node *rnp; + struct rcu_state *rsp = data; + + rdp = this_cpu_ptr(rsp->rda); + rnp = rdp->mynode; + if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || + __this_cpu_read(rcu_data.cpu_no_qs.b.exp)) + return; + if (rcu_is_cpu_rrupt_from_idle()) { + rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(&rcu_data)); + return; + } + __this_cpu_write(rcu_data.cpu_no_qs.b.exp, true); + /* Store .exp before .rcu_urgent_qs. */ + smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); + resched_cpu(smp_processor_id()); +} + +/* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */ +static void sync_sched_exp_online_cleanup(int cpu) +{ + struct rcu_data *rdp; + int ret; + struct rcu_node *rnp; + struct rcu_state *rsp = &rcu_state; + + rdp = per_cpu_ptr(rsp->rda, cpu); + rnp = rdp->mynode; + if (!(READ_ONCE(rnp->expmask) & rdp->grpmask)) + return; + ret = smp_call_function_single(cpu, sync_sched_exp_handler, rsp, 0); + WARN_ON_ONCE(ret); +} + /* - * Wait for an rcu-preempt grace period, but make it happen quickly. - * But because preemptible RCU does not exist, map to rcu-sched. + * Because a context switch is a grace period for RCU-sched, any blocking + * grace-period wait automatically implies a grace period if there + * is only one CPU online at any point time during execution of either + * synchronize_sched() or synchronize_rcu_bh(). It is OK to occasionally + * incorrectly indicate that there are multiple CPUs online when there + * was in fact only one the whole time, as this just adds some overhead: + * RCU still operates correctly. */ +static int rcu_blocking_is_gp(void) +{ + int ret; + + might_sleep(); /* Check for RCU read-side critical section. */ + preempt_disable(); + ret = num_online_cpus() <= 1; + preempt_enable(); + return ret; +} + +/* PREEMPT=n implementation of synchronize_rcu_expedited(). */ void synchronize_rcu_expedited(void) { - synchronize_sched_expedited(); + struct rcu_state *rsp = &rcu_state; + + RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || + lock_is_held(&rcu_lock_map) || + lock_is_held(&rcu_sched_lock_map), + "Illegal synchronize_sched_expedited() in RCU read-side critical section"); + + /* If only one CPU, this is automatically a grace period. */ + if (rcu_blocking_is_gp()) + return; + + _synchronize_rcu_expedited(rsp, sync_sched_exp_handler); } EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 9f0d054e6c20..2c81f8dd63b4 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -123,10 +123,6 @@ static void __init rcu_bootup_announce_oddness(void) #ifdef CONFIG_PREEMPT_RCU -RCU_STATE_INITIALIZER(rcu_preempt, 'p', call_rcu); -static struct rcu_state *const rcu_state_p = &rcu_preempt_state; -static struct rcu_data __percpu *const rcu_data_p = &rcu_preempt_data; - static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, bool wake); static void rcu_read_unlock_special(struct task_struct *t); @@ -303,15 +299,15 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp) * * Callers to this function must disable preemption. */ -static void rcu_preempt_qs(void) +static void rcu_qs(void) { - RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_qs() invoked with preemption enabled!!!\n"); + RCU_LOCKDEP_WARN(preemptible(), "rcu_qs() invoked with preemption enabled!!!\n"); if (__this_cpu_read(rcu_data_p->cpu_no_qs.s)) { trace_rcu_grace_period(TPS("rcu_preempt"), __this_cpu_read(rcu_data_p->gp_seq), TPS("cpuqs")); __this_cpu_write(rcu_data_p->cpu_no_qs.b.norm, false); - barrier(); /* Coordinate with rcu_preempt_check_callbacks(). */ + barrier(); /* Coordinate with rcu_flavor_check_callbacks(). */ current->rcu_read_unlock_special.b.need_qs = false; } } @@ -329,12 +325,14 @@ static void rcu_preempt_qs(void) * * Caller must disable interrupts. */ -static void rcu_preempt_note_context_switch(bool preempt) +void rcu_note_context_switch(bool preempt) { struct task_struct *t = current; struct rcu_data *rdp = this_cpu_ptr(rcu_state_p->rda); struct rcu_node *rnp; + barrier(); /* Avoid RCU read-side critical sections leaking down. */ + trace_rcu_utilization(TPS("Start context switch")); lockdep_assert_irqs_disabled(); WARN_ON_ONCE(!preempt && t->rcu_read_lock_nesting > 0); if (t->rcu_read_lock_nesting > 0 && @@ -381,10 +379,13 @@ static void rcu_preempt_note_context_switch(bool preempt) * grace period, then the fact that the task has been enqueued * means that we continue to block the current grace period. */ - rcu_preempt_qs(); + rcu_qs(); if (rdp->deferred_qs) rcu_report_exp_rdp(rcu_state_p, rdp); + trace_rcu_utilization(TPS("End context switch")); + barrier(); /* Avoid RCU read-side critical sections leaking up. */ } +EXPORT_SYMBOL_GPL(rcu_note_context_switch); /* * Check for preempted RCU readers blocking the current grace period @@ -493,7 +494,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) return; } if (special.b.need_qs) { - rcu_preempt_qs(); + rcu_qs(); t->rcu_read_unlock_special.b.need_qs = false; if (!t->rcu_read_unlock_special.s && !rdp->deferred_qs) { local_irq_restore(flags); @@ -596,7 +597,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) */ static bool rcu_preempt_need_deferred_qs(struct task_struct *t) { - return (this_cpu_ptr(&rcu_preempt_data)->deferred_qs || + return (this_cpu_ptr(&rcu_data)->deferred_qs || READ_ONCE(t->rcu_read_unlock_special.s)) && t->rcu_read_lock_nesting <= 0; } @@ -781,11 +782,14 @@ rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp) * * Caller must disable hard irqs. */ -static void rcu_preempt_check_callbacks(void) +static void rcu_flavor_check_callbacks(int user) { - struct rcu_state *rsp = &rcu_preempt_state; + struct rcu_state *rsp = &rcu_state; struct task_struct *t = current; + if (user || rcu_is_cpu_rrupt_from_idle()) { + rcu_note_voluntary_context_switch(current); + } if (t->rcu_read_lock_nesting > 0 || (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) { /* No QS, force context switch if deferred. */ @@ -795,7 +799,7 @@ static void rcu_preempt_check_callbacks(void) rcu_preempt_deferred_qs(t); /* Report deferred QS. */ return; } else if (!t->rcu_read_lock_nesting) { - rcu_preempt_qs(); /* Report immediate QS. */ + rcu_qs(); /* Report immediate QS. */ return; } @@ -808,44 +812,6 @@ static void rcu_preempt_check_callbacks(void) t->rcu_read_unlock_special.b.need_qs = true; } -/** - * call_rcu() - Queue an RCU callback for invocation after a grace period. - * @head: structure to be used for queueing the RCU updates. - * @func: actual callback function to be invoked after the grace period - * - * The callback function will be invoked some time after a full grace - * period elapses, in other words after all pre-existing RCU read-side - * critical sections have completed. However, the callback function - * might well execute concurrently with RCU read-side critical sections - * that started after call_rcu() was invoked. RCU read-side critical - * sections are delimited by rcu_read_lock() and rcu_read_unlock(), - * and may be nested. - * - * Note that all CPUs must agree that the grace period extended beyond - * all pre-existing RCU read-side critical section. On systems with more - * than one CPU, this means that when "func()" is invoked, each CPU is - * guaranteed to have executed a full memory barrier since the end of its - * last RCU read-side critical section whose beginning preceded the call - * to call_rcu(). It also means that each CPU executing an RCU read-side - * critical section that continues beyond the start of "func()" must have - * executed a memory barrier after the call_rcu() but before the beginning - * of that RCU read-side critical section. Note that these guarantees - * include CPUs that are offline, idle, or executing in user mode, as - * well as CPUs that are executing in the kernel. - * - * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the - * resulting RCU callback function "func()", then both CPU A and CPU B are - * guaranteed to execute a full memory barrier during the time interval - * between the call to call_rcu() and the invocation of "func()" -- even - * if CPU A and CPU B are the same CPU (but again only if the system has - * more than one CPU). - */ -void call_rcu(struct rcu_head *head, rcu_callback_t func) -{ - __call_rcu(head, func, rcu_state_p, -1, 0); -} -EXPORT_SYMBOL_GPL(call_rcu); - /** * synchronize_rcu - wait until a grace period has elapsed. * @@ -856,14 +822,28 @@ EXPORT_SYMBOL_GPL(call_rcu); * concurrently with new RCU read-side critical sections that began while * synchronize_rcu() was waiting. RCU read-side critical sections are * delimited by rcu_read_lock() and rcu_read_unlock(), and may be nested. + * In addition, regions of code across which interrupts, preemption, or + * softirqs have been disabled also serve as RCU read-side critical + * sections. This includes hardware interrupt handlers, softirq handlers, + * and NMI handlers. * - * See the description of synchronize_sched() for more detailed - * information on memory-ordering guarantees. However, please note - * that -only- the memory-ordering guarantees apply. For example, - * synchronize_rcu() is -not- guaranteed to wait on things like code - * protected by preempt_disable(), instead, synchronize_rcu() is -only- - * guaranteed to wait on RCU read-side critical sections, that is, sections - * of code protected by rcu_read_lock(). + * Note that this guarantee implies further memory-ordering guarantees. + * On systems with more than one CPU, when synchronize_rcu() returns, + * each CPU is guaranteed to have executed a full memory barrier since the + * end of its last RCU-sched read-side critical section whose beginning + * preceded the call to synchronize_rcu(). In addition, each CPU having + * an RCU read-side critical section that extends beyond the return from + * synchronize_rcu() is guaranteed to have executed a full memory barrier + * after the beginning of synchronize_rcu() and before the beginning of + * that RCU read-side critical section. Note that these guarantees include + * CPUs that are offline, idle, or executing in user mode, as well as CPUs + * that are executing in the kernel. + * + * Furthermore, if CPU A invoked synchronize_rcu(), which returned + * to its caller on CPU B, then both CPU A and CPU B are guaranteed + * to have executed a full memory barrier during the execution of + * synchronize_rcu() -- even if CPU A and CPU B are the same CPU (but + * again only if the system has more than one CPU). */ void synchronize_rcu(void) { @@ -880,28 +860,6 @@ void synchronize_rcu(void) } EXPORT_SYMBOL_GPL(synchronize_rcu); -/** - * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete. - * - * Note that this primitive does not necessarily wait for an RCU grace period - * to complete. For example, if there are no RCU callbacks queued anywhere - * in the system, then rcu_barrier() is within its rights to return - * immediately, without waiting for anything, much less an RCU grace period. - */ -void rcu_barrier(void) -{ - _rcu_barrier(rcu_state_p); -} -EXPORT_SYMBOL_GPL(rcu_barrier); - -/* - * Initialize preemptible RCU's state structures. - */ -static void __init __rcu_init_preempt(void) -{ - rcu_init_one(rcu_state_p); -} - /* * Check for a task exiting while in a preemptible-RCU read-side * critical section, clean up if so. No need to issue warnings, @@ -964,8 +922,6 @@ dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck) #else /* #ifdef CONFIG_PREEMPT_RCU */ -static struct rcu_state *const rcu_state_p = &rcu_sched_state; - /* * Tell them what RCU they are running. */ @@ -975,18 +931,48 @@ static void __init rcu_bootup_announce(void) rcu_bootup_announce_oddness(); } -/* Because preemptible RCU does not exist, we can ignore its QSes. */ -static void rcu_preempt_qs(void) +/* + * Note a quiescent state for PREEMPT=n. Because we do not need to know + * how many quiescent states passed, just if there was at least one since + * the start of the grace period, this just sets a flag. The caller must + * have disabled preemption. + */ +static void rcu_qs(void) { + RCU_LOCKDEP_WARN(preemptible(), "rcu_qs() invoked with preemption enabled!!!"); + if (!__this_cpu_read(rcu_data.cpu_no_qs.s)) + return; + trace_rcu_grace_period(TPS("rcu_sched"), + __this_cpu_read(rcu_data.gp_seq), TPS("cpuqs")); + __this_cpu_write(rcu_data.cpu_no_qs.b.norm, false); + if (!__this_cpu_read(rcu_data.cpu_no_qs.b.exp)) + return; + __this_cpu_write(rcu_data.cpu_no_qs.b.exp, false); + rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(&rcu_data)); } /* - * Because preemptible RCU does not exist, we never have to check for - * CPUs being in quiescent states. + * Note a PREEMPT=n context switch. The caller must have disabled interrupts. */ -static void rcu_preempt_note_context_switch(bool preempt) +void rcu_note_context_switch(bool preempt) { + barrier(); /* Avoid RCU read-side critical sections leaking down. */ + trace_rcu_utilization(TPS("Start context switch")); + rcu_qs(); + /* Load rcu_urgent_qs before other flags. */ + if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) + goto out; + this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); + if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) + rcu_momentary_dyntick_idle(); + this_cpu_inc(rcu_dynticks.rcu_qs_ctr); + if (!preempt) + rcu_tasks_qs(current); +out: + trace_rcu_utilization(TPS("End context switch")); + barrier(); /* Avoid RCU read-side critical sections leaking up. */ } +EXPORT_SYMBOL_GPL(rcu_note_context_switch); /* * Because preemptible RCU does not exist, there are never any preempted @@ -1054,29 +1040,48 @@ rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp) } /* - * Because preemptible RCU does not exist, it never has any callbacks - * to check. + * Check to see if this CPU is in a non-context-switch quiescent state + * (user mode or idle loop for rcu, non-softirq execution for rcu_bh). + * Also schedule RCU core processing. + * + * This function must be called from hardirq context. It is normally + * invoked from the scheduling-clock interrupt. */ -static void rcu_preempt_check_callbacks(void) +static void rcu_flavor_check_callbacks(int user) { + if (user || rcu_is_cpu_rrupt_from_idle()) { + + /* + * Get here if this CPU took its interrupt from user + * mode or from the idle loop, and if this is not a + * nested interrupt. In this case, the CPU is in + * a quiescent state, so note it. + * + * No memory barrier is required here because rcu_qs() + * references only CPU-local variables that other CPUs + * neither access nor modify, at least not while the + * corresponding CPU is online. + */ + + rcu_qs(); + } } -/* - * Because preemptible RCU does not exist, rcu_barrier() is just - * another name for rcu_barrier_sched(). - */ -void rcu_barrier(void) -{ - rcu_barrier_sched(); -} -EXPORT_SYMBOL_GPL(rcu_barrier); - -/* - * Because preemptible RCU does not exist, it need not be initialized. - */ -static void __init __rcu_init_preempt(void) +/* PREEMPT=n implementation of synchronize_rcu(). */ +void synchronize_rcu(void) { + RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || + lock_is_held(&rcu_lock_map) || + lock_is_held(&rcu_sched_lock_map), + "Illegal synchronize_rcu() in RCU-sched read-side critical section"); + if (rcu_blocking_is_gp()) + return; + if (rcu_gp_is_expedited()) + synchronize_rcu_expedited(); + else + wait_rcu_gp(call_rcu); } +EXPORT_SYMBOL_GPL(synchronize_rcu); /* * Because preemptible RCU does not exist, tasks cannot possibly exit @@ -1319,8 +1324,7 @@ static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, static void rcu_kthread_do_work(void) { - rcu_do_batch(&rcu_sched_state, this_cpu_ptr(&rcu_sched_data)); - rcu_do_batch(&rcu_preempt_state, this_cpu_ptr(&rcu_preempt_data)); + rcu_do_batch(&rcu_state, this_cpu_ptr(&rcu_data)); } static void rcu_cpu_kthread_setup(unsigned int cpu) @@ -1727,87 +1731,6 @@ static void rcu_idle_count_callbacks_posted(void) __this_cpu_add(rcu_dynticks.nonlazy_posted, 1); } -/* - * Data for flushing lazy RCU callbacks at OOM time. - */ -static atomic_t oom_callback_count; -static DECLARE_WAIT_QUEUE_HEAD(oom_callback_wq); - -/* - * RCU OOM callback -- decrement the outstanding count and deliver the - * wake-up if we are the last one. - */ -static void rcu_oom_callback(struct rcu_head *rhp) -{ - if (atomic_dec_and_test(&oom_callback_count)) - wake_up(&oom_callback_wq); -} - -/* - * Post an rcu_oom_notify callback on the current CPU if it has at - * least one lazy callback. This will unnecessarily post callbacks - * to CPUs that already have a non-lazy callback at the end of their - * callback list, but this is an infrequent operation, so accept some - * extra overhead to keep things simple. - */ -static void rcu_oom_notify_cpu(void *unused) -{ - struct rcu_state *rsp; - struct rcu_data *rdp; - - for_each_rcu_flavor(rsp) { - rdp = raw_cpu_ptr(rsp->rda); - if (rcu_segcblist_n_lazy_cbs(&rdp->cblist)) { - atomic_inc(&oom_callback_count); - rsp->call(&rdp->oom_head, rcu_oom_callback); - } - } -} - -/* - * If low on memory, ensure that each CPU has a non-lazy callback. - * This will wake up CPUs that have only lazy callbacks, in turn - * ensuring that they free up the corresponding memory in a timely manner. - * Because an uncertain amount of memory will be freed in some uncertain - * timeframe, we do not claim to have freed anything. - */ -static int rcu_oom_notify(struct notifier_block *self, - unsigned long notused, void *nfreed) -{ - int cpu; - - /* Wait for callbacks from earlier instance to complete. */ - wait_event(oom_callback_wq, atomic_read(&oom_callback_count) == 0); - smp_mb(); /* Ensure callback reuse happens after callback invocation. */ - - /* - * Prevent premature wakeup: ensure that all increments happen - * before there is a chance of the counter reaching zero. - */ - atomic_set(&oom_callback_count, 1); - - for_each_online_cpu(cpu) { - smp_call_function_single(cpu, rcu_oom_notify_cpu, NULL, 1); - cond_resched_tasks_rcu_qs(); - } - - /* Unconditionally decrement: no need to wake ourselves up. */ - atomic_dec(&oom_callback_count); - - return NOTIFY_OK; -} - -static struct notifier_block rcu_oom_nb = { - .notifier_call = rcu_oom_notify -}; - -static int __init rcu_register_oom_notifier(void) -{ - register_oom_notifier(&rcu_oom_nb); - return 0; -} -early_initcall(rcu_register_oom_notifier); - #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ #ifdef CONFIG_RCU_FAST_NO_HZ From 709fdce7545c978e69f52eb19082ea3af44332f5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 10:44:44 -0700 Subject: [PATCH 036/140] rcu: Express Tiny RCU updates in terms of RCU rather than RCU-sched This commit renames Tiny RCU functions so that the lowest level of functionality is RCU (e.g., synchronize_rcu()) rather than RCU-sched (e.g., synchronize_sched()). This provides greater naming compatibility with Tree RCU, which will in turn permit more LoC removal once the RCU-sched and RCU-bh update-side API is removed. Signed-off-by: Paul E. McKenney [ paulmck: Fix Tiny call_rcu()'s EXPORT_SYMBOL() in response to a bug report from kbuild test robot. ] --- include/linux/rcupdate.h | 12 +++++----- include/linux/rcutiny.h | 34 +++++++++++++++------------- include/linux/rcutree.h | 1 - kernel/rcu/tiny.c | 48 ++++++++++++++++++++-------------------- 4 files changed, 48 insertions(+), 47 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 94474bb6b5c4..1207c6c9bd8b 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -49,15 +49,14 @@ /* Exported common interfaces */ -#ifdef CONFIG_TINY_RCU -#define call_rcu call_rcu_sched -#else -void call_rcu(struct rcu_head *head, rcu_callback_t func); +#ifndef CONFIG_TINY_RCU +void synchronize_sched(void); +void call_rcu_sched(struct rcu_head *head, rcu_callback_t func); #endif -void call_rcu_sched(struct rcu_head *head, rcu_callback_t func); -void synchronize_sched(void); +void call_rcu(struct rcu_head *head, rcu_callback_t func); void rcu_barrier_tasks(void); +void synchronize_rcu(void); static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) { @@ -68,7 +67,6 @@ static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) void __rcu_read_lock(void); void __rcu_read_unlock(void); -void synchronize_rcu(void); /* * Defined as a macro as it is a very low level header included from diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index df2c0895c5e7..e66fb8bc2127 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -36,9 +36,9 @@ static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp) /* Never flag non-existent other CPUs! */ static inline bool rcu_eqs_special_set(int cpu) { return false; } -static inline void synchronize_rcu(void) +static inline void synchronize_sched(void) { - synchronize_sched(); + synchronize_rcu(); } static inline unsigned long get_state_synchronize_rcu(void) @@ -61,16 +61,11 @@ static inline void cond_synchronize_sched(unsigned long oldstate) might_sleep(); } -static inline void synchronize_rcu_expedited(void) -{ - synchronize_sched(); /* Only one CPU, so pretty fast anyway!!! */ -} +extern void rcu_barrier(void); -extern void rcu_barrier_sched(void); - -static inline void rcu_barrier(void) +static inline void rcu_barrier_sched(void) { - rcu_barrier_sched(); /* Only one CPU, so only one list of callbacks! */ + rcu_barrier(); /* Only one CPU, so only one list of callbacks! */ } static inline void rcu_barrier_bh(void) @@ -88,27 +83,36 @@ static inline void synchronize_rcu_bh_expedited(void) synchronize_sched(); } +static inline void synchronize_rcu_expedited(void) +{ + synchronize_sched(); +} + static inline void synchronize_sched_expedited(void) { synchronize_sched(); } -static inline void kfree_call_rcu(struct rcu_head *head, - rcu_callback_t func) +static inline void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) { call_rcu(head, func); } -void rcu_sched_qs(void); +static inline void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) +{ + call_rcu(head, func); +} + +void rcu_qs(void); static inline void rcu_softirq_qs(void) { - rcu_sched_qs(); + rcu_qs(); } #define rcu_note_context_switch(preempt) \ do { \ - rcu_sched_qs(); \ + rcu_qs(); \ rcu_tasks_qs(current); \ } while (0) diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 0c44720f0e84..6d30a0809300 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -45,7 +45,6 @@ static inline void rcu_virt_note_context_switch(int cpu) rcu_note_context_switch(false); } -void synchronize_rcu(void); static inline void synchronize_rcu_bh(void) { synchronize_rcu(); diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index cadcf63c4889..30826fb6e438 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -46,25 +46,25 @@ struct rcu_ctrlblk { }; /* Definition for rcupdate control block. */ -static struct rcu_ctrlblk rcu_sched_ctrlblk = { - .donetail = &rcu_sched_ctrlblk.rcucblist, - .curtail = &rcu_sched_ctrlblk.rcucblist, +static struct rcu_ctrlblk rcu_ctrlblk = { + .donetail = &rcu_ctrlblk.rcucblist, + .curtail = &rcu_ctrlblk.rcucblist, }; -void rcu_barrier_sched(void) +void rcu_barrier(void) { - wait_rcu_gp(call_rcu_sched); + wait_rcu_gp(call_rcu); } -EXPORT_SYMBOL(rcu_barrier_sched); +EXPORT_SYMBOL(rcu_barrier); /* Record an rcu quiescent state. */ -void rcu_sched_qs(void) +void rcu_qs(void) { unsigned long flags; local_irq_save(flags); - if (rcu_sched_ctrlblk.donetail != rcu_sched_ctrlblk.curtail) { - rcu_sched_ctrlblk.donetail = rcu_sched_ctrlblk.curtail; + if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) { + rcu_ctrlblk.donetail = rcu_ctrlblk.curtail; raise_softirq(RCU_SOFTIRQ); } local_irq_restore(flags); @@ -79,7 +79,7 @@ void rcu_sched_qs(void) void rcu_check_callbacks(int user) { if (user) - rcu_sched_qs(); + rcu_qs(); } /* Invoke the RCU callbacks whose grace period has elapsed. */ @@ -90,17 +90,17 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused /* Move the ready-to-invoke callbacks to a local list. */ local_irq_save(flags); - if (rcu_sched_ctrlblk.donetail == &rcu_sched_ctrlblk.rcucblist) { + if (rcu_ctrlblk.donetail == &rcu_ctrlblk.rcucblist) { /* No callbacks ready, so just leave. */ local_irq_restore(flags); return; } - list = rcu_sched_ctrlblk.rcucblist; - rcu_sched_ctrlblk.rcucblist = *rcu_sched_ctrlblk.donetail; - *rcu_sched_ctrlblk.donetail = NULL; - if (rcu_sched_ctrlblk.curtail == rcu_sched_ctrlblk.donetail) - rcu_sched_ctrlblk.curtail = &rcu_sched_ctrlblk.rcucblist; - rcu_sched_ctrlblk.donetail = &rcu_sched_ctrlblk.rcucblist; + list = rcu_ctrlblk.rcucblist; + rcu_ctrlblk.rcucblist = *rcu_ctrlblk.donetail; + *rcu_ctrlblk.donetail = NULL; + if (rcu_ctrlblk.curtail == rcu_ctrlblk.donetail) + rcu_ctrlblk.curtail = &rcu_ctrlblk.rcucblist; + rcu_ctrlblk.donetail = &rcu_ctrlblk.rcucblist; local_irq_restore(flags); /* Invoke the callbacks on the local list. */ @@ -125,21 +125,21 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused * * Cool, huh? (Due to Josh Triplett.) */ -void synchronize_sched(void) +void synchronize_rcu(void) { RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map), "Illegal synchronize_sched() in RCU read-side critical section"); } -EXPORT_SYMBOL_GPL(synchronize_sched); +EXPORT_SYMBOL_GPL(synchronize_rcu); /* * Post an RCU callback to be invoked after the end of an RCU-sched grace * period. But since we have but one CPU, that would be after any * quiescent state. */ -void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) +void call_rcu(struct rcu_head *head, rcu_callback_t func) { unsigned long flags; @@ -148,16 +148,16 @@ void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) head->next = NULL; local_irq_save(flags); - *rcu_sched_ctrlblk.curtail = head; - rcu_sched_ctrlblk.curtail = &head->next; + *rcu_ctrlblk.curtail = head; + rcu_ctrlblk.curtail = &head->next; local_irq_restore(flags); if (unlikely(is_idle_task(current))) { - /* force scheduling for rcu_sched_qs() */ + /* force scheduling for rcu_qs() */ resched_cpu(0); } } -EXPORT_SYMBOL_GPL(call_rcu_sched); +EXPORT_SYMBOL_GPL(call_rcu); void __init rcu_init(void) { From 358be2d3685cb0cca49c914e89824467ee0b589c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 14:15:31 -0700 Subject: [PATCH 037/140] rcu: Remove RCU_STATE_INITIALIZER() Now that a given build of the Linux kernel has only one set of rcu_state, rcu_node, and rcu_data structures, there is no point in creating a macro to declare and compile-time initialize them. This commit therefore just does normal declaration and compile-time initialization of these structures. While in the area, this commit also removes #ifndefs of the no-longer-ever-defined preprocessor macro RCU_TREE_NONCORE. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 54 ++++++++++++----------------------------------- kernel/rcu/tree.h | 29 +++++++++++++++++++------ 2 files changed, 37 insertions(+), 46 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 5f79315f094e..1d36cbcce1b4 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -72,46 +72,20 @@ /* Data structures. */ -/* - * In order to export the rcu_state name to the tracing tools, it - * needs to be added in the __tracepoint_string section. - * This requires defining a separate variable tp__varname - * that points to the string being used, and this will allow - * the tracing userspace tools to be able to decipher the string - * address to the matching string. - */ -#ifdef CONFIG_TRACING -# define DEFINE_RCU_TPS(sname) \ -static char sname##_varname[] = #sname; \ -static const char *tp_##sname##_varname __used __tracepoint_string = sname##_varname; -# define RCU_STATE_NAME(sname) sname##_varname -#else -# define DEFINE_RCU_TPS(sname) -# define RCU_STATE_NAME(sname) __stringify(sname) -#endif - -#define RCU_STATE_INITIALIZER(sname, sabbr, cr) \ -DEFINE_RCU_TPS(sname) \ -static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data); \ -struct rcu_state rcu_state = { \ - .level = { &rcu_state.node[0] }, \ - .rda = &rcu_data, \ - .call = cr, \ - .gp_state = RCU_GP_IDLE, \ - .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, \ - .barrier_mutex = __MUTEX_INITIALIZER(rcu_state.barrier_mutex), \ - .name = RCU_STATE_NAME(sname), \ - .abbr = sabbr, \ - .exp_mutex = __MUTEX_INITIALIZER(rcu_state.exp_mutex), \ - .exp_wake_mutex = __MUTEX_INITIALIZER(rcu_state.exp_wake_mutex), \ - .ofl_lock = __SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), \ -} - -#ifdef CONFIG_PREEMPT_RCU -RCU_STATE_INITIALIZER(rcu_preempt, 'p', call_rcu); -#else -RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu); -#endif +static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data); +struct rcu_state rcu_state = { + .level = { &rcu_state.node[0] }, + .rda = &rcu_data, + .call = call_rcu, + .gp_state = RCU_GP_IDLE, + .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, + .barrier_mutex = __MUTEX_INITIALIZER(rcu_state.barrier_mutex), + .name = RCU_NAME, + .abbr = RCU_ABBR, + .exp_mutex = __MUTEX_INITIALIZER(rcu_state.exp_mutex), + .exp_wake_mutex = __MUTEX_INITIALIZER(rcu_state.exp_wake_mutex), + .ofl_lock = __SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), +}; static struct rcu_state *const rcu_state_p = &rcu_state; static struct rcu_data __percpu *const rcu_data_p = &rcu_data; diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 38658ca87dcb..3f36562d3118 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -386,7 +386,6 @@ struct rcu_state { #define RCU_GP_CLEANUP 7 /* Grace-period cleanup started. */ #define RCU_GP_CLEANED 8 /* Grace-period cleanup complete. */ -#ifndef RCU_TREE_NONCORE static const char * const gp_state_names[] = { "RCU_GP_IDLE", "RCU_GP_WAIT_GPS", @@ -398,7 +397,29 @@ static const char * const gp_state_names[] = { "RCU_GP_CLEANUP", "RCU_GP_CLEANED", }; -#endif /* #ifndef RCU_TREE_NONCORE */ + +/* + * In order to export the rcu_state name to the tracing tools, it + * needs to be added in the __tracepoint_string section. + * This requires defining a separate variable tp__varname + * that points to the string being used, and this will allow + * the tracing userspace tools to be able to decipher the string + * address to the matching string. + */ +#ifdef CONFIG_PREEMPT_RCU +#define RCU_ABBR 'p' +#define RCU_NAME_RAW "rcu_preempt" +#else /* #ifdef CONFIG_PREEMPT_RCU */ +#define RCU_ABBR 's' +#define RCU_NAME_RAW "rcu_sched" +#endif /* #else #ifdef CONFIG_PREEMPT_RCU */ +#ifndef CONFIG_TRACING +#define RCU_NAME RCU_NAME_RAW +#else /* #ifdef CONFIG_TRACING */ +static char rcu_name[] = RCU_NAME_RAW; +static const char *tp_rcu_varname __used __tracepoint_string = rcu_name; +#define RCU_NAME rcu_name +#endif /* #else #ifdef CONFIG_TRACING */ extern struct list_head rcu_struct_flavors; @@ -426,8 +447,6 @@ DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_loops); DECLARE_PER_CPU(char, rcu_cpu_has_work); #endif /* #ifdef CONFIG_RCU_BOOST */ -#ifndef RCU_TREE_NONCORE - /* Forward declarations for rcutree_plugin.h */ static void rcu_bootup_announce(void); static void rcu_qs(void); @@ -495,5 +514,3 @@ void srcu_offline_cpu(unsigned int cpu); void srcu_online_cpu(unsigned int cpu) { } void srcu_offline_cpu(unsigned int cpu) { } #endif /* #else #ifdef CONFIG_SRCU */ - -#endif /* #ifndef RCU_TREE_NONCORE */ From ec5dd444b678b1305d9af34ebb4cca17e0ef88e6 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 15:02:28 -0700 Subject: [PATCH 038/140] rcu: Eliminate rcu_state structure's ->call field The rcu_state structure's ->call field references the corresponding RCU flavor's call_rcu() function. However, now that there is only ever one rcu_state structure in a given build of the Linux kernel, and that flavor uses plain old call_rcu(), there is not a lot of point in continuing to have the ->call field. This commit therefore removes it. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 1 - kernel/rcu/tree.h | 1 - kernel/rcu/tree_exp.h | 2 +- 3 files changed, 1 insertion(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1d36cbcce1b4..ea0dfd13fd27 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -76,7 +76,6 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data); struct rcu_state rcu_state = { .level = { &rcu_state.node[0] }, .rda = &rcu_data, - .call = call_rcu, .gp_state = RCU_GP_IDLE, .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, .barrier_mutex = __MUTEX_INITIALIZER(rcu_state.barrier_mutex), diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 3f36562d3118..c50060567146 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -313,7 +313,6 @@ struct rcu_state { /* Hierarchy levels (+1 to */ /* shut bogus gcc warning) */ struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */ - call_rcu_func_t call; /* call_rcu() flavor. */ int ncpus; /* # CPUs seen so far. */ /* The following fields are guarded by the root rcu_node's lock. */ diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 5619edfd414e..224f05f0c0c9 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -619,7 +619,7 @@ static void _synchronize_rcu_expedited(struct rcu_state *rsp, /* If expedited grace periods are prohibited, fall back to normal. */ if (rcu_gp_is_normal()) { - wait_rcu_gp(rsp->call); + wait_rcu_gp(call_rcu); return; } From da1df50d16171f4c65da18093d5b5652423f5b99 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 15:37:16 -0700 Subject: [PATCH 039/140] rcu: Remove rcu_state structure's ->rda field The rcu_state structure's ->rda field was used to find the per-CPU rcu_data structures corresponding to that rcu_state structure. But now there is only one rcu_state structure (creatively named "rcu_state") and one set of per-CPU rcu_data structures (creatively named "rcu_data"). Therefore, uses of the ->rda field can always be replaced by "rcu_data, and this commit makes that change and removes the ->rda field. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 67 ++++++++++++++++++++-------------------- kernel/rcu/tree.h | 1 - kernel/rcu/tree_exp.h | 19 ++++++------ kernel/rcu/tree_plugin.h | 24 +++++++------- 4 files changed, 54 insertions(+), 57 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index ea0dfd13fd27..e6b0bb0d00b7 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -75,7 +75,6 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data); struct rcu_state rcu_state = { .level = { &rcu_state.node[0] }, - .rda = &rcu_data, .gp_state = RCU_GP_IDLE, .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, .barrier_mutex = __MUTEX_INITIALIZER(rcu_state.barrier_mutex), @@ -586,7 +585,7 @@ void show_rcu_gp_kthreads(void) if (!rcu_is_leaf_node(rnp)) continue; for_each_leaf_node_possible_cpu(rnp, cpu) { - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); if (rdp->gpwrap || ULONG_CMP_GE(rsp->gp_seq, rdp->gp_seq_needed)) @@ -660,7 +659,7 @@ static void rcu_eqs_enter(bool user) trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0, rdtp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); do_nocb_deferred_wakeup(rdp); } rcu_prepare_for_idle(); @@ -1034,7 +1033,7 @@ bool rcu_lockdep_current_cpu_online(void) return true; preempt_disable(); for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); rnp = rdp->mynode; if (rdp->grpmask & rcu_rnp_online_cpus(rnp)) { preempt_enable(); @@ -1352,7 +1351,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) print_cpu_stall_info_end(); for_each_possible_cpu(cpu) - totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(rsp->rda, + totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(&rcu_data, cpu)->cblist); pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n", smp_processor_id(), (long)(jiffies - rsp->gp_start), @@ -1392,7 +1391,7 @@ static void print_cpu_stall(struct rcu_state *rsp) { int cpu; unsigned long flags; - struct rcu_data *rdp = this_cpu_ptr(rsp->rda); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp = rcu_get_root(rsp); long totqlen = 0; @@ -1413,7 +1412,7 @@ static void print_cpu_stall(struct rcu_state *rsp) raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags); print_cpu_stall_info_end(); for_each_possible_cpu(cpu) - totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(rsp->rda, + totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(&rcu_data, cpu)->cblist); pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n", jiffies - rsp->gp_start, @@ -1624,7 +1623,7 @@ unlock_out: static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) { bool needmore; - struct rcu_data *rdp = this_cpu_ptr(rsp->rda); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); needmore = ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed); if (!needmore) @@ -1936,7 +1935,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) rcu_for_each_node_breadth_first(rsp, rnp) { rcu_gp_slow(rsp, gp_init_delay); raw_spin_lock_irqsave_rcu_node(rnp, flags); - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); rcu_preempt_check_blocked_tasks(rsp, rnp); rnp->qsmask = rnp->qsmaskinit; WRITE_ONCE(rnp->gp_seq, rsp->gp_seq); @@ -2050,7 +2049,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) dump_blkd_tasks(rsp, rnp, 10); WARN_ON_ONCE(rnp->qsmask); WRITE_ONCE(rnp->gp_seq, new_gp_seq); - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); if (rnp == rdp->mynode) needgp = __note_gp_changes(rsp, rnp, rdp) || needgp; /* smp_mb() provided by prior unlock-lock pair. */ @@ -2070,7 +2069,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("end")); rsp->gp_state = RCU_GP_IDLE; /* Check for GP requests since above loop. */ - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); if (!needgp && ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed)) { trace_rcu_this_gp(rnp, rdp, rnp->gp_seq_needed, TPS("CleanupMore")); @@ -2405,7 +2404,7 @@ rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp) static void rcu_cleanup_dying_cpu(struct rcu_state *rsp) { RCU_TRACE(bool blkd;) - RCU_TRACE(struct rcu_data *rdp = this_cpu_ptr(rsp->rda);) + RCU_TRACE(struct rcu_data *rdp = this_cpu_ptr(&rcu_data);) RCU_TRACE(struct rcu_node *rnp = rdp->mynode;) if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) @@ -2469,7 +2468,7 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf) */ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp) { - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) @@ -2622,7 +2621,7 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)) for_each_leaf_node_possible_cpu(rnp, cpu) { unsigned long bit = leaf_node_cpu_bit(rnp, cpu); if ((rnp->qsmask & bit) != 0) { - if (f(per_cpu_ptr(rsp->rda, cpu))) + if (f(per_cpu_ptr(&rcu_data, cpu))) mask |= bit; } } @@ -2648,7 +2647,7 @@ static void force_quiescent_state(struct rcu_state *rsp) struct rcu_node *rnp_old = NULL; /* Funnel through hierarchy to reduce memory contention. */ - rnp = __this_cpu_read(rsp->rda->mynode); + rnp = __this_cpu_read(rcu_data.mynode); for (; rnp != NULL; rnp = rnp->parent) { ret = (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) || !raw_spin_trylock(&rnp->fqslock); @@ -2740,7 +2739,7 @@ static void __rcu_process_callbacks(struct rcu_state *rsp) { unsigned long flags; - struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); + struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); struct rcu_node *rnp = rdp->mynode; WARN_ON_ONCE(!rdp->beenonline); @@ -2894,14 +2893,14 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func, head->func = func; head->next = NULL; local_irq_save(flags); - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); /* Add the callback to our list. */ if (unlikely(!rcu_segcblist_is_enabled(&rdp->cblist)) || cpu != -1) { int offline; if (cpu != -1) - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); if (likely(rdp->mynode)) { /* Post-boot, so this should be for a no-CBs CPU. */ offline = !__call_rcu_nocb(rdp, head, lazy, flags); @@ -3135,7 +3134,7 @@ static int rcu_pending(void) struct rcu_state *rsp; for_each_rcu_flavor(rsp) - if (__rcu_pending(rsp, this_cpu_ptr(rsp->rda))) + if (__rcu_pending(rsp, this_cpu_ptr(&rcu_data))) return 1; return 0; } @@ -3153,7 +3152,7 @@ static bool rcu_cpu_has_callbacks(bool *all_lazy) struct rcu_state *rsp; for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); if (rcu_segcblist_empty(&rdp->cblist)) continue; hc = true; @@ -3202,7 +3201,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp) static void rcu_barrier_func(void *type) { struct rcu_state *rsp = type; - struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); + struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); _rcu_barrier_trace(rsp, TPS("IRQ"), -1, rsp->barrier_sequence); rdp->barrier_head.func = rcu_barrier_callback; @@ -3262,7 +3261,7 @@ static void _rcu_barrier(struct rcu_state *rsp) for_each_possible_cpu(cpu) { if (!cpu_online(cpu) && !rcu_is_nocb_cpu(cpu)) continue; - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); if (rcu_is_nocb_cpu(cpu)) { if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) { _rcu_barrier_trace(rsp, TPS("OfflineNoCB"), cpu, @@ -3372,7 +3371,7 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf) static void __init rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp) { - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); /* Set up local state, ensuring consistent view of global state. */ rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu); @@ -3398,7 +3397,7 @@ static void rcu_init_percpu_data(int cpu, struct rcu_state *rsp) { unsigned long flags; - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_node *rnp = rcu_get_root(rsp); /* Set up local state, ensuring consistent view of global state. */ @@ -3454,7 +3453,7 @@ int rcutree_prepare_cpu(unsigned int cpu) */ static void rcutree_affinity_setting(unsigned int cpu, int outgoing) { - struct rcu_data *rdp = per_cpu_ptr(rcu_state_p->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); rcu_boost_kthread_setaffinity(rdp->mynode, outgoing); } @@ -3471,7 +3470,7 @@ int rcutree_online_cpu(unsigned int cpu) struct rcu_state *rsp; for_each_rcu_flavor(rsp) { - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); rnp = rdp->mynode; raw_spin_lock_irqsave_rcu_node(rnp, flags); rnp->ffmask |= rdp->grpmask; @@ -3498,7 +3497,7 @@ int rcutree_offline_cpu(unsigned int cpu) struct rcu_state *rsp; for_each_rcu_flavor(rsp) { - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); rnp = rdp->mynode; raw_spin_lock_irqsave_rcu_node(rnp, flags); rnp->ffmask &= ~rdp->grpmask; @@ -3532,7 +3531,7 @@ int rcutree_dead_cpu(unsigned int cpu) for_each_rcu_flavor(rsp) { rcu_cleanup_dead_cpu(cpu, rsp); - do_nocb_deferred_wakeup(per_cpu_ptr(rsp->rda, cpu)); + do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu)); } return 0; } @@ -3566,7 +3565,7 @@ void rcu_cpu_starting(unsigned int cpu) per_cpu(rcu_cpu_started, cpu) = 1; for_each_rcu_flavor(rsp) { - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); rnp = rdp->mynode; mask = rdp->grpmask; raw_spin_lock_irqsave_rcu_node(rnp, flags); @@ -3600,7 +3599,7 @@ static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp) { unsigned long flags; unsigned long mask; - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ @@ -3633,7 +3632,7 @@ void rcu_report_dead(unsigned int cpu) /* QS for any half-done expedited RCU-sched GP. */ preempt_disable(); - rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(rcu_state.rda)); + rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(&rcu_data)); preempt_enable(); rcu_preempt_deferred_qs(current); for_each_rcu_flavor(rsp) @@ -3647,7 +3646,7 @@ static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) { unsigned long flags; struct rcu_data *my_rdp; - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_node *rnp_root = rcu_get_root(rdp->rsp); bool needwake; @@ -3655,7 +3654,7 @@ static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) return; /* No callbacks to migrate. */ local_irq_save(flags); - my_rdp = this_cpu_ptr(rsp->rda); + my_rdp = this_cpu_ptr(&rcu_data); if (rcu_nocb_adopt_orphan_cbs(my_rdp, rdp, flags)) { local_irq_restore(flags); return; @@ -3857,7 +3856,7 @@ static void __init rcu_init_one(struct rcu_state *rsp) for_each_possible_cpu(i) { while (i > rnp->grphi) rnp++; - per_cpu_ptr(rsp->rda, i)->mynode = rnp; + per_cpu_ptr(&rcu_data, i)->mynode = rnp; rcu_boot_init_percpu_data(i, rsp); } list_add(&rsp->flavors, &rcu_struct_flavors); diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index c50060567146..d60304f1ef56 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -312,7 +312,6 @@ struct rcu_state { struct rcu_node *level[RCU_NUM_LVLS + 1]; /* Hierarchy levels (+1 to */ /* shut bogus gcc warning) */ - struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */ int ncpus; /* # CPUs seen so far. */ /* The following fields are guarded by the root rcu_node's lock. */ diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 224f05f0c0c9..3a8a582d9958 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -286,7 +286,7 @@ static bool sync_exp_work_done(struct rcu_state *rsp, unsigned long s) */ static bool exp_funnel_lock(struct rcu_state *rsp, unsigned long s) { - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, raw_smp_processor_id()); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); struct rcu_node *rnp = rdp->mynode; struct rcu_node *rnp_root = rcu_get_root(rsp); @@ -361,7 +361,7 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) mask_ofl_test = 0; for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) { unsigned long mask = leaf_node_cpu_bit(rnp, cpu); - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_dynticks *rdtp = per_cpu_ptr(&rcu_dynticks, cpu); int snap; @@ -390,7 +390,7 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) /* IPI the remaining CPUs for expedited quiescent state. */ for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) { unsigned long mask = leaf_node_cpu_bit(rnp, cpu); - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); if (!(mask_ofl_ipi & mask)) continue; @@ -509,7 +509,7 @@ static void synchronize_sched_expedited_wait(struct rcu_state *rsp) if (!(rnp->expmask & mask)) continue; ndetected++; - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); pr_cont(" %d-%c%c%c", cpu, "O."[!!cpu_online(cpu)], "o."[!!(rdp->grpmask & rnp->expmaskinit)], @@ -642,7 +642,7 @@ static void _synchronize_rcu_expedited(struct rcu_state *rsp, } /* Wait for expedited grace period to complete. */ - rdp = per_cpu_ptr(rsp->rda, raw_smp_processor_id()); + rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); rnp = rcu_get_root(rsp); wait_event(rnp->exp_wq[rcu_seq_ctr(s) & 0x3], sync_exp_work_done(rsp, s)); @@ -665,7 +665,7 @@ static void sync_rcu_exp_handler(void *info) { unsigned long flags; struct rcu_state *rsp = info; - struct rcu_data *rdp = this_cpu_ptr(rsp->rda); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp = rdp->mynode; struct task_struct *t = current; @@ -772,13 +772,12 @@ EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); #else /* #ifdef CONFIG_PREEMPT_RCU */ /* Invoked on each online non-idle CPU for expedited quiescent state. */ -static void sync_sched_exp_handler(void *data) +static void sync_sched_exp_handler(void *unused) { struct rcu_data *rdp; struct rcu_node *rnp; - struct rcu_state *rsp = data; - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); rnp = rdp->mynode; if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || __this_cpu_read(rcu_data.cpu_no_qs.b.exp)) @@ -801,7 +800,7 @@ static void sync_sched_exp_online_cleanup(int cpu) struct rcu_node *rnp; struct rcu_state *rsp = &rcu_state; - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); rnp = rdp->mynode; if (!(READ_ONCE(rnp->expmask) & rdp->grpmask)) return; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 2c81f8dd63b4..b7a99a6e64b6 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -328,7 +328,7 @@ static void rcu_qs(void) void rcu_note_context_switch(bool preempt) { struct task_struct *t = current; - struct rcu_data *rdp = this_cpu_ptr(rcu_state_p->rda); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp; barrier(); /* Avoid RCU read-side critical sections leaking down. */ @@ -488,7 +488,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * t->rcu_read_unlock_special cannot change. */ special = t->rcu_read_unlock_special; - rdp = this_cpu_ptr(rcu_state_p->rda); + rdp = this_cpu_ptr(&rcu_data); if (!special.s && !rdp->deferred_qs) { local_irq_restore(flags); return; @@ -911,7 +911,7 @@ dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck) } pr_cont("\n"); for (cpu = rnp->grplo; cpu <= rnp->grphi; cpu++) { - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); onl = !!(rdp->grpmask & rcu_rnp_online_cpus(rnp)); pr_info("\t%d: %c online: %ld(%d) offline: %ld(%d)\n", cpu, ".o"[onl], @@ -1437,7 +1437,7 @@ static void __init rcu_spawn_boost_kthreads(void) static void rcu_prepare_kthreads(int cpu) { - struct rcu_data *rdp = per_cpu_ptr(rcu_state_p->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_node *rnp = rdp->mynode; /* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */ @@ -1574,7 +1574,7 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void) rdtp->last_advance_all = jiffies; for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); rnp = rdp->mynode; /* @@ -1692,7 +1692,7 @@ static void rcu_prepare_for_idle(void) return; rdtp->last_accelerate = jiffies; for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(rsp->rda); + rdp = this_cpu_ptr(&rcu_data); if (!rcu_segcblist_pend_cbs(&rdp->cblist)) continue; rnp = rdp->mynode; @@ -1778,7 +1778,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu) { unsigned long delta; char fast_no_hz[72]; - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_dynticks *rdtp = rdp->dynticks; char *ticks_title; unsigned long ticks_value; @@ -1833,7 +1833,7 @@ static void increment_cpu_stall_ticks(void) struct rcu_state *rsp; for_each_rcu_flavor(rsp) - raw_cpu_inc(rsp->rda->ticks_this_gp); + raw_cpu_inc(rcu_data.ticks_this_gp); } #ifdef CONFIG_RCU_NOCB_CPU @@ -1965,7 +1965,7 @@ static void wake_nocb_leader_defer(struct rcu_data *rdp, int waketype, */ static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu) { - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); unsigned long ret; #ifdef CONFIG_PROVE_RCU struct rcu_head *rhp; @@ -2426,7 +2426,7 @@ void __init rcu_init_nohz(void) for_each_rcu_flavor(rsp) { for_each_cpu(cpu, rcu_nocb_mask) - init_nocb_callback_list(per_cpu_ptr(rsp->rda, cpu)); + init_nocb_callback_list(per_cpu_ptr(&rcu_data, cpu)); rcu_organize_nocb_kthreads(rsp); } } @@ -2452,7 +2452,7 @@ static void rcu_spawn_one_nocb_kthread(struct rcu_state *rsp, int cpu) struct rcu_data *rdp; struct rcu_data *rdp_last; struct rcu_data *rdp_old_leader; - struct rcu_data *rdp_spawn = per_cpu_ptr(rsp->rda, cpu); + struct rcu_data *rdp_spawn = per_cpu_ptr(&rcu_data, cpu); struct task_struct *t; /* @@ -2545,7 +2545,7 @@ static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp) * we will spawn the needed set of rcu_nocb_kthread() kthreads. */ for_each_cpu(cpu, rcu_nocb_mask) { - rdp = per_cpu_ptr(rsp->rda, cpu); + rdp = per_cpu_ptr(&rcu_data, cpu); if (rdp->cpu >= nl) { /* New leader, set up for followers & next leader. */ nl = DIV_ROUND_UP(rdp->cpu + 1, ls) * ls; From 16fc9c600b3caf97f42cdd1e35309b7529a55cfb Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 15:54:39 -0700 Subject: [PATCH 040/140] rcu: Remove rcu_state_p pointer to default rcu_state structure The rcu_state_p pointer references the default rcu_state structure, that is, the one that call_rcu() uses, as opposed to call_rcu_bh() and sometimes call_rcu_sched(). But there is now only one rcu_state structure, so that one structure is by definition the default, which means that the rcu_state_p pointer no longer serves any useful purpose. This commit therefore removes it. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 27 ++++++++++++--------------- kernel/rcu/tree_exp.h | 2 +- kernel/rcu/tree_plugin.h | 16 ++++++++-------- 3 files changed, 21 insertions(+), 24 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e6b0bb0d00b7..e3cdec55ef3c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -85,7 +85,6 @@ struct rcu_state rcu_state = { .ofl_lock = __SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), }; -static struct rcu_state *const rcu_state_p = &rcu_state; static struct rcu_data __percpu *const rcu_data_p = &rcu_data; LIST_HEAD(rcu_struct_flavors); @@ -491,7 +490,7 @@ static int rcu_pending(void); */ unsigned long rcu_get_gp_seq(void) { - return READ_ONCE(rcu_state_p->gp_seq); + return READ_ONCE(rcu_state.gp_seq); } EXPORT_SYMBOL_GPL(rcu_get_gp_seq); @@ -510,7 +509,7 @@ EXPORT_SYMBOL_GPL(rcu_sched_get_gp_seq); */ unsigned long rcu_bh_get_gp_seq(void) { - return READ_ONCE(rcu_state_p->gp_seq); + return READ_ONCE(rcu_state.gp_seq); } EXPORT_SYMBOL_GPL(rcu_bh_get_gp_seq); @@ -522,7 +521,7 @@ EXPORT_SYMBOL_GPL(rcu_bh_get_gp_seq); */ unsigned long rcu_exp_batches_completed(void) { - return rcu_state_p->expedited_sequence; + return rcu_state.expedited_sequence; } EXPORT_SYMBOL_GPL(rcu_exp_batches_completed); @@ -541,7 +540,7 @@ EXPORT_SYMBOL_GPL(rcu_exp_batches_completed_sched); */ void rcu_force_quiescent_state(void) { - force_quiescent_state(rcu_state_p); + force_quiescent_state(&rcu_state); } EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); @@ -550,7 +549,7 @@ EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); */ void rcu_bh_force_quiescent_state(void) { - force_quiescent_state(rcu_state_p); + force_quiescent_state(&rcu_state); } EXPORT_SYMBOL_GPL(rcu_bh_force_quiescent_state); @@ -611,7 +610,7 @@ void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, case RCU_FLAVOR: case RCU_BH_FLAVOR: case RCU_SCHED_FLAVOR: - rsp = rcu_state_p; + rsp = &rcu_state; break; default: break; @@ -2292,7 +2291,6 @@ rcu_report_unblock_qs_rnp(struct rcu_state *rsp, raw_lockdep_assert_held_rcu_node(rnp); if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT)) || - WARN_ON_ONCE(rsp != rcu_state_p) || WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)) || rnp->qsmask != 0) { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); @@ -2604,7 +2602,6 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)) raw_spin_lock_irqsave_rcu_node(rnp, flags); if (rnp->qsmask == 0) { if (!IS_ENABLED(CONFIG_PREEMPT) || - rsp != rcu_state_p || rcu_preempt_blocked_readers_cgp(rnp)) { /* * No point in scanning bits because they @@ -2973,7 +2970,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func, */ void call_rcu(struct rcu_head *head, rcu_callback_t func) { - __call_rcu(head, func, rcu_state_p, -1, 0); + __call_rcu(head, func, &rcu_state, -1, 0); } EXPORT_SYMBOL_GPL(call_rcu); @@ -3000,7 +2997,7 @@ EXPORT_SYMBOL_GPL(call_rcu_sched); void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) { - __call_rcu(head, func, rcu_state_p, -1, 1); + __call_rcu(head, func, &rcu_state, -1, 1); } EXPORT_SYMBOL_GPL(kfree_call_rcu); @@ -3029,7 +3026,7 @@ unsigned long get_state_synchronize_rcu(void) * before the load from ->gp_seq. */ smp_mb(); /* ^^^ */ - return rcu_seq_snap(&rcu_state_p->gp_seq); + return rcu_seq_snap(&rcu_state.gp_seq); } EXPORT_SYMBOL_GPL(get_state_synchronize_rcu); @@ -3049,7 +3046,7 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_rcu); */ void cond_synchronize_rcu(unsigned long oldstate) { - if (!rcu_seq_done(&rcu_state_p->gp_seq, oldstate)) + if (!rcu_seq_done(&rcu_state.gp_seq, oldstate)) synchronize_rcu(); else smp_mb(); /* Ensure GP ends before subsequent accesses. */ @@ -3308,7 +3305,7 @@ static void _rcu_barrier(struct rcu_state *rsp) */ void rcu_barrier_bh(void) { - _rcu_barrier(rcu_state_p); + _rcu_barrier(&rcu_state); } EXPORT_SYMBOL_GPL(rcu_barrier_bh); @@ -3322,7 +3319,7 @@ EXPORT_SYMBOL_GPL(rcu_barrier_bh); */ void rcu_barrier(void) { - _rcu_barrier(rcu_state_p); + _rcu_barrier(&rcu_state); } EXPORT_SYMBOL_GPL(rcu_barrier); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 3a8a582d9958..298a6904bbcd 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -756,7 +756,7 @@ static void sync_sched_exp_online_cleanup(int cpu) */ void synchronize_rcu_expedited(void) { - struct rcu_state *rsp = rcu_state_p; + struct rcu_state *rsp = &rcu_state; RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index b7a99a6e64b6..329d5802d899 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -381,7 +381,7 @@ void rcu_note_context_switch(bool preempt) */ rcu_qs(); if (rdp->deferred_qs) - rcu_report_exp_rdp(rcu_state_p, rdp); + rcu_report_exp_rdp(&rcu_state, rdp); trace_rcu_utilization(TPS("End context switch")); barrier(); /* Avoid RCU read-side critical sections leaking up. */ } @@ -509,7 +509,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * blocked-tasks list below. */ if (rdp->deferred_qs) { - rcu_report_exp_rdp(rcu_state_p, rdp); + rcu_report_exp_rdp(&rcu_state, rdp); if (!t->rcu_read_unlock_special.s) { local_irq_restore(flags); return; @@ -566,7 +566,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) rnp->grplo, rnp->grphi, !!rnp->gp_tasks); - rcu_report_unblock_qs_rnp(rcu_state_p, rnp, flags); + rcu_report_unblock_qs_rnp(&rcu_state, rnp, flags); } else { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } @@ -580,7 +580,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * then we need to report up the rcu_node hierarchy. */ if (!empty_exp && empty_exp_now) - rcu_report_exp_rnp(rcu_state_p, rnp, true); + rcu_report_exp_rnp(&rcu_state, rnp, true); } else { local_irq_restore(flags); } @@ -1300,7 +1300,7 @@ static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, struct sched_param sp; struct task_struct *t; - if (rcu_state_p != rsp) + if (&rcu_state != rsp) return 0; if (!rcu_scheduler_fully_active || rcu_rnp_online_cpus(rnp) == 0) @@ -1431,8 +1431,8 @@ static void __init rcu_spawn_boost_kthreads(void) for_each_possible_cpu(cpu) per_cpu(rcu_cpu_has_work, cpu) = 0; BUG_ON(smpboot_register_percpu_thread(&rcu_cpu_thread_spec)); - rcu_for_each_leaf_node(rcu_state_p, rnp) - (void)rcu_spawn_one_boost_kthread(rcu_state_p, rnp); + rcu_for_each_leaf_node(&rcu_state, rnp) + (void)rcu_spawn_one_boost_kthread(&rcu_state, rnp); } static void rcu_prepare_kthreads(int cpu) @@ -1442,7 +1442,7 @@ static void rcu_prepare_kthreads(int cpu) /* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */ if (rcu_scheduler_fully_active) - (void)rcu_spawn_one_boost_kthread(rcu_state_p, rnp); + (void)rcu_spawn_one_boost_kthread(&rcu_state, rnp); } #else /* #ifdef CONFIG_RCU_BOOST */ From 2280ee5a7d3efca0dbb2c241029b6c63bec50a6b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 15:54:39 -0700 Subject: [PATCH 041/140] rcu: Remove rcu_data_p pointer to default rcu_data structure The rcu_data_p pointer references the default set of per-CPU rcu_data structures, that is, those that call_rcu() uses, as opposed to call_rcu_bh() and sometimes call_rcu_sched(). But there is now only one set of per-CPU rcu_data structures, so that one set is by definition the default, which means that the rcu_data_p pointer no longer serves any useful purpose. This commit therefore removes it. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 1 - kernel/rcu/tree_plugin.h | 10 +++++----- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e3cdec55ef3c..b650b0c9897e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -85,7 +85,6 @@ struct rcu_state rcu_state = { .ofl_lock = __SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), }; -static struct rcu_data __percpu *const rcu_data_p = &rcu_data; LIST_HEAD(rcu_struct_flavors); /* Dump rcu_node combining tree at boot to verify correct setup. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 329d5802d899..18175ca19f34 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -302,11 +302,11 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp) static void rcu_qs(void) { RCU_LOCKDEP_WARN(preemptible(), "rcu_qs() invoked with preemption enabled!!!\n"); - if (__this_cpu_read(rcu_data_p->cpu_no_qs.s)) { + if (__this_cpu_read(rcu_data.cpu_no_qs.s)) { trace_rcu_grace_period(TPS("rcu_preempt"), - __this_cpu_read(rcu_data_p->gp_seq), + __this_cpu_read(rcu_data.gp_seq), TPS("cpuqs")); - __this_cpu_write(rcu_data_p->cpu_no_qs.b.norm, false); + __this_cpu_write(rcu_data.cpu_no_qs.b.norm, false); barrier(); /* Coordinate with rcu_flavor_check_callbacks(). */ current->rcu_read_unlock_special.b.need_qs = false; } @@ -805,8 +805,8 @@ static void rcu_flavor_check_callbacks(int user) /* If GP is oldish, ask for help from rcu_read_unlock_special(). */ if (t->rcu_read_lock_nesting > 0 && - __this_cpu_read(rcu_data_p->core_needs_qs) && - __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm) && + __this_cpu_read(rcu_data.core_needs_qs) && + __this_cpu_read(rcu_data.cpu_no_qs.b.norm) && !t->rcu_read_unlock_special.b.need_qs && time_after(jiffies, rsp->gp_start + HZ)) t->rcu_read_unlock_special.b.need_qs = true; From b50912d0b5e03f11004fec1e2b50244de9e2fa41 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 042/140] rcu: Remove rsp parameter from rcu_report_qs_rnp() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_report_qs_rnp(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index b650b0c9897e..919033d2c083 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -132,9 +132,8 @@ EXPORT_SYMBOL_GPL(rcu_scheduler_active); */ static int rcu_scheduler_fully_active __read_mostly; -static void -rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp, - struct rcu_node *rnp, unsigned long gps, unsigned long flags); +static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, + unsigned long gps, unsigned long flags); static void rcu_init_new_rnp(struct rcu_node *rnp_leaf); static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf); static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu); @@ -1947,7 +1946,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) mask = rnp->qsmask & ~rnp->qsmaskinitnext; rnp->rcu_gp_init_mask = mask; if ((mask || rnp->wait_blkd_tasks) && rcu_is_leaf_node(rnp)) - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); else raw_spin_unlock_irq_rcu_node(rnp); cond_resched_tasks_rcu_qs(); @@ -2214,13 +2213,13 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags) * disabled. This allows propagating quiescent state due to resumed tasks * during grace-period initialization. */ -static void -rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp, - struct rcu_node *rnp, unsigned long gps, unsigned long flags) +static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, + unsigned long gps, unsigned long flags) __releases(rnp->lock) { unsigned long oldmask = 0; struct rcu_node *rnp_c; + struct rcu_state __maybe_unused *rsp = &rcu_state; raw_lockdep_assert_held_rcu_node(rnp); @@ -2312,7 +2311,7 @@ rcu_report_unblock_qs_rnp(struct rcu_state *rsp, mask = rnp->grpmask; raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ raw_spin_lock_rcu_node(rnp_p); /* irqs already disabled. */ - rcu_report_qs_rnp(mask, rsp, rnp_p, gps, flags); + rcu_report_qs_rnp(mask, rnp_p, gps, flags); } /* @@ -2355,7 +2354,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp) */ needwake = rcu_accelerate_cbs(rsp, rnp, rdp); - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); /* ^^^ Released rnp->lock */ if (needwake) rcu_gp_kthread_wake(rsp); @@ -2623,7 +2622,7 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)) } if (mask != 0) { /* Idle/offline CPUs, report (releases rnp->lock). */ - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); } else { /* Nothing to do here, so just drop the lock. */ raw_spin_unlock_irqrestore_rcu_node(rnp, flags); @@ -3577,7 +3576,7 @@ void rcu_cpu_starting(unsigned int cpu) rdp->rcu_onl_gp_flags = READ_ONCE(rsp->gp_flags); if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */ /* Report QS -after- changing ->qsmaskinitnext! */ - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); } else { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } @@ -3606,7 +3605,7 @@ static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp) rdp->rcu_ofl_gp_flags = READ_ONCE(rsp->gp_flags); if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */ /* Report quiescent state -before- changing ->qsmaskinitnext! */ - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); raw_spin_lock_irqsave_rcu_node(rnp, flags); } rnp->qsmaskinitnext &= ~mask; From aff4e9ede52badf550745c3d30ed5fcf86ed4351 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 043/140] rcu: Remove rsp parameter from rcu_report_qs_rsp() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_report_qs_rsp(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 919033d2c083..2665a45ccb43 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -138,7 +138,7 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf); static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf); static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu); static void invoke_rcu_core(void); -static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp); +static void invoke_rcu_callbacks(struct rcu_data *rdp); static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp); static void sync_sched_exp_online_cleanup(int cpu); @@ -2189,9 +2189,11 @@ static int __noreturn rcu_gp_kthread(void *arg) * just-completed grace period. Note that the caller must hold rnp->lock, * which is released before return. */ -static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags) +static void rcu_report_qs_rsp(unsigned long flags) __releases(rcu_get_root(rsp)->lock) { + struct rcu_state *rsp = &rcu_state; + raw_lockdep_assert_held_rcu_node(rcu_get_root(rsp)); WARN_ON_ONCE(!rcu_gp_in_progress(rsp)); WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); @@ -2268,7 +2270,7 @@ static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, * state for this grace period. Invoke rcu_report_qs_rsp() * to clean up and start the next grace period if one is needed. */ - rcu_report_qs_rsp(rsp, flags); /* releases rnp->lock. */ + rcu_report_qs_rsp(flags); /* releases rnp->lock. */ } /* @@ -2302,7 +2304,7 @@ rcu_report_unblock_qs_rnp(struct rcu_state *rsp, * Only one rcu_node structure in the tree, so don't * try to report up to its nonexistent parent! */ - rcu_report_qs_rsp(rsp, flags); + rcu_report_qs_rsp(flags); return; } @@ -2761,7 +2763,7 @@ __rcu_process_callbacks(struct rcu_state *rsp) /* If there are callbacks ready, invoke them. */ if (rcu_segcblist_ready_cbs(&rdp->cblist)) - invoke_rcu_callbacks(rsp, rdp); + invoke_rcu_callbacks(rdp); /* Do any needed deferred wakeups of rcuo kthreads. */ do_nocb_deferred_wakeup(rdp); @@ -2789,8 +2791,10 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused * are running on the current CPU with softirqs disabled, the * rcu_cpu_kthread_task cannot disappear out from under us. */ -static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp) +static void invoke_rcu_callbacks(struct rcu_data *rdp) { + struct rcu_state *rsp = &rcu_state; + if (unlikely(!READ_ONCE(rcu_scheduler_fully_active))) return; if (likely(!rsp->boost)) { From 139ad4da5ab5d5600b46d930dbf4419577039d9c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 044/140] rcu: Remove rsp parameter from rcu_report_unblock_qs_rnp() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_report_unblock_qs_rnp(), which is particularly appropriate in this case given that this parameter is no longer used. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 3 +-- kernel/rcu/tree_plugin.h | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2665a45ccb43..58aca700d67b 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2281,8 +2281,7 @@ static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, * disabled. */ static void __maybe_unused -rcu_report_unblock_qs_rnp(struct rcu_state *rsp, - struct rcu_node *rnp, unsigned long flags) +rcu_report_unblock_qs_rnp(struct rcu_node *rnp, unsigned long flags) __releases(rnp->lock) { unsigned long gps; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 18175ca19f34..566828ecaecb 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -566,7 +566,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) rnp->grplo, rnp->grphi, !!rnp->gp_tasks); - rcu_report_unblock_qs_rnp(&rcu_state, rnp, flags); + rcu_report_unblock_qs_rnp(rnp, flags); } else { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } From 33085c469aeaef3e1f8a203128cf886490419205 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 045/140] rcu: Remove rsp parameter from rcu_report_qs_rdp() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_report_qs_rdp(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 58aca700d67b..cdf53f8b31cd 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2320,12 +2320,13 @@ rcu_report_unblock_qs_rnp(struct rcu_node *rnp, unsigned long flags) * structure. This must be called from the specified CPU. */ static void -rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp) +rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) { unsigned long flags; unsigned long mask; bool needwake; struct rcu_node *rnp; + struct rcu_state *rsp = &rcu_state; rnp = rdp->mynode; raw_spin_lock_irqsave_rcu_node(rnp, flags); @@ -2392,7 +2393,7 @@ rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp) * Tell RCU we are done (but rcu_report_qs_rdp() will be the * judge of that). */ - rcu_report_qs_rdp(rdp->cpu, rsp, rdp); + rcu_report_qs_rdp(rdp->cpu, rdp); } /* From de8e87305a1ae878f7c518fd9cadcc9159cda493 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 046/140] rcu: Remove rsp parameter from rcu_gp_in_progress() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_gp_in_progress(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 30 +++++++++++++++--------------- kernel/rcu/tree_plugin.h | 2 +- 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index cdf53f8b31cd..1a2956d9e999 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -189,9 +189,9 @@ unsigned long rcu_rnp_online_cpus(struct rcu_node *rnp) * permit this function to be invoked without holding the root rcu_node * structure's ->lock, but of course results can be subject to change. */ -static int rcu_gp_in_progress(struct rcu_state *rsp) +static int rcu_gp_in_progress(void) { - return rcu_seq_state(rcu_seq_current(&rsp->gp_seq)); + return rcu_seq_state(rcu_seq_current(&rcu_state.gp_seq)); } void rcu_softirq_qs(void) @@ -1297,7 +1297,7 @@ static void rcu_stall_kick_kthreads(struct rcu_state *rsp) return; j = READ_ONCE(rsp->jiffies_kick_kthreads); if (time_after(jiffies, j) && rsp->gp_kthread && - (rcu_gp_in_progress(rsp) || READ_ONCE(rsp->gp_flags))) { + (rcu_gp_in_progress() || READ_ONCE(rsp->gp_flags))) { WARN_ONCE(1, "Kicking %s grace-period kthread\n", rsp->name); rcu_ftrace_dump(DUMP_ALL); wake_up_process(rsp->gp_kthread); @@ -1449,7 +1449,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) struct rcu_node *rnp; if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || - !rcu_gp_in_progress(rsp)) + !rcu_gp_in_progress()) return; rcu_stall_kick_kthreads(rsp); j = jiffies; @@ -1484,14 +1484,14 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) return; /* No stall or GP completed since entering function. */ rnp = rdp->mynode; jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; - if (rcu_gp_in_progress(rsp) && + if (rcu_gp_in_progress() && (READ_ONCE(rnp->qsmask) & rdp->grpmask) && cmpxchg(&rsp->jiffies_stall, js, jn) == js) { /* We haven't checked in, so go dump stack. */ print_cpu_stall(rsp); - } else if (rcu_gp_in_progress(rsp) && + } else if (rcu_gp_in_progress() && ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && cmpxchg(&rsp->jiffies_stall, js, jn) == js) { @@ -1589,7 +1589,7 @@ static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp, } /* If GP already in progress, just leave, otherwise start one. */ - if (rcu_gp_in_progress(rsp)) { + if (rcu_gp_in_progress()) { trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedleafroot")); goto unlock_out; } @@ -1846,7 +1846,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) } WRITE_ONCE(rsp->gp_flags, 0); /* Clear all flags: New grace period. */ - if (WARN_ON_ONCE(rcu_gp_in_progress(rsp))) { + if (WARN_ON_ONCE(rcu_gp_in_progress())) { /* * Grace period already in progress, don't start another. * Not supposed to be able to happen. @@ -2195,7 +2195,7 @@ static void rcu_report_qs_rsp(unsigned long flags) struct rcu_state *rsp = &rcu_state; raw_lockdep_assert_held_rcu_node(rcu_get_root(rsp)); - WARN_ON_ONCE(!rcu_gp_in_progress(rsp)); + WARN_ON_ONCE(!rcu_gp_in_progress()); WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(rsp), flags); rcu_gp_kthread_wake(rsp); @@ -2682,7 +2682,7 @@ rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp, struct rcu_node *rnp_root = rcu_get_root(rsp); static atomic_t warned = ATOMIC_INIT(0); - if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress(rsp) || + if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed)) return; j = jiffies; /* Expensive access, and in common case don't get here. */ @@ -2693,7 +2693,7 @@ rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp, raw_spin_lock_irqsave_rcu_node(rnp, flags); j = jiffies; - if (rcu_gp_in_progress(rsp) || + if (rcu_gp_in_progress() || ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) || time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) || @@ -2706,7 +2706,7 @@ rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp, if (rnp_root != rnp) raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */ j = jiffies; - if (rcu_gp_in_progress(rsp) || + if (rcu_gp_in_progress() || ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || time_before(j, rsp->gp_req_activity + gpssdelay) || time_before(j, rsp->gp_activity + gpssdelay) || @@ -2751,7 +2751,7 @@ __rcu_process_callbacks(struct rcu_state *rsp) rcu_check_quiescent_state(rsp, rdp); /* No grace period and unregistered callbacks? */ - if (!rcu_gp_in_progress(rsp) && + if (!rcu_gp_in_progress() && rcu_segcblist_is_enabled(&rdp->cblist)) { local_irq_save(flags); if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) @@ -2841,7 +2841,7 @@ static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, note_gp_changes(rsp, rdp); /* Start a new grace period if one not already started. */ - if (!rcu_gp_in_progress(rsp)) { + if (!rcu_gp_in_progress()) { rcu_accelerate_cbs_unlocked(rsp, rdp->mynode, rdp); } else { /* Give the grace period a kick. */ @@ -3105,7 +3105,7 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp) return 1; /* Has RCU gone idle with this CPU needing another grace period? */ - if (!rcu_gp_in_progress(rsp) && + if (!rcu_gp_in_progress() && rcu_segcblist_is_enabled(&rdp->cblist) && !rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) return 1; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 566828ecaecb..99f517035a6e 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -2655,7 +2655,7 @@ static bool rcu_nohz_full_cpu(struct rcu_state *rsp) { #ifdef CONFIG_NO_HZ_FULL if (tick_nohz_full_cpu(smp_processor_id()) && - (!rcu_gp_in_progress(rsp) || + (!rcu_gp_in_progress() || ULONG_CMP_LT(jiffies, READ_ONCE(rsp->gp_start) + HZ))) return true; #endif /* #ifdef CONFIG_NO_HZ_FULL */ From 336a4f6c451e488b5388a2593fa20f7192706c7b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 047/140] rcu: Remove rsp parameter from rcu_get_root() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_get_root(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 38 +++++++++++++++++++------------------- kernel/rcu/tree_exp.h | 6 +++--- kernel/rcu/tree_plugin.h | 2 +- 3 files changed, 23 insertions(+), 23 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1a2956d9e999..8d0e18faab3b 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -623,9 +623,9 @@ EXPORT_SYMBOL_GPL(rcutorture_get_gp_data); /* * Return the root node of the specified rcu_state structure. */ -static struct rcu_node *rcu_get_root(struct rcu_state *rsp) +static struct rcu_node *rcu_get_root(void) { - return &rsp->node[0]; + return &rcu_state.node[0]; } /* @@ -1318,7 +1318,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) unsigned long gpa; unsigned long j; int ndetected = 0; - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); long totqlen = 0; /* Kick and suppress, if so configured. */ @@ -1367,7 +1367,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n", rsp->name, j - gpa, j, gpa, jiffies_till_next_fqs, - rcu_get_root(rsp)->qsmask); + rcu_get_root()->qsmask); /* In this case, the current CPU might be at fault. */ sched_show_task(current); } @@ -1389,7 +1389,7 @@ static void print_cpu_stall(struct rcu_state *rsp) int cpu; unsigned long flags; struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); long totqlen = 0; /* Kick and suppress, if so configured. */ @@ -1835,7 +1835,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) unsigned long oldmask; unsigned long mask; struct rcu_data *rdp; - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); WRITE_ONCE(rsp->gp_activity, jiffies); raw_spin_lock_irq_rcu_node(rnp); @@ -1962,7 +1962,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) */ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp) { - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); /* Someone like call_rcu() requested a force-quiescent-state scan. */ *gfp = READ_ONCE(rsp->gp_flags); @@ -1981,7 +1981,7 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp) */ static void rcu_gp_fqs(struct rcu_state *rsp, bool first_time) { - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); WRITE_ONCE(rsp->gp_activity, jiffies); rsp->n_force_qs++; @@ -2010,7 +2010,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) bool needgp = false; unsigned long new_gp_seq; struct rcu_data *rdp; - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); struct swait_queue_head *sq; WRITE_ONCE(rsp->gp_activity, jiffies); @@ -2058,7 +2058,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) WRITE_ONCE(rsp->gp_activity, jiffies); rcu_gp_slow(rsp, gp_cleanup_delay); } - rnp = rcu_get_root(rsp); + rnp = rcu_get_root(); raw_spin_lock_irq_rcu_node(rnp); /* GP before rsp->gp_seq update. */ /* Declare grace period done. */ @@ -2094,7 +2094,7 @@ static int __noreturn rcu_gp_kthread(void *arg) unsigned long j; int ret; struct rcu_state *rsp = arg; - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); rcu_bind_gp_kthread(); for (;;) { @@ -2190,14 +2190,14 @@ static int __noreturn rcu_gp_kthread(void *arg) * which is released before return. */ static void rcu_report_qs_rsp(unsigned long flags) - __releases(rcu_get_root(rsp)->lock) + __releases(rcu_get_root()->lock) { struct rcu_state *rsp = &rcu_state; - raw_lockdep_assert_held_rcu_node(rcu_get_root(rsp)); + raw_lockdep_assert_held_rcu_node(rcu_get_root()); WARN_ON_ONCE(!rcu_gp_in_progress()); WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); - raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(rsp), flags); + raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(), flags); rcu_gp_kthread_wake(rsp); } @@ -2654,7 +2654,7 @@ static void force_quiescent_state(struct rcu_state *rsp) return; rnp_old = rnp; } - /* rnp_old == rcu_get_root(rsp), rnp == NULL. */ + /* rnp_old == rcu_get_root(), rnp == NULL. */ /* Reached the root of the rcu_node tree, acquire lock. */ raw_spin_lock_irqsave_rcu_node(rnp_old, flags); @@ -2679,7 +2679,7 @@ rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp, const unsigned long gpssdelay = rcu_jiffies_till_stall_check() * HZ; unsigned long flags; unsigned long j; - struct rcu_node *rnp_root = rcu_get_root(rsp); + struct rcu_node *rnp_root = rcu_get_root(); static atomic_t warned = ATOMIC_INIT(0); if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || @@ -3397,7 +3397,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp) { unsigned long flags; struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); /* Set up local state, ensuring consistent view of global state. */ raw_spin_lock_irqsave_rcu_node(rnp, flags); @@ -3646,7 +3646,7 @@ static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) unsigned long flags; struct rcu_data *my_rdp; struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - struct rcu_node *rnp_root = rcu_get_root(rdp->rsp); + struct rcu_node *rnp_root = rcu_get_root(); bool needwake; if (rcu_is_nocb_cpu(cpu) || rcu_segcblist_empty(&rdp->cblist)) @@ -3744,7 +3744,7 @@ static int __init rcu_spawn_gp_kthread(void) for_each_rcu_flavor(rsp) { t = kthread_create(rcu_gp_kthread, rsp, "%s", rsp->name); BUG_ON(IS_ERR(t)); - rnp = rcu_get_root(rsp); + rnp = rcu_get_root(); raw_spin_lock_irqsave_rcu_node(rnp, flags); rsp->gp_kthread = t; if (kthread_prio) { diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 298a6904bbcd..0bcbb03c9702 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -288,7 +288,7 @@ static bool exp_funnel_lock(struct rcu_state *rsp, unsigned long s) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); struct rcu_node *rnp = rdp->mynode; - struct rcu_node *rnp_root = rcu_get_root(rsp); + struct rcu_node *rnp_root = rcu_get_root(); /* Low-contention fastpath. */ if (ULONG_CMP_LT(READ_ONCE(rnp->exp_seq_rq), s) && @@ -479,7 +479,7 @@ static void synchronize_sched_expedited_wait(struct rcu_state *rsp) unsigned long mask; int ndetected; struct rcu_node *rnp; - struct rcu_node *rnp_root = rcu_get_root(rsp); + struct rcu_node *rnp_root = rcu_get_root(); int ret; trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("startwait")); @@ -643,7 +643,7 @@ static void _synchronize_rcu_expedited(struct rcu_state *rsp, /* Wait for expedited grace period to complete. */ rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); - rnp = rcu_get_root(rsp); + rnp = rcu_get_root(); wait_event(rnp->exp_wq[rcu_seq_ctr(s) & 0x3], sync_exp_work_done(rsp, s)); smp_mb(); /* Workqueue actions happen before return. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 99f517035a6e..545e4ac9422a 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -685,7 +685,7 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp) */ static void rcu_print_detail_task_stall(struct rcu_state *rsp) { - struct rcu_node *rnp = rcu_get_root(rsp); + struct rcu_node *rnp = rcu_get_root(); rcu_print_detail_task_stall_rnp(rnp); rcu_for_each_leaf_node(rsp, rnp) From ad3832e974eba3b6d253d60a28eac2f2da7ea7ff Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 048/140] rcu: Remove rsp parameter from record_gp_stall_check_time() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from record_gp_stall_check_time(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 8d0e18faab3b..bcfdb92d5d10 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1214,17 +1214,17 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) return 0; } -static void record_gp_stall_check_time(struct rcu_state *rsp) +static void record_gp_stall_check_time(void) { unsigned long j = jiffies; unsigned long j1; - rsp->gp_start = j; + rcu_state.gp_start = j; j1 = rcu_jiffies_till_stall_check(); /* Record ->gp_start before ->jiffies_stall. */ - smp_store_release(&rsp->jiffies_stall, j + j1); /* ^^^ */ - rsp->jiffies_resched = j + j1 / 2; - rsp->n_force_qs_gpstart = READ_ONCE(rsp->n_force_qs); + smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */ + rcu_state.jiffies_resched = j + j1 / 2; + rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); } /* @@ -1856,7 +1856,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) } /* Advance to a new grace period and initialize state. */ - record_gp_stall_check_time(rsp); + record_gp_stall_check_time(); /* Record GP times before starting GP, hence rcu_seq_start(). */ rcu_seq_start(&rsp->gp_seq); trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("start")); From 8fd119b6522fea9ba5e68a3aa653f1490778fb25 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 049/140] rcu: Remove rsp parameter from rcu_check_gp_kthread_starvation() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_check_gp_kthread_starvation(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index bcfdb92d5d10..09f05083f01d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1240,10 +1240,11 @@ static const char *gp_state_getname(short gs) /* * Complain about starvation of grace-period kthread. */ -static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp) +static void rcu_check_gp_kthread_starvation(void) { unsigned long gpa; unsigned long j; + struct rcu_state *rsp = &rcu_state; j = jiffies; gpa = READ_ONCE(rsp->gp_activity); @@ -1377,7 +1378,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) WRITE_ONCE(rsp->jiffies_stall, jiffies + 3 * rcu_jiffies_till_stall_check() + 3); - rcu_check_gp_kthread_starvation(rsp); + rcu_check_gp_kthread_starvation(); panic_on_rcu_stall(); @@ -1415,7 +1416,7 @@ static void print_cpu_stall(struct rcu_state *rsp) jiffies - rsp->gp_start, (long)rcu_seq_current(&rsp->gp_seq), totqlen); - rcu_check_gp_kthread_starvation(rsp); + rcu_check_gp_kthread_starvation(); rcu_dump_cpu_stacks(rsp); From 33dbdbf02538e8f088f83a89de68436da590ce76 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 050/140] rcu: Remove rsp parameter from rcu_dump_cpu_stacks() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_dump_cpu_stacks(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 09f05083f01d..3e252b80e0bf 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1270,13 +1270,13 @@ static void rcu_check_gp_kthread_starvation(void) * that don't support NMI-based stack dumps. The NMI-triggered stack * traces are more accurate because they are printed by the target CPU. */ -static void rcu_dump_cpu_stacks(struct rcu_state *rsp) +static void rcu_dump_cpu_stacks(void) { int cpu; unsigned long flags; struct rcu_node *rnp; - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(&rcu_state, rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); for_each_leaf_node_possible_cpu(rnp, cpu) if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) @@ -1355,7 +1355,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) smp_processor_id(), (long)(jiffies - rsp->gp_start), (long)rcu_seq_current(&rsp->gp_seq), totqlen); if (ndetected) { - rcu_dump_cpu_stacks(rsp); + rcu_dump_cpu_stacks(); /* Complain about tasks blocking the grace period. */ rcu_print_detail_task_stall(rsp); @@ -1418,7 +1418,7 @@ static void print_cpu_stall(struct rcu_state *rsp) rcu_check_gp_kthread_starvation(); - rcu_dump_cpu_stacks(rsp); + rcu_dump_cpu_stacks(); raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Rewrite if needed in case of slow consoles. */ From e1741c69d427596c67639b25f1309836e001c224 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 051/140] rcu: Remove rsp parameter from rcu_stall_kick_kthreads() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_stall_kick_kthreads(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 3e252b80e0bf..20466fe22e82 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1290,9 +1290,10 @@ static void rcu_dump_cpu_stacks(void) * If too much time has passed in the current grace period, and if * so configured, go kick the relevant kthreads. */ -static void rcu_stall_kick_kthreads(struct rcu_state *rsp) +static void rcu_stall_kick_kthreads(void) { unsigned long j; + struct rcu_state *rsp = &rcu_state; if (!rcu_kick_kthreads) return; @@ -1323,7 +1324,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) long totqlen = 0; /* Kick and suppress, if so configured. */ - rcu_stall_kick_kthreads(rsp); + rcu_stall_kick_kthreads(); if (rcu_cpu_stall_suppress) return; @@ -1394,7 +1395,7 @@ static void print_cpu_stall(struct rcu_state *rsp) long totqlen = 0; /* Kick and suppress, if so configured. */ - rcu_stall_kick_kthreads(rsp); + rcu_stall_kick_kthreads(); if (rcu_cpu_stall_suppress) return; @@ -1452,7 +1453,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || !rcu_gp_in_progress()) return; - rcu_stall_kick_kthreads(rsp); + rcu_stall_kick_kthreads(); j = jiffies; /* From a91e7e58b1016cd3ce043ab3dd5cde7a1b098215 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 052/140] rcu: Remove rsp parameter from print_other_cpu_stall() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from print_other_cpu_stall(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 20466fe22e82..13f507789588 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1313,7 +1313,7 @@ static void panic_on_rcu_stall(void) panic("RCU Stall\n"); } -static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) +static void print_other_cpu_stall(unsigned long gp_seq) { int cpu; unsigned long flags; @@ -1321,6 +1321,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) unsigned long j; int ndetected = 0; struct rcu_node *rnp = rcu_get_root(); + struct rcu_state *rsp = &rcu_state; long totqlen = 0; /* Kick and suppress, if so configured. */ @@ -1498,7 +1499,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) cmpxchg(&rsp->jiffies_stall, js, jn) == js) { /* They had a few time units to dump stack, so complain. */ - print_other_cpu_stall(rsp, gs2); + print_other_cpu_stall(gs2); } } From 4e8b8e08f931c9378dec9f304f8a170bcf5e70bb Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 053/140] rcu: Remove rsp parameter from print_cpu_stall() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from print_cpu_stall(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 13f507789588..f139b8202d5d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1387,12 +1387,13 @@ static void print_other_cpu_stall(unsigned long gp_seq) force_quiescent_state(rsp); /* Kick them all. */ } -static void print_cpu_stall(struct rcu_state *rsp) +static void print_cpu_stall(void) { int cpu; unsigned long flags; struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp = rcu_get_root(); + struct rcu_state *rsp = &rcu_state; long totqlen = 0; /* Kick and suppress, if so configured. */ @@ -1492,7 +1493,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) cmpxchg(&rsp->jiffies_stall, js, jn) == js) { /* We haven't checked in, so go dump stack. */ - print_cpu_stall(rsp); + print_cpu_stall(); } else if (rcu_gp_in_progress() && ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && From ea12ff2b7d97607bb69b50ccc30d3819b44ffb2b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 054/140] rcu: Remove rsp parameter from check_cpu_stall() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from check_cpu_stall(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index f139b8202d5d..a222afb6d74d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1442,7 +1442,7 @@ static void print_cpu_stall(void) resched_cpu(smp_processor_id()); } -static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) +static void check_cpu_stall(struct rcu_data *rdp) { unsigned long gs1; unsigned long gs2; @@ -1451,6 +1451,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) unsigned long jn; unsigned long js; struct rcu_node *rnp; + struct rcu_state *rsp = &rcu_state; if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || !rcu_gp_in_progress()) @@ -3094,7 +3095,7 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp) struct rcu_node *rnp = rdp->mynode; /* Check for CPU stalls, if enabled. */ - check_cpu_stall(rsp, rdp); + check_cpu_stall(rdp); /* Is this CPU a NO_HZ_FULL CPU that should ignore RCU? */ if (rcu_nohz_full_cpu(rsp)) From 3481f2eab09563456bbc7cb358ad5d151a509064 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 055/140] rcu: Remove rsp parameter from rcu_future_gp_cleanup() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_future_gp_cleanup(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index a222afb6d74d..87fc0727a9b8 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1622,7 +1622,7 @@ unlock_out: * Clean up any old requests for the just-ended grace period. Also return * whether any additional grace periods have been requested. */ -static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) +static bool rcu_future_gp_cleanup(struct rcu_node *rnp) { bool needmore; struct rcu_data *rdp = this_cpu_ptr(&rcu_data); @@ -2055,7 +2055,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) if (rnp == rdp->mynode) needgp = __note_gp_changes(rsp, rnp, rdp) || needgp; /* smp_mb() provided by prior unlock-lock pair. */ - needgp = rcu_future_gp_cleanup(rsp, rnp) || needgp; + needgp = rcu_future_gp_cleanup(rnp) || needgp; sq = rcu_nocb_gp_get(rnp); raw_spin_unlock_irq_rcu_node(rnp); rcu_nocb_gp_cleanup(sq); From 532c00c97f16a2a8576d453ae13ddc38162faed4 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 056/140] rcu: Remove rsp parameter from rcu_gp_kthread_wake() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_gp_kthread_wake(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 22 +++++++++++----------- kernel/rcu/tree_plugin.h | 4 ++-- 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 87fc0727a9b8..06f83fce416b 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1642,13 +1642,13 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp) * raced to awaken, and we lost), and finally don't try to awaken * a kthread that has not yet been created. */ -static void rcu_gp_kthread_wake(struct rcu_state *rsp) +static void rcu_gp_kthread_wake(void) { - if (current == rsp->gp_kthread || - !READ_ONCE(rsp->gp_flags) || - !rsp->gp_kthread) + if (current == rcu_state.gp_kthread || + !READ_ONCE(rcu_state.gp_flags) || + !rcu_state.gp_kthread) return; - swake_up_one(&rsp->gp_wq); + swake_up_one(&rcu_state.gp_wq); } /* @@ -1722,7 +1722,7 @@ static void rcu_accelerate_cbs_unlocked(struct rcu_state *rsp, needwake = rcu_accelerate_cbs(rsp, rnp, rdp); raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ if (needwake) - rcu_gp_kthread_wake(rsp); + rcu_gp_kthread_wake(); } /* @@ -1820,7 +1820,7 @@ static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp) needwake = __note_gp_changes(rsp, rnp, rdp); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); if (needwake) - rcu_gp_kthread_wake(rsp); + rcu_gp_kthread_wake(); } static void rcu_gp_slow(struct rcu_state *rsp, int delay) @@ -2203,7 +2203,7 @@ static void rcu_report_qs_rsp(unsigned long flags) WARN_ON_ONCE(!rcu_gp_in_progress()); WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(), flags); - rcu_gp_kthread_wake(rsp); + rcu_gp_kthread_wake(); } /* @@ -2364,7 +2364,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); /* ^^^ Released rnp->lock */ if (needwake) - rcu_gp_kthread_wake(rsp); + rcu_gp_kthread_wake(); } } @@ -2670,7 +2670,7 @@ static void force_quiescent_state(struct rcu_state *rsp) } WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags); - rcu_gp_kthread_wake(rsp); + rcu_gp_kthread_wake(); } /* @@ -3672,7 +3672,7 @@ static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) !rcu_segcblist_n_cbs(&my_rdp->cblist)); raw_spin_unlock_irqrestore_rcu_node(rnp_root, flags); if (needwake) - rcu_gp_kthread_wake(rsp); + rcu_gp_kthread_wake(); WARN_ONCE(rcu_segcblist_n_cbs(&rdp->cblist) != 0 || !rcu_segcblist_empty(&rdp->cblist), "rcu_cleanup_dead_cpu: Callbacks on offline CPU %d: qlen=%lu, 1stCB=%p\n", diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 545e4ac9422a..50ca000ad9f2 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1700,7 +1700,7 @@ static void rcu_prepare_for_idle(void) needwake = rcu_accelerate_cbs(rsp, rnp, rdp); raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ if (needwake) - rcu_gp_kthread_wake(rsp); + rcu_gp_kthread_wake(); } } @@ -2147,7 +2147,7 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp) needwake = rcu_start_this_gp(rnp, rdp, c); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); if (needwake) - rcu_gp_kthread_wake(rdp->rsp); + rcu_gp_kthread_wake(); } /* From 02f501423d0dde7a4b0dd138e0de6175bcf1926c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 057/140] rcu: Remove rsp parameter from rcu_accelerate_cbs() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_accelerate_cbs(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 15 +++++++-------- kernel/rcu/tree_plugin.h | 2 +- 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 06f83fce416b..984dbbf47265 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1663,11 +1663,11 @@ static void rcu_gp_kthread_wake(void) * * The caller must hold rnp->lock with interrupts disabled. */ -static bool rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp, - struct rcu_data *rdp) +static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp) { unsigned long gp_seq_req; bool ret = false; + struct rcu_state *rsp = &rcu_state; raw_lockdep_assert_held_rcu_node(rnp); @@ -1719,7 +1719,7 @@ static void rcu_accelerate_cbs_unlocked(struct rcu_state *rsp, return; } raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ - needwake = rcu_accelerate_cbs(rsp, rnp, rdp); + needwake = rcu_accelerate_cbs(rnp, rdp); raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ if (needwake) rcu_gp_kthread_wake(); @@ -1751,7 +1751,7 @@ static bool rcu_advance_cbs(struct rcu_state *rsp, struct rcu_node *rnp, rcu_segcblist_advance(&rdp->cblist, rnp->gp_seq); /* Classify any remaining callbacks. */ - return rcu_accelerate_cbs(rsp, rnp, rdp); + return rcu_accelerate_cbs(rnp, rdp); } /* @@ -1777,7 +1777,7 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, ret = rcu_advance_cbs(rsp, rnp, rdp); /* Advance callbacks. */ trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuend")); } else { - ret = rcu_accelerate_cbs(rsp, rnp, rdp); /* Recent callbacks. */ + ret = rcu_accelerate_cbs(rnp, rdp); /* Recent callbacks. */ } /* Now handle the beginnings of any new-to-this-CPU grace periods. */ @@ -2078,7 +2078,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) needgp = true; } /* Advance CBs to reduce false positives below. */ - if (!rcu_accelerate_cbs(rsp, rnp, rdp) && needgp) { + if (!rcu_accelerate_cbs(rnp, rdp) && needgp) { WRITE_ONCE(rsp->gp_flags, RCU_GP_FLAG_INIT); rsp->gp_req_activity = jiffies; trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), @@ -2331,7 +2331,6 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) unsigned long mask; bool needwake; struct rcu_node *rnp; - struct rcu_state *rsp = &rcu_state; rnp = rdp->mynode; raw_spin_lock_irqsave_rcu_node(rnp, flags); @@ -2359,7 +2358,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) * This GP can't end until cpu checks in, so all of our * callbacks can be processed during the next GP. */ - needwake = rcu_accelerate_cbs(rsp, rnp, rdp); + needwake = rcu_accelerate_cbs(rnp, rdp); rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); /* ^^^ Released rnp->lock */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 50ca000ad9f2..0c59c3987c60 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1697,7 +1697,7 @@ static void rcu_prepare_for_idle(void) continue; rnp = rdp->mynode; raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ - needwake = rcu_accelerate_cbs(rsp, rnp, rdp); + needwake = rcu_accelerate_cbs(rnp, rdp); raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ if (needwake) rcu_gp_kthread_wake(); From c6e09b97b9338de2b829a4005dc437e689bf903e Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 058/140] rcu: Remove rsp parameter from rcu_accelerate_cbs_unlocked() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_accelerate_cbs_unlocked(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 984dbbf47265..e66d9e446b1d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1704,15 +1704,14 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp) * that a new grace-period request be made, invokes rcu_accelerate_cbs() * while holding the leaf rcu_node structure's ->lock. */ -static void rcu_accelerate_cbs_unlocked(struct rcu_state *rsp, - struct rcu_node *rnp, +static void rcu_accelerate_cbs_unlocked(struct rcu_node *rnp, struct rcu_data *rdp) { unsigned long c; bool needwake; lockdep_assert_irqs_disabled(); - c = rcu_seq_snap(&rsp->gp_seq); + c = rcu_seq_snap(&rcu_state.gp_seq); if (!rdp->gpwrap && ULONG_CMP_GE(rdp->gp_seq_needed, c)) { /* Old request still live, so mark recent callbacks. */ (void)rcu_segcblist_accelerate(&rdp->cblist, c); @@ -2759,7 +2758,7 @@ __rcu_process_callbacks(struct rcu_state *rsp) rcu_segcblist_is_enabled(&rdp->cblist)) { local_irq_save(flags); if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) - rcu_accelerate_cbs_unlocked(rsp, rnp, rdp); + rcu_accelerate_cbs_unlocked(rnp, rdp); local_irq_restore(flags); } @@ -2846,7 +2845,7 @@ static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, /* Start a new grace period if one not already started. */ if (!rcu_gp_in_progress()) { - rcu_accelerate_cbs_unlocked(rsp, rdp->mynode, rdp); + rcu_accelerate_cbs_unlocked(rdp->mynode, rdp); } else { /* Give the grace period a kick. */ rdp->blimit = LONG_MAX; From 834f56bf54e866e8db9d09b02fb1f3c0184ec927 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 059/140] rcu: Remove rsp parameter from rcu_advance_cbs() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_advance_cbs(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e66d9e446b1d..6964d04c0823 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1734,8 +1734,7 @@ static void rcu_accelerate_cbs_unlocked(struct rcu_node *rnp, * * The caller must hold rnp->lock with interrupts disabled. */ -static bool rcu_advance_cbs(struct rcu_state *rsp, struct rcu_node *rnp, - struct rcu_data *rdp) +static bool rcu_advance_cbs(struct rcu_node *rnp, struct rcu_data *rdp) { raw_lockdep_assert_held_rcu_node(rnp); @@ -1773,7 +1772,7 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, /* Handle the ends of any preceding grace periods first. */ if (rcu_seq_completed_gp(rdp->gp_seq, rnp->gp_seq) || unlikely(READ_ONCE(rdp->gpwrap))) { - ret = rcu_advance_cbs(rsp, rnp, rdp); /* Advance callbacks. */ + ret = rcu_advance_cbs(rnp, rdp); /* Advance callbacks. */ trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuend")); } else { ret = rcu_accelerate_cbs(rnp, rdp); /* Recent callbacks. */ @@ -3663,8 +3662,8 @@ static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) } raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */ /* Leverage recent GPs and set GP for new callbacks. */ - needwake = rcu_advance_cbs(rsp, rnp_root, rdp) || - rcu_advance_cbs(rsp, rnp_root, my_rdp); + needwake = rcu_advance_cbs(rnp_root, rdp) || + rcu_advance_cbs(rnp_root, my_rdp); rcu_segcblist_merge(&my_rdp->cblist, &rdp->cblist); WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) != !rcu_segcblist_n_cbs(&my_rdp->cblist)); From c7e48f7ba3820145d08015108ea763bd03c888e9 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 060/140] rcu: Remove rsp parameter from __note_gp_changes() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from __note_gp_changes(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 6964d04c0823..3e1ec264a653 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1758,11 +1758,11 @@ static bool rcu_advance_cbs(struct rcu_node *rnp, struct rcu_data *rdp) * structure corresponding to the current CPU, and must have irqs disabled. * Returns true if the grace-period kthread needs to be awakened. */ -static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, - struct rcu_data *rdp) +static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) { bool ret; bool need_gp; + struct rcu_state __maybe_unused *rsp = &rcu_state; raw_lockdep_assert_held_rcu_node(rnp); @@ -1815,7 +1815,7 @@ static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp) local_irq_restore(flags); return; } - needwake = __note_gp_changes(rsp, rnp, rdp); + needwake = __note_gp_changes(rnp, rdp); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); if (needwake) rcu_gp_kthread_wake(); @@ -1940,7 +1940,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) rnp->qsmask = rnp->qsmaskinit; WRITE_ONCE(rnp->gp_seq, rsp->gp_seq); if (rnp == rdp->mynode) - (void)__note_gp_changes(rsp, rnp, rdp); + (void)__note_gp_changes(rnp, rdp); rcu_preempt_boost_start_gp(rnp); trace_rcu_grace_period_init(rsp->name, rnp->gp_seq, rnp->level, rnp->grplo, @@ -2051,7 +2051,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) WRITE_ONCE(rnp->gp_seq, new_gp_seq); rdp = this_cpu_ptr(&rcu_data); if (rnp == rdp->mynode) - needgp = __note_gp_changes(rsp, rnp, rdp) || needgp; + needgp = __note_gp_changes(rnp, rdp) || needgp; /* smp_mb() provided by prior unlock-lock pair. */ needgp = rcu_future_gp_cleanup(rnp) || needgp; sq = rcu_nocb_gp_get(rnp); From 15cabdffbbf629f2588612f092bdb37dfa16cc79 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 061/140] rcu: Remove rsp parameter from note_gp_changes() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from note_gp_changes(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 6 +++--- kernel/rcu/tree_plugin.h | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 3e1ec264a653..9189f7c70df5 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1801,7 +1801,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) return ret; } -static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp) +static void note_gp_changes(struct rcu_data *rdp) { unsigned long flags; bool needwake; @@ -2375,7 +2375,7 @@ static void rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp) { /* Check for grace-period ends and beginnings. */ - note_gp_changes(rsp, rdp); + note_gp_changes(rdp); /* * Does this CPU still need to do its part for current grace period? @@ -2840,7 +2840,7 @@ static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, rdp->qlen_last_fqs_check + qhimark)) { /* Are we ignoring a completed grace period? */ - note_gp_changes(rsp, rdp); + note_gp_changes(rdp); /* Start a new grace period if one not already started. */ if (!rcu_gp_in_progress()) { diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 0c59c3987c60..82f10a6bf266 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1586,7 +1586,7 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void) rcu_seq_current(&rnp->gp_seq)) || unlikely(READ_ONCE(rdp->gpwrap))) && rcu_segcblist_pend_cbs(&rdp->cblist)) - note_gp_changes(rsp, rdp); + note_gp_changes(rdp); if (rcu_segcblist_ready_cbs(&rdp->cblist)) cbs_ready = true; From 22212332c1f37da35e0d841b1e06421a4956e1ea Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 062/140] rcu: Remove rsp parameter from rcu_gp_slow() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_gp_slow(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 9189f7c70df5..29121629c004 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1821,10 +1821,10 @@ static void note_gp_changes(struct rcu_data *rdp) rcu_gp_kthread_wake(); } -static void rcu_gp_slow(struct rcu_state *rsp, int delay) +static void rcu_gp_slow(int delay) { if (delay > 0 && - !(rcu_seq_ctr(rsp->gp_seq) % + !(rcu_seq_ctr(rcu_state.gp_seq) % (rcu_num_nodes * PER_RCU_NODE_PERIOD * delay))) schedule_timeout_uninterruptible(delay); } @@ -1917,7 +1917,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) raw_spin_unlock_irq_rcu_node(rnp); spin_unlock(&rsp->ofl_lock); } - rcu_gp_slow(rsp, gp_preinit_delay); /* Races with CPU hotplug. */ + rcu_gp_slow(gp_preinit_delay); /* Races with CPU hotplug. */ /* * Set the quiescent-state-needed bits in all the rcu_node @@ -1933,7 +1933,7 @@ static bool rcu_gp_init(struct rcu_state *rsp) */ rsp->gp_state = RCU_GP_INIT; rcu_for_each_node_breadth_first(rsp, rnp) { - rcu_gp_slow(rsp, gp_init_delay); + rcu_gp_slow(gp_init_delay); raw_spin_lock_irqsave_rcu_node(rnp, flags); rdp = this_cpu_ptr(&rcu_data); rcu_preempt_check_blocked_tasks(rsp, rnp); @@ -2059,7 +2059,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) rcu_nocb_gp_cleanup(sq); cond_resched_tasks_rcu_qs(); WRITE_ONCE(rsp->gp_activity, jiffies); - rcu_gp_slow(rsp, gp_cleanup_delay); + rcu_gp_slow(gp_cleanup_delay); } rnp = rcu_get_root(); raw_spin_lock_irq_rcu_node(rnp); /* GP before rsp->gp_seq update. */ From 0854a05c9fa554930174f0fa7453c18f99108a4a Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 063/140] rcu: Remove rsp parameter from rcu_gp_kthread() and friends There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_gp_init(), rcu_gp_fqs_check_wake(), rcu_gp_fqs(), rcu_gp_cleanup(), and rcu_gp_kthread(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 29121629c004..af4aeaaee046 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1832,13 +1832,14 @@ static void rcu_gp_slow(int delay) /* * Initialize a new grace period. Return false if no grace period required. */ -static bool rcu_gp_init(struct rcu_state *rsp) +static bool rcu_gp_init(void) { unsigned long flags; unsigned long oldmask; unsigned long mask; struct rcu_data *rdp; struct rcu_node *rnp = rcu_get_root(); + struct rcu_state *rsp = &rcu_state; WRITE_ONCE(rsp->gp_activity, jiffies); raw_spin_lock_irq_rcu_node(rnp); @@ -1963,12 +1964,12 @@ static bool rcu_gp_init(struct rcu_state *rsp) * Helper function for swait_event_idle_exclusive() wakeup at force-quiescent-state * time. */ -static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp) +static bool rcu_gp_fqs_check_wake(int *gfp) { struct rcu_node *rnp = rcu_get_root(); /* Someone like call_rcu() requested a force-quiescent-state scan. */ - *gfp = READ_ONCE(rsp->gp_flags); + *gfp = READ_ONCE(rcu_state.gp_flags); if (*gfp & RCU_GP_FLAG_FQS) return true; @@ -1982,9 +1983,10 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp) /* * Do one round of quiescent-state forcing. */ -static void rcu_gp_fqs(struct rcu_state *rsp, bool first_time) +static void rcu_gp_fqs(bool first_time) { struct rcu_node *rnp = rcu_get_root(); + struct rcu_state *rsp = &rcu_state; WRITE_ONCE(rsp->gp_activity, jiffies); rsp->n_force_qs++; @@ -2007,13 +2009,14 @@ static void rcu_gp_fqs(struct rcu_state *rsp, bool first_time) /* * Clean up after the old grace period. */ -static void rcu_gp_cleanup(struct rcu_state *rsp) +static void rcu_gp_cleanup(void) { unsigned long gp_duration; bool needgp = false; unsigned long new_gp_seq; struct rcu_data *rdp; struct rcu_node *rnp = rcu_get_root(); + struct rcu_state *rsp = &rcu_state; struct swait_queue_head *sq; WRITE_ONCE(rsp->gp_activity, jiffies); @@ -2090,13 +2093,13 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) /* * Body of kthread that handles grace periods. */ -static int __noreturn rcu_gp_kthread(void *arg) +static int __noreturn rcu_gp_kthread(void *unused) { bool first_gp_fqs; int gf; unsigned long j; int ret; - struct rcu_state *rsp = arg; + struct rcu_state *rsp = &rcu_state; struct rcu_node *rnp = rcu_get_root(); rcu_bind_gp_kthread(); @@ -2112,7 +2115,7 @@ static int __noreturn rcu_gp_kthread(void *arg) RCU_GP_FLAG_INIT); rsp->gp_state = RCU_GP_DONE_GPS; /* Locking provides needed memory barrier. */ - if (rcu_gp_init(rsp)) + if (rcu_gp_init()) break; cond_resched_tasks_rcu_qs(); WRITE_ONCE(rsp->gp_activity, jiffies); @@ -2137,7 +2140,7 @@ static int __noreturn rcu_gp_kthread(void *arg) TPS("fqswait")); rsp->gp_state = RCU_GP_WAIT_FQS; ret = swait_event_idle_timeout_exclusive(rsp->gp_wq, - rcu_gp_fqs_check_wake(rsp, &gf), j); + rcu_gp_fqs_check_wake(&gf), j); rsp->gp_state = RCU_GP_DOING_FQS; /* Locking provides needed memory barriers. */ /* If grace period done, leave loop. */ @@ -2150,7 +2153,7 @@ static int __noreturn rcu_gp_kthread(void *arg) trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), TPS("fqsstart")); - rcu_gp_fqs(rsp, first_gp_fqs); + rcu_gp_fqs(first_gp_fqs); first_gp_fqs = false; trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), @@ -2178,7 +2181,7 @@ static int __noreturn rcu_gp_kthread(void *arg) /* Handle grace-period end. */ rsp->gp_state = RCU_GP_CLEANUP; - rcu_gp_cleanup(rsp); + rcu_gp_cleanup(); rsp->gp_state = RCU_GP_CLEANED; } } @@ -3744,7 +3747,7 @@ static int __init rcu_spawn_gp_kthread(void) rcu_scheduler_fully_active = 1; for_each_rcu_flavor(rsp) { - t = kthread_create(rcu_gp_kthread, rsp, "%s", rsp->name); + t = kthread_create(rcu_gp_kthread, NULL, "%s", rsp->name); BUG_ON(IS_ERR(t)); rnp = rcu_get_root(); raw_spin_lock_irqsave_rcu_node(rnp, flags); From 8087d3e3c453a7caad389dbd78a32bf19a536928 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 064/140] rcu: Remove rsp parameter from rcu_check_quiescent_state() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_check_quiescent_state(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index af4aeaaee046..51d076495548 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2375,7 +2375,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) * quiescent state for this grace period, and record that fact if so. */ static void -rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp) +rcu_check_quiescent_state(struct rcu_data *rdp) { /* Check for grace-period ends and beginnings. */ note_gp_changes(rdp); @@ -2753,7 +2753,7 @@ __rcu_process_callbacks(struct rcu_state *rsp) resched_cpu(rdp->cpu); /* Provoke future context switch. */ /* Update RCU state based on any recent quiescent states. */ - rcu_check_quiescent_state(rsp, rdp); + rcu_check_quiescent_state(rdp); /* No grace period and unregistered callbacks? */ if (!rcu_gp_in_progress() && From 780cd590836fe24bc2a81b8cd7c2f9cbe495421e Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 065/140] rcu: Remove rsp parameter from CPU hotplug functions There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_cleanup_dying_cpu() and rcu_cleanup_dead_cpu(). And, as long as we are in the neighborhood, inlines them into rcutree_dying_cpu() and rcutree_dead_cpu(), respectively. This also eliminates a pair of for_each_rcu_flavor() loops. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 43 +++++++++++-------------------------------- 1 file changed, 11 insertions(+), 32 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 51d076495548..f06a4bf58b25 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2402,20 +2402,22 @@ rcu_check_quiescent_state(struct rcu_data *rdp) } /* - * Trace the fact that this CPU is going offline. + * Near the end of the offline process. Trace the fact that this CPU + * is going offline. */ -static void rcu_cleanup_dying_cpu(struct rcu_state *rsp) +int rcutree_dying_cpu(unsigned int cpu) { RCU_TRACE(bool blkd;) RCU_TRACE(struct rcu_data *rdp = this_cpu_ptr(&rcu_data);) RCU_TRACE(struct rcu_node *rnp = rdp->mynode;) if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) - return; + return 0; RCU_TRACE(blkd = !!(rnp->qsmask & rdp->grpmask);) - trace_rcu_grace_period(rsp->name, rnp->gp_seq, + trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, blkd ? TPS("cpuofl") : TPS("cpuofl-bgp")); + return 0; } /* @@ -2469,16 +2471,19 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf) * There can only be one CPU hotplug operation at a time, so no need for * explicit locking. */ -static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp) +int rcutree_dead_cpu(unsigned int cpu) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) - return; + return 0; /* Adjust any no-longer-needed kthreads. */ rcu_boost_kthread_setaffinity(rnp, -1); + /* Do any needed no-CB deferred wakeups from this CPU. */ + do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu)); + return 0; } /* @@ -3514,32 +3519,6 @@ int rcutree_offline_cpu(unsigned int cpu) return 0; } -/* - * Near the end of the offline process. We do only tracing here. - */ -int rcutree_dying_cpu(unsigned int cpu) -{ - struct rcu_state *rsp; - - for_each_rcu_flavor(rsp) - rcu_cleanup_dying_cpu(rsp); - return 0; -} - -/* - * The outgoing CPU is gone and we are running elsewhere. - */ -int rcutree_dead_cpu(unsigned int cpu) -{ - struct rcu_state *rsp; - - for_each_rcu_flavor(rsp) { - rcu_cleanup_dead_cpu(cpu, rsp); - do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu)); - } - return 0; -} - static DEFINE_PER_CPU(int, rcu_cpu_started); /* From 5bb5d09cc4f868497dfec2f8101f580f2c571816 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 066/140] rcu: Remove rsp parameter from rcu_do_batch() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_do_batch(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 5 +++-- kernel/rcu/tree_plugin.h | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index f06a4bf58b25..174261a3c193 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2490,12 +2490,13 @@ int rcutree_dead_cpu(unsigned int cpu) * Invoke any RCU callbacks that have made it to the end of their grace * period. Thottle as specified by rdp->blimit. */ -static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp) +static void rcu_do_batch(struct rcu_data *rdp) { unsigned long flags; struct rcu_head *rhp; struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); long bl, count; + struct rcu_state *rsp = &rcu_state; /* If no callbacks are ready, just return. */ if (!rcu_segcblist_ready_cbs(&rdp->cblist)) { @@ -2808,7 +2809,7 @@ static void invoke_rcu_callbacks(struct rcu_data *rdp) if (unlikely(!READ_ONCE(rcu_scheduler_fully_active))) return; if (likely(!rsp->boost)) { - rcu_do_batch(rsp, rdp); + rcu_do_batch(rdp); return; } invoke_rcu_callbacks_kthread(); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 82f10a6bf266..c678c76a754e 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1324,7 +1324,7 @@ static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, static void rcu_kthread_do_work(void) { - rcu_do_batch(&rcu_state, this_cpu_ptr(&rcu_data)); + rcu_do_batch(this_cpu_ptr(&rcu_data)); } static void rcu_cpu_kthread_setup(unsigned int cpu) From e9ecb780fe7d881ebd290663d5cfb9dd7b5e58f4 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 067/140] rcu: Remove rsp parameter from force-quiescent-state functions There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from force_qs_rnp() and force_quiescent_state(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 174261a3c193..2644ed685024 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -479,8 +479,8 @@ module_param(rcu_kick_kthreads, bool, 0644); static ulong jiffies_till_sched_qs = HZ / 10; module_param(jiffies_till_sched_qs, ulong, 0444); -static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)); -static void force_quiescent_state(struct rcu_state *rsp); +static void force_qs_rnp(int (*f)(struct rcu_data *rsp)); +static void force_quiescent_state(void); static int rcu_pending(void); /* @@ -538,7 +538,7 @@ EXPORT_SYMBOL_GPL(rcu_exp_batches_completed_sched); */ void rcu_force_quiescent_state(void) { - force_quiescent_state(&rcu_state); + force_quiescent_state(); } EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); @@ -547,7 +547,7 @@ EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); */ void rcu_bh_force_quiescent_state(void) { - force_quiescent_state(&rcu_state); + force_quiescent_state(); } EXPORT_SYMBOL_GPL(rcu_bh_force_quiescent_state); @@ -1384,7 +1384,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) panic_on_rcu_stall(); - force_quiescent_state(rsp); /* Kick them all. */ + force_quiescent_state(); /* Kick them all. */ } static void print_cpu_stall(void) @@ -1992,10 +1992,10 @@ static void rcu_gp_fqs(bool first_time) rsp->n_force_qs++; if (first_time) { /* Collect dyntick-idle snapshots. */ - force_qs_rnp(rsp, dyntick_save_progress_counter); + force_qs_rnp(dyntick_save_progress_counter); } else { /* Handle dyntick-idle and offline CPUs. */ - force_qs_rnp(rsp, rcu_implicit_dynticks_qs); + force_qs_rnp(rcu_implicit_dynticks_qs); } /* Clear flag to prevent immediate re-entry. */ if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) { @@ -2600,12 +2600,13 @@ void rcu_check_callbacks(int user) * * The caller must have suppressed start of new grace periods. */ -static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)) +static void force_qs_rnp(int (*f)(struct rcu_data *rsp)) { int cpu; unsigned long flags; unsigned long mask; struct rcu_node *rnp; + struct rcu_state *rsp = &rcu_state; rcu_for_each_leaf_node(rsp, rnp) { cond_resched_tasks_rcu_qs(); @@ -2647,12 +2648,13 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)) * Force quiescent states on reluctant CPUs, and also detect which * CPUs are in dyntick-idle mode. */ -static void force_quiescent_state(struct rcu_state *rsp) +static void force_quiescent_state(void) { unsigned long flags; bool ret; struct rcu_node *rnp; struct rcu_node *rnp_old = NULL; + struct rcu_state *rsp = &rcu_state; /* Funnel through hierarchy to reduce memory contention. */ rnp = __this_cpu_read(rcu_data.mynode); @@ -2859,7 +2861,7 @@ static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, rdp->blimit = LONG_MAX; if (rsp->n_force_qs == rdp->n_force_qs_snap && rcu_segcblist_first_pend_cb(&rdp->cblist) != head) - force_quiescent_state(rsp); + force_quiescent_state(); rdp->n_force_qs_snap = rsp->n_force_qs; rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist); } From b96f9dc4fb642b2fa604bc0b64464356ef2b54f5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 068/140] rcu: Remove rsp parameter from rcu_check_gp_start_stall() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_check_gp_start_stall(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2644ed685024..f0a9f809de4c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2686,13 +2686,13 @@ static void force_quiescent_state(void) * RCU to come out of its idle mode. */ static void -rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp, - struct rcu_data *rdp) +rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp) { const unsigned long gpssdelay = rcu_jiffies_till_stall_check() * HZ; unsigned long flags; unsigned long j; struct rcu_node *rnp_root = rcu_get_root(); + struct rcu_state *rsp = &rcu_state; static atomic_t warned = ATOMIC_INIT(0); if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || @@ -2772,7 +2772,7 @@ __rcu_process_callbacks(struct rcu_state *rsp) local_irq_restore(flags); } - rcu_check_gp_start_stall(rsp, rnp, rdp); + rcu_check_gp_start_stall(rnp, rdp); /* If there are callbacks ready, invoke them. */ if (rcu_segcblist_ready_cbs(&rdp->cblist)) From b049fdf8e3b986c2695642fa2d2ceeec55245fb1 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 069/140] rcu: Remove rsp parameter from __rcu_process_callbacks() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from __rcu_process_callbacks(), and also inlines it into rcu_process_callbacks(), removing the for_each_rcu_flavor() while in the neighborhood. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 26 +++++++------------------- 1 file changed, 7 insertions(+), 19 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index f0a9f809de4c..6c860045eaf4 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2741,17 +2741,19 @@ rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp) } /* - * This does the RCU core processing work for the specified rcu_state - * and rcu_data structures. This may be called only from the CPU to - * whom the rdp belongs. + * This does the RCU core processing work for the specified rcu_data + * structures. This may be called only from the CPU to whom the rdp + * belongs. */ -static void -__rcu_process_callbacks(struct rcu_state *rsp) +static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) { unsigned long flags; struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); struct rcu_node *rnp = rdp->mynode; + if (cpu_is_offline(smp_processor_id())) + return; + trace_rcu_utilization(TPS("Start RCU core")); WARN_ON_ONCE(!rdp->beenonline); /* Report any deferred quiescent states if preemption enabled. */ @@ -2780,20 +2782,6 @@ __rcu_process_callbacks(struct rcu_state *rsp) /* Do any needed deferred wakeups of rcuo kthreads. */ do_nocb_deferred_wakeup(rdp); -} - -/* - * Do RCU core processing for the current CPU. - */ -static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) -{ - struct rcu_state *rsp; - - if (cpu_is_offline(smp_processor_id())) - return; - trace_rcu_utilization(TPS("Start RCU core")); - for_each_rcu_flavor(rsp) - __rcu_process_callbacks(rsp); trace_rcu_utilization(TPS("End RCU core")); } From 5c7d89676bc51966ea7882703d15795587e7108c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 070/140] rcu: Remove rsp parameter from __call_rcu() and friend There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from __call_rcu_core() and __call_rcu(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 6c860045eaf4..9f5e67e303c0 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2814,8 +2814,8 @@ static void invoke_rcu_core(void) /* * Handle any core-RCU processing required by a call_rcu() invocation. */ -static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, - struct rcu_head *head, unsigned long flags) +static void __call_rcu_core(struct rcu_data *rdp, struct rcu_head *head, + unsigned long flags) { /* * If called from an extended quiescent state, invoke the RCU @@ -2847,10 +2847,10 @@ static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, } else { /* Give the grace period a kick. */ rdp->blimit = LONG_MAX; - if (rsp->n_force_qs == rdp->n_force_qs_snap && + if (rcu_state.n_force_qs == rdp->n_force_qs_snap && rcu_segcblist_first_pend_cb(&rdp->cblist) != head) force_quiescent_state(); - rdp->n_force_qs_snap = rsp->n_force_qs; + rdp->n_force_qs_snap = rcu_state.n_force_qs; rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist); } } @@ -2870,11 +2870,11 @@ static void rcu_leak_callback(struct rcu_head *rhp) * is expected to specify a CPU. */ static void -__call_rcu(struct rcu_head *head, rcu_callback_t func, - struct rcu_state *rsp, int cpu, bool lazy) +__call_rcu(struct rcu_head *head, rcu_callback_t func, int cpu, bool lazy) { unsigned long flags; struct rcu_data *rdp; + struct rcu_state __maybe_unused *rsp = &rcu_state; /* Misaligned rcu_head! */ WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1)); @@ -2932,7 +2932,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func, rcu_segcblist_n_cbs(&rdp->cblist)); /* Go handle any RCU core processing required. */ - __call_rcu_core(rsp, rdp, head, flags); + __call_rcu_core(rdp, head, flags); local_irq_restore(flags); } @@ -2973,7 +2973,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func, */ void call_rcu(struct rcu_head *head, rcu_callback_t func) { - __call_rcu(head, func, &rcu_state, -1, 0); + __call_rcu(head, func, -1, 0); } EXPORT_SYMBOL_GPL(call_rcu); @@ -3000,7 +3000,7 @@ EXPORT_SYMBOL_GPL(call_rcu_sched); void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) { - __call_rcu(head, func, &rcu_state, -1, 1); + __call_rcu(head, func, -1, 1); } EXPORT_SYMBOL_GPL(kfree_call_rcu); @@ -3272,7 +3272,7 @@ static void _rcu_barrier(struct rcu_state *rsp) smp_mb__before_atomic(); atomic_inc(&rsp->barrier_cpu_count); __call_rcu(&rdp->barrier_head, - rcu_barrier_callback, rsp, cpu, 0); + rcu_barrier_callback, cpu, 0); } } else if (rcu_segcblist_n_cbs(&rdp->cblist)) { _rcu_barrier_trace(rsp, TPS("OnlineQ"), cpu, From 98ece508b545bdaa5575ab46c68f17981516f689 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 071/140] rcu: Remove rsp parameter from __rcu_pending() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from __rcu_pending(), and also inlines it into rcu_pending(), removing the for_each_rcu_flavor() while in the neighborhood.. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 34 ++++++++++------------------------ 1 file changed, 10 insertions(+), 24 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 9f5e67e303c0..7ce691348b51 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2997,8 +2997,7 @@ EXPORT_SYMBOL_GPL(call_rcu_sched); * callbacks in the list of pending callbacks. Until then, this * function may only be called from __kfree_rcu(). */ -void kfree_call_rcu(struct rcu_head *head, - rcu_callback_t func) +void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) { __call_rcu(head, func, -1, 1); } @@ -3080,21 +3079,23 @@ void cond_synchronize_sched(unsigned long oldstate) EXPORT_SYMBOL_GPL(cond_synchronize_sched); /* - * Check to see if there is any immediate RCU-related work to be done - * by the current CPU, for the specified type of RCU, returning 1 if so. - * The checks are in order of increasing expense: checks that can be - * carried out against CPU-local state are performed first. However, - * we must check for CPU stalls first, else we might not get a chance. + * Check to see if there is any immediate RCU-related work to be done by + * the current CPU, for the specified type of RCU, returning 1 if so and + * zero otherwise. The checks are in order of increasing expense: checks + * that can be carried out against CPU-local state are performed first. + * However, we must check for CPU stalls first, else we might not get + * a chance. */ -static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp) +static int rcu_pending(void) { + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp = rdp->mynode; /* Check for CPU stalls, if enabled. */ check_cpu_stall(rdp); /* Is this CPU a NO_HZ_FULL CPU that should ignore RCU? */ - if (rcu_nohz_full_cpu(rsp)) + if (rcu_nohz_full_cpu(&rcu_state)) return 0; /* Is the RCU core waiting for a quiescent state from this CPU? */ @@ -3124,21 +3125,6 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp) return 0; } -/* - * Check to see if there is any immediate RCU-related work to be done - * by the current CPU, returning 1 if so. This function is part of the - * RCU implementation; it is -not- an exported member of the RCU API. - */ -static int rcu_pending(void) -{ - struct rcu_state *rsp; - - for_each_rcu_flavor(rsp) - if (__rcu_pending(rsp, this_cpu_ptr(&rcu_data))) - return 1; - return 0; -} - /* * Return true if the specified CPU has any callback. If all_lazy is * non-NULL, store an indication of whether all callbacks are lazy. From 8344b871b1d575ba630ca57448ea4cbc84daba0f Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 072/140] rcu: Remove rsp parameter from _rcu_barrier() and friends There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from _rcu_barrier_trace() and _rcu_barrier(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 7ce691348b51..d3428d4a68dc 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3156,11 +3156,10 @@ static bool rcu_cpu_has_callbacks(bool *all_lazy) * Helper function for _rcu_barrier() tracing. If tracing is disabled, * the compiler is expected to optimize this away. */ -static void _rcu_barrier_trace(struct rcu_state *rsp, const char *s, - int cpu, unsigned long done) +static void _rcu_barrier_trace(const char *s, int cpu, unsigned long done) { - trace_rcu_barrier(rsp->name, s, cpu, - atomic_read(&rsp->barrier_cpu_count), done); + trace_rcu_barrier(rcu_state.name, s, cpu, + atomic_read(&rcu_state.barrier_cpu_count), done); } /* @@ -3173,11 +3172,10 @@ static void rcu_barrier_callback(struct rcu_head *rhp) struct rcu_state *rsp = rdp->rsp; if (atomic_dec_and_test(&rsp->barrier_cpu_count)) { - _rcu_barrier_trace(rsp, TPS("LastCB"), -1, - rsp->barrier_sequence); + _rcu_barrier_trace(TPS("LastCB"), -1, rsp->barrier_sequence); complete(&rsp->barrier_completion); } else { - _rcu_barrier_trace(rsp, TPS("CB"), -1, rsp->barrier_sequence); + _rcu_barrier_trace(TPS("CB"), -1, rsp->barrier_sequence); } } @@ -3189,15 +3187,14 @@ static void rcu_barrier_func(void *type) struct rcu_state *rsp = type; struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); - _rcu_barrier_trace(rsp, TPS("IRQ"), -1, rsp->barrier_sequence); + _rcu_barrier_trace(TPS("IRQ"), -1, rsp->barrier_sequence); rdp->barrier_head.func = rcu_barrier_callback; debug_rcu_head_queue(&rdp->barrier_head); if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head, 0)) { atomic_inc(&rsp->barrier_cpu_count); } else { debug_rcu_head_unqueue(&rdp->barrier_head); - _rcu_barrier_trace(rsp, TPS("IRQNQ"), -1, - rsp->barrier_sequence); + _rcu_barrier_trace(TPS("IRQNQ"), -1, rsp->barrier_sequence); } } @@ -3205,21 +3202,21 @@ static void rcu_barrier_func(void *type) * Orchestrate the specified type of RCU barrier, waiting for all * RCU callbacks of the specified type to complete. */ -static void _rcu_barrier(struct rcu_state *rsp) +static void _rcu_barrier(void) { int cpu; struct rcu_data *rdp; + struct rcu_state *rsp = &rcu_state; unsigned long s = rcu_seq_snap(&rsp->barrier_sequence); - _rcu_barrier_trace(rsp, TPS("Begin"), -1, s); + _rcu_barrier_trace(TPS("Begin"), -1, s); /* Take mutex to serialize concurrent rcu_barrier() requests. */ mutex_lock(&rsp->barrier_mutex); /* Did someone else do our work for us? */ if (rcu_seq_done(&rsp->barrier_sequence, s)) { - _rcu_barrier_trace(rsp, TPS("EarlyExit"), -1, - rsp->barrier_sequence); + _rcu_barrier_trace(TPS("EarlyExit"), -1, rsp->barrier_sequence); smp_mb(); /* caller's subsequent code after above check. */ mutex_unlock(&rsp->barrier_mutex); return; @@ -3227,7 +3224,7 @@ static void _rcu_barrier(struct rcu_state *rsp) /* Mark the start of the barrier operation. */ rcu_seq_start(&rsp->barrier_sequence); - _rcu_barrier_trace(rsp, TPS("Inc1"), -1, rsp->barrier_sequence); + _rcu_barrier_trace(TPS("Inc1"), -1, rsp->barrier_sequence); /* * Initialize the count to one rather than to zero in order to @@ -3250,10 +3247,10 @@ static void _rcu_barrier(struct rcu_state *rsp) rdp = per_cpu_ptr(&rcu_data, cpu); if (rcu_is_nocb_cpu(cpu)) { if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) { - _rcu_barrier_trace(rsp, TPS("OfflineNoCB"), cpu, + _rcu_barrier_trace(TPS("OfflineNoCB"), cpu, rsp->barrier_sequence); } else { - _rcu_barrier_trace(rsp, TPS("OnlineNoCB"), cpu, + _rcu_barrier_trace(TPS("OnlineNoCB"), cpu, rsp->barrier_sequence); smp_mb__before_atomic(); atomic_inc(&rsp->barrier_cpu_count); @@ -3261,11 +3258,11 @@ static void _rcu_barrier(struct rcu_state *rsp) rcu_barrier_callback, cpu, 0); } } else if (rcu_segcblist_n_cbs(&rdp->cblist)) { - _rcu_barrier_trace(rsp, TPS("OnlineQ"), cpu, + _rcu_barrier_trace(TPS("OnlineQ"), cpu, rsp->barrier_sequence); smp_call_function_single(cpu, rcu_barrier_func, rsp, 1); } else { - _rcu_barrier_trace(rsp, TPS("OnlineNQ"), cpu, + _rcu_barrier_trace(TPS("OnlineNQ"), cpu, rsp->barrier_sequence); } } @@ -3282,7 +3279,7 @@ static void _rcu_barrier(struct rcu_state *rsp) wait_for_completion(&rsp->barrier_completion); /* Mark the end of the barrier operation. */ - _rcu_barrier_trace(rsp, TPS("Inc2"), -1, rsp->barrier_sequence); + _rcu_barrier_trace(TPS("Inc2"), -1, rsp->barrier_sequence); rcu_seq_end(&rsp->barrier_sequence); /* Other rcu_barrier() invocations can now safely proceed. */ @@ -3294,7 +3291,7 @@ static void _rcu_barrier(struct rcu_state *rsp) */ void rcu_barrier_bh(void) { - _rcu_barrier(&rcu_state); + _rcu_barrier(); } EXPORT_SYMBOL_GPL(rcu_barrier_bh); @@ -3308,7 +3305,7 @@ EXPORT_SYMBOL_GPL(rcu_barrier_bh); */ void rcu_barrier(void) { - _rcu_barrier(&rcu_state); + _rcu_barrier(); } EXPORT_SYMBOL_GPL(rcu_barrier); From 53b46303da84d611cd281f74a6538d47709b06b5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 073/140] rcu: Remove rsp parameter from rcu_boot_init_percpu_data() and friends There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_boot_init_percpu_data(), rcu_init_percpu_data(), rcu_cleanup_dying_idle_cpu(), and rcu_migrate_callbacks(). While in the neighborhood, line the last three into rcutree_prepare_cpu(), rcu_report_dead() and rcutree_migrate_callbacks(), respectively. This also gets rid of the for_each_rcu_flavor() calls that were in those tree functions. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 108 ++++++++++++++++------------------------------ 1 file changed, 38 insertions(+), 70 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index d3428d4a68dc..2a49a04a1d98 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3352,7 +3352,7 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf) * Do boot-time initialization of a CPU's per-CPU RCU data. */ static void __init -rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp) +rcu_boot_init_percpu_data(int cpu) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); @@ -3361,23 +3361,25 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp) rdp->dynticks = &per_cpu(rcu_dynticks, cpu); WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1); WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp->dynticks))); - rdp->rcu_ofl_gp_seq = rsp->gp_seq; + rdp->rcu_ofl_gp_seq = rcu_state.gp_seq; rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED; - rdp->rcu_onl_gp_seq = rsp->gp_seq; + rdp->rcu_onl_gp_seq = rcu_state.gp_seq; rdp->rcu_onl_gp_flags = RCU_GP_CLEANED; rdp->cpu = cpu; - rdp->rsp = rsp; + rdp->rsp = &rcu_state; rcu_boot_init_nocb_percpu_data(rdp); } /* - * Initialize a CPU's per-CPU RCU data. Note that only one online or + * Invoked early in the CPU-online process, when pretty much all services + * are available. The incoming CPU is not present. + * + * Initializes a CPU's per-CPU RCU data. Note that only one online or * offline event can be happening at a given time. Note also that we can * accept some slop in the rsp->gp_seq access due to the fact that this * CPU cannot possibly have any RCU callbacks in flight yet. */ -static void -rcu_init_percpu_data(int cpu, struct rcu_state *rsp) +int rcutree_prepare_cpu(unsigned int cpu) { unsigned long flags; struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); @@ -3386,7 +3388,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp) /* Set up local state, ensuring consistent view of global state. */ raw_spin_lock_irqsave_rcu_node(rnp, flags); rdp->qlen_last_fqs_check = 0; - rdp->n_force_qs_snap = rsp->n_force_qs; + rdp->n_force_qs_snap = rcu_state.n_force_qs; rdp->blimit = blimit; if (rcu_segcblist_empty(&rdp->cblist) && /* No early-boot CBs? */ !init_nocb_callback_list(rdp)) @@ -3410,21 +3412,8 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp) rdp->core_needs_qs = false; rdp->rcu_iw_pending = false; rdp->rcu_iw_gp_seq = rnp->gp_seq - 1; - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuonl")); + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuonl")); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); -} - -/* - * Invoked early in the CPU-online process, when pretty much all - * services are available. The incoming CPU is not present. - */ -int rcutree_prepare_cpu(unsigned int cpu) -{ - struct rcu_state *rsp; - - for_each_rcu_flavor(rsp) - rcu_init_percpu_data(cpu, rsp); - rcu_prepare_kthreads(cpu); rcu_spawn_all_nocb_kthreads(cpu); @@ -3547,37 +3536,9 @@ void rcu_cpu_starting(unsigned int cpu) } #ifdef CONFIG_HOTPLUG_CPU -/* - * The CPU is exiting the idle loop into the arch_cpu_idle_dead() - * function. We now remove it from the rcu_node tree's ->qsmaskinitnext - * bit masks. - */ -static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp) -{ - unsigned long flags; - unsigned long mask; - struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ - - /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ - mask = rdp->grpmask; - spin_lock(&rsp->ofl_lock); - raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */ - rdp->rcu_ofl_gp_seq = READ_ONCE(rsp->gp_seq); - rdp->rcu_ofl_gp_flags = READ_ONCE(rsp->gp_flags); - if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */ - /* Report quiescent state -before- changing ->qsmaskinitnext! */ - rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); - raw_spin_lock_irqsave_rcu_node(rnp, flags); - } - rnp->qsmaskinitnext &= ~mask; - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - spin_unlock(&rsp->ofl_lock); -} - /* * The outgoing function has no further need of RCU, so remove it from - * the list of CPUs that RCU must track. + * the rcu_node tree's ->qsmaskinitnext bit masks. * * Note that this function is special in that it is invoked directly * from the outgoing CPU rather than from the cpuhp_step mechanism. @@ -3585,21 +3546,41 @@ static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp) */ void rcu_report_dead(unsigned int cpu) { - struct rcu_state *rsp; + unsigned long flags; + unsigned long mask; + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); + struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ /* QS for any half-done expedited RCU-sched GP. */ preempt_disable(); rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(&rcu_data)); preempt_enable(); rcu_preempt_deferred_qs(current); - for_each_rcu_flavor(rsp) - rcu_cleanup_dying_idle_cpu(cpu, rsp); + + /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ + mask = rdp->grpmask; + spin_lock(&rcu_state.ofl_lock); + raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */ + rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq); + rdp->rcu_ofl_gp_flags = READ_ONCE(rcu_state.gp_flags); + if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */ + /* Report quiescent state -before- changing ->qsmaskinitnext! */ + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); + raw_spin_lock_irqsave_rcu_node(rnp, flags); + } + rnp->qsmaskinitnext &= ~mask; + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + spin_unlock(&rcu_state.ofl_lock); per_cpu(rcu_cpu_started, cpu) = 0; } -/* Migrate the dead CPU's callbacks to the current CPU. */ -static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) +/* + * The outgoing CPU has just passed through the dying-idle state, and we + * are being invoked from the CPU that was IPIed to continue the offline + * operation. Migrate the outgoing CPU's callbacks to the current CPU. + */ +void rcutree_migrate_callbacks(int cpu) { unsigned long flags; struct rcu_data *my_rdp; @@ -3632,19 +3613,6 @@ static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) cpu, rcu_segcblist_n_cbs(&rdp->cblist), rcu_segcblist_first_cb(&rdp->cblist)); } - -/* - * The outgoing CPU has just passed through the dying-idle state, - * and we are being invoked from the CPU that was IPIed to continue the - * offline operation. We need to migrate the outgoing CPU's callbacks. - */ -void rcutree_migrate_callbacks(int cpu) -{ - struct rcu_state *rsp; - - for_each_rcu_flavor(rsp) - rcu_migrate_callbacks(cpu, rsp); -} #endif /* @@ -3814,7 +3782,7 @@ static void __init rcu_init_one(struct rcu_state *rsp) while (i > rnp->grphi) rnp++; per_cpu_ptr(&rcu_data, i)->mynode = rnp; - rcu_boot_init_percpu_data(i, rsp); + rcu_boot_init_percpu_data(i); } list_add(&rsp->flavors, &rcu_struct_flavors); } From b8bb1f63cf9ac43fc3015449843fe1f81c1b31a6 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 074/140] rcu: Remove rsp parameter from rcu_init_one() and friends There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_init_one() and rcu_dump_rcu_node_tree(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2a49a04a1d98..0b274530e8a8 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3708,7 +3708,7 @@ void rcu_scheduler_starting(void) /* * Helper function for rcu_init() that initializes one rcu_state structure. */ -static void __init rcu_init_one(struct rcu_state *rsp) +static void __init rcu_init_one(void) { static const char * const buf[] = RCU_NODE_NAME_INIT; static const char * const fqs[] = RCU_FQS_NAME_INIT; @@ -3720,6 +3720,7 @@ static void __init rcu_init_one(struct rcu_state *rsp) int i; int j; struct rcu_node *rnp; + struct rcu_state *rsp = &rcu_state; BUILD_BUG_ON(RCU_NUM_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */ @@ -3870,14 +3871,14 @@ static void __init rcu_init_geometry(void) * Dump out the structure of the rcu_node combining tree associated * with the rcu_state structure referenced by rsp. */ -static void __init rcu_dump_rcu_node_tree(struct rcu_state *rsp) +static void __init rcu_dump_rcu_node_tree(void) { int level = 0; struct rcu_node *rnp; pr_info("rcu_node tree layout dump\n"); pr_info(" "); - rcu_for_each_node_breadth_first(rsp, rnp) { + rcu_for_each_node_breadth_first(&rcu_state, rnp) { if (rnp->level != level) { pr_cont("\n"); pr_info(" "); @@ -3899,9 +3900,9 @@ void __init rcu_init(void) rcu_bootup_announce(); rcu_init_geometry(); - rcu_init_one(&rcu_state); + rcu_init_one(); if (dump_tree) - rcu_dump_rcu_node_tree(&rcu_state); + rcu_dump_rcu_node_tree(); open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); /* From a2887cd85f38cf2fdbf42bad97e5c412d99ff5ca Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 075/140] rcu: Remove rsp parameter from rcu_print_detail_task_stall() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_print_detail_task_stall(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 2 +- kernel/rcu/tree.h | 2 +- kernel/rcu/tree_plugin.h | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0b274530e8a8..130ce5eebdfa 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1360,7 +1360,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) rcu_dump_cpu_stacks(); /* Complain about tasks blocking the grace period. */ - rcu_print_detail_task_stall(rsp); + rcu_print_detail_task_stall(); } else { if (rcu_seq_current(&rsp->gp_seq) != gp_seq) { pr_err("INFO: Stall ended before state dump start\n"); diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index d60304f1ef56..00d268cb4d04 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -452,7 +452,7 @@ static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); #ifdef CONFIG_HOTPLUG_CPU static bool rcu_preempt_has_tasks(struct rcu_node *rnp); #endif /* #ifdef CONFIG_HOTPLUG_CPU */ -static void rcu_print_detail_task_stall(struct rcu_state *rsp); +static void rcu_print_detail_task_stall(void); static int rcu_print_task_stall(struct rcu_node *rnp); static int rcu_print_task_exp_stall(struct rcu_node *rnp); static void rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index c678c76a754e..1d8148b0d4e5 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -683,12 +683,12 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp) * Dump detailed information for all tasks blocking the current RCU * grace period. */ -static void rcu_print_detail_task_stall(struct rcu_state *rsp) +static void rcu_print_detail_task_stall(void) { struct rcu_node *rnp = rcu_get_root(); rcu_print_detail_task_stall_rnp(rnp); - rcu_for_each_leaf_node(rsp, rnp) + rcu_for_each_leaf_node(&rcu_state, rnp) rcu_print_detail_task_stall_rnp(rnp); } @@ -1005,7 +1005,7 @@ static void rcu_preempt_deferred_qs(struct task_struct *t) { } * Because preemptible RCU does not exist, we never have to check for * tasks blocked within RCU read-side critical sections. */ -static void rcu_print_detail_task_stall(struct rcu_state *rsp) +static void rcu_print_detail_task_stall(void) { } From 81ab59a3ad8656620d7106e855085bc12dc13a4c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 076/140] rcu: Remove rsp parameter from dump_blkd_tasks() and friend There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from dump_blkd_tasks() and rcu_preempt_blocked_readers_cgp(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 4 ++-- kernel/rcu/tree.h | 6 ++---- kernel/rcu/tree_plugin.h | 12 +++++------- 3 files changed, 9 insertions(+), 13 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 130ce5eebdfa..0d69f198390b 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1937,7 +1937,7 @@ static bool rcu_gp_init(void) rcu_gp_slow(gp_init_delay); raw_spin_lock_irqsave_rcu_node(rnp, flags); rdp = this_cpu_ptr(&rcu_data); - rcu_preempt_check_blocked_tasks(rsp, rnp); + rcu_preempt_check_blocked_tasks(rnp); rnp->qsmask = rnp->qsmaskinit; WRITE_ONCE(rnp->gp_seq, rsp->gp_seq); if (rnp == rdp->mynode) @@ -2049,7 +2049,7 @@ static void rcu_gp_cleanup(void) rcu_for_each_node_breadth_first(rsp, rnp) { raw_spin_lock_irq_rcu_node(rnp); if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) - dump_blkd_tasks(rsp, rnp, 10); + dump_blkd_tasks(rnp, 10); WARN_ON_ONCE(rnp->qsmask); WRITE_ONCE(rnp->gp_seq, new_gp_seq); rdp = this_cpu_ptr(&rcu_data); diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 00d268cb4d04..ccdee6bd3919 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -455,12 +455,10 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp); static void rcu_print_detail_task_stall(void); static int rcu_print_task_stall(struct rcu_node *rnp); static int rcu_print_task_exp_stall(struct rcu_node *rnp); -static void rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, - struct rcu_node *rnp); +static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp); static void rcu_flavor_check_callbacks(int user); void call_rcu(struct rcu_head *head, rcu_callback_t func); -static void dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, - int ncheck); +static void dump_blkd_tasks(struct rcu_node *rnp, int ncheck); static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); static void rcu_preempt_boost_start_gp(struct rcu_node *rnp); static void invoke_rcu_callbacks_kthread(void); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 1d8148b0d4e5..9a3d30121815 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -756,14 +756,13 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp) * Also, if there are blocked tasks on the list, they automatically * block the newly created grace period, so set up ->gp_tasks accordingly. */ -static void -rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp) +static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) { struct task_struct *t; RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n"); if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) - dump_blkd_tasks(rsp, rnp, 10); + dump_blkd_tasks(rnp, 10); if (rcu_preempt_has_tasks(rnp) && (rnp->qsmaskinit || rnp->wait_blkd_tasks)) { rnp->gp_tasks = rnp->blkd_tasks.next; @@ -884,7 +883,7 @@ void exit_rcu(void) * specified number of elements. */ static void -dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck) +dump_blkd_tasks(struct rcu_node *rnp, int ncheck) { int cpu; int i; @@ -1033,8 +1032,7 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp) * so there is no need to check for blocked tasks. So check only for * bogus qsmask values. */ -static void -rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp) +static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) { WARN_ON_ONCE(rnp->qsmask); } @@ -1095,7 +1093,7 @@ void exit_rcu(void) * Dump the guaranteed-empty blocked-tasks state. Trust but verify. */ static void -dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck) +dump_blkd_tasks(struct rcu_node *rnp, int ncheck) { WARN_ON_ONCE(!list_empty(&rnp->blkd_tasks)); } From 6dbfdc1409cf07accf7c97475c3b58d46daa319b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 077/140] rcu: Remove rsp parameter from rcu_spawn_one_boost_kthread() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_spawn_one_boost_kthread(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 4 ---- kernel/rcu/tree_plugin.h | 13 ++++++------- 2 files changed, 6 insertions(+), 11 deletions(-) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index ccdee6bd3919..dc1c337f6da9 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -463,10 +463,6 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); static void rcu_preempt_boost_start_gp(struct rcu_node *rnp); static void invoke_rcu_callbacks_kthread(void); static bool rcu_is_callbacks_kthread(void); -#ifdef CONFIG_RCU_BOOST -static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, - struct rcu_node *rnp); -#endif /* #ifdef CONFIG_RCU_BOOST */ static void __init rcu_spawn_boost_kthreads(void); static void rcu_prepare_kthreads(int cpu); static void rcu_cleanup_after_idle(void); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 9a3d30121815..9a6dea5fab86 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1290,21 +1290,20 @@ static void rcu_preempt_boost_start_gp(struct rcu_node *rnp) * already exist. We only create this kthread for preemptible RCU. * Returns zero if all is well, a negated errno otherwise. */ -static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, - struct rcu_node *rnp) +static int rcu_spawn_one_boost_kthread(struct rcu_node *rnp) { - int rnp_index = rnp - &rsp->node[0]; + int rnp_index = rnp - rcu_get_root(); unsigned long flags; struct sched_param sp; struct task_struct *t; - if (&rcu_state != rsp) + if (!IS_ENABLED(CONFIG_PREEMPT_RCU)) return 0; if (!rcu_scheduler_fully_active || rcu_rnp_online_cpus(rnp) == 0) return 0; - rsp->boost = 1; + rcu_state.boost = 1; if (rnp->boost_kthread_task != NULL) return 0; t = kthread_create(rcu_boost_kthread, (void *)rnp, @@ -1430,7 +1429,7 @@ static void __init rcu_spawn_boost_kthreads(void) per_cpu(rcu_cpu_has_work, cpu) = 0; BUG_ON(smpboot_register_percpu_thread(&rcu_cpu_thread_spec)); rcu_for_each_leaf_node(&rcu_state, rnp) - (void)rcu_spawn_one_boost_kthread(&rcu_state, rnp); + (void)rcu_spawn_one_boost_kthread(rnp); } static void rcu_prepare_kthreads(int cpu) @@ -1440,7 +1439,7 @@ static void rcu_prepare_kthreads(int cpu) /* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */ if (rcu_scheduler_fully_active) - (void)rcu_spawn_one_boost_kthread(&rcu_state, rnp); + (void)rcu_spawn_one_boost_kthread(rnp); } #else /* #ifdef CONFIG_RCU_BOOST */ From b21ebed951010acccbe9a55337d16cf4da4cce0a Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 078/140] rcu: Remove rsp parameter from print_cpu_stall_info() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from print_cpu_stall_info(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 4 ++-- kernel/rcu/tree.h | 2 +- kernel/rcu/tree_plugin.h | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0d69f198390b..1042863dab52 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1342,7 +1342,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) if (rnp->qsmask != 0) { for_each_leaf_node_possible_cpu(rnp, cpu) if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { - print_cpu_stall_info(rsp, cpu); + print_cpu_stall_info(cpu); ndetected++; } } @@ -1409,7 +1409,7 @@ static void print_cpu_stall(void) pr_err("INFO: %s self-detected stall on CPU", rsp->name); print_cpu_stall_info_begin(); raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags); - print_cpu_stall_info(rsp, smp_processor_id()); + print_cpu_stall_info(smp_processor_id()); raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags); print_cpu_stall_info_end(); for_each_possible_cpu(cpu) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index dc1c337f6da9..2bf57de9f78a 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -472,7 +472,7 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp); static bool rcu_preempt_need_deferred_qs(struct task_struct *t); static void rcu_preempt_deferred_qs(struct task_struct *t); static void print_cpu_stall_info_begin(void); -static void print_cpu_stall_info(struct rcu_state *rsp, int cpu); +static void print_cpu_stall_info(int cpu); static void print_cpu_stall_info_end(void); static void zero_cpu_stall_ticks(struct rcu_data *rdp); static void increment_cpu_stall_ticks(void); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 9a6dea5fab86..08ff162e02b3 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1771,7 +1771,7 @@ static void print_cpu_stall_info_begin(void) * * Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info. */ -static void print_cpu_stall_info(struct rcu_state *rsp, int cpu) +static void print_cpu_stall_info(int cpu) { unsigned long delta; char fast_no_hz[72]; @@ -1786,7 +1786,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu) */ touch_nmi_watchdog(); - ticks_value = rcu_seq_ctr(rsp->gp_seq - rdp->gp_seq); + ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq); if (ticks_value) { ticks_title = "GPs behind"; } else { @@ -1807,7 +1807,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu) rcu_dynticks_snap(rdtp) & 0xfff, rdtp->dynticks_nesting, rdtp->dynticks_nmi_nesting, rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), - READ_ONCE(rsp->n_force_qs) - rsp->n_force_qs_gpstart, + READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, fast_no_hz); } From 4580b0541beac895a9ba9a4b6f60aec94355bfdd Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 079/140] rcu: Remove rsp parameter from no-CBs CPU functions There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_nocb_cpu_needs_barrier(), rcu_spawn_one_nocb_kthread(), rcu_organize_nocb_kthreads(), rcu_nocb_cpu_needs_barrier(), and rcu_nohz_full_cpu(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 4 ++-- kernel/rcu/tree.h | 6 +++--- kernel/rcu/tree_plugin.h | 18 +++++++++--------- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1042863dab52..1fbe6c60adc6 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3095,7 +3095,7 @@ static int rcu_pending(void) check_cpu_stall(rdp); /* Is this CPU a NO_HZ_FULL CPU that should ignore RCU? */ - if (rcu_nohz_full_cpu(&rcu_state)) + if (rcu_nohz_full_cpu()) return 0; /* Is the RCU core waiting for a quiescent state from this CPU? */ @@ -3246,7 +3246,7 @@ static void _rcu_barrier(void) continue; rdp = per_cpu_ptr(&rcu_data, cpu); if (rcu_is_nocb_cpu(cpu)) { - if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) { + if (!rcu_nocb_cpu_needs_barrier(cpu)) { _rcu_barrier_trace(TPS("OfflineNoCB"), cpu, rsp->barrier_sequence); } else { diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 2bf57de9f78a..7c6033d71e9d 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -476,7 +476,7 @@ static void print_cpu_stall_info(int cpu); static void print_cpu_stall_info_end(void); static void zero_cpu_stall_ticks(struct rcu_data *rdp); static void increment_cpu_stall_ticks(void); -static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu); +static bool rcu_nocb_cpu_needs_barrier(int cpu); static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); static void rcu_init_one_nocb(struct rcu_node *rnp); @@ -491,11 +491,11 @@ static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp); static void rcu_spawn_all_nocb_kthreads(int cpu); static void __init rcu_spawn_nocb_kthreads(void); #ifdef CONFIG_RCU_NOCB_CPU -static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp); +static void __init rcu_organize_nocb_kthreads(void); #endif /* #ifdef CONFIG_RCU_NOCB_CPU */ static bool init_nocb_callback_list(struct rcu_data *rdp); static void rcu_bind_gp_kthread(void); -static bool rcu_nohz_full_cpu(struct rcu_state *rsp); +static bool rcu_nohz_full_cpu(void); static void rcu_dynticks_task_enter(void); static void rcu_dynticks_task_exit(void); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 08ff162e02b3..69705ec13527 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1960,7 +1960,7 @@ static void wake_nocb_leader_defer(struct rcu_data *rdp, int waketype, * Does the specified CPU need an RCU callback for the specified flavor * of rcu_barrier()? */ -static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu) +static bool rcu_nocb_cpu_needs_barrier(int cpu) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); unsigned long ret; @@ -2424,7 +2424,7 @@ void __init rcu_init_nohz(void) for_each_rcu_flavor(rsp) { for_each_cpu(cpu, rcu_nocb_mask) init_nocb_callback_list(per_cpu_ptr(&rcu_data, cpu)); - rcu_organize_nocb_kthreads(rsp); + rcu_organize_nocb_kthreads(); } } @@ -2444,7 +2444,7 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp) * brought online out of order, this can require re-organizing the * leader-follower relationships. */ -static void rcu_spawn_one_nocb_kthread(struct rcu_state *rsp, int cpu) +static void rcu_spawn_one_nocb_kthread(int cpu) { struct rcu_data *rdp; struct rcu_data *rdp_last; @@ -2481,7 +2481,7 @@ static void rcu_spawn_one_nocb_kthread(struct rcu_state *rsp, int cpu) /* Spawn the kthread for this CPU and RCU flavor. */ t = kthread_run(rcu_nocb_kthread, rdp_spawn, - "rcuo%c/%d", rsp->abbr, cpu); + "rcuo%c/%d", rcu_state.abbr, cpu); BUG_ON(IS_ERR(t)); WRITE_ONCE(rdp_spawn->nocb_kthread, t); } @@ -2496,7 +2496,7 @@ static void rcu_spawn_all_nocb_kthreads(int cpu) if (rcu_scheduler_fully_active) for_each_rcu_flavor(rsp) - rcu_spawn_one_nocb_kthread(rsp, cpu); + rcu_spawn_one_nocb_kthread(cpu); } /* @@ -2520,7 +2520,7 @@ module_param(rcu_nocb_leader_stride, int, 0444); /* * Initialize leader-follower relationships for all no-CBs CPU. */ -static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp) +static void __init rcu_organize_nocb_kthreads(void) { int cpu; int ls = rcu_nocb_leader_stride; @@ -2579,7 +2579,7 @@ static bool init_nocb_callback_list(struct rcu_data *rdp) #else /* #ifdef CONFIG_RCU_NOCB_CPU */ -static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu) +static bool rcu_nocb_cpu_needs_barrier(int cpu) { WARN_ON_ONCE(1); /* Should be dead code. */ return false; @@ -2648,12 +2648,12 @@ static bool init_nocb_callback_list(struct rcu_data *rdp) * This code relies on the fact that all NO_HZ_FULL CPUs are also * CONFIG_RCU_NOCB_CPU CPUs. */ -static bool rcu_nohz_full_cpu(struct rcu_state *rsp) +static bool rcu_nohz_full_cpu(void) { #ifdef CONFIG_NO_HZ_FULL if (tick_nohz_full_cpu(smp_processor_id()) && (!rcu_gp_in_progress() || - ULONG_CMP_LT(jiffies, READ_ONCE(rsp->gp_start) + HZ))) + ULONG_CMP_LT(jiffies, READ_ONCE(rcu_state.gp_start) + HZ))) return true; #endif /* #ifdef CONFIG_NO_HZ_FULL */ return false; From 63d4c8c97948b0be8cb7ef3b7b943c25864eae4b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 3 Jul 2018 17:22:34 -0700 Subject: [PATCH 080/140] rcu: Remove rsp parameter from expedited grace-period functions There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from the code in kernel/rcu/tree_exp.h, and removes all of the rsp local variables while in the area. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 4 +- kernel/rcu/tree.h | 1 - kernel/rcu/tree_exp.h | 185 ++++++++++++++++++--------------------- kernel/rcu/tree_plugin.h | 13 ++- 4 files changed, 94 insertions(+), 109 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1fbe6c60adc6..e33bf2aeac50 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -139,7 +139,7 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf); static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu); static void invoke_rcu_core(void); static void invoke_rcu_callbacks(struct rcu_data *rdp); -static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp); +static void rcu_report_exp_rdp(struct rcu_data *rdp); static void sync_sched_exp_online_cleanup(int cpu); /* rcuc/rcub kthread realtime priority */ @@ -3553,7 +3553,7 @@ void rcu_report_dead(unsigned int cpu) /* QS for any half-done expedited RCU-sched GP. */ preempt_disable(); - rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(&rcu_data)); + rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); preempt_enable(); rcu_preempt_deferred_qs(current); diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 7c6033d71e9d..b21d79bdab23 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -61,7 +61,6 @@ struct rcu_dynticks { /* Communicate arguments to a workqueue handler. */ struct rcu_exp_work { smp_call_func_t rew_func; - struct rcu_state *rew_rsp; unsigned long rew_s; struct work_struct rew_work; }; diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 0bcbb03c9702..b6f7bc34ac49 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -25,39 +25,39 @@ /* * Record the start of an expedited grace period. */ -static void rcu_exp_gp_seq_start(struct rcu_state *rsp) +static void rcu_exp_gp_seq_start(void) { - rcu_seq_start(&rsp->expedited_sequence); + rcu_seq_start(&rcu_state.expedited_sequence); } /* * Return then value that expedited-grace-period counter will have * at the end of the current grace period. */ -static __maybe_unused unsigned long rcu_exp_gp_seq_endval(struct rcu_state *rsp) +static __maybe_unused unsigned long rcu_exp_gp_seq_endval(void) { - return rcu_seq_endval(&rsp->expedited_sequence); + return rcu_seq_endval(&rcu_state.expedited_sequence); } /* * Record the end of an expedited grace period. */ -static void rcu_exp_gp_seq_end(struct rcu_state *rsp) +static void rcu_exp_gp_seq_end(void) { - rcu_seq_end(&rsp->expedited_sequence); + rcu_seq_end(&rcu_state.expedited_sequence); smp_mb(); /* Ensure that consecutive grace periods serialize. */ } /* * Take a snapshot of the expedited-grace-period counter. */ -static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp) +static unsigned long rcu_exp_gp_seq_snap(void) { unsigned long s; smp_mb(); /* Caller's modifications seen first by other CPUs. */ - s = rcu_seq_snap(&rsp->expedited_sequence); - trace_rcu_exp_grace_period(rsp->name, s, TPS("snap")); + s = rcu_seq_snap(&rcu_state.expedited_sequence); + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("snap")); return s; } @@ -66,9 +66,9 @@ static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp) * if a full expedited grace period has elapsed since that snapshot * was taken. */ -static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s) +static bool rcu_exp_gp_seq_done(unsigned long s) { - return rcu_seq_done(&rsp->expedited_sequence, s); + return rcu_seq_done(&rcu_state.expedited_sequence, s); } /* @@ -78,26 +78,26 @@ static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s) * ever been online. This means that this function normally takes its * no-work-to-do fastpath. */ -static void sync_exp_reset_tree_hotplug(struct rcu_state *rsp) +static void sync_exp_reset_tree_hotplug(void) { bool done; unsigned long flags; unsigned long mask; unsigned long oldmask; - int ncpus = smp_load_acquire(&rsp->ncpus); /* Order against locking. */ + int ncpus = smp_load_acquire(&rcu_state.ncpus); /* Order vs. locking. */ struct rcu_node *rnp; struct rcu_node *rnp_up; /* If no new CPUs onlined since last time, nothing to do. */ - if (likely(ncpus == rsp->ncpus_snap)) + if (likely(ncpus == rcu_state.ncpus_snap)) return; - rsp->ncpus_snap = ncpus; + rcu_state.ncpus_snap = ncpus; /* * Each pass through the following loop propagates newly onlined * CPUs for the current rcu_node structure up the rcu_node tree. */ - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(&rcu_state, rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); if (rnp->expmaskinit == rnp->expmaskinitnext) { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); @@ -135,13 +135,13 @@ static void sync_exp_reset_tree_hotplug(struct rcu_state *rsp) * Reset the ->expmask values in the rcu_node tree in preparation for * a new expedited grace period. */ -static void __maybe_unused sync_exp_reset_tree(struct rcu_state *rsp) +static void __maybe_unused sync_exp_reset_tree(void) { unsigned long flags; struct rcu_node *rnp; - sync_exp_reset_tree_hotplug(rsp); - rcu_for_each_node_breadth_first(rsp, rnp) { + sync_exp_reset_tree_hotplug(); + rcu_for_each_node_breadth_first(&rcu_state, rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); WARN_ON_ONCE(rnp->expmask); rnp->expmask = rnp->expmaskinit; @@ -194,7 +194,7 @@ static bool sync_rcu_preempt_exp_done_unlocked(struct rcu_node *rnp) * * Caller must hold the specified rcu_node structure's ->lock. */ -static void __rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, +static void __rcu_report_exp_rnp(struct rcu_node *rnp, bool wake, unsigned long flags) __releases(rnp->lock) { @@ -212,7 +212,7 @@ static void __rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, raw_spin_unlock_irqrestore_rcu_node(rnp, flags); if (wake) { smp_mb(); /* EGP done before wake_up(). */ - swake_up_one(&rsp->expedited_wq); + swake_up_one(&rcu_state.expedited_wq); } break; } @@ -229,20 +229,19 @@ static void __rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, * Report expedited quiescent state for specified node. This is a * lock-acquisition wrapper function for __rcu_report_exp_rnp(). */ -static void __maybe_unused rcu_report_exp_rnp(struct rcu_state *rsp, - struct rcu_node *rnp, bool wake) +static void __maybe_unused rcu_report_exp_rnp(struct rcu_node *rnp, bool wake) { unsigned long flags; raw_spin_lock_irqsave_rcu_node(rnp, flags); - __rcu_report_exp_rnp(rsp, rnp, wake, flags); + __rcu_report_exp_rnp(rnp, wake, flags); } /* * Report expedited quiescent state for multiple CPUs, all covered by the * specified leaf rcu_node structure. */ -static void rcu_report_exp_cpu_mult(struct rcu_state *rsp, struct rcu_node *rnp, +static void rcu_report_exp_cpu_mult(struct rcu_node *rnp, unsigned long mask, bool wake) { unsigned long flags; @@ -253,23 +252,23 @@ static void rcu_report_exp_cpu_mult(struct rcu_state *rsp, struct rcu_node *rnp, return; } rnp->expmask &= ~mask; - __rcu_report_exp_rnp(rsp, rnp, wake, flags); /* Releases rnp->lock. */ + __rcu_report_exp_rnp(rnp, wake, flags); /* Releases rnp->lock. */ } /* * Report expedited quiescent state for specified rcu_data (CPU). */ -static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp) +static void rcu_report_exp_rdp(struct rcu_data *rdp) { WRITE_ONCE(rdp->deferred_qs, false); - rcu_report_exp_cpu_mult(rsp, rdp->mynode, rdp->grpmask, true); + rcu_report_exp_cpu_mult(rdp->mynode, rdp->grpmask, true); } /* Common code for work-done checking. */ -static bool sync_exp_work_done(struct rcu_state *rsp, unsigned long s) +static bool sync_exp_work_done(unsigned long s) { - if (rcu_exp_gp_seq_done(rsp, s)) { - trace_rcu_exp_grace_period(rsp->name, s, TPS("done")); + if (rcu_exp_gp_seq_done(s)) { + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("done")); /* Ensure test happens before caller kfree(). */ smp_mb__before_atomic(); /* ^^^ */ return true; @@ -284,7 +283,7 @@ static bool sync_exp_work_done(struct rcu_state *rsp, unsigned long s) * with the mutex held, indicating that the caller must actually do the * expedited grace period. */ -static bool exp_funnel_lock(struct rcu_state *rsp, unsigned long s) +static bool exp_funnel_lock(unsigned long s) { struct rcu_data *rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); struct rcu_node *rnp = rdp->mynode; @@ -294,18 +293,18 @@ static bool exp_funnel_lock(struct rcu_state *rsp, unsigned long s) if (ULONG_CMP_LT(READ_ONCE(rnp->exp_seq_rq), s) && (rnp == rnp_root || ULONG_CMP_LT(READ_ONCE(rnp_root->exp_seq_rq), s)) && - mutex_trylock(&rsp->exp_mutex)) + mutex_trylock(&rcu_state.exp_mutex)) goto fastpath; /* * Each pass through the following loop works its way up * the rcu_node tree, returning if others have done the work or - * otherwise falls through to acquire rsp->exp_mutex. The mapping + * otherwise falls through to acquire ->exp_mutex. The mapping * from CPU to rcu_node structure can be inexact, as it is just * promoting locality and is not strictly needed for correctness. */ for (; rnp != NULL; rnp = rnp->parent) { - if (sync_exp_work_done(rsp, s)) + if (sync_exp_work_done(s)) return true; /* Work not done, either wait here or go up. */ @@ -314,26 +313,26 @@ static bool exp_funnel_lock(struct rcu_state *rsp, unsigned long s) /* Someone else doing GP, so wait for them. */ spin_unlock(&rnp->exp_lock); - trace_rcu_exp_funnel_lock(rsp->name, rnp->level, + trace_rcu_exp_funnel_lock(rcu_state.name, rnp->level, rnp->grplo, rnp->grphi, TPS("wait")); wait_event(rnp->exp_wq[rcu_seq_ctr(s) & 0x3], - sync_exp_work_done(rsp, s)); + sync_exp_work_done(s)); return true; } rnp->exp_seq_rq = s; /* Followers can wait on us. */ spin_unlock(&rnp->exp_lock); - trace_rcu_exp_funnel_lock(rsp->name, rnp->level, rnp->grplo, - rnp->grphi, TPS("nxtlvl")); + trace_rcu_exp_funnel_lock(rcu_state.name, rnp->level, + rnp->grplo, rnp->grphi, TPS("nxtlvl")); } - mutex_lock(&rsp->exp_mutex); + mutex_lock(&rcu_state.exp_mutex); fastpath: - if (sync_exp_work_done(rsp, s)) { - mutex_unlock(&rsp->exp_mutex); + if (sync_exp_work_done(s)) { + mutex_unlock(&rcu_state.exp_mutex); return true; } - rcu_exp_gp_seq_start(rsp); - trace_rcu_exp_grace_period(rsp->name, s, TPS("start")); + rcu_exp_gp_seq_start(); + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("start")); return false; } @@ -352,7 +351,6 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) struct rcu_exp_work *rewp = container_of(wp, struct rcu_exp_work, rew_work); struct rcu_node *rnp = container_of(rewp, struct rcu_node, rew); - struct rcu_state *rsp = rewp->rew_rsp; func = rewp->rew_func; raw_spin_lock_irqsave_rcu_node(rnp, flags); @@ -400,7 +398,7 @@ retry_ipi: mask_ofl_test |= mask; continue; } - ret = smp_call_function_single(cpu, func, rsp, 0); + ret = smp_call_function_single(cpu, func, NULL, 0); if (!ret) { mask_ofl_ipi &= ~mask; continue; @@ -411,7 +409,7 @@ retry_ipi: (rnp->expmask & mask)) { /* Online, so delay for a bit and try again. */ raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("selectofl")); + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("selectofl")); schedule_timeout_uninterruptible(1); goto retry_ipi; } @@ -423,33 +421,31 @@ retry_ipi: /* Report quiescent states for those that went offline. */ mask_ofl_test |= mask_ofl_ipi; if (mask_ofl_test) - rcu_report_exp_cpu_mult(rsp, rnp, mask_ofl_test, false); + rcu_report_exp_cpu_mult(rnp, mask_ofl_test, false); } /* * Select the nodes that the upcoming expedited grace period needs * to wait for. */ -static void sync_rcu_exp_select_cpus(struct rcu_state *rsp, - smp_call_func_t func) +static void sync_rcu_exp_select_cpus(smp_call_func_t func) { int cpu; struct rcu_node *rnp; - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("reset")); - sync_exp_reset_tree(rsp); - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("select")); + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("reset")); + sync_exp_reset_tree(); + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("select")); /* Schedule work for each leaf rcu_node structure. */ - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(&rcu_state, rnp) { rnp->exp_need_flush = false; if (!READ_ONCE(rnp->expmask)) continue; /* Avoid early boot non-existent wq. */ rnp->rew.rew_func = func; - rnp->rew.rew_rsp = rsp; if (!READ_ONCE(rcu_par_gp_wq) || rcu_scheduler_active != RCU_SCHEDULER_RUNNING || - rcu_is_last_leaf_node(rsp, rnp)) { + rcu_is_last_leaf_node(&rcu_state, rnp)) { /* No workqueues yet or last leaf, do direct call. */ sync_rcu_exp_select_node_cpus(&rnp->rew.rew_work); continue; @@ -466,12 +462,12 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp, } /* Wait for workqueue jobs (if any) to complete. */ - rcu_for_each_leaf_node(rsp, rnp) + rcu_for_each_leaf_node(&rcu_state, rnp) if (rnp->exp_need_flush) flush_work(&rnp->rew.rew_work); } -static void synchronize_sched_expedited_wait(struct rcu_state *rsp) +static void synchronize_sched_expedited_wait(void) { int cpu; unsigned long jiffies_stall; @@ -482,13 +478,13 @@ static void synchronize_sched_expedited_wait(struct rcu_state *rsp) struct rcu_node *rnp_root = rcu_get_root(); int ret; - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("startwait")); + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait")); jiffies_stall = rcu_jiffies_till_stall_check(); jiffies_start = jiffies; for (;;) { ret = swait_event_timeout_exclusive( - rsp->expedited_wq, + rcu_state.expedited_wq, sync_rcu_preempt_exp_done_unlocked(rnp_root), jiffies_stall); if (ret > 0 || sync_rcu_preempt_exp_done_unlocked(rnp_root)) @@ -498,9 +494,9 @@ static void synchronize_sched_expedited_wait(struct rcu_state *rsp) continue; panic_on_rcu_stall(); pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", - rsp->name); + rcu_state.name); ndetected = 0; - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(&rcu_state, rnp) { ndetected += rcu_print_task_exp_stall(rnp); for_each_leaf_node_possible_cpu(rnp, cpu) { struct rcu_data *rdp; @@ -517,11 +513,11 @@ static void synchronize_sched_expedited_wait(struct rcu_state *rsp) } } pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", - jiffies - jiffies_start, rsp->expedited_sequence, + jiffies - jiffies_start, rcu_state.expedited_sequence, rnp_root->expmask, ".T"[!!rnp_root->exp_tasks]); if (ndetected) { pr_err("blocking rcu_node structures:"); - rcu_for_each_node_breadth_first(rsp, rnp) { + rcu_for_each_node_breadth_first(&rcu_state, rnp) { if (rnp == rnp_root) continue; /* printed unconditionally */ if (sync_rcu_preempt_exp_done_unlocked(rnp)) @@ -533,7 +529,7 @@ static void synchronize_sched_expedited_wait(struct rcu_state *rsp) } pr_cont("\n"); } - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(&rcu_state, rnp) { for_each_leaf_node_possible_cpu(rnp, cpu) { mask = leaf_node_cpu_bit(rnp, cpu); if (!(rnp->expmask & mask)) @@ -551,21 +547,21 @@ static void synchronize_sched_expedited_wait(struct rcu_state *rsp) * grace period. Also update all the ->exp_seq_rq counters as needed * in order to avoid counter-wrap problems. */ -static void rcu_exp_wait_wake(struct rcu_state *rsp, unsigned long s) +static void rcu_exp_wait_wake(unsigned long s) { struct rcu_node *rnp; - synchronize_sched_expedited_wait(rsp); - rcu_exp_gp_seq_end(rsp); - trace_rcu_exp_grace_period(rsp->name, s, TPS("end")); + synchronize_sched_expedited_wait(); + rcu_exp_gp_seq_end(); + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("end")); /* * Switch over to wakeup mode, allowing the next GP, but -only- the * next GP, to proceed. */ - mutex_lock(&rsp->exp_wake_mutex); + mutex_lock(&rcu_state.exp_wake_mutex); - rcu_for_each_node_breadth_first(rsp, rnp) { + rcu_for_each_node_breadth_first(&rcu_state, rnp) { if (ULONG_CMP_LT(READ_ONCE(rnp->exp_seq_rq), s)) { spin_lock(&rnp->exp_lock); /* Recheck, avoid hang in case someone just arrived. */ @@ -574,24 +570,23 @@ static void rcu_exp_wait_wake(struct rcu_state *rsp, unsigned long s) spin_unlock(&rnp->exp_lock); } smp_mb(); /* All above changes before wakeup. */ - wake_up_all(&rnp->exp_wq[rcu_seq_ctr(rsp->expedited_sequence) & 0x3]); + wake_up_all(&rnp->exp_wq[rcu_seq_ctr(rcu_state.expedited_sequence) & 0x3]); } - trace_rcu_exp_grace_period(rsp->name, s, TPS("endwake")); - mutex_unlock(&rsp->exp_wake_mutex); + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("endwake")); + mutex_unlock(&rcu_state.exp_wake_mutex); } /* * Common code to drive an expedited grace period forward, used by * workqueues and mid-boot-time tasks. */ -static void rcu_exp_sel_wait_wake(struct rcu_state *rsp, - smp_call_func_t func, unsigned long s) +static void rcu_exp_sel_wait_wake(smp_call_func_t func, unsigned long s) { /* Initialize the rcu_node tree in preparation for the wait. */ - sync_rcu_exp_select_cpus(rsp, func); + sync_rcu_exp_select_cpus(func); /* Wait and clean up, including waking everyone. */ - rcu_exp_wait_wake(rsp, s); + rcu_exp_wait_wake(s); } /* @@ -602,15 +597,14 @@ static void wait_rcu_exp_gp(struct work_struct *wp) struct rcu_exp_work *rewp; rewp = container_of(wp, struct rcu_exp_work, rew_work); - rcu_exp_sel_wait_wake(rewp->rew_rsp, rewp->rew_func, rewp->rew_s); + rcu_exp_sel_wait_wake(rewp->rew_func, rewp->rew_s); } /* * Given an rcu_state pointer and a smp_call_function() handler, kick * off the specified flavor of expedited grace period. */ -static void _synchronize_rcu_expedited(struct rcu_state *rsp, - smp_call_func_t func) +static void _synchronize_rcu_expedited(smp_call_func_t func) { struct rcu_data *rdp; struct rcu_exp_work rew; @@ -624,18 +618,17 @@ static void _synchronize_rcu_expedited(struct rcu_state *rsp, } /* Take a snapshot of the sequence number. */ - s = rcu_exp_gp_seq_snap(rsp); - if (exp_funnel_lock(rsp, s)) + s = rcu_exp_gp_seq_snap(); + if (exp_funnel_lock(s)) return; /* Someone else did our work for us. */ /* Ensure that load happens before action based on it. */ if (unlikely(rcu_scheduler_active == RCU_SCHEDULER_INIT)) { /* Direct call during scheduler init and early_initcalls(). */ - rcu_exp_sel_wait_wake(rsp, func, s); + rcu_exp_sel_wait_wake(func, s); } else { /* Marshall arguments & schedule the expedited grace period. */ rew.rew_func = func; - rew.rew_rsp = rsp; rew.rew_s = s; INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp); queue_work(rcu_gp_wq, &rew.rew_work); @@ -645,11 +638,11 @@ static void _synchronize_rcu_expedited(struct rcu_state *rsp, rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); rnp = rcu_get_root(); wait_event(rnp->exp_wq[rcu_seq_ctr(s) & 0x3], - sync_exp_work_done(rsp, s)); + sync_exp_work_done(s)); smp_mb(); /* Workqueue actions happen before return. */ /* Let the next expedited grace period start. */ - mutex_unlock(&rsp->exp_mutex); + mutex_unlock(&rcu_state.exp_mutex); } #ifdef CONFIG_PREEMPT_RCU @@ -661,10 +654,9 @@ static void _synchronize_rcu_expedited(struct rcu_state *rsp, * ->expmask fields in the rcu_node tree. Otherwise, immediately * report the quiescent state. */ -static void sync_rcu_exp_handler(void *info) +static void sync_rcu_exp_handler(void *unused) { unsigned long flags; - struct rcu_state *rsp = info; struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp = rdp->mynode; struct task_struct *t = current; @@ -677,7 +669,7 @@ static void sync_rcu_exp_handler(void *info) if (!t->rcu_read_lock_nesting) { if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) || rcu_dynticks_curr_cpu_in_eqs()) { - rcu_report_exp_rdp(rsp, rdp); + rcu_report_exp_rdp(rdp); } else { rdp->deferred_qs = true; resched_cpu(rdp->cpu); @@ -756,8 +748,6 @@ static void sync_sched_exp_online_cleanup(int cpu) */ void synchronize_rcu_expedited(void) { - struct rcu_state *rsp = &rcu_state; - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map), @@ -765,7 +755,7 @@ void synchronize_rcu_expedited(void) if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) return; - _synchronize_rcu_expedited(rsp, sync_rcu_exp_handler); + _synchronize_rcu_expedited(sync_rcu_exp_handler); } EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); @@ -783,7 +773,7 @@ static void sync_sched_exp_handler(void *unused) __this_cpu_read(rcu_data.cpu_no_qs.b.exp)) return; if (rcu_is_cpu_rrupt_from_idle()) { - rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(&rcu_data)); + rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); return; } __this_cpu_write(rcu_data.cpu_no_qs.b.exp, true); @@ -798,13 +788,12 @@ static void sync_sched_exp_online_cleanup(int cpu) struct rcu_data *rdp; int ret; struct rcu_node *rnp; - struct rcu_state *rsp = &rcu_state; rdp = per_cpu_ptr(&rcu_data, cpu); rnp = rdp->mynode; if (!(READ_ONCE(rnp->expmask) & rdp->grpmask)) return; - ret = smp_call_function_single(cpu, sync_sched_exp_handler, rsp, 0); + ret = smp_call_function_single(cpu, sync_sched_exp_handler, NULL, 0); WARN_ON_ONCE(ret); } @@ -831,8 +820,6 @@ static int rcu_blocking_is_gp(void) /* PREEMPT=n implementation of synchronize_rcu_expedited(). */ void synchronize_rcu_expedited(void) { - struct rcu_state *rsp = &rcu_state; - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map), @@ -842,7 +829,7 @@ void synchronize_rcu_expedited(void) if (rcu_blocking_is_gp()) return; - _synchronize_rcu_expedited(rsp, sync_sched_exp_handler); + _synchronize_rcu_expedited(sync_sched_exp_handler); } EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 69705ec13527..e6ec25e47d00 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -123,8 +123,7 @@ static void __init rcu_bootup_announce_oddness(void) #ifdef CONFIG_PREEMPT_RCU -static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, - bool wake); +static void rcu_report_exp_rnp(struct rcu_node *rnp, bool wake); static void rcu_read_unlock_special(struct task_struct *t); /* @@ -281,7 +280,7 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp) * still in a quiescent state in any case.) */ if (blkd_state & RCU_EXP_BLKD && rdp->deferred_qs) - rcu_report_exp_rdp(rdp->rsp, rdp); + rcu_report_exp_rdp(rdp); else WARN_ON_ONCE(rdp->deferred_qs); } @@ -381,7 +380,7 @@ void rcu_note_context_switch(bool preempt) */ rcu_qs(); if (rdp->deferred_qs) - rcu_report_exp_rdp(&rcu_state, rdp); + rcu_report_exp_rdp(rdp); trace_rcu_utilization(TPS("End context switch")); barrier(); /* Avoid RCU read-side critical sections leaking up. */ } @@ -509,7 +508,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * blocked-tasks list below. */ if (rdp->deferred_qs) { - rcu_report_exp_rdp(&rcu_state, rdp); + rcu_report_exp_rdp(rdp); if (!t->rcu_read_unlock_special.s) { local_irq_restore(flags); return; @@ -580,7 +579,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) * then we need to report up the rcu_node hierarchy. */ if (!empty_exp && empty_exp_now) - rcu_report_exp_rnp(&rcu_state, rnp, true); + rcu_report_exp_rnp(rnp, true); } else { local_irq_restore(flags); } @@ -947,7 +946,7 @@ static void rcu_qs(void) if (!__this_cpu_read(rcu_data.cpu_no_qs.b.exp)) return; __this_cpu_write(rcu_data.cpu_no_qs.b.exp, false); - rcu_report_exp_rdp(&rcu_state, this_cpu_ptr(&rcu_data)); + rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); } /* From aedf4ba984168ab5b96898a03bfdb51d07194776 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Jul 2018 14:33:59 -0700 Subject: [PATCH 081/140] rcu: Remove rsp parameter from rcu_node tree accessor macros There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's rcu_node tree's accessor macros. This commit therefore removes the rsp parameter from those macros in kernel/rcu/rcu.h, and removes some now-unused rsp local variables while in the area. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcu.h | 28 +++++++++++----------------- kernel/rcu/srcutree.c | 4 ++-- kernel/rcu/tree.c | 19 +++++++++---------- kernel/rcu/tree_exp.h | 18 +++++++++--------- kernel/rcu/tree_plugin.h | 4 ++-- 5 files changed, 33 insertions(+), 40 deletions(-) diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 4d04683c31b2..2bb77fddc11f 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -329,29 +329,23 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt) } /* Returns first leaf rcu_node of the specified RCU flavor. */ -#define rcu_first_leaf_node(rsp) ((rsp)->level[rcu_num_lvls - 1]) +#define rcu_first_leaf_node() (rcu_state.level[rcu_num_lvls - 1]) /* Is this rcu_node a leaf? */ #define rcu_is_leaf_node(rnp) ((rnp)->level == rcu_num_lvls - 1) /* Is this rcu_node the last leaf? */ -#define rcu_is_last_leaf_node(rsp, rnp) ((rnp) == &(rsp)->node[rcu_num_nodes - 1]) +#define rcu_is_last_leaf_node(rnp) ((rnp) == &rcu_state.node[rcu_num_nodes - 1]) /* - * Do a full breadth-first scan of the rcu_node structures for the + * Do a full breadth-first scan of the {s,}rcu_node structures for the * specified rcu_state structure. */ -#define rcu_for_each_node_breadth_first(rsp, rnp) \ - for ((rnp) = &(rsp)->node[0]; \ - (rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++) - -/* - * Do a breadth-first scan of the non-leaf rcu_node structures for the - * specified rcu_state structure. Note that if there is a singleton - * rcu_node tree with but one rcu_node structure, this loop is a no-op. - */ -#define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \ - for ((rnp) = &(rsp)->node[0]; !rcu_is_leaf_node(rsp, rnp); (rnp)++) +#define srcu_for_each_node_breadth_first(sp, rnp) \ + for ((rnp) = &(sp)->node[0]; \ + (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++) +#define rcu_for_each_node_breadth_first(rnp) \ + srcu_for_each_node_breadth_first(&rcu_state, rnp) /* * Scan the leaves of the rcu_node hierarchy for the specified rcu_state @@ -359,9 +353,9 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt) * one rcu_node structure, this loop -will- visit the rcu_node structure. * It is still a leaf node, even if it is also the root node. */ -#define rcu_for_each_leaf_node(rsp, rnp) \ - for ((rnp) = rcu_first_leaf_node(rsp); \ - (rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++) +#define rcu_for_each_leaf_node(rnp) \ + for ((rnp) = rcu_first_leaf_node(); \ + (rnp) < &rcu_state.node[rcu_num_nodes]; (rnp)++) /* * Iterate over all possible CPUs in a leaf RCU node. diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 6c9866a854b1..2042080cd38b 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -105,7 +105,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static) rcu_init_levelspread(levelspread, num_rcu_lvl); /* Each pass through this loop initializes one srcu_node structure. */ - rcu_for_each_node_breadth_first(sp, snp) { + srcu_for_each_node_breadth_first(sp, snp) { spin_lock_init(&ACCESS_PRIVATE(snp, lock)); WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) != ARRAY_SIZE(snp->srcu_data_have_cbs)); @@ -561,7 +561,7 @@ static void srcu_gp_end(struct srcu_struct *sp) /* Initiate callback invocation as needed. */ idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs); - rcu_for_each_node_breadth_first(sp, snp) { + srcu_for_each_node_breadth_first(sp, snp) { spin_lock_irq_rcu_node(snp); cbs = false; last_lvl = snp >= sp->level[rcu_num_lvls - 1]; diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e33bf2aeac50..0465a85a40e1 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -573,7 +573,7 @@ void show_rcu_gp_kthreads(void) for_each_rcu_flavor(rsp) { pr_info("%s: wait state: %d ->state: %#lx\n", rsp->name, rsp->gp_state, rsp->gp_kthread->state); - rcu_for_each_node_breadth_first(rsp, rnp) { + rcu_for_each_node_breadth_first(rnp) { if (ULONG_CMP_GE(rsp->gp_seq, rnp->gp_seq_needed)) continue; pr_info("\trcu_node %d:%d ->gp_seq %lu ->gp_seq_needed %lu\n", @@ -1276,7 +1276,7 @@ static void rcu_dump_cpu_stacks(void) unsigned long flags; struct rcu_node *rnp; - rcu_for_each_leaf_node(&rcu_state, rnp) { + rcu_for_each_leaf_node(rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); for_each_leaf_node_possible_cpu(rnp, cpu) if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) @@ -1336,7 +1336,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) */ pr_err("INFO: %s detected stalls on CPUs/tasks:", rsp->name); print_cpu_stall_info_begin(); - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); ndetected += rcu_print_task_stall(rnp); if (rnp->qsmask != 0) { @@ -1873,7 +1873,7 @@ static bool rcu_gp_init(void) * will handle subsequent offline CPUs. */ rsp->gp_state = RCU_GP_ONOFF; - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(rnp) { spin_lock(&rsp->ofl_lock); raw_spin_lock_irq_rcu_node(rnp); if (rnp->qsmaskinit == rnp->qsmaskinitnext && @@ -1933,7 +1933,7 @@ static bool rcu_gp_init(void) * process finishes, because this kthread handles both. */ rsp->gp_state = RCU_GP_INIT; - rcu_for_each_node_breadth_first(rsp, rnp) { + rcu_for_each_node_breadth_first(rnp) { rcu_gp_slow(gp_init_delay); raw_spin_lock_irqsave_rcu_node(rnp, flags); rdp = this_cpu_ptr(&rcu_data); @@ -2046,7 +2046,7 @@ static void rcu_gp_cleanup(void) */ new_gp_seq = rsp->gp_seq; rcu_seq_end(&new_gp_seq); - rcu_for_each_node_breadth_first(rsp, rnp) { + rcu_for_each_node_breadth_first(rnp) { raw_spin_lock_irq_rcu_node(rnp); if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) dump_blkd_tasks(rnp, 10); @@ -2606,9 +2606,8 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rsp)) unsigned long flags; unsigned long mask; struct rcu_node *rnp; - struct rcu_state *rsp = &rcu_state; - rcu_for_each_leaf_node(rsp, rnp) { + rcu_for_each_leaf_node(rnp) { cond_resched_tasks_rcu_qs(); mask = 0; raw_spin_lock_irqsave_rcu_node(rnp, flags); @@ -3778,7 +3777,7 @@ static void __init rcu_init_one(void) init_swait_queue_head(&rsp->gp_wq); init_swait_queue_head(&rsp->expedited_wq); - rnp = rcu_first_leaf_node(rsp); + rnp = rcu_first_leaf_node(); for_each_possible_cpu(i) { while (i > rnp->grphi) rnp++; @@ -3878,7 +3877,7 @@ static void __init rcu_dump_rcu_node_tree(void) pr_info("rcu_node tree layout dump\n"); pr_info(" "); - rcu_for_each_node_breadth_first(&rcu_state, rnp) { + rcu_for_each_node_breadth_first(rnp) { if (rnp->level != level) { pr_cont("\n"); pr_info(" "); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index b6f7bc34ac49..060bdb45cd95 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -97,7 +97,7 @@ static void sync_exp_reset_tree_hotplug(void) * Each pass through the following loop propagates newly onlined * CPUs for the current rcu_node structure up the rcu_node tree. */ - rcu_for_each_leaf_node(&rcu_state, rnp) { + rcu_for_each_leaf_node(rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); if (rnp->expmaskinit == rnp->expmaskinitnext) { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); @@ -141,7 +141,7 @@ static void __maybe_unused sync_exp_reset_tree(void) struct rcu_node *rnp; sync_exp_reset_tree_hotplug(); - rcu_for_each_node_breadth_first(&rcu_state, rnp) { + rcu_for_each_node_breadth_first(rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); WARN_ON_ONCE(rnp->expmask); rnp->expmask = rnp->expmaskinit; @@ -438,14 +438,14 @@ static void sync_rcu_exp_select_cpus(smp_call_func_t func) trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("select")); /* Schedule work for each leaf rcu_node structure. */ - rcu_for_each_leaf_node(&rcu_state, rnp) { + rcu_for_each_leaf_node(rnp) { rnp->exp_need_flush = false; if (!READ_ONCE(rnp->expmask)) continue; /* Avoid early boot non-existent wq. */ rnp->rew.rew_func = func; if (!READ_ONCE(rcu_par_gp_wq) || rcu_scheduler_active != RCU_SCHEDULER_RUNNING || - rcu_is_last_leaf_node(&rcu_state, rnp)) { + rcu_is_last_leaf_node(rnp)) { /* No workqueues yet or last leaf, do direct call. */ sync_rcu_exp_select_node_cpus(&rnp->rew.rew_work); continue; @@ -462,7 +462,7 @@ static void sync_rcu_exp_select_cpus(smp_call_func_t func) } /* Wait for workqueue jobs (if any) to complete. */ - rcu_for_each_leaf_node(&rcu_state, rnp) + rcu_for_each_leaf_node(rnp) if (rnp->exp_need_flush) flush_work(&rnp->rew.rew_work); } @@ -496,7 +496,7 @@ static void synchronize_sched_expedited_wait(void) pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", rcu_state.name); ndetected = 0; - rcu_for_each_leaf_node(&rcu_state, rnp) { + rcu_for_each_leaf_node(rnp) { ndetected += rcu_print_task_exp_stall(rnp); for_each_leaf_node_possible_cpu(rnp, cpu) { struct rcu_data *rdp; @@ -517,7 +517,7 @@ static void synchronize_sched_expedited_wait(void) rnp_root->expmask, ".T"[!!rnp_root->exp_tasks]); if (ndetected) { pr_err("blocking rcu_node structures:"); - rcu_for_each_node_breadth_first(&rcu_state, rnp) { + rcu_for_each_node_breadth_first(rnp) { if (rnp == rnp_root) continue; /* printed unconditionally */ if (sync_rcu_preempt_exp_done_unlocked(rnp)) @@ -529,7 +529,7 @@ static void synchronize_sched_expedited_wait(void) } pr_cont("\n"); } - rcu_for_each_leaf_node(&rcu_state, rnp) { + rcu_for_each_leaf_node(rnp) { for_each_leaf_node_possible_cpu(rnp, cpu) { mask = leaf_node_cpu_bit(rnp, cpu); if (!(rnp->expmask & mask)) @@ -561,7 +561,7 @@ static void rcu_exp_wait_wake(unsigned long s) */ mutex_lock(&rcu_state.exp_wake_mutex); - rcu_for_each_node_breadth_first(&rcu_state, rnp) { + rcu_for_each_node_breadth_first(rnp) { if (ULONG_CMP_LT(READ_ONCE(rnp->exp_seq_rq), s)) { spin_lock(&rnp->exp_lock); /* Recheck, avoid hang in case someone just arrived. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index e6ec25e47d00..b60d3df92ff5 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -687,7 +687,7 @@ static void rcu_print_detail_task_stall(void) struct rcu_node *rnp = rcu_get_root(); rcu_print_detail_task_stall_rnp(rnp); - rcu_for_each_leaf_node(&rcu_state, rnp) + rcu_for_each_leaf_node(rnp) rcu_print_detail_task_stall_rnp(rnp); } @@ -1427,7 +1427,7 @@ static void __init rcu_spawn_boost_kthreads(void) for_each_possible_cpu(cpu) per_cpu(rcu_cpu_has_work, cpu) = 0; BUG_ON(smpboot_register_percpu_thread(&rcu_cpu_thread_spec)); - rcu_for_each_leaf_node(&rcu_state, rnp) + rcu_for_each_leaf_node(rnp) (void)rcu_spawn_one_boost_kthread(rnp); } From 88d1bead858d88cdda92ed8f3388eea8ee3a9675 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Jul 2018 14:45:00 -0700 Subject: [PATCH 082/140] rcu: Remove rcu_data structure's ->rsp field Now that there is only one rcu_state structure, there is no need for the rcu_data structure to indicate which it corresponds to. This commit therefore removes the rcu_data structure's ->rsp field, replacing all remaining uses of it with &rcu_state. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 28 +++++++++++++-------------- kernel/rcu/tree.h | 1 - kernel/rcu/tree_plugin.h | 42 ++++++++++++++++++++-------------------- 3 files changed, 34 insertions(+), 37 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0465a85a40e1..aeff9024bb6c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1070,7 +1070,7 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp) { rdp->dynticks_snap = rcu_dynticks_snap(rdp->dynticks); if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) { - trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("dti")); + trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); rcu_gpnum_ovf(rdp->mynode, rdp); return 1; } @@ -1120,7 +1120,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) * of the current RCU grace period. */ if (rcu_dynticks_in_eqs_since(rdp->dynticks, rdp->dynticks_snap)) { - trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("dti")); + trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); rdp->dynticks_fqs++; rcu_gpnum_ovf(rnp, rdp); return 1; @@ -1134,20 +1134,20 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) */ jtsq = jiffies_till_sched_qs; ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); - if (time_after(jiffies, rdp->rsp->gp_start + jtsq) && + if (time_after(jiffies, rcu_state.gp_start + jtsq) && READ_ONCE(rdp->rcu_qs_ctr_snap) != per_cpu(rcu_dynticks.rcu_qs_ctr, rdp->cpu) && rcu_seq_current(&rdp->gp_seq) == rnp->gp_seq && !rdp->gpwrap) { - trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("rqc")); + trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("rqc")); rcu_gpnum_ovf(rnp, rdp); return 1; - } else if (time_after(jiffies, rdp->rsp->gp_start + jtsq)) { + } else if (time_after(jiffies, rcu_state.gp_start + jtsq)) { /* Load rcu_qs_ctr before store to rcu_urgent_qs. */ smp_store_release(ruqp, true); } /* If waiting too long on an offline CPU, complain. */ if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp)) && - time_after(jiffies, rdp->rsp->gp_start + HZ)) { + time_after(jiffies, rcu_state.gp_start + HZ)) { bool onl; struct rcu_node *rnp1; @@ -1185,12 +1185,12 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) */ rnhqp = &per_cpu(rcu_dynticks.rcu_need_heavy_qs, rdp->cpu); if (!READ_ONCE(*rnhqp) && - (time_after(jiffies, rdp->rsp->gp_start + jtsq) || - time_after(jiffies, rdp->rsp->jiffies_resched))) { + (time_after(jiffies, rcu_state.gp_start + jtsq) || + time_after(jiffies, rcu_state.jiffies_resched))) { WRITE_ONCE(*rnhqp, true); /* Store rcu_need_heavy_qs before rcu_urgent_qs. */ smp_store_release(ruqp, true); - rdp->rsp->jiffies_resched += jtsq; /* Re-enable beating. */ + rcu_state.jiffies_resched += jtsq; /* Re-enable beating. */ } /* @@ -1199,7 +1199,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) * see if the CPU is getting hammered with interrupts, but only * once per grace period, just to keep the IPIs down to a dull roar. */ - if (jiffies - rdp->rsp->gp_start > rcu_jiffies_till_stall_check() / 2) { + if (jiffies - rcu_state.gp_start > rcu_jiffies_till_stall_check() / 2) { resched_cpu(rdp->cpu); if (IS_ENABLED(CONFIG_IRQ_WORK) && !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != rnp->gp_seq && @@ -1526,7 +1526,7 @@ void rcu_cpu_stall_reset(void) static void trace_rcu_this_gp(struct rcu_node *rnp, struct rcu_data *rdp, unsigned long gp_seq_req, const char *s) { - trace_rcu_future_grace_period(rdp->rsp->name, rnp->gp_seq, gp_seq_req, + trace_rcu_future_grace_period(rcu_state.name, rnp->gp_seq, gp_seq_req, rnp->level, rnp->grplo, rnp->grphi, s); } @@ -1550,7 +1550,7 @@ static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp, unsigned long gp_seq_req) { bool ret = false; - struct rcu_state *rsp = rdp->rsp; + struct rcu_state *rsp = &rcu_state; struct rcu_node *rnp; /* @@ -3167,8 +3167,7 @@ static void _rcu_barrier_trace(const char *s, int cpu, unsigned long done) */ static void rcu_barrier_callback(struct rcu_head *rhp) { - struct rcu_data *rdp = container_of(rhp, struct rcu_data, barrier_head); - struct rcu_state *rsp = rdp->rsp; + struct rcu_state *rsp = &rcu_state; if (atomic_dec_and_test(&rsp->barrier_cpu_count)) { _rcu_barrier_trace(TPS("LastCB"), -1, rsp->barrier_sequence); @@ -3365,7 +3364,6 @@ rcu_boot_init_percpu_data(int cpu) rdp->rcu_onl_gp_seq = rcu_state.gp_seq; rdp->rcu_onl_gp_flags = RCU_GP_CLEANED; rdp->cpu = cpu; - rdp->rsp = &rcu_state; rcu_boot_init_nocb_percpu_data(rdp); } diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index b21d79bdab23..6f1b1a3fc23d 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -265,7 +265,6 @@ struct rcu_data { short rcu_onl_gp_flags; /* ->gp_flags at last online. */ int cpu; - struct rcu_state *rsp; }; /* Values for nocb_defer_wakeup field in struct rcu_data. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index b60d3df92ff5..5423f9e58494 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -350,7 +350,7 @@ void rcu_note_context_switch(bool preempt) */ WARN_ON_ONCE((rdp->grpmask & rcu_rnp_online_cpus(rnp)) == 0); WARN_ON_ONCE(!list_empty(&t->rcu_node_entry)); - trace_rcu_preempt_task(rdp->rsp->name, + trace_rcu_preempt_task(rcu_state.name, t->pid, (rnp->qsmask & rdp->grpmask) ? rnp->gp_seq @@ -1951,7 +1951,7 @@ static void wake_nocb_leader_defer(struct rcu_data *rdp, int waketype, if (rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT) mod_timer(&rdp->nocb_timer, jiffies + 1); WRITE_ONCE(rdp->nocb_defer_wakeup, waketype); - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, reason); + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, reason); raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); } @@ -2030,7 +2030,7 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp, /* If we are not being polled and there is a kthread, awaken it ... */ t = READ_ONCE(rdp->nocb_kthread); if (rcu_nocb_poll || !t) { - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNotPoll")); return; } @@ -2039,7 +2039,7 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp, if (!irqs_disabled_flags(flags)) { /* ... if queue was empty ... */ wake_nocb_leader(rdp, false); - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeEmpty")); } else { wake_nocb_leader_defer(rdp, RCU_NOCB_WAKE, @@ -2050,7 +2050,7 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp, /* ... or if many callbacks queued. */ if (!irqs_disabled_flags(flags)) { wake_nocb_leader(rdp, true); - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeOvf")); } else { wake_nocb_leader_defer(rdp, RCU_NOCB_WAKE_FORCE, @@ -2058,7 +2058,7 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp, } rdp->qlen_last_fqs_check = LONG_MAX / 2; } else { - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WakeNot")); + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot")); } return; } @@ -2080,12 +2080,12 @@ static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp, return false; __call_rcu_nocb_enqueue(rdp, rhp, &rhp->next, 1, lazy, flags); if (__is_kfree_rcu_offset((unsigned long)rhp->func)) - trace_rcu_kfree_callback(rdp->rsp->name, rhp, + trace_rcu_kfree_callback(rcu_state.name, rhp, (unsigned long)rhp->func, -atomic_long_read(&rdp->nocb_q_count_lazy), -atomic_long_read(&rdp->nocb_q_count)); else - trace_rcu_callback(rdp->rsp->name, rhp, + trace_rcu_callback(rcu_state.name, rhp, -atomic_long_read(&rdp->nocb_q_count_lazy), -atomic_long_read(&rdp->nocb_q_count)); @@ -2135,7 +2135,7 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp) struct rcu_node *rnp = rdp->mynode; local_irq_save(flags); - c = rcu_seq_snap(&rdp->rsp->gp_seq); + c = rcu_seq_snap(&rcu_state.gp_seq); if (!rdp->gpwrap && ULONG_CMP_GE(rdp->gp_seq_needed, c)) { local_irq_restore(flags); } else { @@ -2180,7 +2180,7 @@ wait_again: /* Wait for callbacks to appear. */ if (!rcu_nocb_poll) { - trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, TPS("Sleep")); + trace_rcu_nocb_wake(rcu_state.name, my_rdp->cpu, TPS("Sleep")); swait_event_interruptible_exclusive(my_rdp->nocb_wq, !READ_ONCE(my_rdp->nocb_leader_sleep)); raw_spin_lock_irqsave(&my_rdp->nocb_lock, flags); @@ -2190,7 +2190,7 @@ wait_again: raw_spin_unlock_irqrestore(&my_rdp->nocb_lock, flags); } else if (firsttime) { firsttime = false; /* Don't drown trace log with "Poll"! */ - trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, TPS("Poll")); + trace_rcu_nocb_wake(rcu_state.name, my_rdp->cpu, TPS("Poll")); } /* @@ -2217,7 +2217,7 @@ wait_again: if (rcu_nocb_poll) { schedule_timeout_interruptible(1); } else { - trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, + trace_rcu_nocb_wake(rcu_state.name, my_rdp->cpu, TPS("WokeEmpty")); } goto wait_again; @@ -2262,7 +2262,7 @@ wait_again: static void nocb_follower_wait(struct rcu_data *rdp) { for (;;) { - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("FollowerSleep")); + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FollowerSleep")); swait_event_interruptible_exclusive(rdp->nocb_wq, READ_ONCE(rdp->nocb_follower_head)); if (smp_load_acquire(&rdp->nocb_follower_head)) { @@ -2270,7 +2270,7 @@ static void nocb_follower_wait(struct rcu_data *rdp) return; } WARN_ON(signal_pending(current)); - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WokeEmpty")); + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty")); } } @@ -2305,10 +2305,10 @@ static int rcu_nocb_kthread(void *arg) rdp->nocb_follower_tail = &rdp->nocb_follower_head; raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); BUG_ON(!list); - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WokeNonEmpty")); + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeNonEmpty")); /* Each pass through the following loop invokes a callback. */ - trace_rcu_batch_start(rdp->rsp->name, + trace_rcu_batch_start(rcu_state.name, atomic_long_read(&rdp->nocb_q_count_lazy), atomic_long_read(&rdp->nocb_q_count), -1); c = cl = 0; @@ -2316,23 +2316,23 @@ static int rcu_nocb_kthread(void *arg) next = list->next; /* Wait for enqueuing to complete, if needed. */ while (next == NULL && &list->next != tail) { - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WaitQueue")); schedule_timeout_interruptible(1); - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeQueue")); next = list->next; } debug_rcu_head_unqueue(list); local_bh_disable(); - if (__rcu_reclaim(rdp->rsp->name, list)) + if (__rcu_reclaim(rcu_state.name, list)) cl++; c++; local_bh_enable(); cond_resched_tasks_rcu_qs(); list = next; } - trace_rcu_batch_end(rdp->rsp->name, c, !!list, 0, 0, 1); + trace_rcu_batch_end(rcu_state.name, c, !!list, 0, 0, 1); smp_mb__before_atomic(); /* _add after CB invocation. */ atomic_long_add(-c, &rdp->nocb_q_count); atomic_long_add(-cl, &rdp->nocb_q_count_lazy); @@ -2360,7 +2360,7 @@ static void do_nocb_deferred_wakeup_common(struct rcu_data *rdp) ndw = READ_ONCE(rdp->nocb_defer_wakeup); WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); __wake_nocb_leader(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("DeferredWake")); + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake")); } /* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */ From 564a9ae6046c64d03df0c1c1264094b1a00dccc9 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Jul 2018 14:52:04 -0700 Subject: [PATCH 083/140] rcu: Remove last non-flavor-traversal rsp local variable from tree_plugin.h This commit removes the last non-flavor-traversal rsp local variable from kernel/rcu/tree_plugin.h in favor of &rcu_state. The flavor-traversal locals will be removed with the removal of flavor traversal. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_plugin.h | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 5423f9e58494..59d66ee26310 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -782,7 +782,6 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) */ static void rcu_flavor_check_callbacks(int user) { - struct rcu_state *rsp = &rcu_state; struct task_struct *t = current; if (user || rcu_is_cpu_rrupt_from_idle()) { @@ -806,7 +805,7 @@ static void rcu_flavor_check_callbacks(int user) __this_cpu_read(rcu_data.core_needs_qs) && __this_cpu_read(rcu_data.cpu_no_qs.b.norm) && !t->rcu_read_unlock_special.b.need_qs && - time_after(jiffies, rsp->gp_start + HZ)) + time_after(jiffies, rcu_state.gp_start + HZ)) t->rcu_read_unlock_special.b.need_qs = true; } @@ -1761,12 +1760,11 @@ static void print_cpu_stall_info_begin(void) /* * Print out diagnostic information for the specified stalled CPU. * - * If the specified CPU is aware of the current RCU grace period - * (flavor specified by rsp), then print the number of scheduling - * clock interrupts the CPU has taken during the time that it has - * been aware. Otherwise, print the number of RCU grace periods - * that this CPU is ignorant of, for example, "1" if the CPU was - * aware of the previous grace period. + * If the specified CPU is aware of the current RCU grace period, then + * print the number of scheduling clock interrupts the CPU has taken + * during the time that it has been aware. Otherwise, print the number + * of RCU grace periods that this CPU is ignorant of, for example, "1" + * if the CPU was aware of the previous grace period. * * Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info. */ From b97d23c51c9fee56b0c7598c323ab2846d873f2d Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Jul 2018 15:35:00 -0700 Subject: [PATCH 084/140] rcu: Remove for_each_rcu_flavor() flavor-traversal macro Now that there is only ever a single flavor of RCU in a given kernel build, there isn't a whole lot of point in having a flavor-traversal macro. This commit therefore removes it and converts calls to it to straightline code, inlining trivial functions as appropriate. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 172 ++++++++++++++++----------------------- kernel/rcu/tree.h | 7 -- kernel/rcu/tree_plugin.h | 57 +++++-------- 3 files changed, 91 insertions(+), 145 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index aeff9024bb6c..46a32999020d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -85,8 +85,6 @@ struct rcu_state rcu_state = { .ofl_lock = __SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), }; -LIST_HEAD(rcu_struct_flavors); - /* Dump rcu_node combining tree at boot to verify correct setup. */ static bool dump_tree; module_param(dump_tree, bool, 0444); @@ -568,31 +566,28 @@ void show_rcu_gp_kthreads(void) int cpu; struct rcu_data *rdp; struct rcu_node *rnp; - struct rcu_state *rsp; - for_each_rcu_flavor(rsp) { - pr_info("%s: wait state: %d ->state: %#lx\n", - rsp->name, rsp->gp_state, rsp->gp_kthread->state); - rcu_for_each_node_breadth_first(rnp) { - if (ULONG_CMP_GE(rsp->gp_seq, rnp->gp_seq_needed)) + pr_info("%s: wait state: %d ->state: %#lx\n", rcu_state.name, + rcu_state.gp_state, rcu_state.gp_kthread->state); + rcu_for_each_node_breadth_first(rnp) { + if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed)) + continue; + pr_info("\trcu_node %d:%d ->gp_seq %lu ->gp_seq_needed %lu\n", + rnp->grplo, rnp->grphi, rnp->gp_seq, + rnp->gp_seq_needed); + if (!rcu_is_leaf_node(rnp)) + continue; + for_each_leaf_node_possible_cpu(rnp, cpu) { + rdp = per_cpu_ptr(&rcu_data, cpu); + if (rdp->gpwrap || + ULONG_CMP_GE(rcu_state.gp_seq, + rdp->gp_seq_needed)) continue; - pr_info("\trcu_node %d:%d ->gp_seq %lu ->gp_seq_needed %lu\n", - rnp->grplo, rnp->grphi, rnp->gp_seq, - rnp->gp_seq_needed); - if (!rcu_is_leaf_node(rnp)) - continue; - for_each_leaf_node_possible_cpu(rnp, cpu) { - rdp = per_cpu_ptr(&rcu_data, cpu); - if (rdp->gpwrap || - ULONG_CMP_GE(rsp->gp_seq, - rdp->gp_seq_needed)) - continue; - pr_info("\tcpu %d ->gp_seq_needed %lu\n", - cpu, rdp->gp_seq_needed); - } + pr_info("\tcpu %d ->gp_seq_needed %lu\n", + cpu, rdp->gp_seq_needed); } - /* sched_show_task(rsp->gp_kthread); */ } + /* sched_show_task(rcu_state.gp_kthread); */ } EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads); @@ -638,7 +633,6 @@ static struct rcu_node *rcu_get_root(void) */ static void rcu_eqs_enter(bool user) { - struct rcu_state *rsp; struct rcu_data *rdp; struct rcu_dynticks *rdtp; @@ -655,10 +649,8 @@ static void rcu_eqs_enter(bool user) lockdep_assert_irqs_disabled(); trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0, rdtp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); - for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(&rcu_data); - do_nocb_deferred_wakeup(rdp); - } + rdp = this_cpu_ptr(&rcu_data); + do_nocb_deferred_wakeup(rdp); rcu_prepare_for_idle(); rcu_preempt_deferred_qs(current); WRITE_ONCE(rdtp->dynticks_nesting, 0); /* Avoid irq-access tearing. */ @@ -1024,21 +1016,17 @@ bool rcu_lockdep_current_cpu_online(void) { struct rcu_data *rdp; struct rcu_node *rnp; - struct rcu_state *rsp; + bool ret = false; if (in_nmi() || !rcu_scheduler_fully_active) return true; preempt_disable(); - for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(&rcu_data); - rnp = rdp->mynode; - if (rdp->grpmask & rcu_rnp_online_cpus(rnp)) { - preempt_enable(); - return true; - } - } + rdp = this_cpu_ptr(&rcu_data); + rnp = rdp->mynode; + if (rdp->grpmask & rcu_rnp_online_cpus(rnp)) + ret = true; preempt_enable(); - return false; + return ret; } EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online); @@ -1516,10 +1504,7 @@ static void check_cpu_stall(struct rcu_data *rdp) */ void rcu_cpu_stall_reset(void) { - struct rcu_state *rsp; - - for_each_rcu_flavor(rsp) - WRITE_ONCE(rsp->jiffies_stall, jiffies + ULONG_MAX / 2); + WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2); } /* Trace-event wrapper function for trace_rcu_future_grace_period. */ @@ -3134,17 +3119,12 @@ static bool rcu_cpu_has_callbacks(bool *all_lazy) bool al = true; bool hc = false; struct rcu_data *rdp; - struct rcu_state *rsp; - for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(&rcu_data); - if (rcu_segcblist_empty(&rdp->cblist)) - continue; + rdp = this_cpu_ptr(&rcu_data); + if (!rcu_segcblist_empty(&rdp->cblist)) { hc = true; - if (rcu_segcblist_n_nonlazy_cbs(&rdp->cblist) || !all_lazy) { + if (rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)) al = false; - break; - } } if (all_lazy) *all_lazy = al; @@ -3436,15 +3416,12 @@ int rcutree_online_cpu(unsigned int cpu) unsigned long flags; struct rcu_data *rdp; struct rcu_node *rnp; - struct rcu_state *rsp; - for_each_rcu_flavor(rsp) { - rdp = per_cpu_ptr(&rcu_data, cpu); - rnp = rdp->mynode; - raw_spin_lock_irqsave_rcu_node(rnp, flags); - rnp->ffmask |= rdp->grpmask; - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - } + rdp = per_cpu_ptr(&rcu_data, cpu); + rnp = rdp->mynode; + raw_spin_lock_irqsave_rcu_node(rnp, flags); + rnp->ffmask |= rdp->grpmask; + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); if (IS_ENABLED(CONFIG_TREE_SRCU)) srcu_online_cpu(cpu); if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) @@ -3463,15 +3440,12 @@ int rcutree_offline_cpu(unsigned int cpu) unsigned long flags; struct rcu_data *rdp; struct rcu_node *rnp; - struct rcu_state *rsp; - for_each_rcu_flavor(rsp) { - rdp = per_cpu_ptr(&rcu_data, cpu); - rnp = rdp->mynode; - raw_spin_lock_irqsave_rcu_node(rnp, flags); - rnp->ffmask &= ~rdp->grpmask; - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - } + rdp = per_cpu_ptr(&rcu_data, cpu); + rnp = rdp->mynode; + raw_spin_lock_irqsave_rcu_node(rnp, flags); + rnp->ffmask &= ~rdp->grpmask; + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); rcutree_affinity_setting(cpu, cpu); if (IS_ENABLED(CONFIG_TREE_SRCU)) @@ -3500,34 +3474,32 @@ void rcu_cpu_starting(unsigned int cpu) unsigned long oldmask; struct rcu_data *rdp; struct rcu_node *rnp; - struct rcu_state *rsp; + struct rcu_state *rsp = &rcu_state; if (per_cpu(rcu_cpu_started, cpu)) return; per_cpu(rcu_cpu_started, cpu) = 1; - for_each_rcu_flavor(rsp) { - rdp = per_cpu_ptr(&rcu_data, cpu); - rnp = rdp->mynode; - mask = rdp->grpmask; - raw_spin_lock_irqsave_rcu_node(rnp, flags); - rnp->qsmaskinitnext |= mask; - oldmask = rnp->expmaskinitnext; - rnp->expmaskinitnext |= mask; - oldmask ^= rnp->expmaskinitnext; - nbits = bitmap_weight(&oldmask, BITS_PER_LONG); - /* Allow lockless access for expedited grace periods. */ - smp_store_release(&rsp->ncpus, rsp->ncpus + nbits); /* ^^^ */ - rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */ - rdp->rcu_onl_gp_seq = READ_ONCE(rsp->gp_seq); - rdp->rcu_onl_gp_flags = READ_ONCE(rsp->gp_flags); - if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */ - /* Report QS -after- changing ->qsmaskinitnext! */ - rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); - } else { - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - } + rdp = per_cpu_ptr(&rcu_data, cpu); + rnp = rdp->mynode; + mask = rdp->grpmask; + raw_spin_lock_irqsave_rcu_node(rnp, flags); + rnp->qsmaskinitnext |= mask; + oldmask = rnp->expmaskinitnext; + rnp->expmaskinitnext |= mask; + oldmask ^= rnp->expmaskinitnext; + nbits = bitmap_weight(&oldmask, BITS_PER_LONG); + /* Allow lockless access for expedited grace periods. */ + smp_store_release(&rsp->ncpus, rsp->ncpus + nbits); /* ^^^ */ + rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */ + rdp->rcu_onl_gp_seq = READ_ONCE(rsp->gp_seq); + rdp->rcu_onl_gp_flags = READ_ONCE(rsp->gp_flags); + if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */ + /* Report QS -after- changing ->qsmaskinitnext! */ + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); + } else { + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } smp_mb(); /* Ensure RCU read-side usage follows above initialization. */ } @@ -3644,7 +3616,6 @@ static int __init rcu_spawn_gp_kthread(void) unsigned long flags; int kthread_prio_in = kthread_prio; struct rcu_node *rnp; - struct rcu_state *rsp; struct sched_param sp; struct task_struct *t; @@ -3664,19 +3635,17 @@ static int __init rcu_spawn_gp_kthread(void) kthread_prio, kthread_prio_in); rcu_scheduler_fully_active = 1; - for_each_rcu_flavor(rsp) { - t = kthread_create(rcu_gp_kthread, NULL, "%s", rsp->name); - BUG_ON(IS_ERR(t)); - rnp = rcu_get_root(); - raw_spin_lock_irqsave_rcu_node(rnp, flags); - rsp->gp_kthread = t; - if (kthread_prio) { - sp.sched_priority = kthread_prio; - sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); - } - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - wake_up_process(t); + t = kthread_create(rcu_gp_kthread, NULL, "%s", rcu_state.name); + BUG_ON(IS_ERR(t)); + rnp = rcu_get_root(); + raw_spin_lock_irqsave_rcu_node(rnp, flags); + rcu_state.gp_kthread = t; + if (kthread_prio) { + sp.sched_priority = kthread_prio; + sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); } + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + wake_up_process(t); rcu_spawn_nocb_kthreads(); rcu_spawn_boost_kthreads(); return 0; @@ -3782,7 +3751,6 @@ static void __init rcu_init_one(void) per_cpu_ptr(&rcu_data, i)->mynode = rnp; rcu_boot_init_percpu_data(i); } - list_add(&rsp->flavors, &rcu_struct_flavors); } /* diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 6f1b1a3fc23d..8abc15c42d84 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -360,7 +360,6 @@ struct rcu_state { /* jiffies. */ const char *name; /* Name of structure. */ char abbr; /* Abbreviated name. */ - struct list_head flavors; /* List of RCU flavors. */ spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; /* Synchronize offline with */ @@ -417,12 +416,6 @@ static const char *tp_rcu_varname __used __tracepoint_string = rcu_name; #define RCU_NAME rcu_name #endif /* #else #ifdef CONFIG_TRACING */ -extern struct list_head rcu_struct_flavors; - -/* Sequence through rcu_state structures for each RCU flavor. */ -#define for_each_rcu_flavor(rsp) \ - list_for_each_entry((rsp), &rcu_struct_flavors, flavors) - /* * RCU implementation internal declarations: */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 59d66ee26310..878a1d2cd465 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1561,31 +1561,28 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void) struct rcu_data *rdp; struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); struct rcu_node *rnp; - struct rcu_state *rsp; /* Exit early if we advanced recently. */ if (jiffies == rdtp->last_advance_all) return false; rdtp->last_advance_all = jiffies; - for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(&rcu_data); - rnp = rdp->mynode; + rdp = this_cpu_ptr(&rcu_data); + rnp = rdp->mynode; - /* - * Don't bother checking unless a grace period has - * completed since we last checked and there are - * callbacks not yet ready to invoke. - */ - if ((rcu_seq_completed_gp(rdp->gp_seq, - rcu_seq_current(&rnp->gp_seq)) || - unlikely(READ_ONCE(rdp->gpwrap))) && - rcu_segcblist_pend_cbs(&rdp->cblist)) - note_gp_changes(rdp); + /* + * Don't bother checking unless a grace period has + * completed since we last checked and there are + * callbacks not yet ready to invoke. + */ + if ((rcu_seq_completed_gp(rdp->gp_seq, + rcu_seq_current(&rnp->gp_seq)) || + unlikely(READ_ONCE(rdp->gpwrap))) && + rcu_segcblist_pend_cbs(&rdp->cblist)) + note_gp_changes(rdp); - if (rcu_segcblist_ready_cbs(&rdp->cblist)) - cbs_ready = true; - } + if (rcu_segcblist_ready_cbs(&rdp->cblist)) + cbs_ready = true; return cbs_ready; } @@ -1648,7 +1645,6 @@ static void rcu_prepare_for_idle(void) struct rcu_data *rdp; struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); struct rcu_node *rnp; - struct rcu_state *rsp; int tne; lockdep_assert_irqs_disabled(); @@ -1686,10 +1682,8 @@ static void rcu_prepare_for_idle(void) if (rdtp->last_accelerate == jiffies) return; rdtp->last_accelerate = jiffies; - for_each_rcu_flavor(rsp) { - rdp = this_cpu_ptr(&rcu_data); - if (!rcu_segcblist_pend_cbs(&rdp->cblist)) - continue; + rdp = this_cpu_ptr(&rcu_data); + if (rcu_segcblist_pend_cbs(&rdp->cblist)) { rnp = rdp->mynode; raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ needwake = rcu_accelerate_cbs(rnp, rdp); @@ -1824,10 +1818,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp) /* Increment ->ticks_this_gp for all flavors of RCU. */ static void increment_cpu_stall_ticks(void) { - struct rcu_state *rsp; - - for_each_rcu_flavor(rsp) - raw_cpu_inc(rcu_data.ticks_this_gp); + raw_cpu_inc(rcu_data.ticks_this_gp); } #ifdef CONFIG_RCU_NOCB_CPU @@ -2384,7 +2375,6 @@ void __init rcu_init_nohz(void) { int cpu; bool need_rcu_nocb_mask = false; - struct rcu_state *rsp; #if defined(CONFIG_NO_HZ_FULL) if (tick_nohz_full_running && cpumask_weight(tick_nohz_full_mask)) @@ -2418,11 +2408,9 @@ void __init rcu_init_nohz(void) if (rcu_nocb_poll) pr_info("\tPoll for callbacks from no-CBs CPUs.\n"); - for_each_rcu_flavor(rsp) { - for_each_cpu(cpu, rcu_nocb_mask) - init_nocb_callback_list(per_cpu_ptr(&rcu_data, cpu)); - rcu_organize_nocb_kthreads(); - } + for_each_cpu(cpu, rcu_nocb_mask) + init_nocb_callback_list(per_cpu_ptr(&rcu_data, cpu)); + rcu_organize_nocb_kthreads(); } /* Initialize per-rcu_data variables for no-CBs CPUs. */ @@ -2489,11 +2477,8 @@ static void rcu_spawn_one_nocb_kthread(int cpu) */ static void rcu_spawn_all_nocb_kthreads(int cpu) { - struct rcu_state *rsp; - if (rcu_scheduler_fully_active) - for_each_rcu_flavor(rsp) - rcu_spawn_one_nocb_kthread(cpu); + rcu_spawn_one_nocb_kthread(cpu); } /* From f7dd7d44fd2db80bfb2c5f81e67b5404b4735312 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Jul 2018 15:39:40 -0700 Subject: [PATCH 085/140] rcu: Simplify rcutorture_get_gp_data() This commit restructures rcutorture_get_gp_data() to take advantage of the fact that there is only one flavor of RCU. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 46a32999020d..254c78377c22 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -597,21 +597,16 @@ EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads); void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, unsigned long *gp_seq) { - struct rcu_state *rsp = NULL; - switch (test_type) { case RCU_FLAVOR: case RCU_BH_FLAVOR: case RCU_SCHED_FLAVOR: - rsp = &rcu_state; + *flags = READ_ONCE(rcu_state.gp_flags); + *gp_seq = rcu_seq_current(&rcu_state.gp_seq); break; default: break; } - if (rsp == NULL) - return; - *flags = READ_ONCE(rsp->gp_flags); - *gp_seq = rcu_seq_current(&rsp->gp_seq); } EXPORT_SYMBOL_GPL(rcutorture_get_gp_data); From 7cba4775ba79d8da5775339f6a4769762626bcfd Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Jul 2018 18:25:59 -0700 Subject: [PATCH 086/140] rcu: Restructure rcu_check_gp_kthread_starvation() This commit removes the rsp and gpa local variables, repurposes the j local variable and adds a gpk (GP kthread) local to improve readability. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 254c78377c22..4c920e2e729d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1225,24 +1225,21 @@ static const char *gp_state_getname(short gs) */ static void rcu_check_gp_kthread_starvation(void) { - unsigned long gpa; + struct task_struct *gpk = rcu_state.gp_kthread; unsigned long j; - struct rcu_state *rsp = &rcu_state; - j = jiffies; - gpa = READ_ONCE(rsp->gp_activity); - if (j - gpa > 2 * HZ) { + j = jiffies - READ_ONCE(rcu_state.gp_activity); + if (j > 2 * HZ) { pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n", - rsp->name, j - gpa, - (long)rcu_seq_current(&rsp->gp_seq), - rsp->gp_flags, - gp_state_getname(rsp->gp_state), rsp->gp_state, - rsp->gp_kthread ? rsp->gp_kthread->state : ~0, - rsp->gp_kthread ? task_cpu(rsp->gp_kthread) : -1); - if (rsp->gp_kthread) { + rcu_state.name, j, + (long)rcu_seq_current(&rcu_state.gp_seq), + rcu_state.gp_flags, + gp_state_getname(rcu_state.gp_state), rcu_state.gp_state, + gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1); + if (gpk) { pr_err("RCU grace-period kthread stack dump:\n"); - sched_show_task(rsp->gp_kthread); - wake_up_process(rsp->gp_kthread); + sched_show_task(gpk); + wake_up_process(gpk); } } } From 4c6ed43708bbd53112f3a455bf7fe0d224167943 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 00:02:29 -0700 Subject: [PATCH 087/140] rcu: Eliminate stall-warning use of rsp Now that there is only one rcu_state structure, there is less point in maintaining a pointer to it. This commit therefore replaces rsp with &rcu_state in print_other_cpu_stall(), print_cpu_stall(), and check_cpu_stall(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 80 +++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 41 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 4c920e2e729d..2f6fd076d8e6 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1273,17 +1273,17 @@ static void rcu_dump_cpu_stacks(void) static void rcu_stall_kick_kthreads(void) { unsigned long j; - struct rcu_state *rsp = &rcu_state; if (!rcu_kick_kthreads) return; - j = READ_ONCE(rsp->jiffies_kick_kthreads); - if (time_after(jiffies, j) && rsp->gp_kthread && - (rcu_gp_in_progress() || READ_ONCE(rsp->gp_flags))) { - WARN_ONCE(1, "Kicking %s grace-period kthread\n", rsp->name); + j = READ_ONCE(rcu_state.jiffies_kick_kthreads); + if (time_after(jiffies, j) && rcu_state.gp_kthread && + (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) { + WARN_ONCE(1, "Kicking %s grace-period kthread\n", + rcu_state.name); rcu_ftrace_dump(DUMP_ALL); - wake_up_process(rsp->gp_kthread); - WRITE_ONCE(rsp->jiffies_kick_kthreads, j + HZ); + wake_up_process(rcu_state.gp_kthread); + WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ); } } @@ -1301,7 +1301,6 @@ static void print_other_cpu_stall(unsigned long gp_seq) unsigned long j; int ndetected = 0; struct rcu_node *rnp = rcu_get_root(); - struct rcu_state *rsp = &rcu_state; long totqlen = 0; /* Kick and suppress, if so configured. */ @@ -1314,7 +1313,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) * See Documentation/RCU/stallwarn.txt for info on how to debug * RCU CPU stall warnings. */ - pr_err("INFO: %s detected stalls on CPUs/tasks:", rsp->name); + pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name); print_cpu_stall_info_begin(); rcu_for_each_leaf_node(rnp) { raw_spin_lock_irqsave_rcu_node(rnp, flags); @@ -1334,21 +1333,21 @@ static void print_other_cpu_stall(unsigned long gp_seq) totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(&rcu_data, cpu)->cblist); pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n", - smp_processor_id(), (long)(jiffies - rsp->gp_start), - (long)rcu_seq_current(&rsp->gp_seq), totqlen); + smp_processor_id(), (long)(jiffies - rcu_state.gp_start), + (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); if (ndetected) { rcu_dump_cpu_stacks(); /* Complain about tasks blocking the grace period. */ rcu_print_detail_task_stall(); } else { - if (rcu_seq_current(&rsp->gp_seq) != gp_seq) { + if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) { pr_err("INFO: Stall ended before state dump start\n"); } else { j = jiffies; - gpa = READ_ONCE(rsp->gp_activity); + gpa = READ_ONCE(rcu_state.gp_activity); pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n", - rsp->name, j - gpa, j, gpa, + rcu_state.name, j - gpa, j, gpa, jiffies_till_next_fqs, rcu_get_root()->qsmask); /* In this case, the current CPU might be at fault. */ @@ -1356,8 +1355,8 @@ static void print_other_cpu_stall(unsigned long gp_seq) } } /* Rewrite if needed in case of slow consoles. */ - if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) - WRITE_ONCE(rsp->jiffies_stall, + if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) + WRITE_ONCE(rcu_state.jiffies_stall, jiffies + 3 * rcu_jiffies_till_stall_check() + 3); rcu_check_gp_kthread_starvation(); @@ -1373,7 +1372,6 @@ static void print_cpu_stall(void) unsigned long flags; struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp = rcu_get_root(); - struct rcu_state *rsp = &rcu_state; long totqlen = 0; /* Kick and suppress, if so configured. */ @@ -1386,7 +1384,7 @@ static void print_cpu_stall(void) * See Documentation/RCU/stallwarn.txt for info on how to debug * RCU CPU stall warnings. */ - pr_err("INFO: %s self-detected stall on CPU", rsp->name); + pr_err("INFO: %s self-detected stall on CPU", rcu_state.name); print_cpu_stall_info_begin(); raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags); print_cpu_stall_info(smp_processor_id()); @@ -1396,8 +1394,8 @@ static void print_cpu_stall(void) totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(&rcu_data, cpu)->cblist); pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n", - jiffies - rsp->gp_start, - (long)rcu_seq_current(&rsp->gp_seq), totqlen); + jiffies - rcu_state.gp_start, + (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); rcu_check_gp_kthread_starvation(); @@ -1405,8 +1403,8 @@ static void print_cpu_stall(void) raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Rewrite if needed in case of slow consoles. */ - if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) - WRITE_ONCE(rsp->jiffies_stall, + if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) + WRITE_ONCE(rcu_state.jiffies_stall, jiffies + 3 * rcu_jiffies_till_stall_check() + 3); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); @@ -1431,7 +1429,6 @@ static void check_cpu_stall(struct rcu_data *rdp) unsigned long jn; unsigned long js; struct rcu_node *rnp; - struct rcu_state *rsp = &rcu_state; if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || !rcu_gp_in_progress()) @@ -1442,27 +1439,28 @@ static void check_cpu_stall(struct rcu_data *rdp) /* * Lots of memory barriers to reject false positives. * - * The idea is to pick up rsp->gp_seq, then rsp->jiffies_stall, - * then rsp->gp_start, and finally another copy of rsp->gp_seq. - * These values are updated in the opposite order with memory - * barriers (or equivalent) during grace-period initialization - * and cleanup. Now, a false positive can occur if we get an new - * value of rsp->gp_start and a old value of rsp->jiffies_stall. - * But given the memory barriers, the only way that this can happen - * is if one grace period ends and another starts between these - * two fetches. This is detected by comparing the second fetch - * of rsp->gp_seq with the previous fetch from rsp->gp_seq. + * The idea is to pick up rcu_state.gp_seq, then + * rcu_state.jiffies_stall, then rcu_state.gp_start, and finally + * another copy of rcu_state.gp_seq. These values are updated in + * the opposite order with memory barriers (or equivalent) during + * grace-period initialization and cleanup. Now, a false positive + * can occur if we get an new value of rcu_state.gp_start and a old + * value of rcu_state.jiffies_stall. But given the memory barriers, + * the only way that this can happen is if one grace period ends + * and another starts between these two fetches. This is detected + * by comparing the second fetch of rcu_state.gp_seq with the + * previous fetch from rcu_state.gp_seq. * - * Given this check, comparisons of jiffies, rsp->jiffies_stall, - * and rsp->gp_start suffice to forestall false positives. + * Given this check, comparisons of jiffies, rcu_state.jiffies_stall, + * and rcu_state.gp_start suffice to forestall false positives. */ - gs1 = READ_ONCE(rsp->gp_seq); + gs1 = READ_ONCE(rcu_state.gp_seq); smp_rmb(); /* Pick up ->gp_seq first... */ - js = READ_ONCE(rsp->jiffies_stall); + js = READ_ONCE(rcu_state.jiffies_stall); smp_rmb(); /* ...then ->jiffies_stall before the rest... */ - gps = READ_ONCE(rsp->gp_start); + gps = READ_ONCE(rcu_state.gp_start); smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */ - gs2 = READ_ONCE(rsp->gp_seq); + gs2 = READ_ONCE(rcu_state.gp_seq); if (gs1 != gs2 || ULONG_CMP_LT(j, js) || ULONG_CMP_GE(gps, js)) @@ -1471,14 +1469,14 @@ static void check_cpu_stall(struct rcu_data *rdp) jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; if (rcu_gp_in_progress() && (READ_ONCE(rnp->qsmask) & rdp->grpmask) && - cmpxchg(&rsp->jiffies_stall, js, jn) == js) { + cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { /* We haven't checked in, so go dump stack. */ print_cpu_stall(); } else if (rcu_gp_in_progress() && ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && - cmpxchg(&rsp->jiffies_stall, js, jn) == js) { + cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { /* They had a few time units to dump stack, so complain. */ print_other_cpu_stall(gs2); From 9cbc5b97029bff2db7fb413d6ce588d38373834c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 15:47:01 -0700 Subject: [PATCH 088/140] rcu: Eliminate grace-period management code use of rsp Now that there is only one rcu_state structure, there is less point in maintaining a pointer to it. This commit therefore replaces rsp with &rcu_state in rcu_start_this_gp(), rcu_accelerate_cbs(), __note_gp_changes(), rcu_gp_init(), rcu_gp_fqs(), rcu_gp_cleanup(), rcu_gp_kthread(), and rcu_report_qs_rsp(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 168 ++++++++++++++++++++++------------------------ 1 file changed, 82 insertions(+), 86 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2f6fd076d8e6..88915372ba38 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1525,7 +1525,6 @@ static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp, unsigned long gp_seq_req) { bool ret = false; - struct rcu_state *rsp = &rcu_state; struct rcu_node *rnp; /* @@ -1574,13 +1573,13 @@ static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp, goto unlock_out; } trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedroot")); - WRITE_ONCE(rsp->gp_flags, rsp->gp_flags | RCU_GP_FLAG_INIT); - rsp->gp_req_activity = jiffies; - if (!rsp->gp_kthread) { + WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags | RCU_GP_FLAG_INIT); + rcu_state.gp_req_activity = jiffies; + if (!rcu_state.gp_kthread) { trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("NoGPkthread")); goto unlock_out; } - trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), TPS("newreq")); + trace_rcu_grace_period(rcu_state.name, READ_ONCE(rcu_state.gp_seq), TPS("newreq")); ret = true; /* Caller must wake GP kthread. */ unlock_out: /* Push furthest requested GP to leaf node and rcu_data structure. */ @@ -1642,7 +1641,6 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp) { unsigned long gp_seq_req; bool ret = false; - struct rcu_state *rsp = &rcu_state; raw_lockdep_assert_held_rcu_node(rnp); @@ -1660,15 +1658,15 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp) * accelerating callback invocation to an earlier grace-period * number. */ - gp_seq_req = rcu_seq_snap(&rsp->gp_seq); + gp_seq_req = rcu_seq_snap(&rcu_state.gp_seq); if (rcu_segcblist_accelerate(&rdp->cblist, gp_seq_req)) ret = rcu_start_this_gp(rnp, rdp, gp_seq_req); /* Trace depending on how much we were able to accelerate. */ if (rcu_segcblist_restempty(&rdp->cblist, RCU_WAIT_TAIL)) - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("AccWaitCB")); + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("AccWaitCB")); else - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("AccReadyCB")); + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("AccReadyCB")); return ret; } @@ -1737,7 +1735,6 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) { bool ret; bool need_gp; - struct rcu_state __maybe_unused *rsp = &rcu_state; raw_lockdep_assert_held_rcu_node(rnp); @@ -1748,7 +1745,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) if (rcu_seq_completed_gp(rdp->gp_seq, rnp->gp_seq) || unlikely(READ_ONCE(rdp->gpwrap))) { ret = rcu_advance_cbs(rnp, rdp); /* Advance callbacks. */ - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuend")); + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuend")); } else { ret = rcu_accelerate_cbs(rnp, rdp); /* Recent callbacks. */ } @@ -1761,7 +1758,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) * set up to detect a quiescent state, otherwise don't * go looking for one. */ - trace_rcu_grace_period(rsp->name, rnp->gp_seq, TPS("cpustart")); + trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, TPS("cpustart")); need_gp = !!(rnp->qsmask & rdp->grpmask); rdp->cpu_no_qs.b.norm = need_gp; rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_dynticks.rcu_qs_ctr); @@ -1814,16 +1811,15 @@ static bool rcu_gp_init(void) unsigned long mask; struct rcu_data *rdp; struct rcu_node *rnp = rcu_get_root(); - struct rcu_state *rsp = &rcu_state; - WRITE_ONCE(rsp->gp_activity, jiffies); + WRITE_ONCE(rcu_state.gp_activity, jiffies); raw_spin_lock_irq_rcu_node(rnp); - if (!READ_ONCE(rsp->gp_flags)) { + if (!READ_ONCE(rcu_state.gp_flags)) { /* Spurious wakeup, tell caller to go back to sleep. */ raw_spin_unlock_irq_rcu_node(rnp); return false; } - WRITE_ONCE(rsp->gp_flags, 0); /* Clear all flags: New grace period. */ + WRITE_ONCE(rcu_state.gp_flags, 0); /* Clear all flags: New GP. */ if (WARN_ON_ONCE(rcu_gp_in_progress())) { /* @@ -1837,8 +1833,8 @@ static bool rcu_gp_init(void) /* Advance to a new grace period and initialize state. */ record_gp_stall_check_time(); /* Record GP times before starting GP, hence rcu_seq_start(). */ - rcu_seq_start(&rsp->gp_seq); - trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("start")); + rcu_seq_start(&rcu_state.gp_seq); + trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start")); raw_spin_unlock_irq_rcu_node(rnp); /* @@ -1847,15 +1843,15 @@ static bool rcu_gp_init(void) * for subsequent online CPUs, and that quiescent-state forcing * will handle subsequent offline CPUs. */ - rsp->gp_state = RCU_GP_ONOFF; + rcu_state.gp_state = RCU_GP_ONOFF; rcu_for_each_leaf_node(rnp) { - spin_lock(&rsp->ofl_lock); + spin_lock(&rcu_state.ofl_lock); raw_spin_lock_irq_rcu_node(rnp); if (rnp->qsmaskinit == rnp->qsmaskinitnext && !rnp->wait_blkd_tasks) { /* Nothing to do on this leaf rcu_node structure. */ raw_spin_unlock_irq_rcu_node(rnp); - spin_unlock(&rsp->ofl_lock); + spin_unlock(&rcu_state.ofl_lock); continue; } @@ -1891,34 +1887,34 @@ static bool rcu_gp_init(void) } raw_spin_unlock_irq_rcu_node(rnp); - spin_unlock(&rsp->ofl_lock); + spin_unlock(&rcu_state.ofl_lock); } rcu_gp_slow(gp_preinit_delay); /* Races with CPU hotplug. */ /* * Set the quiescent-state-needed bits in all the rcu_node - * structures for all currently online CPUs in breadth-first order, - * starting from the root rcu_node structure, relying on the layout - * of the tree within the rsp->node[] array. Note that other CPUs - * will access only the leaves of the hierarchy, thus seeing that no - * grace period is in progress, at least until the corresponding - * leaf node has been initialized. + * structures for all currently online CPUs in breadth-first + * order, starting from the root rcu_node structure, relying on the + * layout of the tree within the rcu_state.node[] array. Note that + * other CPUs will access only the leaves of the hierarchy, thus + * seeing that no grace period is in progress, at least until the + * corresponding leaf node has been initialized. * * The grace period cannot complete until the initialization * process finishes, because this kthread handles both. */ - rsp->gp_state = RCU_GP_INIT; + rcu_state.gp_state = RCU_GP_INIT; rcu_for_each_node_breadth_first(rnp) { rcu_gp_slow(gp_init_delay); raw_spin_lock_irqsave_rcu_node(rnp, flags); rdp = this_cpu_ptr(&rcu_data); rcu_preempt_check_blocked_tasks(rnp); rnp->qsmask = rnp->qsmaskinit; - WRITE_ONCE(rnp->gp_seq, rsp->gp_seq); + WRITE_ONCE(rnp->gp_seq, rcu_state.gp_seq); if (rnp == rdp->mynode) (void)__note_gp_changes(rnp, rdp); rcu_preempt_boost_start_gp(rnp); - trace_rcu_grace_period_init(rsp->name, rnp->gp_seq, + trace_rcu_grace_period_init(rcu_state.name, rnp->gp_seq, rnp->level, rnp->grplo, rnp->grphi, rnp->qsmask); /* Quiescent states for tasks on any now-offline CPUs. */ @@ -1929,7 +1925,7 @@ static bool rcu_gp_init(void) else raw_spin_unlock_irq_rcu_node(rnp); cond_resched_tasks_rcu_qs(); - WRITE_ONCE(rsp->gp_activity, jiffies); + WRITE_ONCE(rcu_state.gp_activity, jiffies); } return true; @@ -1961,10 +1957,9 @@ static bool rcu_gp_fqs_check_wake(int *gfp) static void rcu_gp_fqs(bool first_time) { struct rcu_node *rnp = rcu_get_root(); - struct rcu_state *rsp = &rcu_state; - WRITE_ONCE(rsp->gp_activity, jiffies); - rsp->n_force_qs++; + WRITE_ONCE(rcu_state.gp_activity, jiffies); + rcu_state.n_force_qs++; if (first_time) { /* Collect dyntick-idle snapshots. */ force_qs_rnp(dyntick_save_progress_counter); @@ -1973,10 +1968,10 @@ static void rcu_gp_fqs(bool first_time) force_qs_rnp(rcu_implicit_dynticks_qs); } /* Clear flag to prevent immediate re-entry. */ - if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) { + if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) { raw_spin_lock_irq_rcu_node(rnp); - WRITE_ONCE(rsp->gp_flags, - READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS); + WRITE_ONCE(rcu_state.gp_flags, + READ_ONCE(rcu_state.gp_flags) & ~RCU_GP_FLAG_FQS); raw_spin_unlock_irq_rcu_node(rnp); } } @@ -1991,14 +1986,13 @@ static void rcu_gp_cleanup(void) unsigned long new_gp_seq; struct rcu_data *rdp; struct rcu_node *rnp = rcu_get_root(); - struct rcu_state *rsp = &rcu_state; struct swait_queue_head *sq; - WRITE_ONCE(rsp->gp_activity, jiffies); + WRITE_ONCE(rcu_state.gp_activity, jiffies); raw_spin_lock_irq_rcu_node(rnp); - gp_duration = jiffies - rsp->gp_start; - if (gp_duration > rsp->gp_max) - rsp->gp_max = gp_duration; + gp_duration = jiffies - rcu_state.gp_start; + if (gp_duration > rcu_state.gp_max) + rcu_state.gp_max = gp_duration; /* * We know the grace period is complete, but to everyone else @@ -2019,7 +2013,7 @@ static void rcu_gp_cleanup(void) * the rcu_node structures before the beginning of the next grace * period is recorded in any of the rcu_node structures. */ - new_gp_seq = rsp->gp_seq; + new_gp_seq = rcu_state.gp_seq; rcu_seq_end(&new_gp_seq); rcu_for_each_node_breadth_first(rnp) { raw_spin_lock_irq_rcu_node(rnp); @@ -2036,16 +2030,16 @@ static void rcu_gp_cleanup(void) raw_spin_unlock_irq_rcu_node(rnp); rcu_nocb_gp_cleanup(sq); cond_resched_tasks_rcu_qs(); - WRITE_ONCE(rsp->gp_activity, jiffies); + WRITE_ONCE(rcu_state.gp_activity, jiffies); rcu_gp_slow(gp_cleanup_delay); } rnp = rcu_get_root(); - raw_spin_lock_irq_rcu_node(rnp); /* GP before rsp->gp_seq update. */ + raw_spin_lock_irq_rcu_node(rnp); /* GP before ->gp_seq update. */ /* Declare grace period done. */ - rcu_seq_end(&rsp->gp_seq); - trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("end")); - rsp->gp_state = RCU_GP_IDLE; + rcu_seq_end(&rcu_state.gp_seq); + trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("end")); + rcu_state.gp_state = RCU_GP_IDLE; /* Check for GP requests since above loop. */ rdp = this_cpu_ptr(&rcu_data); if (!needgp && ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed)) { @@ -2055,12 +2049,14 @@ static void rcu_gp_cleanup(void) } /* Advance CBs to reduce false positives below. */ if (!rcu_accelerate_cbs(rnp, rdp) && needgp) { - WRITE_ONCE(rsp->gp_flags, RCU_GP_FLAG_INIT); - rsp->gp_req_activity = jiffies; - trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), + WRITE_ONCE(rcu_state.gp_flags, RCU_GP_FLAG_INIT); + rcu_state.gp_req_activity = jiffies; + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), TPS("newreq")); } else { - WRITE_ONCE(rsp->gp_flags, rsp->gp_flags & RCU_GP_FLAG_INIT); + WRITE_ONCE(rcu_state.gp_flags, + rcu_state.gp_flags & RCU_GP_FLAG_INIT); } raw_spin_unlock_irq_rcu_node(rnp); } @@ -2074,7 +2070,6 @@ static int __noreturn rcu_gp_kthread(void *unused) int gf; unsigned long j; int ret; - struct rcu_state *rsp = &rcu_state; struct rcu_node *rnp = rcu_get_root(); rcu_bind_gp_kthread(); @@ -2082,21 +2077,22 @@ static int __noreturn rcu_gp_kthread(void *unused) /* Handle grace-period start. */ for (;;) { - trace_rcu_grace_period(rsp->name, - READ_ONCE(rsp->gp_seq), + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), TPS("reqwait")); - rsp->gp_state = RCU_GP_WAIT_GPS; - swait_event_idle_exclusive(rsp->gp_wq, READ_ONCE(rsp->gp_flags) & - RCU_GP_FLAG_INIT); - rsp->gp_state = RCU_GP_DONE_GPS; + rcu_state.gp_state = RCU_GP_WAIT_GPS; + swait_event_idle_exclusive(rcu_state.gp_wq, + READ_ONCE(rcu_state.gp_flags) & + RCU_GP_FLAG_INIT); + rcu_state.gp_state = RCU_GP_DONE_GPS; /* Locking provides needed memory barrier. */ if (rcu_gp_init()) break; cond_resched_tasks_rcu_qs(); - WRITE_ONCE(rsp->gp_activity, jiffies); + WRITE_ONCE(rcu_state.gp_activity, jiffies); WARN_ON(signal_pending(current)); - trace_rcu_grace_period(rsp->name, - READ_ONCE(rsp->gp_seq), + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), TPS("reqwaitsig")); } @@ -2106,58 +2102,59 @@ static int __noreturn rcu_gp_kthread(void *unused) ret = 0; for (;;) { if (!ret) { - rsp->jiffies_force_qs = jiffies + j; - WRITE_ONCE(rsp->jiffies_kick_kthreads, + rcu_state.jiffies_force_qs = jiffies + j; + WRITE_ONCE(rcu_state.jiffies_kick_kthreads, jiffies + 3 * j); } - trace_rcu_grace_period(rsp->name, - READ_ONCE(rsp->gp_seq), + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), TPS("fqswait")); - rsp->gp_state = RCU_GP_WAIT_FQS; - ret = swait_event_idle_timeout_exclusive(rsp->gp_wq, + rcu_state.gp_state = RCU_GP_WAIT_FQS; + ret = swait_event_idle_timeout_exclusive(rcu_state.gp_wq, rcu_gp_fqs_check_wake(&gf), j); - rsp->gp_state = RCU_GP_DOING_FQS; + rcu_state.gp_state = RCU_GP_DOING_FQS; /* Locking provides needed memory barriers. */ /* If grace period done, leave loop. */ if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp)) break; /* If time for quiescent-state forcing, do it. */ - if (ULONG_CMP_GE(jiffies, rsp->jiffies_force_qs) || + if (ULONG_CMP_GE(jiffies, rcu_state.jiffies_force_qs) || (gf & RCU_GP_FLAG_FQS)) { - trace_rcu_grace_period(rsp->name, - READ_ONCE(rsp->gp_seq), + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), TPS("fqsstart")); rcu_gp_fqs(first_gp_fqs); first_gp_fqs = false; - trace_rcu_grace_period(rsp->name, - READ_ONCE(rsp->gp_seq), + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), TPS("fqsend")); cond_resched_tasks_rcu_qs(); - WRITE_ONCE(rsp->gp_activity, jiffies); + WRITE_ONCE(rcu_state.gp_activity, jiffies); ret = 0; /* Force full wait till next FQS. */ j = jiffies_till_next_fqs; } else { /* Deal with stray signal. */ cond_resched_tasks_rcu_qs(); - WRITE_ONCE(rsp->gp_activity, jiffies); + WRITE_ONCE(rcu_state.gp_activity, jiffies); WARN_ON(signal_pending(current)); - trace_rcu_grace_period(rsp->name, - READ_ONCE(rsp->gp_seq), + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), TPS("fqswaitsig")); ret = 1; /* Keep old FQS timing. */ j = jiffies; - if (time_after(jiffies, rsp->jiffies_force_qs)) + if (time_after(jiffies, + rcu_state.jiffies_force_qs)) j = 1; else - j = rsp->jiffies_force_qs - j; + j = rcu_state.jiffies_force_qs - j; } } /* Handle grace-period end. */ - rsp->gp_state = RCU_GP_CLEANUP; + rcu_state.gp_state = RCU_GP_CLEANUP; rcu_gp_cleanup(); - rsp->gp_state = RCU_GP_CLEANED; + rcu_state.gp_state = RCU_GP_CLEANED; } } @@ -2173,11 +2170,10 @@ static int __noreturn rcu_gp_kthread(void *unused) static void rcu_report_qs_rsp(unsigned long flags) __releases(rcu_get_root()->lock) { - struct rcu_state *rsp = &rcu_state; - raw_lockdep_assert_held_rcu_node(rcu_get_root()); WARN_ON_ONCE(!rcu_gp_in_progress()); - WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); + WRITE_ONCE(rcu_state.gp_flags, + READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS); raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(), flags); rcu_gp_kthread_wake(); } From 3c779dfef2c45248c5916e5acb79570649374fd6 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 15:54:02 -0700 Subject: [PATCH 089/140] rcu: Eliminate callback-invocation/invocation use of rsp Now that there is only one rcu_state structure, there is less point in maintaining a pointer to it. This commit therefore replaces rsp with &rcu_state in rcu_do_batch(), invoke_rcu_callbacks(), and __call_rcu(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 88915372ba38..46bdb52aded1 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2467,14 +2467,13 @@ static void rcu_do_batch(struct rcu_data *rdp) struct rcu_head *rhp; struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); long bl, count; - struct rcu_state *rsp = &rcu_state; /* If no callbacks are ready, just return. */ if (!rcu_segcblist_ready_cbs(&rdp->cblist)) { - trace_rcu_batch_start(rsp->name, + trace_rcu_batch_start(rcu_state.name, rcu_segcblist_n_lazy_cbs(&rdp->cblist), rcu_segcblist_n_cbs(&rdp->cblist), 0); - trace_rcu_batch_end(rsp->name, 0, + trace_rcu_batch_end(rcu_state.name, 0, !rcu_segcblist_empty(&rdp->cblist), need_resched(), is_idle_task(current), rcu_is_callbacks_kthread()); @@ -2489,7 +2488,8 @@ static void rcu_do_batch(struct rcu_data *rdp) local_irq_save(flags); WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); bl = rdp->blimit; - trace_rcu_batch_start(rsp->name, rcu_segcblist_n_lazy_cbs(&rdp->cblist), + trace_rcu_batch_start(rcu_state.name, + rcu_segcblist_n_lazy_cbs(&rdp->cblist), rcu_segcblist_n_cbs(&rdp->cblist), bl); rcu_segcblist_extract_done_cbs(&rdp->cblist, &rcl); local_irq_restore(flags); @@ -2498,7 +2498,7 @@ static void rcu_do_batch(struct rcu_data *rdp) rhp = rcu_cblist_dequeue(&rcl); for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) { debug_rcu_head_unqueue(rhp); - if (__rcu_reclaim(rsp->name, rhp)) + if (__rcu_reclaim(rcu_state.name, rhp)) rcu_cblist_dequeued_lazy(&rcl); /* * Stop only if limit reached and CPU has something to do. @@ -2512,7 +2512,7 @@ static void rcu_do_batch(struct rcu_data *rdp) local_irq_save(flags); count = -rcl.len; - trace_rcu_batch_end(rsp->name, count, !!rcl.head, need_resched(), + trace_rcu_batch_end(rcu_state.name, count, !!rcl.head, need_resched(), is_idle_task(current), rcu_is_callbacks_kthread()); /* Update counts and requeue any remaining callbacks. */ @@ -2528,7 +2528,7 @@ static void rcu_do_batch(struct rcu_data *rdp) /* Reset ->qlen_last_fqs_check trigger if enough CBs have drained. */ if (count == 0 && rdp->qlen_last_fqs_check != 0) { rdp->qlen_last_fqs_check = 0; - rdp->n_force_qs_snap = rsp->n_force_qs; + rdp->n_force_qs_snap = rcu_state.n_force_qs; } else if (count < rdp->qlen_last_fqs_check - qhimark) rdp->qlen_last_fqs_check = count; @@ -2764,11 +2764,9 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused */ static void invoke_rcu_callbacks(struct rcu_data *rdp) { - struct rcu_state *rsp = &rcu_state; - if (unlikely(!READ_ONCE(rcu_scheduler_fully_active))) return; - if (likely(!rsp->boost)) { + if (likely(!rcu_state.boost)) { rcu_do_batch(rdp); return; } @@ -2844,7 +2842,6 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func, int cpu, bool lazy) { unsigned long flags; struct rcu_data *rdp; - struct rcu_state __maybe_unused *rsp = &rcu_state; /* Misaligned rcu_head! */ WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1)); @@ -2893,11 +2890,12 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func, int cpu, bool lazy) rcu_idle_count_callbacks_posted(); if (__is_kfree_rcu_offset((unsigned long)func)) - trace_rcu_kfree_callback(rsp->name, head, (unsigned long)func, + trace_rcu_kfree_callback(rcu_state.name, head, + (unsigned long)func, rcu_segcblist_n_lazy_cbs(&rdp->cblist), rcu_segcblist_n_cbs(&rdp->cblist)); else - trace_rcu_callback(rsp->name, head, + trace_rcu_callback(rcu_state.name, head, rcu_segcblist_n_lazy_cbs(&rdp->cblist), rcu_segcblist_n_cbs(&rdp->cblist)); From 67a0edbf3c4dfcf3d20dafaff0d8c1c0ed44c292 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 16:15:38 -0700 Subject: [PATCH 090/140] rcu: Eliminate quiescent-state and grace-period-nonstart use of rsp Now that there is only one rcu_state structure, there is less point in maintaining a pointer to it. This commit therefore replaces rsp with &rcu_state in rcu_report_qs_rnp(), force_quiescent_state(), and rcu_check_gp_start_stall(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 46bdb52aded1..f329282dd305 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2198,7 +2198,6 @@ static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, { unsigned long oldmask = 0; struct rcu_node *rnp_c; - struct rcu_state __maybe_unused *rsp = &rcu_state; raw_lockdep_assert_held_rcu_node(rnp); @@ -2217,7 +2216,7 @@ static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, WARN_ON_ONCE(!rcu_is_leaf_node(rnp) && rcu_preempt_blocked_readers_cgp(rnp)); rnp->qsmask &= ~mask; - trace_rcu_quiescent_state_report(rsp->name, rnp->gp_seq, + trace_rcu_quiescent_state_report(rcu_state.name, rnp->gp_seq, mask, rnp->qsmask, rnp->level, rnp->grplo, rnp->grphi, !!rnp->gp_tasks); @@ -2624,12 +2623,11 @@ static void force_quiescent_state(void) bool ret; struct rcu_node *rnp; struct rcu_node *rnp_old = NULL; - struct rcu_state *rsp = &rcu_state; /* Funnel through hierarchy to reduce memory contention. */ rnp = __this_cpu_read(rcu_data.mynode); for (; rnp != NULL; rnp = rnp->parent) { - ret = (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) || + ret = (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) || !raw_spin_trylock(&rnp->fqslock); if (rnp_old != NULL) raw_spin_unlock(&rnp_old->fqslock); @@ -2642,11 +2640,12 @@ static void force_quiescent_state(void) /* Reached the root of the rcu_node tree, acquire lock. */ raw_spin_lock_irqsave_rcu_node(rnp_old, flags); raw_spin_unlock(&rnp_old->fqslock); - if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) { + if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) { raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags); return; /* Someone beat us to it. */ } - WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); + WRITE_ONCE(rcu_state.gp_flags, + READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS); raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags); rcu_gp_kthread_wake(); } @@ -2662,15 +2661,14 @@ rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp) unsigned long flags; unsigned long j; struct rcu_node *rnp_root = rcu_get_root(); - struct rcu_state *rsp = &rcu_state; static atomic_t warned = ATOMIC_INIT(0); if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed)) return; j = jiffies; /* Expensive access, and in common case don't get here. */ - if (time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) || - time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) || + if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || + time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || atomic_read(&warned)) return; @@ -2678,8 +2676,8 @@ rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp) j = jiffies; if (rcu_gp_in_progress() || ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || - time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) || - time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) || + time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || + time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || atomic_read(&warned)) { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); return; @@ -2691,19 +2689,19 @@ rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp) j = jiffies; if (rcu_gp_in_progress() || ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || - time_before(j, rsp->gp_req_activity + gpssdelay) || - time_before(j, rsp->gp_activity + gpssdelay) || + time_before(j, rcu_state.gp_req_activity + gpssdelay) || + time_before(j, rcu_state.gp_activity + gpssdelay) || atomic_xchg(&warned, 1)) { raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */ raw_spin_unlock_irqrestore_rcu_node(rnp, flags); return; } pr_alert("%s: g%ld->%ld gar:%lu ga:%lu f%#x gs:%d %s->state:%#lx\n", - __func__, (long)READ_ONCE(rsp->gp_seq), + __func__, (long)READ_ONCE(rcu_state.gp_seq), (long)READ_ONCE(rnp_root->gp_seq_needed), - j - rsp->gp_req_activity, j - rsp->gp_activity, - rsp->gp_flags, rsp->gp_state, rsp->name, - rsp->gp_kthread ? rsp->gp_kthread->state : 0x1ffffL); + j - rcu_state.gp_req_activity, j - rcu_state.gp_activity, + rcu_state.gp_flags, rcu_state.gp_state, rcu_state.name, + rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL); WARN_ON(1); if (rnp_root != rnp) raw_spin_unlock_rcu_node(rnp_root); From ec9f5835f74cba5cc2285d3032bb2b16afc312c3 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 16:26:12 -0700 Subject: [PATCH 091/140] rcu: Eliminate RCU-barrier use of rsp Now that there is only one rcu_state structure, there is less point in maintaining a pointer to it. This commit therefore replaces rsp with &rcu_state in rcu_barrier_callback(), rcu_barrier_func(), and _rcu_barrier(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 65 +++++++++++++++++++++++------------------------ 1 file changed, 32 insertions(+), 33 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index f329282dd305..ce5fb177a0f7 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3129,32 +3129,31 @@ static void _rcu_barrier_trace(const char *s, int cpu, unsigned long done) */ static void rcu_barrier_callback(struct rcu_head *rhp) { - struct rcu_state *rsp = &rcu_state; - - if (atomic_dec_and_test(&rsp->barrier_cpu_count)) { - _rcu_barrier_trace(TPS("LastCB"), -1, rsp->barrier_sequence); - complete(&rsp->barrier_completion); + if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) { + _rcu_barrier_trace(TPS("LastCB"), -1, + rcu_state.barrier_sequence); + complete(&rcu_state.barrier_completion); } else { - _rcu_barrier_trace(TPS("CB"), -1, rsp->barrier_sequence); + _rcu_barrier_trace(TPS("CB"), -1, rcu_state.barrier_sequence); } } /* * Called with preemption disabled, and from cross-cpu IRQ context. */ -static void rcu_barrier_func(void *type) +static void rcu_barrier_func(void *unused) { - struct rcu_state *rsp = type; struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); - _rcu_barrier_trace(TPS("IRQ"), -1, rsp->barrier_sequence); + _rcu_barrier_trace(TPS("IRQ"), -1, rcu_state.barrier_sequence); rdp->barrier_head.func = rcu_barrier_callback; debug_rcu_head_queue(&rdp->barrier_head); if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head, 0)) { - atomic_inc(&rsp->barrier_cpu_count); + atomic_inc(&rcu_state.barrier_cpu_count); } else { debug_rcu_head_unqueue(&rdp->barrier_head); - _rcu_barrier_trace(TPS("IRQNQ"), -1, rsp->barrier_sequence); + _rcu_barrier_trace(TPS("IRQNQ"), -1, + rcu_state.barrier_sequence); } } @@ -3166,25 +3165,25 @@ static void _rcu_barrier(void) { int cpu; struct rcu_data *rdp; - struct rcu_state *rsp = &rcu_state; - unsigned long s = rcu_seq_snap(&rsp->barrier_sequence); + unsigned long s = rcu_seq_snap(&rcu_state.barrier_sequence); _rcu_barrier_trace(TPS("Begin"), -1, s); /* Take mutex to serialize concurrent rcu_barrier() requests. */ - mutex_lock(&rsp->barrier_mutex); + mutex_lock(&rcu_state.barrier_mutex); /* Did someone else do our work for us? */ - if (rcu_seq_done(&rsp->barrier_sequence, s)) { - _rcu_barrier_trace(TPS("EarlyExit"), -1, rsp->barrier_sequence); + if (rcu_seq_done(&rcu_state.barrier_sequence, s)) { + _rcu_barrier_trace(TPS("EarlyExit"), -1, + rcu_state.barrier_sequence); smp_mb(); /* caller's subsequent code after above check. */ - mutex_unlock(&rsp->barrier_mutex); + mutex_unlock(&rcu_state.barrier_mutex); return; } /* Mark the start of the barrier operation. */ - rcu_seq_start(&rsp->barrier_sequence); - _rcu_barrier_trace(TPS("Inc1"), -1, rsp->barrier_sequence); + rcu_seq_start(&rcu_state.barrier_sequence); + _rcu_barrier_trace(TPS("Inc1"), -1, rcu_state.barrier_sequence); /* * Initialize the count to one rather than to zero in order to @@ -3192,8 +3191,8 @@ static void _rcu_barrier(void) * (or preemption of this task). Exclude CPU-hotplug operations * to ensure that no offline CPU has callbacks queued. */ - init_completion(&rsp->barrier_completion); - atomic_set(&rsp->barrier_cpu_count, 1); + init_completion(&rcu_state.barrier_completion); + atomic_set(&rcu_state.barrier_cpu_count, 1); get_online_cpus(); /* @@ -3208,22 +3207,22 @@ static void _rcu_barrier(void) if (rcu_is_nocb_cpu(cpu)) { if (!rcu_nocb_cpu_needs_barrier(cpu)) { _rcu_barrier_trace(TPS("OfflineNoCB"), cpu, - rsp->barrier_sequence); + rcu_state.barrier_sequence); } else { _rcu_barrier_trace(TPS("OnlineNoCB"), cpu, - rsp->barrier_sequence); + rcu_state.barrier_sequence); smp_mb__before_atomic(); - atomic_inc(&rsp->barrier_cpu_count); + atomic_inc(&rcu_state.barrier_cpu_count); __call_rcu(&rdp->barrier_head, rcu_barrier_callback, cpu, 0); } } else if (rcu_segcblist_n_cbs(&rdp->cblist)) { _rcu_barrier_trace(TPS("OnlineQ"), cpu, - rsp->barrier_sequence); - smp_call_function_single(cpu, rcu_barrier_func, rsp, 1); + rcu_state.barrier_sequence); + smp_call_function_single(cpu, rcu_barrier_func, NULL, 1); } else { _rcu_barrier_trace(TPS("OnlineNQ"), cpu, - rsp->barrier_sequence); + rcu_state.barrier_sequence); } } put_online_cpus(); @@ -3232,18 +3231,18 @@ static void _rcu_barrier(void) * Now that we have an rcu_barrier_callback() callback on each * CPU, and thus each counted, remove the initial count. */ - if (atomic_dec_and_test(&rsp->barrier_cpu_count)) - complete(&rsp->barrier_completion); + if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) + complete(&rcu_state.barrier_completion); /* Wait for all rcu_barrier_callback() callbacks to be invoked. */ - wait_for_completion(&rsp->barrier_completion); + wait_for_completion(&rcu_state.barrier_completion); /* Mark the end of the barrier operation. */ - _rcu_barrier_trace(TPS("Inc2"), -1, rsp->barrier_sequence); - rcu_seq_end(&rsp->barrier_sequence); + _rcu_barrier_trace(TPS("Inc2"), -1, rcu_state.barrier_sequence); + rcu_seq_end(&rcu_state.barrier_sequence); /* Other rcu_barrier() invocations can now safely proceed. */ - mutex_unlock(&rsp->barrier_mutex); + mutex_unlock(&rcu_state.barrier_mutex); } /** From eb7a6653887b540a81d1b91ee0fc68b604da9386 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 17:47:45 -0700 Subject: [PATCH 092/140] rcu: Eliminate initialization-time use of rsp Now that there is only one rcu_state structure, there is less point in maintaining a pointer to it. This commit therefore replaces rsp with &rcu_state in rcu_cpu_starting() and rcu_init_one(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index ce5fb177a0f7..5e3a3001a50d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3455,7 +3455,6 @@ void rcu_cpu_starting(unsigned int cpu) unsigned long oldmask; struct rcu_data *rdp; struct rcu_node *rnp; - struct rcu_state *rsp = &rcu_state; if (per_cpu(rcu_cpu_started, cpu)) return; @@ -3472,10 +3471,10 @@ void rcu_cpu_starting(unsigned int cpu) oldmask ^= rnp->expmaskinitnext; nbits = bitmap_weight(&oldmask, BITS_PER_LONG); /* Allow lockless access for expedited grace periods. */ - smp_store_release(&rsp->ncpus, rsp->ncpus + nbits); /* ^^^ */ + smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + nbits); /* ^^^ */ rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */ - rdp->rcu_onl_gp_seq = READ_ONCE(rsp->gp_seq); - rdp->rcu_onl_gp_flags = READ_ONCE(rsp->gp_flags); + rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq); + rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags); if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */ /* Report QS -after- changing ->qsmaskinitnext! */ rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); @@ -3667,7 +3666,6 @@ static void __init rcu_init_one(void) int i; int j; struct rcu_node *rnp; - struct rcu_state *rsp = &rcu_state; BUILD_BUG_ON(RCU_NUM_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */ @@ -3678,14 +3676,15 @@ static void __init rcu_init_one(void) /* Initialize the level-tracking arrays. */ for (i = 1; i < rcu_num_lvls; i++) - rsp->level[i] = rsp->level[i - 1] + num_rcu_lvl[i - 1]; + rcu_state.level[i] = + rcu_state.level[i - 1] + num_rcu_lvl[i - 1]; rcu_init_levelspread(levelspread, num_rcu_lvl); /* Initialize the elements themselves, starting from the leaves. */ for (i = rcu_num_lvls - 1; i >= 0; i--) { cpustride *= levelspread[i]; - rnp = rsp->level[i]; + rnp = rcu_state.level[i]; for (j = 0; j < num_rcu_lvl[i]; j++, rnp++) { raw_spin_lock_init(&ACCESS_PRIVATE(rnp, lock)); lockdep_set_class_and_name(&ACCESS_PRIVATE(rnp, lock), @@ -3693,9 +3692,9 @@ static void __init rcu_init_one(void) raw_spin_lock_init(&rnp->fqslock); lockdep_set_class_and_name(&rnp->fqslock, &rcu_fqs_class[i], fqs[i]); - rnp->gp_seq = rsp->gp_seq; - rnp->gp_seq_needed = rsp->gp_seq; - rnp->completedqs = rsp->gp_seq; + rnp->gp_seq = rcu_state.gp_seq; + rnp->gp_seq_needed = rcu_state.gp_seq; + rnp->completedqs = rcu_state.gp_seq; rnp->qsmask = 0; rnp->qsmaskinit = 0; rnp->grplo = j * cpustride; @@ -3709,7 +3708,7 @@ static void __init rcu_init_one(void) } else { rnp->grpnum = j % levelspread[i - 1]; rnp->grpmask = 1UL << rnp->grpnum; - rnp->parent = rsp->level[i - 1] + + rnp->parent = rcu_state.level[i - 1] + j / levelspread[i - 1]; } rnp->level = i; @@ -3723,8 +3722,8 @@ static void __init rcu_init_one(void) } } - init_swait_queue_head(&rsp->gp_wq); - init_swait_queue_head(&rsp->expedited_wq); + init_swait_queue_head(&rcu_state.gp_wq); + init_swait_queue_head(&rcu_state.expedited_wq); rnp = rcu_first_leaf_node(); for_each_possible_cpu(i) { while (i > rnp->grphi) From 8ff0b90780910821a53c70d5e68d28382f2a1a07 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 17:55:14 -0700 Subject: [PATCH 093/140] rcu: Fix typo in force_qs_rnp()'s parameter's parameter Pointers to rcu_data structures should be named rdp, not rsp. This commit therefore makes this change. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 5e3a3001a50d..c1ce4cf41068 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -477,7 +477,7 @@ module_param(rcu_kick_kthreads, bool, 0644); static ulong jiffies_till_sched_qs = HZ / 10; module_param(jiffies_till_sched_qs, ulong, 0444); -static void force_qs_rnp(int (*f)(struct rcu_data *rsp)); +static void force_qs_rnp(int (*f)(struct rcu_data *rdp)); static void force_quiescent_state(void); static int rcu_pending(void); @@ -2570,7 +2570,7 @@ void rcu_check_callbacks(int user) * * The caller must have suppressed start of new grace periods. */ -static void force_qs_rnp(int (*f)(struct rcu_data *rsp)) +static void force_qs_rnp(int (*f)(struct rcu_data *rdp)) { int cpu; unsigned long flags; From 4e95020cdd34bbfc86f9c705f4d46ed63fa2e231 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 17:59:36 -0700 Subject: [PATCH 094/140] rcu: Inline increment_cpu_stall_ticks() into its sole caller Consolidation of the RCU flavors into one makes increment_cpu_stall_ticks() a trivial one-line function with only one caller. This commit therefore inlines it. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 2 +- kernel/rcu/tree.h | 1 - kernel/rcu/tree_plugin.h | 6 ------ 3 files changed, 1 insertion(+), 8 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index c1ce4cf41068..ee130b0dc54a 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2555,7 +2555,7 @@ static void rcu_do_batch(struct rcu_data *rdp) void rcu_check_callbacks(int user) { trace_rcu_utilization(TPS("Start scheduler-tick")); - increment_cpu_stall_ticks(); + raw_cpu_inc(rcu_data.ticks_this_gp); rcu_flavor_check_callbacks(user); if (rcu_pending()) invoke_rcu_core(); diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 8abc15c42d84..46452d3d0fad 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -466,7 +466,6 @@ static void print_cpu_stall_info_begin(void); static void print_cpu_stall_info(int cpu); static void print_cpu_stall_info_end(void); static void zero_cpu_stall_ticks(struct rcu_data *rdp); -static void increment_cpu_stall_ticks(void); static bool rcu_nocb_cpu_needs_barrier(int cpu); static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 878a1d2cd465..cd276c46bc14 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1815,12 +1815,6 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp) rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id()); } -/* Increment ->ticks_this_gp for all flavors of RCU. */ -static void increment_cpu_stall_ticks(void) -{ - raw_cpu_inc(rcu_data.ticks_this_gp); -} - #ifdef CONFIG_RCU_NOCB_CPU /* From c3854a055bc834806b481b34f5f552ac415b2000 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 5 Jul 2018 18:23:23 -0700 Subject: [PATCH 095/140] rcu: Pull rcu_gp_kthread() FQS loop into separate function The rcu_gp_kthread() function is long and deeply indented, so this commit pulls the loop that repeatedly invokes rcu_gp_fqs() into a new rcu_gp_fqs_loop() function. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 125 ++++++++++++++++++++++++---------------------- 1 file changed, 66 insertions(+), 59 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index ee130b0dc54a..53ba7747878c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1976,6 +1976,71 @@ static void rcu_gp_fqs(bool first_time) } } +/* + * Loop doing repeated quiescent-state forcing until the grace period ends. + */ +static void rcu_gp_fqs_loop(void) +{ + bool first_gp_fqs; + int gf; + unsigned long j; + int ret; + struct rcu_node *rnp = rcu_get_root(); + + first_gp_fqs = true; + j = jiffies_till_first_fqs; + ret = 0; + for (;;) { + if (!ret) { + rcu_state.jiffies_force_qs = jiffies + j; + WRITE_ONCE(rcu_state.jiffies_kick_kthreads, + jiffies + 3 * j); + } + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), + TPS("fqswait")); + rcu_state.gp_state = RCU_GP_WAIT_FQS; + ret = swait_event_idle_timeout_exclusive( + rcu_state.gp_wq, rcu_gp_fqs_check_wake(&gf), j); + rcu_state.gp_state = RCU_GP_DOING_FQS; + /* Locking provides needed memory barriers. */ + /* If grace period done, leave loop. */ + if (!READ_ONCE(rnp->qsmask) && + !rcu_preempt_blocked_readers_cgp(rnp)) + break; + /* If time for quiescent-state forcing, do it. */ + if (ULONG_CMP_GE(jiffies, rcu_state.jiffies_force_qs) || + (gf & RCU_GP_FLAG_FQS)) { + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), + TPS("fqsstart")); + rcu_gp_fqs(first_gp_fqs); + first_gp_fqs = false; + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), + TPS("fqsend")); + cond_resched_tasks_rcu_qs(); + WRITE_ONCE(rcu_state.gp_activity, jiffies); + ret = 0; /* Force full wait till next FQS. */ + j = jiffies_till_next_fqs; + } else { + /* Deal with stray signal. */ + cond_resched_tasks_rcu_qs(); + WRITE_ONCE(rcu_state.gp_activity, jiffies); + WARN_ON(signal_pending(current)); + trace_rcu_grace_period(rcu_state.name, + READ_ONCE(rcu_state.gp_seq), + TPS("fqswaitsig")); + ret = 1; /* Keep old FQS timing. */ + j = jiffies; + if (time_after(jiffies, rcu_state.jiffies_force_qs)) + j = 1; + else + j = rcu_state.jiffies_force_qs - j; + } + } +} + /* * Clean up after the old grace period. */ @@ -2066,12 +2131,6 @@ static void rcu_gp_cleanup(void) */ static int __noreturn rcu_gp_kthread(void *unused) { - bool first_gp_fqs; - int gf; - unsigned long j; - int ret; - struct rcu_node *rnp = rcu_get_root(); - rcu_bind_gp_kthread(); for (;;) { @@ -2097,59 +2156,7 @@ static int __noreturn rcu_gp_kthread(void *unused) } /* Handle quiescent-state forcing. */ - first_gp_fqs = true; - j = jiffies_till_first_fqs; - ret = 0; - for (;;) { - if (!ret) { - rcu_state.jiffies_force_qs = jiffies + j; - WRITE_ONCE(rcu_state.jiffies_kick_kthreads, - jiffies + 3 * j); - } - trace_rcu_grace_period(rcu_state.name, - READ_ONCE(rcu_state.gp_seq), - TPS("fqswait")); - rcu_state.gp_state = RCU_GP_WAIT_FQS; - ret = swait_event_idle_timeout_exclusive(rcu_state.gp_wq, - rcu_gp_fqs_check_wake(&gf), j); - rcu_state.gp_state = RCU_GP_DOING_FQS; - /* Locking provides needed memory barriers. */ - /* If grace period done, leave loop. */ - if (!READ_ONCE(rnp->qsmask) && - !rcu_preempt_blocked_readers_cgp(rnp)) - break; - /* If time for quiescent-state forcing, do it. */ - if (ULONG_CMP_GE(jiffies, rcu_state.jiffies_force_qs) || - (gf & RCU_GP_FLAG_FQS)) { - trace_rcu_grace_period(rcu_state.name, - READ_ONCE(rcu_state.gp_seq), - TPS("fqsstart")); - rcu_gp_fqs(first_gp_fqs); - first_gp_fqs = false; - trace_rcu_grace_period(rcu_state.name, - READ_ONCE(rcu_state.gp_seq), - TPS("fqsend")); - cond_resched_tasks_rcu_qs(); - WRITE_ONCE(rcu_state.gp_activity, jiffies); - ret = 0; /* Force full wait till next FQS. */ - j = jiffies_till_next_fqs; - } else { - /* Deal with stray signal. */ - cond_resched_tasks_rcu_qs(); - WRITE_ONCE(rcu_state.gp_activity, jiffies); - WARN_ON(signal_pending(current)); - trace_rcu_grace_period(rcu_state.name, - READ_ONCE(rcu_state.gp_seq), - TPS("fqswaitsig")); - ret = 1; /* Keep old FQS timing. */ - j = jiffies; - if (time_after(jiffies, - rcu_state.jiffies_force_qs)) - j = 1; - else - j = rcu_state.jiffies_force_qs - j; - } - } + rcu_gp_fqs_loop(); /* Handle grace-period end. */ rcu_state.gp_state = RCU_GP_CLEANUP; From 4c7e9c1434c6fc960774a5475f2fbccbf557fdeb Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 6 Jul 2018 09:54:25 -0700 Subject: [PATCH 096/140] rcu: Consolidate RCU-bh update-side function definitions This commit saves a few lines by consolidating the RCU-bh function definitions at the end of include/linux/rcupdate.h. This consolidation also makes it easier to remove them all when the time comes. Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 27 ++++++++++++++++++++++----- include/linux/rcutiny.h | 15 --------------- include/linux/rcutree.h | 17 ----------------- kernel/rcu/tree.c | 9 --------- 4 files changed, 22 insertions(+), 46 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 1207c6c9bd8b..e530f5739033 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -58,11 +58,6 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func); void rcu_barrier_tasks(void); void synchronize_rcu(void); -static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) -{ - call_rcu(head, func); -} - #ifdef CONFIG_PREEMPT_RCU void __rcu_read_lock(void); @@ -875,4 +870,26 @@ static inline notrace void rcu_read_unlock_sched_notrace(void) #endif /* #else #ifdef CONFIG_ARCH_WEAK_RELEASE_ACQUIRE */ +/* Transitional pre-consolidation compatibility definitions. */ + +static inline void synchronize_rcu_bh(void) +{ + synchronize_rcu(); +} + +static inline void synchronize_rcu_bh_expedited(void) +{ + synchronize_rcu_expedited(); +} + +static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) +{ + call_rcu(head, func); +} + +static inline void rcu_barrier_bh(void) +{ + rcu_barrier(); +} + #endif /* __LINUX_RCUPDATE_H */ diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index e66fb8bc2127..df82bada9b19 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -68,21 +68,6 @@ static inline void rcu_barrier_sched(void) rcu_barrier(); /* Only one CPU, so only one list of callbacks! */ } -static inline void rcu_barrier_bh(void) -{ - rcu_barrier(); -} - -static inline void synchronize_rcu_bh(void) -{ - synchronize_sched(); -} - -static inline void synchronize_rcu_bh_expedited(void) -{ - synchronize_sched(); -} - static inline void synchronize_rcu_expedited(void) { synchronize_sched(); diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 6d30a0809300..94820156aa62 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -45,11 +45,6 @@ static inline void rcu_virt_note_context_switch(int cpu) rcu_note_context_switch(false); } -static inline void synchronize_rcu_bh(void) -{ - synchronize_rcu(); -} - void synchronize_rcu_expedited(void); static inline void synchronize_sched_expedited(void) @@ -59,19 +54,7 @@ static inline void synchronize_sched_expedited(void) void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func); -/** - * synchronize_rcu_bh_expedited - Brute-force RCU-bh grace period - * - * This is a transitional API and will soon be removed, with all - * callers converted to synchronize_rcu_expedited(). - */ -static inline void synchronize_rcu_bh_expedited(void) -{ - synchronize_rcu_expedited(); -} - void rcu_barrier(void); -void rcu_barrier_bh(void); void rcu_barrier_sched(void); bool rcu_eqs_special_set(int cpu); unsigned long get_state_synchronize_rcu(void); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 53ba7747878c..8d5dadaf3c53 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3252,15 +3252,6 @@ static void _rcu_barrier(void) mutex_unlock(&rcu_state.barrier_mutex); } -/** - * rcu_barrier_bh - Wait until all in-flight call_rcu_bh() callbacks complete. - */ -void rcu_barrier_bh(void) -{ - _rcu_barrier(); -} -EXPORT_SYMBOL_GPL(rcu_barrier_bh); - /** * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete. * From a8bb74acd8efe2eb934d524ae20859980975b602 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 6 Jul 2018 11:46:47 -0700 Subject: [PATCH 097/140] rcu: Consolidate RCU-sched update-side function definitions This commit saves a few lines by consolidating the RCU-sched function definitions at the end of include/linux/rcupdate.h. This consolidation also makes it easier to remove them all when the time comes. Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 38 +++++++++++++++++++++----- include/linux/rcutiny.h | 32 +--------------------- include/linux/rcutree.h | 9 ------- kernel/rcu/tree.c | 58 ---------------------------------------- 4 files changed, 32 insertions(+), 105 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index e530f5739033..12103e1bbe67 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -48,12 +48,6 @@ #define ulong2long(a) (*(long *)(&(a))) /* Exported common interfaces */ - -#ifndef CONFIG_TINY_RCU -void synchronize_sched(void); -void call_rcu_sched(struct rcu_head *head, rcu_callback_t func); -#endif - void call_rcu(struct rcu_head *head, rcu_callback_t func); void rcu_barrier_tasks(void); void synchronize_rcu(void); @@ -170,7 +164,7 @@ void exit_tasks_rcu_finish(void); #define rcu_tasks_qs(t) do { } while (0) #define rcu_note_voluntary_context_switch(t) rcu_all_qs() #define call_rcu_tasks call_rcu_sched -#define synchronize_rcu_tasks synchronize_sched +#define synchronize_rcu_tasks synchronize_rcu static inline void exit_tasks_rcu_start(void) { } static inline void exit_tasks_rcu_finish(void) { } #endif /* #else #ifdef CONFIG_TASKS_RCU */ @@ -892,4 +886,34 @@ static inline void rcu_barrier_bh(void) rcu_barrier(); } +static inline void synchronize_sched(void) +{ + synchronize_rcu(); +} + +static inline void synchronize_sched_expedited(void) +{ + synchronize_rcu_expedited(); +} + +static inline void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) +{ + call_rcu(head, func); +} + +static inline void rcu_barrier_sched(void) +{ + rcu_barrier(); +} + +static inline unsigned long get_state_synchronize_sched(void) +{ + return get_state_synchronize_rcu(); +} + +static inline void cond_synchronize_sched(unsigned long oldstate) +{ + cond_synchronize_rcu(oldstate); +} + #endif /* __LINUX_RCUPDATE_H */ diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index df82bada9b19..7fa4fb9e899e 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -36,11 +36,6 @@ static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp) /* Never flag non-existent other CPUs! */ static inline bool rcu_eqs_special_set(int cpu) { return false; } -static inline void synchronize_sched(void) -{ - synchronize_rcu(); -} - static inline unsigned long get_state_synchronize_rcu(void) { return 0; @@ -51,36 +46,11 @@ static inline void cond_synchronize_rcu(unsigned long oldstate) might_sleep(); } -static inline unsigned long get_state_synchronize_sched(void) -{ - return 0; -} - -static inline void cond_synchronize_sched(unsigned long oldstate) -{ - might_sleep(); -} - extern void rcu_barrier(void); -static inline void rcu_barrier_sched(void) -{ - rcu_barrier(); /* Only one CPU, so only one list of callbacks! */ -} - static inline void synchronize_rcu_expedited(void) { - synchronize_sched(); -} - -static inline void synchronize_sched_expedited(void) -{ - synchronize_sched(); -} - -static inline void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) -{ - call_rcu(head, func); + synchronize_rcu(); } static inline void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 94820156aa62..d09a9abe9440 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -46,21 +46,12 @@ static inline void rcu_virt_note_context_switch(int cpu) } void synchronize_rcu_expedited(void); - -static inline void synchronize_sched_expedited(void) -{ - synchronize_rcu_expedited(); -} - void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func); void rcu_barrier(void); -void rcu_barrier_sched(void); bool rcu_eqs_special_set(int cpu); unsigned long get_state_synchronize_rcu(void); void cond_synchronize_rcu(unsigned long oldstate); -unsigned long get_state_synchronize_sched(void); -void cond_synchronize_sched(unsigned long oldstate); void rcu_idle_enter(void); void rcu_idle_exit(void); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 8d5dadaf3c53..1a2551a4d583 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2950,19 +2950,6 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) } EXPORT_SYMBOL_GPL(call_rcu); -/** - * call_rcu_sched() - Queue an RCU for invocation after sched grace period. - * @head: structure to be used for queueing the RCU updates. - * @func: actual callback function to be invoked after the grace period - * - * This is transitional. - */ -void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) -{ - call_rcu(head, func); -} -EXPORT_SYMBOL_GPL(call_rcu_sched); - /* * Queue an RCU callback for lazy invocation after a grace period. * This will likely be later named something like "call_rcu_lazy()", @@ -2976,17 +2963,6 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) } EXPORT_SYMBOL_GPL(kfree_call_rcu); -/** - * synchronize_sched - wait until an rcu-sched grace period has elapsed. - * - * This is transitional. - */ -void synchronize_sched(void) -{ - synchronize_rcu(); -} -EXPORT_SYMBOL_GPL(synchronize_sched); - /** * get_state_synchronize_rcu - Snapshot current RCU state * @@ -3028,29 +3004,6 @@ void cond_synchronize_rcu(unsigned long oldstate) } EXPORT_SYMBOL_GPL(cond_synchronize_rcu); -/** - * get_state_synchronize_sched - Snapshot current RCU-sched state - * - * This is transitional, and only used by rcutorture. - */ -unsigned long get_state_synchronize_sched(void) -{ - return get_state_synchronize_rcu(); -} -EXPORT_SYMBOL_GPL(get_state_synchronize_sched); - -/** - * cond_synchronize_sched - Conditionally wait for an RCU-sched grace period - * @oldstate: return value from earlier call to get_state_synchronize_sched() - * - * This is transitional and only used by rcutorture. - */ -void cond_synchronize_sched(unsigned long oldstate) -{ - cond_synchronize_rcu(oldstate); -} -EXPORT_SYMBOL_GPL(cond_synchronize_sched); - /* * Check to see if there is any immediate RCU-related work to be done by * the current CPU, for the specified type of RCU, returning 1 if so and @@ -3266,17 +3219,6 @@ void rcu_barrier(void) } EXPORT_SYMBOL_GPL(rcu_barrier); -/** - * rcu_barrier_sched - Wait for in-flight call_rcu_sched() callbacks. - * - * This is transitional. - */ -void rcu_barrier_sched(void) -{ - rcu_barrier(); -} -EXPORT_SYMBOL_GPL(rcu_barrier_sched); - /* * Propagate ->qsinitmask bits up the rcu_node tree to account for the * first CPU in a given leaf rcu_node structure coming online. The caller From 2ceebc035082a42f1416d4b47270c0acb5354949 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 6 Jul 2018 15:16:12 -0700 Subject: [PATCH 098/140] rcutorture: Add RCU-bh and RCU-sched support for extended readers Since there is now a single consolidated RCU flavor, rcutorture needs to test extending of RCU readers via rcu_read_lock_bh() and rcu_read_lock_sched(). This commit adds this support, with added checks (just like for local_bh_enable()) to ensure that rcu_read_unlock_bh() will not be invoked while interrupts are disabled. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index c55d1483886e..1bc0e37dffa8 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -66,13 +66,16 @@ MODULE_AUTHOR("Paul E. McKenney and Josh Triplett extendables field, extendables param, and related definitions. */ #define RCUTORTURE_RDR_SHIFT 8 /* Put SRCU index in upper bits. */ #define RCUTORTURE_RDR_MASK ((1 << RCUTORTURE_RDR_SHIFT) - 1) -#define RCUTORTURE_RDR_BH 0x1 /* Extend readers by disabling bh. */ -#define RCUTORTURE_RDR_IRQ 0x2 /* ... disabling interrupts. */ -#define RCUTORTURE_RDR_PREEMPT 0x4 /* ... disabling preemption. */ -#define RCUTORTURE_RDR_RCU 0x8 /* ... entering another RCU reader. */ -#define RCUTORTURE_RDR_NBITS 4 /* Number of bits defined above. */ -#define RCUTORTURE_MAX_EXTEND (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | \ - RCUTORTURE_RDR_PREEMPT) +#define RCUTORTURE_RDR_BH 0x01 /* Extend readers by disabling bh. */ +#define RCUTORTURE_RDR_IRQ 0x02 /* ... disabling interrupts. */ +#define RCUTORTURE_RDR_PREEMPT 0x04 /* ... disabling preemption. */ +#define RCUTORTURE_RDR_RBH 0x08 /* ... rcu_read_lock_bh(). */ +#define RCUTORTURE_RDR_SCHED 0x10 /* ... rcu_read_lock_sched(). */ +#define RCUTORTURE_RDR_RCU 0x20 /* ... entering another RCU reader. */ +#define RCUTORTURE_RDR_NBITS 6 /* Number of bits defined above. */ +#define RCUTORTURE_MAX_EXTEND \ + (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \ + RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED) #define RCUTORTURE_RDR_MAX_LOOPS 0x7 /* Maximum reader extensions. */ /* Must be power of two minus one. */ @@ -1217,6 +1220,10 @@ static void rcutorture_one_extend(int *readstate, int newstate, local_irq_disable(); if (statesnew & RCUTORTURE_RDR_PREEMPT) preempt_disable(); + if (statesnew & RCUTORTURE_RDR_RBH) + rcu_read_lock_bh(); + if (statesnew & RCUTORTURE_RDR_SCHED) + rcu_read_lock_sched(); if (statesnew & RCUTORTURE_RDR_RCU) idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT; @@ -1227,6 +1234,10 @@ static void rcutorture_one_extend(int *readstate, int newstate, local_bh_enable(); if (statesold & RCUTORTURE_RDR_PREEMPT) preempt_enable(); + if (statesold & RCUTORTURE_RDR_RBH) + rcu_read_unlock_bh(); + if (statesold & RCUTORTURE_RDR_SCHED) + rcu_read_unlock_sched(); if (statesold & RCUTORTURE_RDR_RCU) cur_ops->readunlock(idxold >> RCUTORTURE_RDR_SHIFT); @@ -1269,10 +1280,11 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp) mask = mask & randmask2; else mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS)); + /* Can't enable bh w/irq disabled. */ if ((mask & RCUTORTURE_RDR_IRQ) && - !(mask & RCUTORTURE_RDR_BH) && - (oldmask & RCUTORTURE_RDR_BH)) - mask |= RCUTORTURE_RDR_BH; /* Can't enable bh w/irq disabled. */ + ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) || + (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH)))) + mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH; if ((mask & RCUTORTURE_RDR_IRQ) && !(mask & cur_ops->ext_irq_conflict) && (oldmask & cur_ops->ext_irq_conflict)) From 72ce30dd1f9bdbd6913ba868d0d2ca55c268eff3 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 10:24:23 -0700 Subject: [PATCH 099/140] rcu: Stop testing RCU-bh and RCU-sched Now that the RCU-bh and RCU-sched update-side functions are simple wrappers around their RCU counterparts, there isn't a whole lot of point in testing them. This commit therefore removes the self-test capability and removes the corresponding kernel-boot parameters. It also updates the various rcutorture .boot files to remove the kernel boot parameters that call for testing RCU-bh and RCU-sched. Signed-off-by: Paul E. McKenney --- .../admin-guide/kernel-parameters.txt | 6 --- kernel/rcu/update.c | 38 +------------------ .../rcutorture/configs/rcu/TINY02.boot | 2 - .../rcutorture/configs/rcu/TREE01.boot | 2 +- .../rcutorture/configs/rcu/TREE04.boot | 2 +- .../rcutorture/configs/rcu/TREE05.boot | 2 - .../rcutorture/configs/rcu/TREE06.boot | 2 - .../rcutorture/configs/rcu/TREE08.boot | 2 - 8 files changed, 3 insertions(+), 53 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 9871e649ffef..aa96e669bcb8 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3863,12 +3863,6 @@ rcupdate.rcu_self_test= [KNL] Run the RCU early boot self tests - rcupdate.rcu_self_test_bh= [KNL] - Run the RCU bh early boot self tests - - rcupdate.rcu_self_test_sched= [KNL] - Run the RCU sched early boot self tests - rdinit= [KNL] Format: Run specified binary instead of /init from the ramdisk, diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index 9ea87d0aa386..ee366faecea6 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -203,11 +203,7 @@ void rcu_test_sync_prims(void) if (!IS_ENABLED(CONFIG_PROVE_RCU)) return; synchronize_rcu(); - synchronize_rcu_bh(); - synchronize_sched(); synchronize_rcu_expedited(); - synchronize_rcu_bh_expedited(); - synchronize_sched_expedited(); } #if !defined(CONFIG_TINY_RCU) || defined(CONFIG_SRCU) @@ -870,15 +866,10 @@ static void __init rcu_tasks_bootup_oddness(void) #ifdef CONFIG_PROVE_RCU /* - * Early boot self test parameters, one for each flavor + * Early boot self test parameters. */ static bool rcu_self_test; -static bool rcu_self_test_bh; -static bool rcu_self_test_sched; - module_param(rcu_self_test, bool, 0444); -module_param(rcu_self_test_bh, bool, 0444); -module_param(rcu_self_test_sched, bool, 0444); static int rcu_self_test_counter; @@ -895,30 +886,12 @@ static void early_boot_test_call_rcu(void) call_rcu(&head, test_callback); } -static void early_boot_test_call_rcu_bh(void) -{ - static struct rcu_head head; - - call_rcu_bh(&head, test_callback); -} - -static void early_boot_test_call_rcu_sched(void) -{ - static struct rcu_head head; - - call_rcu_sched(&head, test_callback); -} - void rcu_early_boot_tests(void) { pr_info("Running RCU self tests\n"); if (rcu_self_test) early_boot_test_call_rcu(); - if (rcu_self_test_bh) - early_boot_test_call_rcu_bh(); - if (rcu_self_test_sched) - early_boot_test_call_rcu_sched(); rcu_test_sync_prims(); } @@ -931,15 +904,6 @@ static int rcu_verify_early_boot_tests(void) early_boot_test_counter++; rcu_barrier(); } - if (rcu_self_test_bh) { - early_boot_test_counter++; - rcu_barrier_bh(); - } - if (rcu_self_test_sched) { - early_boot_test_counter++; - rcu_barrier_sched(); - } - if (rcu_self_test_counter != early_boot_test_counter) { WARN_ON(1); ret = -1; diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TINY02.boot b/tools/testing/selftests/rcutorture/configs/rcu/TINY02.boot index 6c1a292a65fb..b39f1553a478 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TINY02.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/TINY02.boot @@ -1,3 +1 @@ rcupdate.rcu_self_test=1 -rcupdate.rcu_self_test_bh=1 -rcutorture.torture_type=rcu_bh diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot index 9f3a4d28e508..ea47da95374b 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot @@ -1,4 +1,4 @@ -rcutorture.torture_type=rcu_bh maxcpus=8 nr_cpus=43 +maxcpus=8 nr_cpus=43 rcutree.gp_preinit_delay=3 rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE04.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE04.boot index e6071bb96c7d..5adc6756792a 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE04.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE04.boot @@ -1 +1 @@ -rcutorture.torture_type=rcu_bh rcutree.rcu_fanout_leaf=4 nohz_full=1-7 +rcutree.rcu_fanout_leaf=4 nohz_full=1-7 diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot index c7fd050dfcd9..779f1aed4606 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot @@ -1,5 +1,3 @@ -rcutorture.torture_type=sched -rcupdate.rcu_self_test_sched=1 rcutree.gp_preinit_delay=3 rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE06.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE06.boot index ad18b52a2cad..055f4aa79077 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE06.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE06.boot @@ -1,6 +1,4 @@ rcupdate.rcu_self_test=1 -rcupdate.rcu_self_test_bh=1 -rcupdate.rcu_self_test_sched=1 rcutree.rcu_fanout_exact=1 rcutree.gp_preinit_delay=3 rcutree.gp_init_delay=3 diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE08.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE08.boot index 1bd8efc4141e..22478fd3a865 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE08.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE08.boot @@ -1,5 +1,3 @@ -rcutorture.torture_type=sched rcupdate.rcu_self_test=1 -rcupdate.rcu_self_test_sched=1 rcutree.rcu_fanout_exact=1 rcu_nocbs=0-7 From c770c82a238237d7e97b9101b9e44db14203de47 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 10:28:07 -0700 Subject: [PATCH 100/140] rcutorture: Remove the "rcu_bh" and "sched" torture types Now that the RCU-bh and RCU-sched update-side functions are simple wrappers around their RCU counterparts, there isn't a whole lot of point in testing them. This commit therefore removes the "rcu_bh" and "sched" torture types from rcutorture. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 89 ++--------------------------------------- 1 file changed, 3 insertions(+), 86 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 1bc0e37dffa8..a228ad762fba 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -128,7 +128,7 @@ torture_param(int, verbose, 1, static char *torture_type = "rcu"; module_param(torture_type, charp, 0444); -MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, ...)"); +MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, srcu, ...)"); static int nrealreaders; static int ncbflooders; @@ -438,47 +438,6 @@ static struct rcu_torture_ops rcu_ops = { .name = "rcu" }; -/* - * Definitions for rcu_bh torture testing. - */ - -static int rcu_bh_torture_read_lock(void) __acquires(RCU_BH) -{ - rcu_read_lock_bh(); - return 0; -} - -static void rcu_bh_torture_read_unlock(int idx) __releases(RCU_BH) -{ - rcu_read_unlock_bh(); -} - -static void rcu_bh_torture_deferred_free(struct rcu_torture *p) -{ - call_rcu_bh(&p->rtort_rcu, rcu_torture_cb); -} - -static struct rcu_torture_ops rcu_bh_ops = { - .ttype = RCU_BH_FLAVOR, - .init = rcu_sync_torture_init, - .readlock = rcu_bh_torture_read_lock, - .read_delay = rcu_read_delay, /* just reuse rcu's version. */ - .readunlock = rcu_bh_torture_read_unlock, - .get_gp_seq = rcu_bh_get_gp_seq, - .gp_diff = rcu_seq_diff, - .deferred_free = rcu_bh_torture_deferred_free, - .sync = synchronize_rcu_bh, - .exp_sync = synchronize_rcu_bh_expedited, - .call = call_rcu_bh, - .cb_barrier = rcu_barrier_bh, - .fqs = rcu_bh_force_quiescent_state, - .stats = NULL, - .irq_capable = 1, - .extendables = (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ), - .ext_irq_conflict = RCUTORTURE_RDR_RCU, - .name = "rcu_bh" -}; - /* * Don't even think about trying any of these in real life!!! * The names includes "busted", and they really means it! @@ -666,48 +625,6 @@ static struct rcu_torture_ops busted_srcud_ops = { .name = "busted_srcud" }; -/* - * Definitions for sched torture testing. - */ - -static int sched_torture_read_lock(void) -{ - preempt_disable(); - return 0; -} - -static void sched_torture_read_unlock(int idx) -{ - preempt_enable(); -} - -static void rcu_sched_torture_deferred_free(struct rcu_torture *p) -{ - call_rcu_sched(&p->rtort_rcu, rcu_torture_cb); -} - -static struct rcu_torture_ops sched_ops = { - .ttype = RCU_SCHED_FLAVOR, - .init = rcu_sync_torture_init, - .readlock = sched_torture_read_lock, - .read_delay = rcu_read_delay, /* just reuse rcu's version. */ - .readunlock = sched_torture_read_unlock, - .get_gp_seq = rcu_sched_get_gp_seq, - .gp_diff = rcu_seq_diff, - .deferred_free = rcu_sched_torture_deferred_free, - .sync = synchronize_sched, - .exp_sync = synchronize_sched_expedited, - .get_state = get_state_synchronize_sched, - .cond_sync = cond_synchronize_sched, - .call = call_rcu_sched, - .cb_barrier = rcu_barrier_sched, - .fqs = rcu_sched_force_quiescent_state, - .stats = NULL, - .irq_capable = 1, - .extendables = RCUTORTURE_MAX_EXTEND, - .name = "sched" -}; - /* * Definitions for RCU-tasks torture testing. */ @@ -1956,8 +1873,8 @@ rcu_torture_init(void) int cpu; int firsterr = 0; static struct rcu_torture_ops *torture_ops[] = { - &rcu_ops, &rcu_bh_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops, - &busted_srcud_ops, &sched_ops, &tasks_ops, + &rcu_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops, + &busted_srcud_ops, &tasks_ops, }; if (!torture_init_begin(torture_type, verbose)) From 620d246065cdca4c4a8ad9ed28a191665cd3d457 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:25:10 -0700 Subject: [PATCH 101/140] rcuperf: Remove the "rcu_bh" and "sched" torture types Now that the RCU-bh and RCU-sched update-side functions are simple wrappers around their RCU counterparts, there isn't a whole lot of point in testing them. This commit therefore removes the "rcu_bh" and "sched" torture types from rcuperf. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcuperf.c | 65 ++------------------------------------------ 1 file changed, 2 insertions(+), 63 deletions(-) diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c index 34244523550e..8de53f3dc5b0 100644 --- a/kernel/rcu/rcuperf.c +++ b/kernel/rcu/rcuperf.c @@ -189,36 +189,6 @@ static struct rcu_perf_ops rcu_ops = { .name = "rcu" }; -/* - * Definitions for rcu_bh perf testing. - */ - -static int rcu_bh_perf_read_lock(void) __acquires(RCU_BH) -{ - rcu_read_lock_bh(); - return 0; -} - -static void rcu_bh_perf_read_unlock(int idx) __releases(RCU_BH) -{ - rcu_read_unlock_bh(); -} - -static struct rcu_perf_ops rcu_bh_ops = { - .ptype = RCU_BH_FLAVOR, - .init = rcu_sync_perf_init, - .readlock = rcu_bh_perf_read_lock, - .readunlock = rcu_bh_perf_read_unlock, - .get_gp_seq = rcu_bh_get_gp_seq, - .gp_diff = rcu_seq_diff, - .exp_completed = rcu_exp_batches_completed_sched, - .async = call_rcu_bh, - .gp_barrier = rcu_barrier_bh, - .sync = synchronize_rcu_bh, - .exp_sync = synchronize_rcu_bh_expedited, - .name = "rcu_bh" -}; - /* * Definitions for srcu perf testing. */ @@ -305,36 +275,6 @@ static struct rcu_perf_ops srcud_ops = { .name = "srcud" }; -/* - * Definitions for sched perf testing. - */ - -static int sched_perf_read_lock(void) -{ - preempt_disable(); - return 0; -} - -static void sched_perf_read_unlock(int idx) -{ - preempt_enable(); -} - -static struct rcu_perf_ops sched_ops = { - .ptype = RCU_SCHED_FLAVOR, - .init = rcu_sync_perf_init, - .readlock = sched_perf_read_lock, - .readunlock = sched_perf_read_unlock, - .get_gp_seq = rcu_sched_get_gp_seq, - .gp_diff = rcu_seq_diff, - .exp_completed = rcu_exp_batches_completed_sched, - .async = call_rcu_sched, - .gp_barrier = rcu_barrier_sched, - .sync = synchronize_sched, - .exp_sync = synchronize_sched_expedited, - .name = "sched" -}; - /* * Definitions for RCU-tasks perf testing. */ @@ -611,7 +551,7 @@ rcu_perf_cleanup(void) kfree(writer_n_durations); } - /* Do flavor-specific cleanup operations. */ + /* Do torture-type-specific cleanup operations. */ if (cur_ops->cleanup != NULL) cur_ops->cleanup(); @@ -661,8 +601,7 @@ rcu_perf_init(void) long i; int firsterr = 0; static struct rcu_perf_ops *perf_ops[] = { - &rcu_ops, &rcu_bh_ops, &srcu_ops, &srcud_ops, &sched_ops, - &tasks_ops, + &rcu_ops, &srcu_ops, &srcud_ops, &tasks_ops, }; if (!torture_init_begin(perf_type, verbose)) From de3875d3023310416d08eaab3c1a8527e9b452bf Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 21:43:17 -0700 Subject: [PATCH 102/140] rcu: Remove now-unused rcutorture APIs This commit removes rcu_sched_get_gp_seq(), rcu_bh_get_gp_seq(), rcu_exp_batches_completed_sched(), rcu_sched_force_quiescent_state(), and rcu_bh_force_quiescent_state(), which are no longer used because rcutorture no longer does "rcu_bh" and "rcu_sched" torture types. Signed-off-by: Paul E. McKenney --- kernel/rcu/rcu.h | 10 ---------- kernel/rcu/tree.c | 47 ----------------------------------------------- 2 files changed, 57 deletions(-) diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 2bb77fddc11f..aa3dc08af4b3 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -509,29 +509,19 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type, #ifdef CONFIG_TINY_RCU static inline unsigned long rcu_get_gp_seq(void) { return 0; } -static inline unsigned long rcu_bh_get_gp_seq(void) { return 0; } -static inline unsigned long rcu_sched_get_gp_seq(void) { return 0; } static inline unsigned long rcu_exp_batches_completed(void) { return 0; } -static inline unsigned long rcu_exp_batches_completed_sched(void) { return 0; } static inline unsigned long srcu_batches_completed(struct srcu_struct *sp) { return 0; } static inline void rcu_force_quiescent_state(void) { } -static inline void rcu_bh_force_quiescent_state(void) { } -static inline void rcu_sched_force_quiescent_state(void) { } static inline void show_rcu_gp_kthreads(void) { } static inline int rcu_get_gp_kthreads_prio(void) { return 0; } #else /* #ifdef CONFIG_TINY_RCU */ unsigned long rcu_get_gp_seq(void); -unsigned long rcu_bh_get_gp_seq(void); -unsigned long rcu_sched_get_gp_seq(void); unsigned long rcu_exp_batches_completed(void); -unsigned long rcu_exp_batches_completed_sched(void); unsigned long srcu_batches_completed(struct srcu_struct *sp); void show_rcu_gp_kthreads(void); int rcu_get_gp_kthreads_prio(void); void rcu_force_quiescent_state(void); -void rcu_bh_force_quiescent_state(void); -void rcu_sched_force_quiescent_state(void); extern struct workqueue_struct *rcu_gp_wq; extern struct workqueue_struct *rcu_par_gp_wq; #endif /* #else #ifdef CONFIG_TINY_RCU */ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1a2551a4d583..5e14a19c066c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -490,25 +490,6 @@ unsigned long rcu_get_gp_seq(void) } EXPORT_SYMBOL_GPL(rcu_get_gp_seq); -/* - * Return the number of RCU-sched GPs completed thus far for debug & stats. - */ -unsigned long rcu_sched_get_gp_seq(void) -{ - return rcu_get_gp_seq(); -} -EXPORT_SYMBOL_GPL(rcu_sched_get_gp_seq); - -/* - * Return the number of RCU GPs completed thus far for debug & stats. - * This is a transitional API and will soon be removed. - */ -unsigned long rcu_bh_get_gp_seq(void) -{ - return READ_ONCE(rcu_state.gp_seq); -} -EXPORT_SYMBOL_GPL(rcu_bh_get_gp_seq); - /* * Return the number of RCU expedited batches completed thus far for * debug & stats. Odd numbers mean that a batch is in progress, even @@ -521,16 +502,6 @@ unsigned long rcu_exp_batches_completed(void) } EXPORT_SYMBOL_GPL(rcu_exp_batches_completed); -/* - * Return the number of RCU-sched expedited batches completed thus far - * for debug & stats. Similar to rcu_exp_batches_completed(). - */ -unsigned long rcu_exp_batches_completed_sched(void) -{ - return rcu_state.expedited_sequence; -} -EXPORT_SYMBOL_GPL(rcu_exp_batches_completed_sched); - /* * Force a quiescent state. */ @@ -540,24 +511,6 @@ void rcu_force_quiescent_state(void) } EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); -/* - * Force a quiescent state for RCU BH. - */ -void rcu_bh_force_quiescent_state(void) -{ - force_quiescent_state(); -} -EXPORT_SYMBOL_GPL(rcu_bh_force_quiescent_state); - -/* - * Force a quiescent state for RCU-sched. - */ -void rcu_sched_force_quiescent_state(void) -{ - rcu_force_quiescent_state(); -} -EXPORT_SYMBOL_GPL(rcu_sched_force_quiescent_state); - /* * Show the state of the grace-period kthreads. */ From 2bd8b1a2afc4463cc503665e98faa5909d1ac462 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 103/140] rcu: Clean up flavor-related definitions and comments in rcupdate.h Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 12103e1bbe67..d6d543b60a9f 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -119,11 +119,10 @@ static inline void rcu_init_nohz(void) { } * RCU_NONIDLE - Indicate idle-loop code that needs RCU readers * @a: Code that RCU needs to pay attention to. * - * RCU, RCU-bh, and RCU-sched read-side critical sections are forbidden - * in the inner idle loop, that is, between the rcu_idle_enter() and - * the rcu_idle_exit() -- RCU will happily ignore any such read-side - * critical sections. However, things like powertop need tracepoints - * in the inner idle loop. + * RCU read-side critical sections are forbidden in the inner idle loop, + * that is, between the rcu_idle_enter() and the rcu_idle_exit() -- RCU + * will happily ignore any such read-side critical sections. However, + * things like powertop need tracepoints in the inner idle loop. * * This macro provides the way out: RCU_NONIDLE(do_something_with_RCU()) * will tell RCU that it needs to pay attention, invoke its argument @@ -163,7 +162,7 @@ void exit_tasks_rcu_finish(void); #else /* #ifdef CONFIG_TASKS_RCU */ #define rcu_tasks_qs(t) do { } while (0) #define rcu_note_voluntary_context_switch(t) rcu_all_qs() -#define call_rcu_tasks call_rcu_sched +#define call_rcu_tasks call_rcu #define synchronize_rcu_tasks synchronize_rcu static inline void exit_tasks_rcu_start(void) { } static inline void exit_tasks_rcu_finish(void) { } @@ -309,8 +308,8 @@ static inline void rcu_preempt_sleep_check(void) { } * Helper functions for rcu_dereference_check(), rcu_dereference_protected() * and rcu_assign_pointer(). Some of these could be folded into their * callers, but they are left separate in order to ease introduction of - * multiple flavors of pointers to match the multiple flavors of RCU - * (e.g., __rcu_sched, and __srcu), should this make sense in the future. + * multiple pointers markings to match different RCU implementations + * (e.g., __srcu), should this make sense in the future. */ #ifdef __CHECKER__ @@ -670,9 +669,8 @@ static inline void rcu_read_unlock(void) * rcu_read_lock_bh() - mark the beginning of an RCU-bh critical section * * This is equivalent of rcu_read_lock(), but also disables softirqs. - * Note that synchronize_rcu() and friends may be used for the update - * side, although synchronize_rcu_bh() is available as a wrapper in the - * short term. Longer term, the _bh update-side API will be eliminated. + * Note that anything else that disables softirqs can also serve as + * an RCU read-side critical section. * * Note that rcu_read_lock_bh() and the matching rcu_read_unlock_bh() * must occur in the same context, for example, it is illegal to invoke @@ -705,10 +703,9 @@ static inline void rcu_read_unlock_bh(void) /** * rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section * - * This is equivalent of rcu_read_lock(), but to be used when updates - * are being done using call_rcu_sched() or synchronize_rcu_sched(). - * Read-side critical sections can also be introduced by anything that - * disables preemption, including local_irq_disable() and friends. + * This is equivalent of rcu_read_lock(), but disables preemption. + * Read-side critical sections can also be introduced by anything else + * that disables preemption, including local_irq_disable() and friends. * * Note that rcu_read_lock_sched() and the matching rcu_read_unlock_sched() * must occur in the same context, for example, it is illegal to invoke From aff5f0369e312b0ab0ca7a2a12dd64b7e39c7091 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 104/140] rcu: Clean up flavor-related definitions and comments in rculist.h Signed-off-by: Paul E. McKenney --- include/linux/rculist.h | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/include/linux/rculist.h b/include/linux/rculist.h index 4786c2235b98..e91ec9ddcd30 100644 --- a/include/linux/rculist.h +++ b/include/linux/rculist.h @@ -182,7 +182,7 @@ static inline void list_replace_rcu(struct list_head *old, * @list: the RCU-protected list to splice * @prev: points to the last element of the existing list * @next: points to the first element of the existing list - * @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... + * @sync: synchronize_rcu, synchronize_rcu_expedited, ... * * The list pointed to by @prev and @next can be RCU-read traversed * concurrently with this function. @@ -240,7 +240,7 @@ static inline void __list_splice_init_rcu(struct list_head *list, * designed for stacks. * @list: the RCU-protected list to splice * @head: the place in the existing list to splice the first list into - * @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... + * @sync: synchronize_rcu, synchronize_rcu_expedited, ... */ static inline void list_splice_init_rcu(struct list_head *list, struct list_head *head, @@ -255,7 +255,7 @@ static inline void list_splice_init_rcu(struct list_head *list, * list, designed for queues. * @list: the RCU-protected list to splice * @head: the place in the existing list to splice the first list into - * @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... + * @sync: synchronize_rcu, synchronize_rcu_expedited, ... */ static inline void list_splice_tail_init_rcu(struct list_head *list, struct list_head *head, @@ -359,13 +359,12 @@ static inline void list_splice_tail_init_rcu(struct list_head *list, * @type: the type of the struct this is embedded in. * @member: the name of the list_head within the struct. * - * This primitive may safely run concurrently with the _rcu list-mutation - * primitives such as list_add_rcu(), but requires some implicit RCU - * read-side guarding. One example is running within a special - * exception-time environment where preemption is disabled and where - * lockdep cannot be invoked (in which case updaters must use RCU-sched, - * as in synchronize_sched(), call_rcu_sched(), and friends). Another - * example is when items are added to the list, but never deleted. + * This primitive may safely run concurrently with the _rcu + * list-mutation primitives such as list_add_rcu(), but requires some + * implicit RCU read-side guarding. One example is running within a special + * exception-time environment where preemption is disabled and where lockdep + * cannot be invoked. Another example is when items are added to the list, + * but never deleted. */ #define list_entry_lockless(ptr, type, member) \ container_of((typeof(ptr))READ_ONCE(ptr), type, member) @@ -376,13 +375,12 @@ static inline void list_splice_tail_init_rcu(struct list_head *list, * @head: the head for your list. * @member: the name of the list_struct within the struct. * - * This primitive may safely run concurrently with the _rcu list-mutation - * primitives such as list_add_rcu(), but requires some implicit RCU - * read-side guarding. One example is running within a special - * exception-time environment where preemption is disabled and where - * lockdep cannot be invoked (in which case updaters must use RCU-sched, - * as in synchronize_sched(), call_rcu_sched(), and friends). Another - * example is when items are added to the list, but never deleted. + * This primitive may safely run concurrently with the _rcu + * list-mutation primitives such as list_add_rcu(), but requires some + * implicit RCU read-side guarding. One example is running within a special + * exception-time environment where preemption is disabled and where lockdep + * cannot be invoked. Another example is when items are added to the list, + * but never deleted. */ #define list_for_each_entry_lockless(pos, head, member) \ for (pos = list_entry_lockless((head)->next, typeof(*pos), member); \ From df8561a0d7e4f5cb72d0aa6be43e154b2027bba6 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 105/140] rcu: Clean up flavor-related definitions and comments in rcupdate_wait.h Signed-off-by: Paul E. McKenney --- include/linux/rcupdate_wait.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/rcupdate_wait.h b/include/linux/rcupdate_wait.h index bc104699560e..8a16c3eb3dd0 100644 --- a/include/linux/rcupdate_wait.h +++ b/include/linux/rcupdate_wait.h @@ -33,17 +33,17 @@ do { \ /** * synchronize_rcu_mult - Wait concurrently for multiple grace periods - * @...: List of call_rcu() functions for the flavors to wait on. + * @...: List of call_rcu() functions for different grace periods to wait on * - * This macro waits concurrently for multiple flavors of RCU grace periods. - * For example, synchronize_rcu_mult(call_rcu, call_rcu_sched) would wait - * on concurrent RCU and RCU-sched grace periods. Waiting on a give SRCU + * This macro waits concurrently for multiple types of RCU grace periods. + * For example, synchronize_rcu_mult(call_rcu, call_rcu_tasks) would wait + * on concurrent RCU and RCU-tasks grace periods. Waiting on a give SRCU * domain requires you to write a wrapper function for that SRCU domain's * call_srcu() function, supplying the corresponding srcu_struct. * - * If Tiny RCU, tell _wait_rcu_gp() not to bother waiting for RCU - * or RCU-sched, given that anywhere synchronize_rcu_mult() can be called - * is automatically a grace period. + * If Tiny RCU, tell _wait_rcu_gp() does not bother waiting for RCU, + * given that anywhere synchronize_rcu_mult() can be called is automatically + * a grace period. */ #define synchronize_rcu_mult(...) \ _wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__) From 8c1cf2da6f8af7f6b6e0e06d3a83115712cc04b8 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 106/140] rcu: Clean up flavor-related definitions and comments in Kconfig Signed-off-by: Paul E. McKenney --- kernel/rcu/Kconfig | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig index a0b7f0103ca9..939a2056c87a 100644 --- a/kernel/rcu/Kconfig +++ b/kernel/rcu/Kconfig @@ -196,7 +196,7 @@ config RCU_BOOST This option boosts the priority of preempted RCU readers that block the current preemptible RCU grace period for too long. This option also prevents heavy loads from blocking RCU - callback invocation for all flavors of RCU. + callback invocation. Say Y here if you are working with real-time apps or heavy loads Say N here if you are unsure. @@ -225,15 +225,15 @@ config RCU_NOCB_CPU callback invocation to energy-efficient CPUs in battery-powered asymmetric multiprocessors. - This option offloads callback invocation from the set of - CPUs specified at boot time by the rcu_nocbs parameter. - For each such CPU, a kthread ("rcuox/N") will be created to - invoke callbacks, where the "N" is the CPU being offloaded, - and where the "p" for RCU-preempt and "s" for RCU-sched. - Nothing prevents this kthread from running on the specified - CPUs, but (1) the kthreads may be preempted between each - callback, and (2) affinity or cgroups can be used to force - the kthreads to run on whatever set of CPUs is desired. + This option offloads callback invocation from the set of CPUs + specified at boot time by the rcu_nocbs parameter. For each + such CPU, a kthread ("rcuox/N") will be created to invoke + callbacks, where the "N" is the CPU being offloaded, and where + the "p" for RCU-preempt (PREEMPT kernels) and "s" for RCU-sched + (!PREEMPT kernels). Nothing prevents this kthread from running + on the specified CPUs, but (1) the kthreads may be preempted + between each callback, and (2) affinity or cgroups can be used + to force the kthreads to run on whatever set of CPUs is desired. Say Y here if you want to help to debug reduced OS jitter. Say N here if you are unsure. From 7f87c036fea3c17eb6a6e4f4164c67aeb98710ea Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 107/140] rcu: Clean up flavor-related definitions and comments in rcu.h Signed-off-by: Paul E. McKenney --- kernel/rcu/rcu.h | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index aa3dc08af4b3..5dec94509a7e 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -176,8 +176,9 @@ static inline unsigned long rcu_seq_diff(unsigned long new, unsigned long old) /* * debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally - * by call_rcu() and rcu callback execution, and are therefore not part of the - * RCU API. Leaving in rcupdate.h because they are used by all RCU flavors. + * by call_rcu() and rcu callback execution, and are therefore not part + * of the RCU API. These are in rcupdate.h because they are used by all + * RCU implementations. */ #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD @@ -328,7 +329,7 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt) } } -/* Returns first leaf rcu_node of the specified RCU flavor. */ +/* Returns a pointer to the first leaf rcu_node structure. */ #define rcu_first_leaf_node() (rcu_state.level[rcu_num_lvls - 1]) /* Is this rcu_node a leaf? */ @@ -339,7 +340,8 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt) /* * Do a full breadth-first scan of the {s,}rcu_node structures for the - * specified rcu_state structure. + * specified state structure (for SRCU) or the only rcu_state structure + * (for RCU). */ #define srcu_for_each_node_breadth_first(sp, rnp) \ for ((rnp) = &(sp)->node[0]; \ @@ -348,10 +350,10 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt) srcu_for_each_node_breadth_first(&rcu_state, rnp) /* - * Scan the leaves of the rcu_node hierarchy for the specified rcu_state - * structure. Note that if there is a singleton rcu_node tree with but - * one rcu_node structure, this loop -will- visit the rcu_node structure. - * It is still a leaf node, even if it is also the root node. + * Scan the leaves of the rcu_node hierarchy for the rcu_state structure. + * Note that if there is a singleton rcu_node tree with but one rcu_node + * structure, this loop -will- visit the rcu_node structure. It is still + * a leaf node, even if it is also the root node. */ #define rcu_for_each_leaf_node(rnp) \ for ((rnp) = rcu_first_leaf_node(); \ From 62a1a945368ff8b4011dfc791f89152ef3da0ecf Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 108/140] rcu: Clean up flavor-related definitions and comments in rcutorture.c Signed-off-by: Paul E. McKenney --- kernel/rcu/rcutorture.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index a228ad762fba..294b3f6b7eb6 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1221,7 +1221,7 @@ static void rcutorture_loop_extend(int *readstate, WARN_ON_ONCE(!*readstate); /* -Existing- RCU read-side critsect! */ if (!((mask - 1) & mask)) - return; /* Current RCU flavor not extendable. */ + return; /* Current RCU reader not extendable. */ i = (torture_random(trsp) >> 3) & RCUTORTURE_RDR_MAX_LOOPS; while (i--) { mask = rcutorture_extend_mask(*readstate, trsp); @@ -1790,7 +1790,7 @@ rcu_torture_cleanup(void) cpuhp_remove_state(rcutor_hp); /* - * Wait for all RCU callbacks to fire, then do flavor-specific + * Wait for all RCU callbacks to fire, then do torture-type-specific * cleanup operations. */ if (cur_ops->cb_barrier != NULL) From 6eb95cc4507a765de06d30028390da1b4a9c8e5c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 109/140] rcu: Clean up flavor-related definitions and comments in srcutree.h Signed-off-by: Paul E. McKenney --- kernel/rcu/srcutree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 2042080cd38b..7f266b0f9832 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -980,7 +980,7 @@ EXPORT_SYMBOL_GPL(synchronize_srcu_expedited); * There are memory-ordering constraints implied by synchronize_srcu(). * On systems with more than one CPU, when synchronize_srcu() returns, * each CPU is guaranteed to have executed a full memory barrier since - * the end of its last corresponding SRCU-sched read-side critical section + * the end of its last corresponding SRCU read-side critical section * whose beginning preceded the call to synchronize_srcu(). In addition, * each CPU having an SRCU read-side critical section that extends beyond * the return from synchronize_srcu() is guaranteed to have executed a From 679d3f30923eb687ce3bcd3dfaf108a2809d5a57 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 110/140] rcu: Clean up flavor-related definitions and comments in tiny.c Signed-off-by: Paul E. McKenney --- kernel/rcu/tiny.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index 30826fb6e438..a77853b73bfe 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -117,9 +117,9 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused /* * Wait for a grace period to elapse. But it is illegal to invoke - * synchronize_sched() from within an RCU read-side critical section. - * Therefore, any legal call to synchronize_sched() is a quiescent - * state, and so on a UP system, synchronize_sched() need do nothing. + * synchronize_rcu() from within an RCU read-side critical section. + * Therefore, any legal call to synchronize_rcu() is a quiescent + * state, and so on a UP system, synchronize_rcu() need do nothing. * (But Lai Jiangshan points out the benefits of doing might_sleep() * to reduce latency.) * @@ -130,12 +130,12 @@ void synchronize_rcu(void) RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map), - "Illegal synchronize_sched() in RCU read-side critical section"); + "Illegal synchronize_rcu() in RCU read-side critical section"); } EXPORT_SYMBOL_GPL(synchronize_rcu); /* - * Post an RCU callback to be invoked after the end of an RCU-sched grace + * Post an RCU callback to be invoked after the end of an RCU grace * period. But since we have but one CPU, that would be after any * quiescent state. */ From 49918a54e63c99899aa3aa64d456c5bf14122e5a Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 111/140] rcu: Clean up flavor-related definitions and comments in tree.c Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 74 ++++++++++++++++++++--------------------------- 1 file changed, 32 insertions(+), 42 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 5e14a19c066c..c8761e7c7c00 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -382,12 +382,11 @@ static int rcu_is_cpu_rrupt_from_idle(void) } /* - * Register a quiescent state for all RCU flavors. If there is an + * Register an urgently needed quiescent state. If there is an * emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight - * dyntick-idle quiescent state visible to other CPUs (but only for those - * RCU flavors in desperate need of a quiescent state, which will normally - * be none of them). Either way, do a lightweight quiescent state for - * all RCU flavors. + * dyntick-idle quiescent state visible to other CPUs, which will in + * some cases serve for expedited as well as normal grace periods. + * Either way, register a lightweight quiescent state. * * The barrier() calls are redundant in the common case when this is * called externally, but just in case this is called from within this @@ -564,7 +563,7 @@ void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, EXPORT_SYMBOL_GPL(rcutorture_get_gp_data); /* - * Return the root node of the specified rcu_state structure. + * Return the root node of the rcu_state structure. */ static struct rcu_node *rcu_get_root(void) { @@ -949,11 +948,7 @@ void rcu_request_urgent_qs_task(struct task_struct *t) * Disable preemption to avoid false positives that could otherwise * happen due to the current CPU number being sampled, this task being * preempted, its old CPU being taken offline, resuming on some other CPU, - * then determining that its old CPU is now offline. Because there are - * multiple flavors of RCU, and because this function can be called in the - * midst of updating the flavors while a given CPU coming online or going - * offline, it is necessary to check all flavors. If any of the flavors - * believe that given CPU is online, it is considered to be online. + * then determining that its old CPU is now offline. * * Disable checking if in an NMI handler because we cannot safely * report errors from NMI handlers anyway. In addition, it is OK to use @@ -1563,11 +1558,10 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp) } /* - * Awaken the grace-period kthread for the specified flavor of RCU. - * Don't do a self-awaken, and don't bother awakening when there is - * nothing for the grace-period kthread to do (as in several CPUs - * raced to awaken, and we lost), and finally don't try to awaken - * a kthread that has not yet been created. + * Awaken the grace-period kthread. Don't do a self-awaken, and don't + * bother awakening when there is nothing for the grace-period kthread + * to do (as in several CPUs raced to awaken, and we lost), and finally + * don't try to awaken a kthread that has not yet been created. */ static void rcu_gp_kthread_wake(void) { @@ -2119,13 +2113,13 @@ static int __noreturn rcu_gp_kthread(void *unused) } /* - * Report a full set of quiescent states to the specified rcu_state data - * structure. Invoke rcu_gp_kthread_wake() to awaken the grace-period - * kthread if another grace period is required. Whether we wake - * the grace-period kthread or it awakens itself for the next round - * of quiescent-state forcing, that kthread will clean up after the - * just-completed grace period. Note that the caller must hold rnp->lock, - * which is released before return. + * Report a full set of quiescent states to the rcu_state data structure. + * Invoke rcu_gp_kthread_wake() to awaken the grace-period kthread if + * another grace period is required. Whether we wake the grace-period + * kthread or it awakens itself for the next round of quiescent-state + * forcing, that kthread will clean up after the just-completed grace + * period. Note that the caller must hold rnp->lock, which is released + * before return. */ static void rcu_report_qs_rsp(unsigned long flags) __releases(rcu_get_root()->lock) @@ -2212,7 +2206,7 @@ static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, /* * Record a quiescent state for all tasks that were previously queued * on the specified rcu_node structure and that were blocking the current - * RCU grace period. The caller must hold the specified rnp->lock with + * RCU grace period. The caller must hold the corresponding rnp->lock with * irqs disabled, and this lock is released upon return, but irqs remain * disabled. */ @@ -2714,11 +2708,11 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused } /* - * Schedule RCU callback invocation. If the specified type of RCU - * does not support RCU priority boosting, just do a direct call, - * otherwise wake up the per-CPU kernel kthread. Note that because we - * are running on the current CPU with softirqs disabled, the - * rcu_cpu_kthread_task cannot disappear out from under us. + * Schedule RCU callback invocation. If the running implementation of RCU + * does not support RCU priority boosting, just do a direct call, otherwise + * wake up the per-CPU kernel kthread. Note that because we are running + * on the current CPU with softirqs disabled, the rcu_cpu_kthread_task + * cannot disappear out from under us. */ static void invoke_rcu_callbacks(struct rcu_data *rdp) { @@ -2959,11 +2953,10 @@ EXPORT_SYMBOL_GPL(cond_synchronize_rcu); /* * Check to see if there is any immediate RCU-related work to be done by - * the current CPU, for the specified type of RCU, returning 1 if so and - * zero otherwise. The checks are in order of increasing expense: checks - * that can be carried out against CPU-local state are performed first. - * However, we must check for CPU stalls first, else we might not get - * a chance. + * the current CPU, returning 1 if so and zero otherwise. The checks are + * in order of increasing expense: checks that can be carried out against + * CPU-local state are performed first. However, we must check for CPU + * stalls first, else we might not get a chance. */ static int rcu_pending(void) { @@ -3070,10 +3063,7 @@ static void rcu_barrier_func(void *unused) } } -/* - * Orchestrate the specified type of RCU barrier, waiting for all - * RCU callbacks of the specified type to complete. - */ +/* Orchestrate an RCU barrier, waiting for all RCU callbacks to complete. */ static void _rcu_barrier(void) { int cpu; @@ -3393,7 +3383,7 @@ void rcu_report_dead(unsigned int cpu) struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ - /* QS for any half-done expedited RCU-sched GP. */ + /* QS for any half-done expedited grace period. */ preempt_disable(); rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); preempt_enable(); @@ -3482,7 +3472,7 @@ static int rcu_pm_notify(struct notifier_block *self, } /* - * Spawn the kthreads that handle each RCU flavor's grace periods. + * Spawn the kthreads that handle RCU's grace periods. */ static int __init rcu_spawn_gp_kthread(void) { @@ -3545,7 +3535,7 @@ void rcu_scheduler_starting(void) } /* - * Helper function for rcu_init() that initializes one rcu_state structure. + * Helper function for rcu_init() that initializes the rcu_state structure. */ static void __init rcu_init_one(void) { @@ -3707,7 +3697,7 @@ static void __init rcu_init_geometry(void) /* * Dump out the structure of the rcu_node combining tree associated - * with the rcu_state structure referenced by rsp. + * with the rcu_state structure. */ static void __init rcu_dump_rcu_node_tree(void) { From 8fa946d42855c2e3a481bf105aa2b25cefebe111 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 112/140] rcu: Clean up flavor-related definitions and comments in tree_exp.h Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_exp.h | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 060bdb45cd95..78553a8fa3c6 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -601,8 +601,8 @@ static void wait_rcu_exp_gp(struct work_struct *wp) } /* - * Given an rcu_state pointer and a smp_call_function() handler, kick - * off the specified flavor of expedited grace period. + * Given a smp_call_function() handler, kick off the specified + * implementation of expedited grace period. */ static void _synchronize_rcu_expedited(smp_call_func_t func) { @@ -721,7 +721,7 @@ static void sync_rcu_exp_handler(void *unused) resched_cpu(rdp->cpu); } -/* PREEMPT=y, so no RCU-sched to clean up after. */ +/* PREEMPT=y, so no PREEMPT=n expedited grace period to clean up after. */ static void sync_sched_exp_online_cleanup(int cpu) { } @@ -798,13 +798,13 @@ static void sync_sched_exp_online_cleanup(int cpu) } /* - * Because a context switch is a grace period for RCU-sched, any blocking - * grace-period wait automatically implies a grace period if there - * is only one CPU online at any point time during execution of either - * synchronize_sched() or synchronize_rcu_bh(). It is OK to occasionally - * incorrectly indicate that there are multiple CPUs online when there - * was in fact only one the whole time, as this just adds some overhead: - * RCU still operates correctly. + * Because a context switch is a grace period for !PREEMPT, any + * blocking grace-period wait automatically implies a grace period if + * there is only one CPU online at any point time during execution of + * either synchronize_rcu() or synchronize_rcu_expedited(). It is OK to + * occasionally incorrectly indicate that there are multiple CPUs online + * when there was in fact only one the whole time, as this just adds some + * overhead: RCU still operates correctly. */ static int rcu_blocking_is_gp(void) { @@ -823,7 +823,7 @@ void synchronize_rcu_expedited(void) RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map), - "Illegal synchronize_sched_expedited() in RCU read-side critical section"); + "Illegal synchronize_rcu_expedited() in RCU read-side critical section"); /* If only one CPU, this is automatically a grace period. */ if (rcu_blocking_is_gp()) From 0ae86a272656b34edfe90a637363d10f470c65d8 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 7 Jul 2018 18:12:26 -0700 Subject: [PATCH 113/140] rcu: Clean up flavor-related definitions and comments in tree_plugin.h Signed-off-by: Paul E. McKenney --- kernel/rcu/tree_plugin.h | 36 +++++++++++++++++------------------- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index cd276c46bc14..cd4c1b979446 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -38,8 +38,7 @@ #include "../locking/rtmutex_common.h" /* - * Control variables for per-CPU and per-rcu_node kthreads. These - * handle all flavors of RCU. + * Control variables for per-CPU and per-rcu_node kthreads. */ static DEFINE_PER_CPU(struct task_struct *, rcu_cpu_kthread_task); DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status); @@ -826,8 +825,8 @@ static void rcu_flavor_check_callbacks(int user) * * Note that this guarantee implies further memory-ordering guarantees. * On systems with more than one CPU, when synchronize_rcu() returns, - * each CPU is guaranteed to have executed a full memory barrier since the - * end of its last RCU-sched read-side critical section whose beginning + * each CPU is guaranteed to have executed a full memory barrier since + * the end of its last RCU read-side critical section whose beginning * preceded the call to synchronize_rcu(). In addition, each CPU having * an RCU read-side critical section that extends beyond the return from * synchronize_rcu() is guaranteed to have executed a full memory barrier @@ -1069,7 +1068,7 @@ void synchronize_rcu(void) RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_sched_lock_map), - "Illegal synchronize_rcu() in RCU-sched read-side critical section"); + "Illegal synchronize_rcu() in RCU read-side critical section"); if (rcu_blocking_is_gp()) return; if (rcu_gp_is_expedited()) @@ -1341,9 +1340,9 @@ static int rcu_cpu_kthread_should_run(unsigned int cpu) } /* - * Per-CPU kernel thread that invokes RCU callbacks. This replaces the - * RCU softirq used in flavors and configurations of RCU that do not - * support RCU priority boosting. + * Per-CPU kernel thread that invokes RCU callbacks. This replaces + * the RCU softirq used in configurations of RCU that do not support RCU + * priority boosting. */ static void rcu_cpu_kthread(unsigned int cpu) { @@ -1484,8 +1483,8 @@ static void rcu_prepare_kthreads(int cpu) * 1 if so. This function is part of the RCU implementation; it is -not- * an exported member of the RCU API. * - * Because we not have RCU_FAST_NO_HZ, just check whether this CPU needs - * any flavor of RCU. + * Because we not have RCU_FAST_NO_HZ, just check whether or not this + * CPU has RCU callbacks queued. */ int rcu_needs_cpu(u64 basemono, u64 *nextevt) { @@ -1551,9 +1550,9 @@ static int rcu_idle_lazy_gp_delay = RCU_IDLE_LAZY_GP_DELAY; module_param(rcu_idle_lazy_gp_delay, int, 0644); /* - * Try to advance callbacks for all flavors of RCU on the current CPU, but - * only if it has been awhile since the last time we did so. Afterwards, - * if there are any callbacks ready for immediate invocation, return true. + * Try to advance callbacks on the current CPU, but only if it has been + * awhile since the last time we did so. Afterwards, if there are any + * callbacks ready for immediate invocation, return true. */ static bool __maybe_unused rcu_try_advance_all_cbs(void) { @@ -1808,7 +1807,7 @@ static void print_cpu_stall_info_end(void) pr_err("\t"); } -/* Zero ->ticks_this_gp for all flavors of RCU. */ +/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */ static void zero_cpu_stall_ticks(struct rcu_data *rdp) { rdp->ticks_this_gp = 0; @@ -1939,7 +1938,7 @@ static void wake_nocb_leader_defer(struct rcu_data *rdp, int waketype, } /* - * Does the specified CPU need an RCU callback for the specified flavor + * Does the specified CPU need an RCU callback for this invocation * of rcu_barrier()? */ static bool rcu_nocb_cpu_needs_barrier(int cpu) @@ -2419,9 +2418,8 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp) /* * If the specified CPU is a no-CBs CPU that does not already have its - * rcuo kthread for the specified RCU flavor, spawn it. If the CPUs are - * brought online out of order, this can require re-organizing the - * leader-follower relationships. + * rcuo kthread, spawn it. If the CPUs are brought online out of order, + * this can require re-organizing the leader-follower relationships. */ static void rcu_spawn_one_nocb_kthread(int cpu) { @@ -2458,7 +2456,7 @@ static void rcu_spawn_one_nocb_kthread(int cpu) rdp_spawn->nocb_next_follower = rdp_old_leader; } - /* Spawn the kthread for this CPU and RCU flavor. */ + /* Spawn the kthread for this CPU. */ t = kthread_run(rcu_nocb_kthread, rdp_spawn, "rcuo%c/%d", rcu_state.abbr, cpu); BUG_ON(IS_ERR(t)); From 06462efc808c956f462ec5c3b5e10bbee0be2545 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sun, 8 Jul 2018 10:58:37 -0700 Subject: [PATCH 114/140] rcu: Clean up flavor-related definitions and comments in update.c Signed-off-by: Paul E. McKenney --- kernel/rcu/update.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index ee366faecea6..fa089ead4bd6 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -332,7 +332,7 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array, int i; int j; - /* Initialize and register callbacks for each flavor specified. */ + /* Initialize and register callbacks for each crcu_array element. */ for (i = 0; i < n; i++) { if (checktiny && (crcu_array[i] == call_rcu || @@ -697,19 +697,19 @@ static int __noreturn rcu_tasks_kthread(void *arg) /* * Wait for all pre-existing t->on_rq and t->nvcsw - * transitions to complete. Invoking synchronize_sched() + * transitions to complete. Invoking synchronize_rcu() * suffices because all these transitions occur with - * interrupts disabled. Without this synchronize_sched(), + * interrupts disabled. Without this synchronize_rcu(), * a read-side critical section that started before the * grace period might be incorrectly seen as having started * after the grace period. * - * This synchronize_sched() also dispenses with the + * This synchronize_rcu() also dispenses with the * need for a memory barrier on the first store to * ->rcu_tasks_holdout, as it forces the store to happen * after the beginning of the grace period. */ - synchronize_sched(); + synchronize_rcu(); /* * There were callbacks, so we need to wait for an @@ -736,7 +736,7 @@ static int __noreturn rcu_tasks_kthread(void *arg) * This does only part of the job, ensuring that all * tasks that were previously exiting reach the point * where they have disabled preemption, allowing the - * later synchronize_sched() to finish the job. + * later synchronize_rcu() to finish the job. */ synchronize_srcu(&tasks_rcu_exit_srcu); @@ -786,20 +786,20 @@ static int __noreturn rcu_tasks_kthread(void *arg) * cause their RCU-tasks read-side critical sections to * extend past the end of the grace period. However, * because these ->nvcsw updates are carried out with - * interrupts disabled, we can use synchronize_sched() + * interrupts disabled, we can use synchronize_rcu() * to force the needed ordering on all such CPUs. * - * This synchronize_sched() also confines all + * This synchronize_rcu() also confines all * ->rcu_tasks_holdout accesses to be within the grace * period, avoiding the need for memory barriers for * ->rcu_tasks_holdout accesses. * - * In addition, this synchronize_sched() waits for exiting + * In addition, this synchronize_rcu() waits for exiting * tasks to complete their final preempt_disable() region * of execution, cleaning up after the synchronize_srcu() * above. */ - synchronize_sched(); + synchronize_rcu(); /* Invoke the callbacks. */ while (list) { From 4d232dfe1df35254298e7986c4de8c9f63f58c79 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 10 Jul 2018 12:53:40 -0700 Subject: [PATCH 115/140] rcu: Remove !PREEMPT code from rcu_note_voluntary_context_switch() Because RCU-tasks exists only in PREEMPT kernels and because RCU-sched no longer exists in PREEMPT kernels, it is no longer necessary for the rcu_note_voluntary_context_switch() macro to do anything for !PREEMPT kernels. This commit therefore removes !PREEMPT-related code from this macro, namely, the rcu_all_qs(). Signed-off-by: Paul E. McKenney --- include/linux/rcupdate.h | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index d6d543b60a9f..e4f821165d0b 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -150,18 +150,14 @@ static inline void rcu_init_nohz(void) { } if (READ_ONCE((t)->rcu_tasks_holdout)) \ WRITE_ONCE((t)->rcu_tasks_holdout, false); \ } while (0) -#define rcu_note_voluntary_context_switch(t) \ - do { \ - rcu_all_qs(); \ - rcu_tasks_qs(t); \ - } while (0) +#define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t) void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func); void synchronize_rcu_tasks(void); void exit_tasks_rcu_start(void); void exit_tasks_rcu_finish(void); #else /* #ifdef CONFIG_TASKS_RCU */ #define rcu_tasks_qs(t) do { } while (0) -#define rcu_note_voluntary_context_switch(t) rcu_all_qs() +#define rcu_note_voluntary_context_switch(t) do { } while (0) #define call_rcu_tasks call_rcu #define synchronize_rcu_tasks synchronize_rcu static inline void exit_tasks_rcu_start(void) { } From 395a2f097ebdddf2bfa286b6119f1b231025c2f1 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 10 Jul 2018 14:00:14 -0700 Subject: [PATCH 116/140] rcu: Define rcu_all_qs() only in !PREEMPT builds Now that rcu_all_qs() is used only in !PREEMPT builds, move it to tree_plugin.h so that it is defined only in those builds. This in turn means that rcu_momentary_dyntick_idle() is only used in !PREEMPT builds, but it is simply marked __maybe_unused in order to keep it near the rest of the dyntick-idle code. Signed-off-by: Paul E. McKenney --- include/linux/rcutree.h | 2 ++ kernel/rcu/tree.c | 41 +--------------------------------------- kernel/rcu/tree_plugin.h | 39 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 42 insertions(+), 40 deletions(-) diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index d09a9abe9440..7f83179177d1 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -66,7 +66,9 @@ void rcu_scheduler_starting(void); extern int rcu_scheduler_active __read_mostly; void rcu_end_inkernel_boot(void); bool rcu_is_watching(void); +#ifndef CONFIG_PREEMPT void rcu_all_qs(void); +#endif /* RCUtree hotplug events */ int rcutree_prepare_cpu(unsigned int cpu); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index c8761e7c7c00..e140aaa78527 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -356,7 +356,7 @@ bool rcu_eqs_special_set(int cpu) * * The caller must have disabled interrupts and must not be idle. */ -static void rcu_momentary_dyntick_idle(void) +static void __maybe_unused rcu_momentary_dyntick_idle(void) { struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); int special; @@ -381,45 +381,6 @@ static int rcu_is_cpu_rrupt_from_idle(void) __this_cpu_read(rcu_dynticks.dynticks_nmi_nesting) <= 1; } -/* - * Register an urgently needed quiescent state. If there is an - * emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight - * dyntick-idle quiescent state visible to other CPUs, which will in - * some cases serve for expedited as well as normal grace periods. - * Either way, register a lightweight quiescent state. - * - * The barrier() calls are redundant in the common case when this is - * called externally, but just in case this is called from within this - * file. - * - */ -void rcu_all_qs(void) -{ - unsigned long flags; - - if (!raw_cpu_read(rcu_dynticks.rcu_urgent_qs)) - return; - preempt_disable(); - /* Load rcu_urgent_qs before other flags. */ - if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { - preempt_enable(); - return; - } - this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); - barrier(); /* Avoid RCU read-side critical sections leaking down. */ - if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) { - local_irq_save(flags); - rcu_momentary_dyntick_idle(); - local_irq_restore(flags); - } - if (unlikely(raw_cpu_read(rcu_data.cpu_no_qs.b.exp))) - rcu_qs(); - this_cpu_inc(rcu_dynticks.rcu_qs_ctr); - barrier(); /* Avoid RCU read-side critical sections leaking up. */ - preempt_enable(); -} -EXPORT_SYMBOL_GPL(rcu_all_qs); - #define DEFAULT_RCU_BLIMIT 10 /* Maximum callbacks per rcu_do_batch. */ static long blimit = DEFAULT_RCU_BLIMIT; #define DEFAULT_RCU_QHIMARK 10000 /* If this many pending, ignore blimit. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index cd4c1b979446..7add1c297500 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -947,6 +947,45 @@ static void rcu_qs(void) rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); } +/* + * Register an urgently needed quiescent state. If there is an + * emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight + * dyntick-idle quiescent state visible to other CPUs, which will in + * some cases serve for expedited as well as normal grace periods. + * Either way, register a lightweight quiescent state. + * + * The barrier() calls are redundant in the common case when this is + * called externally, but just in case this is called from within this + * file. + * + */ +void rcu_all_qs(void) +{ + unsigned long flags; + + if (!raw_cpu_read(rcu_dynticks.rcu_urgent_qs)) + return; + preempt_disable(); + /* Load rcu_urgent_qs before other flags. */ + if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { + preempt_enable(); + return; + } + this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); + barrier(); /* Avoid RCU read-side critical sections leaking down. */ + if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) { + local_irq_save(flags); + rcu_momentary_dyntick_idle(); + local_irq_restore(flags); + } + if (unlikely(raw_cpu_read(rcu_data.cpu_no_qs.b.exp))) + rcu_qs(); + this_cpu_inc(rcu_dynticks.rcu_qs_ctr); + barrier(); /* Avoid RCU read-side critical sections leaking up. */ + preempt_enable(); +} +EXPORT_SYMBOL_GPL(rcu_all_qs); + /* * Note a PREEMPT=n context switch. The caller must have disabled interrupts. */ From dd46a7882c2c2006201e053ebf5e9ad761860cc0 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 10 Jul 2018 18:37:30 -0700 Subject: [PATCH 117/140] rcu: Inline _rcu_barrier() into its sole remaining caller Because rcu_barrier() is a one-line wrapper function for _rcu_barrier() and because nothing else calls _rcu_barrier(), this commit inlines _rcu_barrier() into rcu_barrier(). Signed-off-by: Paul E. McKenney --- include/trace/events/rcu.h | 20 ++++++------- kernel/rcu/tree.c | 58 +++++++++++++++++--------------------- kernel/rcu/tree.h | 4 +-- kernel/rcu/tree_plugin.h | 2 +- 4 files changed, 39 insertions(+), 45 deletions(-) diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h index a8d07feff6a0..175e0bce22bd 100644 --- a/include/trace/events/rcu.h +++ b/include/trace/events/rcu.h @@ -705,20 +705,20 @@ TRACE_EVENT(rcu_torture_read, ); /* - * Tracepoint for _rcu_barrier() execution. The string "s" describes - * the _rcu_barrier phase: - * "Begin": _rcu_barrier() started. - * "EarlyExit": _rcu_barrier() piggybacked, thus early exit. - * "Inc1": _rcu_barrier() piggyback check counter incremented. - * "OfflineNoCB": _rcu_barrier() found callback on never-online CPU - * "OnlineNoCB": _rcu_barrier() found online no-CBs CPU. - * "OnlineQ": _rcu_barrier() found online CPU with callbacks. - * "OnlineNQ": _rcu_barrier() found online CPU, no callbacks. + * Tracepoint for rcu_barrier() execution. The string "s" describes + * the rcu_barrier phase: + * "Begin": rcu_barrier() started. + * "EarlyExit": rcu_barrier() piggybacked, thus early exit. + * "Inc1": rcu_barrier() piggyback check counter incremented. + * "OfflineNoCB": rcu_barrier() found callback on never-online CPU + * "OnlineNoCB": rcu_barrier() found online no-CBs CPU. + * "OnlineQ": rcu_barrier() found online CPU with callbacks. + * "OnlineNQ": rcu_barrier() found online CPU, no callbacks. * "IRQ": An rcu_barrier_callback() callback posted on remote CPU. * "IRQNQ": An rcu_barrier_callback() callback found no callbacks. * "CB": An rcu_barrier_callback() invoked a callback, not the last. * "LastCB": An rcu_barrier_callback() invoked the last callback. - * "Inc2": _rcu_barrier() piggyback check counter incremented. + * "Inc2": rcu_barrier() piggyback check counter incremented. * The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument * is the count of remaining callbacks, and "done" is the piggybacking count. */ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e140aaa78527..ce16b8da2c6f 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2747,7 +2747,7 @@ static void rcu_leak_callback(struct rcu_head *rhp) /* * Helper function for call_rcu() and friends. The cpu argument will * normally be -1, indicating "currently running CPU". It may specify - * a CPU only if that CPU is a no-CBs CPU. Currently, only _rcu_barrier() + * a CPU only if that CPU is a no-CBs CPU. Currently, only rcu_barrier() * is expected to specify a CPU. */ static void @@ -2981,27 +2981,27 @@ static bool rcu_cpu_has_callbacks(bool *all_lazy) } /* - * Helper function for _rcu_barrier() tracing. If tracing is disabled, + * Helper function for rcu_barrier() tracing. If tracing is disabled, * the compiler is expected to optimize this away. */ -static void _rcu_barrier_trace(const char *s, int cpu, unsigned long done) +static void rcu_barrier_trace(const char *s, int cpu, unsigned long done) { trace_rcu_barrier(rcu_state.name, s, cpu, atomic_read(&rcu_state.barrier_cpu_count), done); } /* - * RCU callback function for _rcu_barrier(). If we are last, wake - * up the task executing _rcu_barrier(). + * RCU callback function for rcu_barrier(). If we are last, wake + * up the task executing rcu_barrier(). */ static void rcu_barrier_callback(struct rcu_head *rhp) { if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) { - _rcu_barrier_trace(TPS("LastCB"), -1, + rcu_barrier_trace(TPS("LastCB"), -1, rcu_state.barrier_sequence); complete(&rcu_state.barrier_completion); } else { - _rcu_barrier_trace(TPS("CB"), -1, rcu_state.barrier_sequence); + rcu_barrier_trace(TPS("CB"), -1, rcu_state.barrier_sequence); } } @@ -3012,33 +3012,40 @@ static void rcu_barrier_func(void *unused) { struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); - _rcu_barrier_trace(TPS("IRQ"), -1, rcu_state.barrier_sequence); + rcu_barrier_trace(TPS("IRQ"), -1, rcu_state.barrier_sequence); rdp->barrier_head.func = rcu_barrier_callback; debug_rcu_head_queue(&rdp->barrier_head); if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head, 0)) { atomic_inc(&rcu_state.barrier_cpu_count); } else { debug_rcu_head_unqueue(&rdp->barrier_head); - _rcu_barrier_trace(TPS("IRQNQ"), -1, + rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence); } } -/* Orchestrate an RCU barrier, waiting for all RCU callbacks to complete. */ -static void _rcu_barrier(void) +/** + * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete. + * + * Note that this primitive does not necessarily wait for an RCU grace period + * to complete. For example, if there are no RCU callbacks queued anywhere + * in the system, then rcu_barrier() is within its rights to return + * immediately, without waiting for anything, much less an RCU grace period. + */ +void rcu_barrier(void) { int cpu; struct rcu_data *rdp; unsigned long s = rcu_seq_snap(&rcu_state.barrier_sequence); - _rcu_barrier_trace(TPS("Begin"), -1, s); + rcu_barrier_trace(TPS("Begin"), -1, s); /* Take mutex to serialize concurrent rcu_barrier() requests. */ mutex_lock(&rcu_state.barrier_mutex); /* Did someone else do our work for us? */ if (rcu_seq_done(&rcu_state.barrier_sequence, s)) { - _rcu_barrier_trace(TPS("EarlyExit"), -1, + rcu_barrier_trace(TPS("EarlyExit"), -1, rcu_state.barrier_sequence); smp_mb(); /* caller's subsequent code after above check. */ mutex_unlock(&rcu_state.barrier_mutex); @@ -3047,7 +3054,7 @@ static void _rcu_barrier(void) /* Mark the start of the barrier operation. */ rcu_seq_start(&rcu_state.barrier_sequence); - _rcu_barrier_trace(TPS("Inc1"), -1, rcu_state.barrier_sequence); + rcu_barrier_trace(TPS("Inc1"), -1, rcu_state.barrier_sequence); /* * Initialize the count to one rather than to zero in order to @@ -3070,10 +3077,10 @@ static void _rcu_barrier(void) rdp = per_cpu_ptr(&rcu_data, cpu); if (rcu_is_nocb_cpu(cpu)) { if (!rcu_nocb_cpu_needs_barrier(cpu)) { - _rcu_barrier_trace(TPS("OfflineNoCB"), cpu, + rcu_barrier_trace(TPS("OfflineNoCB"), cpu, rcu_state.barrier_sequence); } else { - _rcu_barrier_trace(TPS("OnlineNoCB"), cpu, + rcu_barrier_trace(TPS("OnlineNoCB"), cpu, rcu_state.barrier_sequence); smp_mb__before_atomic(); atomic_inc(&rcu_state.barrier_cpu_count); @@ -3081,11 +3088,11 @@ static void _rcu_barrier(void) rcu_barrier_callback, cpu, 0); } } else if (rcu_segcblist_n_cbs(&rdp->cblist)) { - _rcu_barrier_trace(TPS("OnlineQ"), cpu, + rcu_barrier_trace(TPS("OnlineQ"), cpu, rcu_state.barrier_sequence); smp_call_function_single(cpu, rcu_barrier_func, NULL, 1); } else { - _rcu_barrier_trace(TPS("OnlineNQ"), cpu, + rcu_barrier_trace(TPS("OnlineNQ"), cpu, rcu_state.barrier_sequence); } } @@ -3102,25 +3109,12 @@ static void _rcu_barrier(void) wait_for_completion(&rcu_state.barrier_completion); /* Mark the end of the barrier operation. */ - _rcu_barrier_trace(TPS("Inc2"), -1, rcu_state.barrier_sequence); + rcu_barrier_trace(TPS("Inc2"), -1, rcu_state.barrier_sequence); rcu_seq_end(&rcu_state.barrier_sequence); /* Other rcu_barrier() invocations can now safely proceed. */ mutex_unlock(&rcu_state.barrier_mutex); } - -/** - * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete. - * - * Note that this primitive does not necessarily wait for an RCU grace period - * to complete. For example, if there are no RCU callbacks queued anywhere - * in the system, then rcu_barrier() is within its rights to return - * immediately, without waiting for anything, much less an RCU grace period. - */ -void rcu_barrier(void) -{ - _rcu_barrier(); -} EXPORT_SYMBOL_GPL(rcu_barrier); /* diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 46452d3d0fad..8cf93ac277ec 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -222,7 +222,7 @@ struct rcu_data { /* Grace period that needs help */ /* from cond_resched(). */ - /* 5) _rcu_barrier(), OOM callbacks, and expediting. */ + /* 5) rcu_barrier(), OOM callbacks, and expediting. */ struct rcu_head barrier_head; int exp_dynticks_snap; /* Double-check need for IPI. */ @@ -328,7 +328,7 @@ struct rcu_state { atomic_t barrier_cpu_count; /* # CPUs waiting on. */ struct completion barrier_completion; /* Wake at barrier end. */ unsigned long barrier_sequence; /* ++ at start and end of */ - /* _rcu_barrier(). */ + /* rcu_barrier(). */ /* End of fields guarded by barrier_mutex. */ struct mutex exp_mutex; /* Serialize expedited GP. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 7add1c297500..beaaca7a11f4 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1993,7 +1993,7 @@ static bool rcu_nocb_cpu_needs_barrier(int cpu) * There needs to be a barrier before this function is called, * but associated with a prior determination that no more * callbacks would be posted. In the worst case, the first - * barrier in _rcu_barrier() suffices (but the caller cannot + * barrier in rcu_barrier() suffices (but the caller cannot * necessarily rely on this, not a substitute for the caller * getting the concurrency design right!). There must also be * a barrier between the following load an posting of a callback From 92aa39e9dc77481b90cbef25e547d66cab901496 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 9 Jul 2018 13:47:30 -0700 Subject: [PATCH 118/140] rcu: Make need_resched() respond to urgent RCU-QS needs The per-CPU rcu_dynticks.rcu_urgent_qs variable communicates an urgent need for an RCU quiescent state from the force-quiescent-state processing within the grace-period kthread to context switches and to cond_resched(). Unfortunately, such urgent needs are not communicated to need_resched(), which is sometimes used to decide when to invoke cond_resched(), for but one example, within the KVM vcpu_run() function. As of v4.15, this can result in synchronize_sched() being delayed by up to ten seconds, which can be problematic, to say nothing of annoying. This commit therefore checks rcu_dynticks.rcu_urgent_qs from within rcu_check_callbacks(), which is invoked from the scheduling-clock interrupt handler. If the current task is not an idle task and is not executing in usermode, a context switch is forced, and either way, the rcu_dynticks.rcu_urgent_qs variable is set to false. If the current task is an idle task, then RCU's dyntick-idle code will detect the quiescent state, so no further action is required. Similarly, if the task is executing in usermode, other code in rcu_check_callbacks() and its called functions will report the corresponding quiescent state. Reported-by: Marius Hillenbrand Reported-by: David Woodhouse Suggested-by: Peter Zijlstra Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index ce16b8da2c6f..f47ac7a4719f 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2471,6 +2471,15 @@ void rcu_check_callbacks(int user) { trace_rcu_utilization(TPS("Start scheduler-tick")); raw_cpu_inc(rcu_data.ticks_this_gp); + /* The load-acquire pairs with the store-release setting to true. */ + if (smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { + /* Idle and userspace execution already are quiescent states. */ + if (!is_idle_task(current) && !user) { + set_tsk_need_resched(current); + set_preempt_need_resched(); + } + __this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); + } rcu_flavor_check_callbacks(user); if (rcu_pending()) invoke_rcu_core(); From a0ef9ec24144799b5b47fa54c38f9a0f9dfe9a59 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 9 Jul 2018 15:50:16 -0700 Subject: [PATCH 119/140] rcu: Provide improved interrupt-from-idle check in rcu_check_callbacks() The patch making need_resched() respond to urgent RCU-QS needs used is_idle_task(current) to detect an interrupt from idle, which does work reasonably, but is (in theory at least) vulnerable to loops containing need_resched() invoked from within RCU_NONIDLE() or its tracepoint equivalent. This commit therefore moves rcu_is_cpu_rrupt_from_idle() to a place from which rcu_check_callbacks() can invoke it and replaces the is_idle_task(current) with rcu_is_cpu_rrupt_from_idle(). Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index f47ac7a4719f..77d2cbf7c831 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2474,7 +2474,7 @@ void rcu_check_callbacks(int user) /* The load-acquire pairs with the store-release setting to true. */ if (smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { /* Idle and userspace execution already are quiescent states. */ - if (!is_idle_task(current) && !user) { + if (!rcu_is_cpu_rrupt_from_idle() && !user) { set_tsk_need_resched(current); set_preempt_need_resched(); } From c116dba68d19246639e4fdb8c75756c67d6d268f Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 13 Jul 2018 12:09:14 -0700 Subject: [PATCH 120/140] rcutorture: Dump reader protection sequence if failures or close calls Now that RCU can have readers with multiple segments, it is quite possible that a specific sequence of reader segments might result in an rcutorture failure (reader spans a full grace period as detected by one of the grace-period primitives) or an rcutorture close call (reader potentially spans a full grace period based on reading out the RCU implementation's grace-period counter, but with no ordering). In such cases, it would clearly ease debugging if the offending specific sequence was known. For the first reader encountering a failure or a close call, this commit therefore dumps out the segments, delay durations, and whether or not the reader was preempted. Signed-off-by: Paul E. McKenney [ paulmck: Mark variables static, as suggested by kbuild test robot. ] --- kernel/rcu/rcutorture.c | 119 ++++++++++++++++++++++++++++++++-------- 1 file changed, 96 insertions(+), 23 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 294b3f6b7eb6..1141e0d84ff1 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -78,6 +78,7 @@ MODULE_AUTHOR("Paul E. McKenney and Josh Triplett get_gp_seq(); ts = rcu_trace_clock_local(); mdelay(longdelay_ms); + rtrsp->rt_delay_ms = longdelay_ms; completed = cur_ops->get_gp_seq(); do_trace_rcu_torture_read(cur_ops->name, NULL, ts, started, completed); } - if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us))) + if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us))) { udelay(shortdelay_us); + rtrsp->rt_delay_us = shortdelay_us; + } if (!preempt_count() && - !(torture_random(rrsp) % (nrealreaders * 500))) + !(torture_random(rrsp) % (nrealreaders * 500))) { torture_preempt_schedule(); /* QS only if preemptible. */ + rtrsp->rt_preempted = true; + } } static void rcu_torture_read_unlock(int idx) __releases(RCU) @@ -494,7 +514,8 @@ static int srcu_torture_read_lock(void) __acquires(srcu_ctlp) return srcu_read_lock(srcu_ctlp); } -static void srcu_read_delay(struct torture_random_state *rrsp) +static void +srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp) { long delay; const long uspertick = 1000000 / HZ; @@ -504,10 +525,12 @@ static void srcu_read_delay(struct torture_random_state *rrsp) delay = torture_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick); - if (!delay && in_task()) + if (!delay && in_task()) { schedule_timeout_interruptible(longdelay); - else - rcu_read_delay(rrsp); + rtrsp->rt_delay_jiffies = longdelay; + } else { + rcu_read_delay(rrsp, rtrsp); + } } static void srcu_torture_read_unlock(int idx) __releases(srcu_ctlp) @@ -1120,7 +1143,8 @@ static void rcu_torture_timer_cb(struct rcu_head *rhp) * change, do a ->read_delay(). */ static void rcutorture_one_extend(int *readstate, int newstate, - struct torture_random_state *trsp) + struct torture_random_state *trsp, + struct rt_read_seg *rtrsp) { int idxnew = -1; int idxold = *readstate; @@ -1129,6 +1153,7 @@ static void rcutorture_one_extend(int *readstate, int newstate, WARN_ON_ONCE(idxold < 0); WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1); + rtrsp->rt_readstate = newstate; /* First, put new protection in place to avoid critical-section gap. */ if (statesnew & RCUTORTURE_RDR_BH) @@ -1160,7 +1185,7 @@ static void rcutorture_one_extend(int *readstate, int newstate, /* Delay if neither beginning nor end and there was a change. */ if ((statesnew || statesold) && *readstate && newstate) - cur_ops->read_delay(trsp); + cur_ops->read_delay(trsp, rtrsp); /* Update the reader state. */ if (idxnew == -1) @@ -1189,11 +1214,11 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp) { int mask = rcutorture_extend_mask_max(); unsigned long randmask1 = torture_random(trsp) >> 8; - unsigned long randmask2 = randmask1 >> 1; + unsigned long randmask2 = randmask1 >> 3; WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT); - /* Half the time lots of bits, half the time only one bit. */ - if (randmask1 & 0x1) + /* Most of the time lots of bits, half the time only one bit. */ + if (!(randmask1 & 0x7)) mask = mask & randmask2; else mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS)); @@ -1213,20 +1238,25 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp) * Do a randomly selected number of extensions of an existing RCU read-side * critical section. */ -static void rcutorture_loop_extend(int *readstate, - struct torture_random_state *trsp) +static struct rt_read_seg * +rcutorture_loop_extend(int *readstate, struct torture_random_state *trsp, + struct rt_read_seg *rtrsp) { int i; + int j; int mask = rcutorture_extend_mask_max(); WARN_ON_ONCE(!*readstate); /* -Existing- RCU read-side critsect! */ if (!((mask - 1) & mask)) - return; /* Current RCU reader not extendable. */ - i = (torture_random(trsp) >> 3) & RCUTORTURE_RDR_MAX_LOOPS; - while (i--) { + return rtrsp; /* Current RCU reader not extendable. */ + /* Bias towards larger numbers of loops. */ + i = (torture_random(trsp) >> 3); + i = ((i | (i >> 3)) & RCUTORTURE_RDR_MAX_LOOPS) + 1; + for (j = 0; j < i; j++) { mask = rcutorture_extend_mask(*readstate, trsp); - rcutorture_one_extend(readstate, mask, trsp); + rcutorture_one_extend(readstate, mask, trsp, &rtrsp[j]); } + return &rtrsp[j]; } /* @@ -1236,16 +1266,20 @@ static void rcutorture_loop_extend(int *readstate, */ static bool rcu_torture_one_read(struct torture_random_state *trsp) { + int i; unsigned long started; unsigned long completed; int newstate; struct rcu_torture *p; int pipe_count; int readstate = 0; + struct rt_read_seg rtseg[RCUTORTURE_RDR_MAX_SEGS] = { { 0 } }; + struct rt_read_seg *rtrsp = &rtseg[0]; + struct rt_read_seg *rtrsp1; unsigned long long ts; newstate = rcutorture_extend_mask(readstate, trsp); - rcutorture_one_extend(&readstate, newstate, trsp); + rcutorture_one_extend(&readstate, newstate, trsp, rtrsp++); started = cur_ops->get_gp_seq(); ts = rcu_trace_clock_local(); p = rcu_dereference_check(rcu_torture_current, @@ -1255,12 +1289,12 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp) torturing_tasks()); if (p == NULL) { /* Wait for rcu_torture_writer to get underway */ - rcutorture_one_extend(&readstate, 0, trsp); + rcutorture_one_extend(&readstate, 0, trsp, rtrsp); return false; } if (p->rtort_mbtest == 0) atomic_inc(&n_rcu_torture_mberror); - rcutorture_loop_extend(&readstate, trsp); + rtrsp = rcutorture_loop_extend(&readstate, trsp, rtrsp); preempt_disable(); pipe_count = p->rtort_pipe_count; if (pipe_count > RCU_TORTURE_PIPE_LEN) { @@ -1281,8 +1315,17 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp) } __this_cpu_inc(rcu_torture_batch[completed]); preempt_enable(); - rcutorture_one_extend(&readstate, 0, trsp); + rcutorture_one_extend(&readstate, 0, trsp, rtrsp); WARN_ON_ONCE(readstate & RCUTORTURE_RDR_MASK); + + /* If error or close call, record the sequence of reader protections. */ + if ((pipe_count > 1 || completed > 1) && !xchg(&err_segs_recorded, 1)) { + i = 0; + for (rtrsp1 = &rtseg[0]; rtrsp1 < rtrsp; rtrsp1++) + err_segs[i++] = *rtrsp1; + rt_read_nsegs = i; + } + return true; } @@ -1747,6 +1790,7 @@ static enum cpuhp_state rcutor_hp; static void rcu_torture_cleanup(void) { + int firsttime; int flags = 0; unsigned long gp_seq = 0; int i; @@ -1800,6 +1844,33 @@ rcu_torture_cleanup(void) rcu_torture_stats_print(); /* -After- the stats thread is stopped! */ + if (err_segs_recorded) { + pr_alert("Failure/close-call rcutorture reader segments:\n"); + if (rt_read_nsegs == 0) + pr_alert("\t: No segments recorded!!!\n"); + firsttime = 1; + for (i = 0; i < rt_read_nsegs; i++) { + pr_alert("\t%d: %#x ", i, err_segs[i].rt_readstate); + if (err_segs[i].rt_delay_jiffies != 0) { + pr_cont("%s%ldjiffies", firsttime ? "" : "+", + err_segs[i].rt_delay_jiffies); + firsttime = 0; + } + if (err_segs[i].rt_delay_ms != 0) { + pr_cont("%s%ldms", firsttime ? "" : "+", + err_segs[i].rt_delay_ms); + firsttime = 0; + } + if (err_segs[i].rt_delay_us != 0) { + pr_cont("%s%ldus", firsttime ? "" : "+", + err_segs[i].rt_delay_us); + firsttime = 0; + } + pr_cont("%s\n", + err_segs[i].rt_preempted ? "preempted" : ""); + + } + } if (atomic_read(&n_rcu_torture_error) || n_rcu_torture_barrier_error) rcu_torture_print_module_parms(cur_ops, "End of test: FAILURE"); else if (torture_onoff_failures()) @@ -1943,6 +2014,8 @@ rcu_torture_init(void) per_cpu(rcu_torture_batch, cpu)[i] = 0; } } + err_segs_recorded = 0; + rt_read_nsegs = 0; /* Start up the kthreads. */ From c5bacd94173ec49d7dce7ac7c64bbdde3a6e69ae Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 20 Jul 2018 14:18:23 -0700 Subject: [PATCH 121/140] rcu: Motivate Tiny RCU forward progress If a long-running CPU-bound in-kernel task invokes call_rcu(), the callback won't be invoked until the next context switch. If there are no other runnable tasks (which is not an uncommon situation on deep embedded systems), the callback might never be invoked. This commit therefore causes rcu_check_callbacks() to ask the scheduler for a context switch if there are callbacks posted that are still waiting for a grace period. Suggested-by: Peter Zijlstra Signed-off-by: Paul E. McKenney --- kernel/rcu/tiny.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index a77853b73bfe..1745d30e170e 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -78,8 +78,12 @@ void rcu_qs(void) */ void rcu_check_callbacks(int user) { - if (user) + if (user) { rcu_qs(); + } else if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) { + set_tsk_need_resched(current); + set_preempt_need_resched(); + } } /* Invoke the RCU callbacks whose grace period has elapsed. */ From 7e28c5af4ef6b539334aa5de40feca0c041c94df Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 11 Jul 2018 08:09:28 -0700 Subject: [PATCH 122/140] rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure The ->rcu_qs_ctr counter was intended to allow providing a lightweight report of a quiescent state to all RCU flavors. But now that there is only one flavor of RCU in any one running kernel, there is no point in having this feature. This commit therefore removes the ->rcu_qs_ctr field from the rcu_dynticks structure and the ->rcu_qs_ctr_snap field from the rcu_data structure. This results in the "rqc" option to the rcu_fqs trace event no longer being used, so this commit also removes the "rqc" description from the header comment. While in the neighborhood, this commit also causes the forward-progress request .rcu_need_heavy_qs be set one jiffies_till_sched_qs interval later in the grace period than the first setting of .rcu_urgent_qs. Signed-off-by: Paul E. McKenney --- include/trace/events/rcu.h | 5 ++-- kernel/rcu/tree.c | 52 ++++++++++---------------------------- kernel/rcu/tree.h | 3 --- kernel/rcu/tree_plugin.h | 5 +--- 4 files changed, 17 insertions(+), 48 deletions(-) diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h index 175e0bce22bd..f0c4d10e614b 100644 --- a/include/trace/events/rcu.h +++ b/include/trace/events/rcu.h @@ -393,9 +393,8 @@ TRACE_EVENT(rcu_quiescent_state_report, * Tracepoint for quiescent states detected by force_quiescent_state(). * These trace events include the type of RCU, the grace-period number * that was blocked by the CPU, the CPU itself, and the type of quiescent - * state, which can be "dti" for dyntick-idle mode, "kick" when kicking - * a CPU that has been in dyntick-idle mode for too long, or "rqc" if the - * CPU got a quiescent state via its rcu_qs_ctr. + * state, which can be "dti" for dyntick-idle mode or "kick" when kicking + * a CPU that has been in dyntick-idle mode for too long. */ TRACE_EVENT(rcu_fqs, diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 77d2cbf7c831..bc42c600027c 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1018,25 +1018,6 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) return 1; } - /* - * Has this CPU encountered a cond_resched() since the beginning - * of the grace period? For this to be the case, the CPU has to - * have noticed the current grace period. This might not be the - * case for nohz_full CPUs looping in the kernel. - */ - jtsq = jiffies_till_sched_qs; - ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); - if (time_after(jiffies, rcu_state.gp_start + jtsq) && - READ_ONCE(rdp->rcu_qs_ctr_snap) != per_cpu(rcu_dynticks.rcu_qs_ctr, rdp->cpu) && - rcu_seq_current(&rdp->gp_seq) == rnp->gp_seq && !rdp->gpwrap) { - trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("rqc")); - rcu_gpnum_ovf(rnp, rdp); - return 1; - } else if (time_after(jiffies, rcu_state.gp_start + jtsq)) { - /* Load rcu_qs_ctr before store to rcu_urgent_qs. */ - smp_store_release(ruqp, true); - } - /* If waiting too long on an offline CPU, complain. */ if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp)) && time_after(jiffies, rcu_state.gp_start + HZ)) { @@ -1060,29 +1041,27 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) /* * A CPU running for an extended time within the kernel can - * delay RCU grace periods. When the CPU is in NO_HZ_FULL mode, - * even context-switching back and forth between a pair of - * in-kernel CPU-bound tasks cannot advance grace periods. - * So if the grace period is old enough, make the CPU pay attention. - * Note that the unsynchronized assignments to the per-CPU - * rcu_need_heavy_qs variable are safe. Yes, setting of - * bits can be lost, but they will be set again on the next - * force-quiescent-state pass. So lost bit sets do not result - * in incorrect behavior, merely in a grace period lasting - * a few jiffies longer than it might otherwise. Because - * there are at most four threads involved, and because the - * updates are only once every few jiffies, the probability of - * lossage (and thus of slight grace-period extension) is - * quite low. + * delay RCU grace periods: (1) At age jiffies_till_sched_qs, + * set .rcu_urgent_qs, (2) At age 2*jiffies_till_sched_qs, set + * both .rcu_need_heavy_qs and .rcu_urgent_qs. Note that the + * unsynchronized assignments to the per-CPU rcu_need_heavy_qs + * variable are safe because the assignments are repeated if this + * CPU failed to pass through a quiescent state. This code + * also checks .jiffies_resched in case jiffies_till_sched_qs + * is set way high. */ + jtsq = jiffies_till_sched_qs; + ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); rnhqp = &per_cpu(rcu_dynticks.rcu_need_heavy_qs, rdp->cpu); if (!READ_ONCE(*rnhqp) && - (time_after(jiffies, rcu_state.gp_start + jtsq) || + (time_after(jiffies, rcu_state.gp_start + jtsq * 2) || time_after(jiffies, rcu_state.jiffies_resched))) { WRITE_ONCE(*rnhqp, true); /* Store rcu_need_heavy_qs before rcu_urgent_qs. */ smp_store_release(ruqp, true); rcu_state.jiffies_resched += jtsq; /* Re-enable beating. */ + } else if (time_after(jiffies, rcu_state.gp_start + jtsq)) { + WRITE_ONCE(*ruqp, true); } /* @@ -1091,7 +1070,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) * see if the CPU is getting hammered with interrupts, but only * once per grace period, just to keep the IPIs down to a dull roar. */ - if (jiffies - rcu_state.gp_start > rcu_jiffies_till_stall_check() / 2) { + if (time_after(jiffies, rcu_state.jiffies_resched)) { resched_cpu(rdp->cpu); if (IS_ENABLED(CONFIG_IRQ_WORK) && !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != rnp->gp_seq && @@ -1669,7 +1648,6 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, TPS("cpustart")); need_gp = !!(rnp->qsmask & rdp->grpmask); rdp->cpu_no_qs.b.norm = need_gp; - rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_dynticks.rcu_qs_ctr); rdp->core_needs_qs = need_gp; zero_cpu_stall_ticks(rdp); } @@ -2230,7 +2208,6 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) * within the current grace period. */ rdp->cpu_no_qs.b.norm = true; /* need qs for new gp. */ - rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_dynticks.rcu_qs_ctr); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); return; } @@ -3213,7 +3190,6 @@ int rcutree_prepare_cpu(unsigned int cpu) rdp->gp_seq = rnp->gp_seq; rdp->gp_seq_needed = rnp->gp_seq; rdp->cpu_no_qs.b.norm = true; - rdp->rcu_qs_ctr_snap = per_cpu(rcu_dynticks.rcu_qs_ctr, cpu); rdp->core_needs_qs = false; rdp->rcu_iw_pending = false; rdp->rcu_iw_gp_seq = rnp->gp_seq - 1; diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 8cf93ac277ec..4866fa44ab0b 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -42,7 +42,6 @@ struct rcu_dynticks { long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ atomic_t dynticks; /* Even value for idle, else odd. */ bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ - unsigned long rcu_qs_ctr; /* Light universal quiescent state ctr. */ bool rcu_urgent_qs; /* GP old need light quiescent state. */ #ifdef CONFIG_RCU_FAST_NO_HZ bool all_lazy; /* Are all CPU's CBs lazy? */ @@ -188,8 +187,6 @@ struct rcu_data { /* 1) quiescent-state and grace-period handling : */ unsigned long gp_seq; /* Track rsp->rcu_gp_seq counter. */ unsigned long gp_seq_needed; /* Track rsp->rcu_gp_seq_needed ctr. */ - unsigned long rcu_qs_ctr_snap;/* Snapshot of rcu_qs_ctr to check */ - /* for rcu_all_qs() invocations. */ union rcu_noqs cpu_no_qs; /* No QSes yet for this CPU. */ bool core_needs_qs; /* Core waits for quiesc state. */ bool beenonline; /* CPU online at least once. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index beaaca7a11f4..726d57708849 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -978,9 +978,7 @@ void rcu_all_qs(void) rcu_momentary_dyntick_idle(); local_irq_restore(flags); } - if (unlikely(raw_cpu_read(rcu_data.cpu_no_qs.b.exp))) - rcu_qs(); - this_cpu_inc(rcu_dynticks.rcu_qs_ctr); + rcu_qs(); barrier(); /* Avoid RCU read-side critical sections leaking up. */ preempt_enable(); } @@ -1000,7 +998,6 @@ void rcu_note_context_switch(bool preempt) this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) rcu_momentary_dyntick_idle(); - this_cpu_inc(rcu_dynticks.rcu_qs_ctr); if (!preempt) rcu_tasks_qs(current); out: From 74de6960c99d8df0d09fb29a7b014cb9c5571e2b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 24 Jul 2018 15:28:09 -0700 Subject: [PATCH 123/140] rcu: Provide functions for determining if call_rcu() has been invoked This commit adds rcu_head_init() and rcu_head_after_call_rcu() functions to help RCU users detect when another CPU has passed the specified rcu_head structure and function to call_rcu(). The rcu_head_init() should be invoked before making the structure visible to RCU readers, and then the rcu_head_after_call_rcu() may be invoked from within an RCU read-side critical section on an rcu_head structure that was obtained during a traversal of the data structure in question. The rcu_head_after_call_rcu() function will return true if the rcu_head structure has already been passed (with the specified function) to call_rcu(), otherwise it will return false. If rcu_head_init() has not been invoked on the rcu_head structure or if the rcu_head (AKA callback) has already been invoked, then rcu_head_after_call_rcu() will do WARN_ON_ONCE(). Reported-by: NeilBrown Signed-off-by: Paul E. McKenney [ paulmck: Apply neilb naming feedback. ] --- include/linux/rcupdate.h | 40 ++++++++++++++++++++++++++++++++++++++++ kernel/rcu/rcu.h | 5 ++++- 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index e4f821165d0b..4db8bcacc51a 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -857,6 +857,46 @@ static inline notrace void rcu_read_unlock_sched_notrace(void) #endif /* #else #ifdef CONFIG_ARCH_WEAK_RELEASE_ACQUIRE */ +/* Has the specified rcu_head structure been handed to call_rcu()? */ + +/* + * rcu_head_init - Initialize rcu_head for rcu_head_after_call_rcu() + * @rhp: The rcu_head structure to initialize. + * + * If you intend to invoke rcu_head_after_call_rcu() to test whether a + * given rcu_head structure has already been passed to call_rcu(), then + * you must also invoke this rcu_head_init() function on it just after + * allocating that structure. Calls to this function must not race with + * calls to call_rcu(), rcu_head_after_call_rcu(), or callback invocation. + */ +static inline void rcu_head_init(struct rcu_head *rhp) +{ + rhp->func = (rcu_callback_t)~0L; +} + +/* + * rcu_head_after_call_rcu - Has this rcu_head been passed to call_rcu()? + * @rhp: The rcu_head structure to test. + * @func: The function passed to call_rcu() along with @rhp. + * + * Returns @true if the @rhp has been passed to call_rcu() with @func, + * and @false otherwise. Emits a warning in any other case, including + * the case where @rhp has already been invoked after a grace period. + * Calls to this function must not race with callback invocation. One way + * to avoid such races is to enclose the call to rcu_head_after_call_rcu() + * in an RCU read-side critical section that includes a read-side fetch + * of the pointer to the structure containing @rhp. + */ +static inline bool +rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f) +{ + if (READ_ONCE(rhp->func) == f) + return true; + WARN_ON_ONCE(READ_ONCE(rhp->func) != (rcu_callback_t)~0L); + return false; +} + + /* Transitional pre-consolidation compatibility definitions. */ static inline void synchronize_rcu_bh(void) diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 5dec94509a7e..4c56c1d98fb3 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -224,6 +224,7 @@ void kfree(const void *); */ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head) { + rcu_callback_t f; unsigned long offset = (unsigned long)head->func; rcu_lock_acquire(&rcu_callback_map); @@ -234,7 +235,9 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head) return true; } else { RCU_TRACE(trace_rcu_invoke_callback(rn, head);) - head->func(head); + f = head->func; + WRITE_ONCE(head->func, (rcu_callback_t)0L); + f(head); rcu_lock_release(&rcu_callback_map); return false; } From c06aed0e31008a248c1841f1b7fc80e9ee242a31 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 25 Jul 2018 11:25:23 -0700 Subject: [PATCH 124/140] rcu: Compute jiffies_till_sched_qs from other kernel parameters The jiffies_till_sched_qs value used to determine how old a grace period must be before RCU enlists the help of the scheduler to force a quiescent state on the holdout CPU. Currently, this defaults to HZ/10 regardless of system size and may be set only at boot time. This can be a problem for very large systems, because if the values of the jiffies_till_first_fqs and jiffies_till_next_fqs kernel parameters are left at their defaults, they are calculated to increase as the number of CPUs actually configured on the system increases. Thus, on a sufficiently large system, RCU would enlist the help of the scheduler before the grace-period kthread had a chance to scan for idle CPUs, which wastes CPU time. This commit therefore allows jiffies_till_sched_qs to be set, if desired, but if left as default, computes is as jiffies_till_first_fqs plus twice jiffies_till_next_fqs, thus allowing three force-quiescent-state scans for idle CPUs. This scales with the number of CPUs, providing sensible default values. Signed-off-by: Paul E. McKenney --- .../admin-guide/kernel-parameters.txt | 9 ++- kernel/rcu/tree.c | 63 ++++++++++++++----- kernel/rcu/tree_plugin.h | 2 + 3 files changed, 57 insertions(+), 17 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index aa96e669bcb8..6153fb62abe1 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3595,7 +3595,14 @@ Set required age in jiffies for a given grace period before RCU starts soliciting quiescent-state help from - rcu_note_context_switch(). + rcu_note_context_switch(). If not specified, the + kernel will calculate a value based on the most + recent settings of rcutree.jiffies_till_first_fqs + and rcutree.jiffies_till_next_fqs. + This calculated value may be viewed in + rcutree.jiffies_to_sched_qs. Any attempt to + set rcutree.jiffies_to_sched_qs will be + cheerfully overwritten. rcutree.jiffies_till_first_fqs= [KNL] Set delay from grace-period initialization to diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index bc42c600027c..6bd0951a5f3a 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -396,13 +396,47 @@ static ulong jiffies_till_first_fqs = ULONG_MAX; static ulong jiffies_till_next_fqs = ULONG_MAX; static bool rcu_kick_kthreads; +/* + * How long the grace period must be before we start recruiting + * quiescent-state help from rcu_note_context_switch(). + */ +static ulong jiffies_till_sched_qs = ULONG_MAX; +module_param(jiffies_till_sched_qs, ulong, 0444); +static ulong jiffies_to_sched_qs; /* Adjusted version of above if not default */ +module_param(jiffies_to_sched_qs, ulong, 0444); /* Display only! */ + +/* + * Make sure that we give the grace-period kthread time to detect any + * idle CPUs before taking active measures to force quiescent states. + * However, don't go below 100 milliseconds, adjusted upwards for really + * large systems. + */ +static void adjust_jiffies_till_sched_qs(void) +{ + unsigned long j; + + /* If jiffies_till_sched_qs was specified, respect the request. */ + if (jiffies_till_sched_qs != ULONG_MAX) { + WRITE_ONCE(jiffies_to_sched_qs, jiffies_till_sched_qs); + return; + } + j = READ_ONCE(jiffies_till_first_fqs) + + 2 * READ_ONCE(jiffies_till_next_fqs); + if (j < HZ / 10 + nr_cpu_ids / RCU_JIFFIES_FQS_DIV) + j = HZ / 10 + nr_cpu_ids / RCU_JIFFIES_FQS_DIV; + pr_info("RCU calculated value of scheduler-enlistment delay is %ld jiffies.\n", j); + WRITE_ONCE(jiffies_to_sched_qs, j); +} + static int param_set_first_fqs_jiffies(const char *val, const struct kernel_param *kp) { ulong j; int ret = kstrtoul(val, 0, &j); - if (!ret) + if (!ret) { WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : j); + adjust_jiffies_till_sched_qs(); + } return ret; } @@ -411,8 +445,10 @@ static int param_set_next_fqs_jiffies(const char *val, const struct kernel_param ulong j; int ret = kstrtoul(val, 0, &j); - if (!ret) + if (!ret) { WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : (j ?: 1)); + adjust_jiffies_till_sched_qs(); + } return ret; } @@ -430,13 +466,6 @@ module_param_cb(jiffies_till_first_fqs, &first_fqs_jiffies_ops, &jiffies_till_fi module_param_cb(jiffies_till_next_fqs, &next_fqs_jiffies_ops, &jiffies_till_next_fqs, 0644); module_param(rcu_kick_kthreads, bool, 0644); -/* - * How long the grace period must be before we start recruiting - * quiescent-state help from rcu_note_context_switch(). - */ -static ulong jiffies_till_sched_qs = HZ / 10; -module_param(jiffies_till_sched_qs, ulong, 0444); - static void force_qs_rnp(int (*f)(struct rcu_data *rdp)); static void force_quiescent_state(void); static int rcu_pending(void); @@ -1041,16 +1070,16 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) /* * A CPU running for an extended time within the kernel can - * delay RCU grace periods: (1) At age jiffies_till_sched_qs, - * set .rcu_urgent_qs, (2) At age 2*jiffies_till_sched_qs, set + * delay RCU grace periods: (1) At age jiffies_to_sched_qs, + * set .rcu_urgent_qs, (2) At age 2*jiffies_to_sched_qs, set * both .rcu_need_heavy_qs and .rcu_urgent_qs. Note that the * unsynchronized assignments to the per-CPU rcu_need_heavy_qs * variable are safe because the assignments are repeated if this * CPU failed to pass through a quiescent state. This code - * also checks .jiffies_resched in case jiffies_till_sched_qs + * also checks .jiffies_resched in case jiffies_to_sched_qs * is set way high. */ - jtsq = jiffies_till_sched_qs; + jtsq = READ_ONCE(jiffies_to_sched_qs); ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); rnhqp = &per_cpu(rcu_dynticks.rcu_need_heavy_qs, rdp->cpu); if (!READ_ONCE(*rnhqp) && @@ -1236,7 +1265,7 @@ static void print_other_cpu_stall(unsigned long gp_seq) gpa = READ_ONCE(rcu_state.gp_activity); pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n", rcu_state.name, j - gpa, j, gpa, - jiffies_till_next_fqs, + READ_ONCE(jiffies_till_next_fqs), rcu_get_root()->qsmask); /* In this case, the current CPU might be at fault. */ sched_show_task(current); @@ -1874,7 +1903,7 @@ static void rcu_gp_fqs_loop(void) struct rcu_node *rnp = rcu_get_root(); first_gp_fqs = true; - j = jiffies_till_first_fqs; + j = READ_ONCE(jiffies_till_first_fqs); ret = 0; for (;;) { if (!ret) { @@ -1908,7 +1937,7 @@ static void rcu_gp_fqs_loop(void) cond_resched_tasks_rcu_qs(); WRITE_ONCE(rcu_state.gp_activity, jiffies); ret = 0; /* Force full wait till next FQS. */ - j = jiffies_till_next_fqs; + j = READ_ONCE(jiffies_till_next_fqs); } else { /* Deal with stray signal. */ cond_resched_tasks_rcu_qs(); @@ -3579,6 +3608,8 @@ static void __init rcu_init_geometry(void) jiffies_till_first_fqs = d; if (jiffies_till_next_fqs == ULONG_MAX) jiffies_till_next_fqs = d; + if (jiffies_till_sched_qs == ULONG_MAX) + adjust_jiffies_till_sched_qs(); /* If the compile-time values are accurate, just leave. */ if (rcu_fanout_leaf == RCU_FANOUT_LEAF && diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 726d57708849..7ec366268e2e 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -105,6 +105,8 @@ static void __init rcu_bootup_announce_oddness(void) pr_info("\tBoot-time adjustment of first FQS scan delay to %ld jiffies.\n", jiffies_till_first_fqs); if (jiffies_till_next_fqs != ULONG_MAX) pr_info("\tBoot-time adjustment of subsequent FQS scan delay to %ld jiffies.\n", jiffies_till_next_fqs); + if (jiffies_till_sched_qs != ULONG_MAX) + pr_info("\tBoot-time adjustment of scheduler-enlistment delay to %ld jiffies.\n", jiffies_till_sched_qs); if (rcu_kick_kthreads) pr_info("\tKick kthreads if too-long grace period.\n"); if (IS_ENABLED(CONFIG_DEBUG_OBJECTS_RCU_HEAD)) From d3052109c0bc9e536d17d627ae628ed8ceb6928c Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 25 Jul 2018 11:49:47 -0700 Subject: [PATCH 125/140] rcu: More aggressively enlist scheduler aid for nohz_full CPUs Because nohz_full CPUs can leave the scheduler-clock interrupt disabled even when in kernel mode, RCU cannot rely on rcu_check_callbacks() to enlist the scheduler's aid in extracting a quiescent state from such CPUs. This commit therefore more aggressively uses resched_cpu() on nohz_full CPUs that fail to pass through a quiescent state in a timely manner. By default, the resched_cpu() beating starts 300 milliseconds into the quiescent state. While in the neighborhood, add a ->last_fqs_resched field to the rcu_data structure in order to rate-limit resched_cpu() calls from the RCU grace-period kthread. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 32 ++++++++++++++++++++++++++------ kernel/rcu/tree.h | 1 + kernel/rcu/tree_plugin.h | 1 + 3 files changed, 28 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 6bd0951a5f3a..96731f62594a 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -61,6 +61,7 @@ #include #include #include +#include #include "tree.h" #include "rcu.h" @@ -1088,19 +1089,38 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) WRITE_ONCE(*rnhqp, true); /* Store rcu_need_heavy_qs before rcu_urgent_qs. */ smp_store_release(ruqp, true); - rcu_state.jiffies_resched += jtsq; /* Re-enable beating. */ } else if (time_after(jiffies, rcu_state.gp_start + jtsq)) { WRITE_ONCE(*ruqp, true); } /* - * If more than halfway to RCU CPU stall-warning time, do a - * resched_cpu() to try to loosen things up a bit. Also check to - * see if the CPU is getting hammered with interrupts, but only - * once per grace period, just to keep the IPIs down to a dull roar. + * NO_HZ_FULL CPUs can run in-kernel without rcu_check_callbacks! + * The above code handles this, but only for straight cond_resched(). + * And some in-kernel loops check need_resched() before calling + * cond_resched(), which defeats the above code for CPUs that are + * running in-kernel with scheduling-clock interrupts disabled. + * So hit them over the head with the resched_cpu() hammer! + */ + if (tick_nohz_full_cpu(rdp->cpu) && + time_after(jiffies, + READ_ONCE(rdp->last_fqs_resched) + jtsq * 3)) { + resched_cpu(rdp->cpu); + WRITE_ONCE(rdp->last_fqs_resched, jiffies); + } + + /* + * If more than halfway to RCU CPU stall-warning time, invoke + * resched_cpu() more frequently to try to loosen things up a bit. + * Also check to see if the CPU is getting hammered with interrupts, + * but only once per grace period, just to keep the IPIs down to + * a dull roar. */ if (time_after(jiffies, rcu_state.jiffies_resched)) { - resched_cpu(rdp->cpu); + if (time_after(jiffies, + READ_ONCE(rdp->last_fqs_resched) + jtsq)) { + resched_cpu(rdp->cpu); + WRITE_ONCE(rdp->last_fqs_resched, jiffies); + } if (IS_ENABLED(CONFIG_IRQ_WORK) && !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != rnp->gp_seq && (rnp->ffmask & rdp->grpmask)) { diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 4866fa44ab0b..8f053bb1eec8 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -260,6 +260,7 @@ struct rcu_data { short rcu_ofl_gp_flags; /* ->gp_flags at last offline. */ unsigned long rcu_onl_gp_seq; /* ->gp_seq at last online. */ short rcu_onl_gp_flags; /* ->gp_flags at last online. */ + unsigned long last_fqs_resched; /* Time of last rcu_resched(). */ int cpu; }; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 7ec366268e2e..1e80a0da7924 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1850,6 +1850,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp) { rdp->ticks_this_gp = 0; rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id()); + WRITE_ONCE(rdp->last_fqs_resched, jiffies); } #ifdef CONFIG_RCU_NOCB_CPU From fced9c8cfe6bc8a26dbbf785927aa673c83a7a35 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 26 Jul 2018 13:44:00 -0700 Subject: [PATCH 126/140] rcu: Avoid resched_cpu() when rescheduling the current CPU The resched_cpu() interface is quite handy, but it does acquire the specified CPU's runqueue lock, which does not come for free. This commit therefore substitutes the following when directing resched_cpu() at the current CPU: set_tsk_need_resched(current); set_preempt_need_resched(); Signed-off-by: Paul E. McKenney Cc: Peter Zijlstra --- kernel/rcu/tree.c | 11 +++++++---- kernel/rcu/tree_exp.h | 17 ++++++++++------- kernel/rcu/tree_plugin.h | 6 ++++-- 3 files changed, 21 insertions(+), 13 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 96731f62594a..92346ab8077d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1354,7 +1354,8 @@ static void print_cpu_stall(void) * progress and it could be we're stuck in kernel space without context * switches for an entirely unreasonable amount of time. */ - resched_cpu(smp_processor_id()); + set_tsk_need_resched(current); + set_preempt_need_resched(); } static void check_cpu_stall(struct rcu_data *rdp) @@ -2675,10 +2676,12 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused WARN_ON_ONCE(!rdp->beenonline); /* Report any deferred quiescent states if preemption enabled. */ - if (!(preempt_count() & PREEMPT_MASK)) + if (!(preempt_count() & PREEMPT_MASK)) { rcu_preempt_deferred_qs(current); - else if (rcu_preempt_need_deferred_qs(current)) - resched_cpu(rdp->cpu); /* Provoke future context switch. */ + } else if (rcu_preempt_need_deferred_qs(current)) { + set_tsk_need_resched(current); + set_preempt_need_resched(); + } /* Update RCU state based on any recent quiescent states. */ rcu_check_quiescent_state(rdp); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 78553a8fa3c6..030df96e0d3c 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -672,7 +672,8 @@ static void sync_rcu_exp_handler(void *unused) rcu_report_exp_rdp(rdp); } else { rdp->deferred_qs = true; - resched_cpu(rdp->cpu); + set_tsk_need_resched(t); + set_preempt_need_resched(); } return; } @@ -710,15 +711,16 @@ static void sync_rcu_exp_handler(void *unused) * because we are in an interrupt handler, which will cause that * function to take an early exit without doing anything. * - * Otherwise, use resched_cpu() to force a context switch after - * the CPU enables everything. + * Otherwise, force a context switch after the CPU enables everything. */ rdp->deferred_qs = true; if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) || - WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs())) + WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs())) { rcu_preempt_deferred_qs(t); - else - resched_cpu(rdp->cpu); + } else { + set_tsk_need_resched(t); + set_preempt_need_resched(); + } } /* PREEMPT=y, so no PREEMPT=n expedited grace period to clean up after. */ @@ -779,7 +781,8 @@ static void sync_sched_exp_handler(void *unused) __this_cpu_write(rcu_data.cpu_no_qs.b.exp, true); /* Store .exp before .rcu_urgent_qs. */ smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); - resched_cpu(smp_processor_id()); + set_tsk_need_resched(current); + set_preempt_need_resched(); } /* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 1e80a0da7924..978ce3539809 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -791,8 +791,10 @@ static void rcu_flavor_check_callbacks(int user) if (t->rcu_read_lock_nesting > 0 || (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) { /* No QS, force context switch if deferred. */ - if (rcu_preempt_need_deferred_qs(t)) - resched_cpu(smp_processor_id()); + if (rcu_preempt_need_deferred_qs(t)) { + set_tsk_need_resched(t); + set_preempt_need_resched(); + } } else if (rcu_preempt_need_deferred_qs(t)) { rcu_preempt_deferred_qs(t); /* Report deferred QS. */ return; From df63fa5bc11aadf81126d4a1785080c800e2ece3 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 31 Jul 2018 09:49:20 -0700 Subject: [PATCH 127/140] rcu: Convert "1UL << x" to "BIT(x)" This commit saves a few characters by converting "1UL << x" to "BIT(x)". Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 2 +- kernel/rcu/tree.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 92346ab8077d..e778fd5546d1 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3582,7 +3582,7 @@ static void __init rcu_init_one(void) rnp->parent = NULL; } else { rnp->grpnum = j % levelspread[i - 1]; - rnp->grpmask = 1UL << rnp->grpnum; + rnp->grpmask = BIT(rnp->grpnum); rnp->parent = rcu_state.level[i - 1] + j / levelspread[i - 1]; } diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 8f053bb1eec8..5e561f1339d4 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -168,7 +168,7 @@ struct rcu_node { * are indexed relative to this interval rather than the global CPU ID space. * This generates the bit for a CPU in node-local masks. */ -#define leaf_node_cpu_bit(rnp, cpu) (1UL << ((cpu) - (rnp)->grplo)) +#define leaf_node_cpu_bit(rnp, cpu) (BIT((cpu) - (rnp)->grplo)) /* * Union to allow "aggregate OR" operation on the need for a quiescent From 31ab604bf3232374e6471a2df9a83c4e75538390 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 16:15:54 -0700 Subject: [PATCH 128/140] rcu: Remove unused rcu_dynticks_snap() from Tiny RCU The rcu_dynticks_snap() function is defined in include/linux/rcutiny.h, but is no longer used by Tiny RCU. This commit therefore removes it. Signed-off-by: Paul E. McKenney --- include/linux/rcutiny.h | 6 ------ 1 file changed, 6 deletions(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 7fa4fb9e899e..f183683bdf79 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -27,12 +27,6 @@ #include -struct rcu_dynticks; -static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp) -{ - return 0; -} - /* Never flag non-existent other CPUs! */ static inline bool rcu_eqs_special_set(int cpu) { return false; } From cc72046cc3cce8438778f306d59442febf4b7683 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 19:31:39 -0700 Subject: [PATCH 129/140] rcu: Merge rcu_dynticks structure into rcu_data structure Now that there is only ever one rcu_data structure per CPU, there is no need for a separate rcu_dynticks structure. This commit therefore adds the rcu_dynticks fields into the rcu_data structure in preparation for removing the rcu_dynticks structure entirely. Note that the ->dynticks field will be handled specially because there is a field by that name in both structures. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 5e561f1339d4..d35cd9677b08 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -212,6 +212,23 @@ struct rcu_data { /* 3) dynticks interface. */ struct rcu_dynticks *dynticks; /* Shared per-CPU dynticks state. */ int dynticks_snap; /* Per-GP tracking for dynticks. */ + long dynticks_nesting; /* Track process nesting level. */ + long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ + // atomic_t dynticks; /* Even value for idle, else odd. */ + bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ + bool rcu_urgent_qs; /* GP old need light quiescent state. */ +#ifdef CONFIG_RCU_FAST_NO_HZ + bool all_lazy; /* Are all CPU's CBs lazy? */ + unsigned long nonlazy_posted; + /* # times non-lazy CBs posted to CPU. */ + unsigned long nonlazy_posted_snap; + /* idle-period nonlazy_posted snapshot. */ + unsigned long last_accelerate; + /* Last jiffy CBs were accelerated. */ + unsigned long last_advance_all; + /* Last jiffy CBs were all advanced. */ + int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */ +#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ /* 4) reasons this CPU needed to be kicked by force_quiescent_state */ unsigned long dynticks_fqs; /* Kicked due to dynticks idle. */ From 0fd79e7521bc944522c3c97f40f3d25619e329f4 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 21:00:38 -0700 Subject: [PATCH 130/140] rcu: Switch ->tick_nohz_enabled_snap to rcu_data structure This commit removes ->tick_nohz_enabled_snap from the rcu_dynticks structure and updates the code to access it from the rcu_data structure. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 1 - kernel/rcu/tree_plugin.h | 10 +++++----- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index d35cd9677b08..5d447ceba769 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -53,7 +53,6 @@ struct rcu_dynticks { /* Last jiffy CBs were accelerated. */ unsigned long last_advance_all; /* Last jiffy CBs were all advanced. */ - int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */ #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ }; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 978ce3539809..6511032371c1 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1681,7 +1681,7 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt) static void rcu_prepare_for_idle(void) { bool needwake; - struct rcu_data *rdp; + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); struct rcu_node *rnp; int tne; @@ -1692,10 +1692,10 @@ static void rcu_prepare_for_idle(void) /* Handle nohz enablement switches conservatively. */ tne = READ_ONCE(tick_nohz_active); - if (tne != rdtp->tick_nohz_enabled_snap) { + if (tne != rdp->tick_nohz_enabled_snap) { if (rcu_cpu_has_callbacks(NULL)) invoke_rcu_core(); /* force nohz to see update. */ - rdtp->tick_nohz_enabled_snap = tne; + rdp->tick_nohz_enabled_snap = tne; return; } if (!tne) @@ -1721,7 +1721,6 @@ static void rcu_prepare_for_idle(void) if (rdtp->last_accelerate == jiffies) return; rdtp->last_accelerate = jiffies; - rdp = this_cpu_ptr(&rcu_data); if (rcu_segcblist_pend_cbs(&rdp->cblist)) { rnp = rdp->mynode; raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ @@ -1765,6 +1764,7 @@ static void rcu_idle_count_callbacks_posted(void) static void print_cpu_stall_fast_no_hz(char *cp, int cpu) { + struct rcu_data *rdp = &per_cpu(rcu_data, cpu); struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); unsigned long nlpd = rdtp->nonlazy_posted - rdtp->nonlazy_posted_snap; @@ -1772,7 +1772,7 @@ static void print_cpu_stall_fast_no_hz(char *cp, int cpu) rdtp->last_accelerate & 0xffff, jiffies & 0xffff, ulong2long(nlpd), rdtp->all_lazy ? 'L' : '.', - rdtp->tick_nohz_enabled_snap ? '.' : 'D'); + rdp->tick_nohz_enabled_snap ? '.' : 'D'); } #else /* #ifdef CONFIG_RCU_FAST_NO_HZ */ From 5998a75adbf4f85e63b06fa7723633cc84d7129b Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 21:00:38 -0700 Subject: [PATCH 131/140] rcu: Switch last accelerate/advance to rcu_data structure This commit removes ->last_accelerate and ->last_advance_all from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 4 ---- kernel/rcu/tree_plugin.h | 17 ++++++++--------- 2 files changed, 8 insertions(+), 13 deletions(-) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 5d447ceba769..69bd6bec05bb 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -49,10 +49,6 @@ struct rcu_dynticks { /* # times non-lazy CBs posted to CPU. */ unsigned long nonlazy_posted_snap; /* idle-period nonlazy_posted snapshot. */ - unsigned long last_accelerate; - /* Last jiffy CBs were accelerated. */ - unsigned long last_advance_all; - /* Last jiffy CBs were all advanced. */ #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ }; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 6511032371c1..45708164ddf9 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1597,16 +1597,14 @@ module_param(rcu_idle_lazy_gp_delay, int, 0644); static bool __maybe_unused rcu_try_advance_all_cbs(void) { bool cbs_ready = false; - struct rcu_data *rdp; - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_node *rnp; /* Exit early if we advanced recently. */ - if (jiffies == rdtp->last_advance_all) + if (jiffies == rdp->last_advance_all) return false; - rdtp->last_advance_all = jiffies; + rdp->last_advance_all = jiffies; - rdp = this_cpu_ptr(&rcu_data); rnp = rdp->mynode; /* @@ -1635,6 +1633,7 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void) */ int rcu_needs_cpu(u64 basemono, u64 *nextevt) { + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); unsigned long dj; @@ -1655,7 +1654,7 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt) invoke_rcu_core(); return 1; } - rdtp->last_accelerate = jiffies; + rdp->last_accelerate = jiffies; /* Request timer delay depending on laziness, and round. */ if (!rdtp->all_lazy) { @@ -1718,9 +1717,9 @@ static void rcu_prepare_for_idle(void) * If we have not yet accelerated this jiffy, accelerate all * callbacks on this CPU. */ - if (rdtp->last_accelerate == jiffies) + if (rdp->last_accelerate == jiffies) return; - rdtp->last_accelerate = jiffies; + rdp->last_accelerate = jiffies; if (rcu_segcblist_pend_cbs(&rdp->cblist)) { rnp = rdp->mynode; raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ @@ -1769,7 +1768,7 @@ static void print_cpu_stall_fast_no_hz(char *cp, int cpu) unsigned long nlpd = rdtp->nonlazy_posted - rdtp->nonlazy_posted_snap; sprintf(cp, "last_accelerate: %04lx/%04lx, nonlazy_posted: %ld, %c%c", - rdtp->last_accelerate & 0xffff, jiffies & 0xffff, + rdp->last_accelerate & 0xffff, jiffies & 0xffff, ulong2long(nlpd), rdtp->all_lazy ? 'L' : '.', rdp->tick_nohz_enabled_snap ? '.' : 'D'); From c458a89e964dbf3c56b23eca2018bd0e2380969d Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 21:00:38 -0700 Subject: [PATCH 132/140] rcu: Switch lazy counts to rcu_data structure This commit removes ->all_lazy, ->nonlazy_posted and ->nonlazy_posted_snap from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.h | 7 ------- kernel/rcu/tree_plugin.h | 23 ++++++++++------------- 2 files changed, 10 insertions(+), 20 deletions(-) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 69bd6bec05bb..36a47c7bd882 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -43,13 +43,6 @@ struct rcu_dynticks { atomic_t dynticks; /* Even value for idle, else odd. */ bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ bool rcu_urgent_qs; /* GP old need light quiescent state. */ -#ifdef CONFIG_RCU_FAST_NO_HZ - bool all_lazy; /* Are all CPU's CBs lazy? */ - unsigned long nonlazy_posted; - /* # times non-lazy CBs posted to CPU. */ - unsigned long nonlazy_posted_snap; - /* idle-period nonlazy_posted snapshot. */ -#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ }; /* Communicate arguments to a workqueue handler. */ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 45708164ddf9..b5aeb2fe4cfe 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1634,16 +1634,15 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void) int rcu_needs_cpu(u64 basemono, u64 *nextevt) { struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); unsigned long dj; lockdep_assert_irqs_disabled(); /* Snapshot to detect later posting of non-lazy callback. */ - rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted; + rdp->nonlazy_posted_snap = rdp->nonlazy_posted; /* If no callbacks, RCU doesn't need the CPU. */ - if (!rcu_cpu_has_callbacks(&rdtp->all_lazy)) { + if (!rcu_cpu_has_callbacks(&rdp->all_lazy)) { *nextevt = KTIME_MAX; return 0; } @@ -1657,7 +1656,7 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt) rdp->last_accelerate = jiffies; /* Request timer delay depending on laziness, and round. */ - if (!rdtp->all_lazy) { + if (!rdp->all_lazy) { dj = round_up(rcu_idle_gp_delay + jiffies, rcu_idle_gp_delay) - jiffies; } else { @@ -1681,7 +1680,6 @@ static void rcu_prepare_for_idle(void) { bool needwake; struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); struct rcu_node *rnp; int tne; @@ -1705,10 +1703,10 @@ static void rcu_prepare_for_idle(void) * callbacks, invoke RCU core for the side-effect of recalculating * idle duration on re-entry to idle. */ - if (rdtp->all_lazy && - rdtp->nonlazy_posted != rdtp->nonlazy_posted_snap) { - rdtp->all_lazy = false; - rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted; + if (rdp->all_lazy && + rdp->nonlazy_posted != rdp->nonlazy_posted_snap) { + rdp->all_lazy = false; + rdp->nonlazy_posted_snap = rdp->nonlazy_posted; invoke_rcu_core(); return; } @@ -1754,7 +1752,7 @@ static void rcu_cleanup_after_idle(void) */ static void rcu_idle_count_callbacks_posted(void) { - __this_cpu_add(rcu_dynticks.nonlazy_posted, 1); + __this_cpu_add(rcu_data.nonlazy_posted, 1); } #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ @@ -1764,13 +1762,12 @@ static void rcu_idle_count_callbacks_posted(void) static void print_cpu_stall_fast_no_hz(char *cp, int cpu) { struct rcu_data *rdp = &per_cpu(rcu_data, cpu); - struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); - unsigned long nlpd = rdtp->nonlazy_posted - rdtp->nonlazy_posted_snap; + unsigned long nlpd = rdp->nonlazy_posted - rdp->nonlazy_posted_snap; sprintf(cp, "last_accelerate: %04lx/%04lx, nonlazy_posted: %ld, %c%c", rdp->last_accelerate & 0xffff, jiffies & 0xffff, ulong2long(nlpd), - rdtp->all_lazy ? 'L' : '.', + rdp->all_lazy ? 'L' : '.', rdp->tick_nohz_enabled_snap ? '.' : 'D'); } From 2dba13f0b6c2b26ff371b8927ac58d20a7d94713 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 21:00:38 -0700 Subject: [PATCH 133/140] rcu: Switch urgent quiescent-state requests to rcu_data structure This commit removes ->rcu_need_heavy_qs and ->rcu_urgent_qs from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 12 ++++++------ kernel/rcu/tree.h | 2 -- kernel/rcu/tree_exp.h | 2 +- kernel/rcu/tree_plugin.h | 14 +++++++------- 4 files changed, 14 insertions(+), 16 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e778fd5546d1..7ec0ba885273 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -362,7 +362,7 @@ static void __maybe_unused rcu_momentary_dyntick_idle(void) struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); int special; - raw_cpu_write(rcu_dynticks.rcu_need_heavy_qs, false); + raw_cpu_write(rcu_data.rcu_need_heavy_qs, false); special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); /* It is illegal to call this from idle state. */ WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR)); @@ -928,7 +928,7 @@ void rcu_request_urgent_qs_task(struct task_struct *t) cpu = task_cpu(t); if (!task_curr(t)) return; /* This task is not running on that CPU. */ - smp_store_release(per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, cpu), true); + smp_store_release(per_cpu_ptr(&rcu_data.rcu_urgent_qs, cpu), true); } #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) @@ -1081,8 +1081,8 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) * is set way high. */ jtsq = READ_ONCE(jiffies_to_sched_qs); - ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); - rnhqp = &per_cpu(rcu_dynticks.rcu_need_heavy_qs, rdp->cpu); + ruqp = per_cpu_ptr(&rcu_data.rcu_urgent_qs, rdp->cpu); + rnhqp = &per_cpu(rcu_data.rcu_need_heavy_qs, rdp->cpu); if (!READ_ONCE(*rnhqp) && (time_after(jiffies, rcu_state.gp_start + jtsq * 2) || time_after(jiffies, rcu_state.jiffies_resched))) { @@ -2499,13 +2499,13 @@ void rcu_check_callbacks(int user) trace_rcu_utilization(TPS("Start scheduler-tick")); raw_cpu_inc(rcu_data.ticks_this_gp); /* The load-acquire pairs with the store-release setting to true. */ - if (smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { + if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) { /* Idle and userspace execution already are quiescent states. */ if (!rcu_is_cpu_rrupt_from_idle() && !user) { set_tsk_need_resched(current); set_preempt_need_resched(); } - __this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); + __this_cpu_write(rcu_data.rcu_urgent_qs, false); } rcu_flavor_check_callbacks(user); if (rcu_pending()) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 36a47c7bd882..4c31066ddb94 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -41,8 +41,6 @@ struct rcu_dynticks { long dynticks_nesting; /* Track process nesting level. */ long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ atomic_t dynticks; /* Even value for idle, else odd. */ - bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ - bool rcu_urgent_qs; /* GP old need light quiescent state. */ }; /* Communicate arguments to a workqueue handler. */ diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 030df96e0d3c..11387fcd4d85 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -780,7 +780,7 @@ static void sync_sched_exp_handler(void *unused) } __this_cpu_write(rcu_data.cpu_no_qs.b.exp, true); /* Store .exp before .rcu_urgent_qs. */ - smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); + smp_store_release(this_cpu_ptr(&rcu_data.rcu_urgent_qs), true); set_tsk_need_resched(current); set_preempt_need_resched(); } diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index b5aeb2fe4cfe..161760957a07 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -967,17 +967,17 @@ void rcu_all_qs(void) { unsigned long flags; - if (!raw_cpu_read(rcu_dynticks.rcu_urgent_qs)) + if (!raw_cpu_read(rcu_data.rcu_urgent_qs)) return; preempt_disable(); /* Load rcu_urgent_qs before other flags. */ - if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { + if (!smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) { preempt_enable(); return; } - this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); + this_cpu_write(rcu_data.rcu_urgent_qs, false); barrier(); /* Avoid RCU read-side critical sections leaking down. */ - if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) { + if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) { local_irq_save(flags); rcu_momentary_dyntick_idle(); local_irq_restore(flags); @@ -997,10 +997,10 @@ void rcu_note_context_switch(bool preempt) trace_rcu_utilization(TPS("Start context switch")); rcu_qs(); /* Load rcu_urgent_qs before other flags. */ - if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) + if (!smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) goto out; - this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); - if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) + this_cpu_write(rcu_data.rcu_urgent_qs, false); + if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) rcu_momentary_dyntick_idle(); if (!preempt) rcu_tasks_qs(current); From 4c5273bf2b5ed9b585e470dda19c09c875a9fbbd Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 21:00:38 -0700 Subject: [PATCH 134/140] rcu: Switch dyntick nesting counters to rcu_data structure This commit removes ->dynticks_nesting and ->dynticks_nmi_nesting from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 79 +++++++++++++++++++++------------------- kernel/rcu/tree.h | 2 - kernel/rcu/tree_plugin.h | 2 +- 3 files changed, 43 insertions(+), 40 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 7ec0ba885273..bfa264a6f3fc 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -73,7 +73,10 @@ /* Data structures. */ -static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data); +static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { + .dynticks_nesting = 1, + .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, +}; struct rcu_state rcu_state = { .level = { &rcu_state.node[0] }, .gp_state = RCU_GP_IDLE, @@ -210,8 +213,6 @@ void rcu_softirq_qs(void) #endif static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = { - .dynticks_nesting = 1, - .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, .dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR), }; @@ -378,8 +379,8 @@ static void __maybe_unused rcu_momentary_dyntick_idle(void) */ static int rcu_is_cpu_rrupt_from_idle(void) { - return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 0 && - __this_cpu_read(rcu_dynticks.dynticks_nmi_nesting) <= 1; + return __this_cpu_read(rcu_data.dynticks_nesting) <= 0 && + __this_cpu_read(rcu_data.dynticks_nmi_nesting) <= 1; } #define DEFAULT_RCU_BLIMIT 10 /* Maximum callbacks per rcu_do_batch. */ @@ -571,27 +572,27 @@ static struct rcu_node *rcu_get_root(void) */ static void rcu_eqs_enter(bool user) { - struct rcu_data *rdp; + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); struct rcu_dynticks *rdtp; rdtp = this_cpu_ptr(&rcu_dynticks); - WARN_ON_ONCE(rdtp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE); - WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); + WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE); + WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && - rdtp->dynticks_nesting == 0); - if (rdtp->dynticks_nesting != 1) { - rdtp->dynticks_nesting--; + rdp->dynticks_nesting == 0); + if (rdp->dynticks_nesting != 1) { + rdp->dynticks_nesting--; return; } lockdep_assert_irqs_disabled(); - trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0, rdtp->dynticks); + trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, rdtp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); rdp = this_cpu_ptr(&rcu_data); do_nocb_deferred_wakeup(rdp); rcu_prepare_for_idle(); rcu_preempt_deferred_qs(current); - WRITE_ONCE(rdtp->dynticks_nesting, 0); /* Avoid irq-access tearing. */ + WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid irq-access tearing. */ rcu_dynticks_eqs_enter(); rcu_dynticks_task_enter(); } @@ -634,7 +635,7 @@ void rcu_user_enter(void) /* * If we are returning from the outermost NMI handler that interrupted an - * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting + * RCU-idle period, update rdtp->dynticks and rdp->dynticks_nmi_nesting * to let the RCU grace-period handling know that the CPU is back to * being RCU-idle. * @@ -643,30 +644,31 @@ void rcu_user_enter(void) */ static __always_inline void rcu_nmi_exit_common(bool irq) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); + struct rcu_dynticks __maybe_unused *rdtp = this_cpu_ptr(&rcu_dynticks); /* * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks. * (We are exiting an NMI handler, so RCU better be paying attention * to us!) */ - WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0); + WARN_ON_ONCE(rdp->dynticks_nmi_nesting <= 0); WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs()); /* * If the nesting level is not 1, the CPU wasn't RCU-idle, so * leave it in non-RCU-idle state. */ - if (rdtp->dynticks_nmi_nesting != 1) { - trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nmi_nesting, rdtp->dynticks_nmi_nesting - 2, rdtp->dynticks); - WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* No store tearing. */ - rdtp->dynticks_nmi_nesting - 2); + if (rdp->dynticks_nmi_nesting != 1) { + trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdtp->dynticks); + WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */ + rdp->dynticks_nmi_nesting - 2); return; } /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */ - trace_rcu_dyntick(TPS("Startirq"), rdtp->dynticks_nmi_nesting, 0, rdtp->dynticks); - WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ + trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, rdtp->dynticks); + WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ if (irq) rcu_prepare_for_idle(); @@ -739,25 +741,27 @@ void rcu_irq_exit_irqson(void) */ static void rcu_eqs_exit(bool user) { + struct rcu_data *rdp; struct rcu_dynticks *rdtp; long oldval; lockdep_assert_irqs_disabled(); rdtp = this_cpu_ptr(&rcu_dynticks); - oldval = rdtp->dynticks_nesting; + rdp = this_cpu_ptr(&rcu_data); + oldval = rdp->dynticks_nesting; WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0); if (oldval) { - rdtp->dynticks_nesting++; + rdp->dynticks_nesting++; return; } rcu_dynticks_task_exit(); rcu_dynticks_eqs_exit(); rcu_cleanup_after_idle(); - trace_rcu_dyntick(TPS("End"), rdtp->dynticks_nesting, 1, rdtp->dynticks); + trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, rdtp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); - WRITE_ONCE(rdtp->dynticks_nesting, 1); - WARN_ON_ONCE(rdtp->dynticks_nmi_nesting); - WRITE_ONCE(rdtp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE); + WRITE_ONCE(rdp->dynticks_nesting, 1); + WARN_ON_ONCE(rdp->dynticks_nmi_nesting); + WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE); } /** @@ -799,7 +803,7 @@ void rcu_user_exit(void) * @irq: Is this call from rcu_irq_enter? * * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and - * rdtp->dynticks_nmi_nesting to let the RCU grace-period handling know + * rdp->dynticks_nmi_nesting to let the RCU grace-period handling know * that the CPU is active. This implementation permits nested NMIs, as * long as the nesting level does not overflow an int. (You will probably * run out of stack space first.) @@ -809,11 +813,12 @@ void rcu_user_exit(void) */ static __always_inline void rcu_nmi_enter_common(bool irq) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); + struct rcu_dynticks __maybe_unused *rdtp = this_cpu_ptr(&rcu_dynticks); long incby = 2; /* Complain about underflow. */ - WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 0); + WARN_ON_ONCE(rdp->dynticks_nmi_nesting < 0); /* * If idle from RCU viewpoint, atomically increment ->dynticks @@ -836,10 +841,10 @@ static __always_inline void rcu_nmi_enter_common(bool irq) incby = 1; } trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="), - rdtp->dynticks_nmi_nesting, - rdtp->dynticks_nmi_nesting + incby, rdtp->dynticks); - WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* Prevent store tearing. */ - rdtp->dynticks_nmi_nesting + incby); + rdp->dynticks_nmi_nesting, + rdp->dynticks_nmi_nesting + incby, rdtp->dynticks); + WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */ + rdp->dynticks_nmi_nesting + incby); barrier(); } @@ -3194,7 +3199,7 @@ rcu_boot_init_percpu_data(int cpu) /* Set up local state, ensuring consistent view of global state. */ rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu); rdp->dynticks = &per_cpu(rcu_dynticks, cpu); - WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1); + WARN_ON_ONCE(rdp->dynticks_nesting != 1); WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp->dynticks))); rdp->rcu_ofl_gp_seq = rcu_state.gp_seq; rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED; @@ -3227,7 +3232,7 @@ int rcutree_prepare_cpu(unsigned int cpu) if (rcu_segcblist_empty(&rdp->cblist) && /* No early-boot CBs? */ !init_nocb_callback_list(rdp)) rcu_segcblist_init(&rdp->cblist); /* Re-enable callbacks. */ - rdp->dynticks->dynticks_nesting = 1; /* CPU not up, no tearing. */ + rdp->dynticks_nesting = 1; /* CPU not up, no tearing. */ rcu_dynticks_eqs_online(); raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 4c31066ddb94..2e5eec48a94a 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -38,8 +38,6 @@ * Dynticks per-CPU state. */ struct rcu_dynticks { - long dynticks_nesting; /* Track process nesting level. */ - long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ atomic_t dynticks; /* Even value for idle, else odd. */ }; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 161760957a07..7087ee3e1ea5 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1831,7 +1831,7 @@ static void print_cpu_stall_info(int cpu) "!."[!delta], ticks_value, ticks_title, rcu_dynticks_snap(rdtp) & 0xfff, - rdtp->dynticks_nesting, rdtp->dynticks_nmi_nesting, + rdp->dynticks_nesting, rdp->dynticks_nmi_nesting, rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, fast_no_hz); From dc5a4f2932f18568bb9d8cdbe2139a8ddbc28bb8 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 3 Aug 2018 21:00:38 -0700 Subject: [PATCH 135/140] rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks This commit move ->dynticks from the rcu_dynticks structure to the rcu_data structure, replacing the field of the same name. It also updates the code to access ->dynticks from the rcu_data structure and to use the rcu_data structure rather than following to now-gone ->dynticks field to the now-gone rcu_dynticks structure. While in the area, this commit also fixes up comments. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 92 ++++++++++++++++++---------------------- kernel/rcu/tree.h | 35 ++++++--------- kernel/rcu/tree_exp.h | 6 +-- kernel/rcu/tree_plugin.h | 3 +- 4 files changed, 56 insertions(+), 80 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index bfa264a6f3fc..32f500fb24d3 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -73,9 +73,20 @@ /* Data structures. */ +/* + * Steal a bit from the bottom of ->dynticks for idle entry/exit + * control. Initially this is for TLB flushing. + */ +#define RCU_DYNTICK_CTRL_MASK 0x1 +#define RCU_DYNTICK_CTRL_CTR (RCU_DYNTICK_CTRL_MASK + 1) +#ifndef rcu_eqs_special_exit +#define rcu_eqs_special_exit() do { } while (0) +#endif + static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { .dynticks_nesting = 1, .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, + .dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR), }; struct rcu_state rcu_state = { .level = { &rcu_state.node[0] }, @@ -202,27 +213,13 @@ void rcu_softirq_qs(void) rcu_preempt_deferred_qs(current); } -/* - * Steal a bit from the bottom of ->dynticks for idle entry/exit - * control. Initially this is for TLB flushing. - */ -#define RCU_DYNTICK_CTRL_MASK 0x1 -#define RCU_DYNTICK_CTRL_CTR (RCU_DYNTICK_CTRL_MASK + 1) -#ifndef rcu_eqs_special_exit -#define rcu_eqs_special_exit() do { } while (0) -#endif - -static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = { - .dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR), -}; - /* * Record entry into an extended quiescent state. This is only to be * called when not already in an extended quiescent state. */ static void rcu_dynticks_eqs_enter(void) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); int seq; /* @@ -230,7 +227,7 @@ static void rcu_dynticks_eqs_enter(void) * critical sections, and we also must force ordering with the * next idle sojourn. */ - seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); + seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks); /* Better be in an extended quiescent state! */ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & RCU_DYNTICK_CTRL_CTR)); @@ -245,7 +242,7 @@ static void rcu_dynticks_eqs_enter(void) */ static void rcu_dynticks_eqs_exit(void) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); int seq; /* @@ -253,11 +250,11 @@ static void rcu_dynticks_eqs_exit(void) * and we also must force ordering with the next RCU read-side * critical section. */ - seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); + seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & RCU_DYNTICK_CTRL_CTR)); if (seq & RCU_DYNTICK_CTRL_MASK) { - atomic_andnot(RCU_DYNTICK_CTRL_MASK, &rdtp->dynticks); + atomic_andnot(RCU_DYNTICK_CTRL_MASK, &rdp->dynticks); smp_mb__after_atomic(); /* _exit after clearing mask. */ /* Prefer duplicate flushes to losing a flush. */ rcu_eqs_special_exit(); @@ -276,11 +273,11 @@ static void rcu_dynticks_eqs_exit(void) */ static void rcu_dynticks_eqs_online(void) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - if (atomic_read(&rdtp->dynticks) & RCU_DYNTICK_CTRL_CTR) + if (atomic_read(&rdp->dynticks) & RCU_DYNTICK_CTRL_CTR) return; - atomic_add(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); + atomic_add(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks); } /* @@ -290,18 +287,18 @@ static void rcu_dynticks_eqs_online(void) */ bool rcu_dynticks_curr_cpu_in_eqs(void) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - return !(atomic_read(&rdtp->dynticks) & RCU_DYNTICK_CTRL_CTR); + return !(atomic_read(&rdp->dynticks) & RCU_DYNTICK_CTRL_CTR); } /* * Snapshot the ->dynticks counter with full ordering so as to allow * stable comparison of this counter with past and future snapshots. */ -int rcu_dynticks_snap(struct rcu_dynticks *rdtp) +int rcu_dynticks_snap(struct rcu_data *rdp) { - int snap = atomic_add_return(0, &rdtp->dynticks); + int snap = atomic_add_return(0, &rdp->dynticks); return snap & ~RCU_DYNTICK_CTRL_MASK; } @@ -316,13 +313,13 @@ static bool rcu_dynticks_in_eqs(int snap) } /* - * Return true if the CPU corresponding to the specified rcu_dynticks + * Return true if the CPU corresponding to the specified rcu_data * structure has spent some time in an extended quiescent state since * rcu_dynticks_snap() returned the specified snapshot. */ -static bool rcu_dynticks_in_eqs_since(struct rcu_dynticks *rdtp, int snap) +static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap) { - return snap != rcu_dynticks_snap(rdtp); + return snap != rcu_dynticks_snap(rdp); } /* @@ -336,14 +333,14 @@ bool rcu_eqs_special_set(int cpu) { int old; int new; - struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); + struct rcu_data *rdp = &per_cpu(rcu_data, cpu); do { - old = atomic_read(&rdtp->dynticks); + old = atomic_read(&rdp->dynticks); if (old & RCU_DYNTICK_CTRL_CTR) return false; new = old | RCU_DYNTICK_CTRL_MASK; - } while (atomic_cmpxchg(&rdtp->dynticks, old, new) != old); + } while (atomic_cmpxchg(&rdp->dynticks, old, new) != old); return true; } @@ -360,11 +357,11 @@ bool rcu_eqs_special_set(int cpu) */ static void __maybe_unused rcu_momentary_dyntick_idle(void) { - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); int special; raw_cpu_write(rcu_data.rcu_need_heavy_qs, false); - special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); + special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, + &this_cpu_ptr(&rcu_data)->dynticks); /* It is illegal to call this from idle state. */ WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR)); rcu_preempt_deferred_qs(current); @@ -573,9 +570,7 @@ static struct rcu_node *rcu_get_root(void) static void rcu_eqs_enter(bool user) { struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - struct rcu_dynticks *rdtp; - rdtp = this_cpu_ptr(&rcu_dynticks); WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE); WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && @@ -586,7 +581,7 @@ static void rcu_eqs_enter(bool user) } lockdep_assert_irqs_disabled(); - trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, rdtp->dynticks); + trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, rdp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); rdp = this_cpu_ptr(&rcu_data); do_nocb_deferred_wakeup(rdp); @@ -635,7 +630,7 @@ void rcu_user_enter(void) /* * If we are returning from the outermost NMI handler that interrupted an - * RCU-idle period, update rdtp->dynticks and rdp->dynticks_nmi_nesting + * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting * to let the RCU grace-period handling know that the CPU is back to * being RCU-idle. * @@ -645,7 +640,6 @@ void rcu_user_enter(void) static __always_inline void rcu_nmi_exit_common(bool irq) { struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - struct rcu_dynticks __maybe_unused *rdtp = this_cpu_ptr(&rcu_dynticks); /* * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks. @@ -660,14 +654,14 @@ static __always_inline void rcu_nmi_exit_common(bool irq) * leave it in non-RCU-idle state. */ if (rdp->dynticks_nmi_nesting != 1) { - trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdtp->dynticks); + trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdp->dynticks); WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */ rdp->dynticks_nmi_nesting - 2); return; } /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */ - trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, rdtp->dynticks); + trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, rdp->dynticks); WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ if (irq) @@ -742,11 +736,9 @@ void rcu_irq_exit_irqson(void) static void rcu_eqs_exit(bool user) { struct rcu_data *rdp; - struct rcu_dynticks *rdtp; long oldval; lockdep_assert_irqs_disabled(); - rdtp = this_cpu_ptr(&rcu_dynticks); rdp = this_cpu_ptr(&rcu_data); oldval = rdp->dynticks_nesting; WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0); @@ -757,7 +749,7 @@ static void rcu_eqs_exit(bool user) rcu_dynticks_task_exit(); rcu_dynticks_eqs_exit(); rcu_cleanup_after_idle(); - trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, rdtp->dynticks); + trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, rdp->dynticks); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); WRITE_ONCE(rdp->dynticks_nesting, 1); WARN_ON_ONCE(rdp->dynticks_nmi_nesting); @@ -802,7 +794,7 @@ void rcu_user_exit(void) * rcu_nmi_enter_common - inform RCU of entry to NMI context * @irq: Is this call from rcu_irq_enter? * - * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and + * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and * rdp->dynticks_nmi_nesting to let the RCU grace-period handling know * that the CPU is active. This implementation permits nested NMIs, as * long as the nesting level does not overflow an int. (You will probably @@ -814,7 +806,6 @@ void rcu_user_exit(void) static __always_inline void rcu_nmi_enter_common(bool irq) { struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - struct rcu_dynticks __maybe_unused *rdtp = this_cpu_ptr(&rcu_dynticks); long incby = 2; /* Complain about underflow. */ @@ -842,7 +833,7 @@ static __always_inline void rcu_nmi_enter_common(bool irq) } trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="), rdp->dynticks_nmi_nesting, - rdp->dynticks_nmi_nesting + incby, rdtp->dynticks); + rdp->dynticks_nmi_nesting + incby, rdp->dynticks); WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */ rdp->dynticks_nmi_nesting + incby); barrier(); @@ -995,7 +986,7 @@ static void rcu_gpnum_ovf(struct rcu_node *rnp, struct rcu_data *rdp) */ static int dyntick_save_progress_counter(struct rcu_data *rdp) { - rdp->dynticks_snap = rcu_dynticks_snap(rdp->dynticks); + rdp->dynticks_snap = rcu_dynticks_snap(rdp); if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) { trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); rcu_gpnum_ovf(rdp->mynode, rdp); @@ -1046,7 +1037,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) * read-side critical section that started before the beginning * of the current RCU grace period. */ - if (rcu_dynticks_in_eqs_since(rdp->dynticks, rdp->dynticks_snap)) { + if (rcu_dynticks_in_eqs_since(rdp, rdp->dynticks_snap)) { trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); rdp->dynticks_fqs++; rcu_gpnum_ovf(rnp, rdp); @@ -3198,9 +3189,8 @@ rcu_boot_init_percpu_data(int cpu) /* Set up local state, ensuring consistent view of global state. */ rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu); - rdp->dynticks = &per_cpu(rcu_dynticks, cpu); WARN_ON_ONCE(rdp->dynticks_nesting != 1); - WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp->dynticks))); + WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp))); rdp->rcu_ofl_gp_seq = rcu_state.gp_seq; rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED; rdp->rcu_onl_gp_seq = rcu_state.gp_seq; diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 2e5eec48a94a..af8681fec23b 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -34,13 +34,6 @@ #include "rcu_segcblist.h" -/* - * Dynticks per-CPU state. - */ -struct rcu_dynticks { - atomic_t dynticks; /* Even value for idle, else odd. */ -}; - /* Communicate arguments to a workqueue handler. */ struct rcu_exp_work { smp_call_func_t rew_func; @@ -194,24 +187,20 @@ struct rcu_data { long blimit; /* Upper limit on a processed batch */ /* 3) dynticks interface. */ - struct rcu_dynticks *dynticks; /* Shared per-CPU dynticks state. */ int dynticks_snap; /* Per-GP tracking for dynticks. */ - long dynticks_nesting; /* Track process nesting level. */ - long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ - // atomic_t dynticks; /* Even value for idle, else odd. */ - bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ - bool rcu_urgent_qs; /* GP old need light quiescent state. */ + long dynticks_nesting; /* Track process nesting level. */ + long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ + atomic_t dynticks; /* Even value for idle, else odd. */ + bool rcu_need_heavy_qs; /* GP old, so heavy quiescent state! */ + bool rcu_urgent_qs; /* GP old need light quiescent state. */ #ifdef CONFIG_RCU_FAST_NO_HZ - bool all_lazy; /* Are all CPU's CBs lazy? */ - unsigned long nonlazy_posted; - /* # times non-lazy CBs posted to CPU. */ + bool all_lazy; /* Are all CPU's CBs lazy? */ + unsigned long nonlazy_posted; /* # times non-lazy CB posted to CPU. */ unsigned long nonlazy_posted_snap; - /* idle-period nonlazy_posted snapshot. */ - unsigned long last_accelerate; - /* Last jiffy CBs were accelerated. */ - unsigned long last_advance_all; - /* Last jiffy CBs were all advanced. */ - int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */ + /* Nonlazy_posted snapshot. */ + unsigned long last_accelerate; /* Last jiffy CBs were accelerated. */ + unsigned long last_advance_all; /* Last jiffy CBs were all advanced. */ + int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */ #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ /* 4) reasons this CPU needed to be kicked by force_quiescent_state */ @@ -426,7 +415,7 @@ extern struct rcu_state rcu_bh_state; extern struct rcu_state rcu_preempt_state; #endif /* #ifdef CONFIG_PREEMPT_RCU */ -int rcu_dynticks_snap(struct rcu_dynticks *rdtp); +int rcu_dynticks_snap(struct rcu_data *rdp); #ifdef CONFIG_RCU_BOOST DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_status); diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 11387fcd4d85..8d18c1014e2b 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -360,14 +360,13 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) { unsigned long mask = leaf_node_cpu_bit(rnp, cpu); struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - struct rcu_dynticks *rdtp = per_cpu_ptr(&rcu_dynticks, cpu); int snap; if (raw_smp_processor_id() == cpu || !(rnp->qsmaskinitnext & mask)) { mask_ofl_test |= mask; } else { - snap = rcu_dynticks_snap(rdtp); + snap = rcu_dynticks_snap(rdp); if (rcu_dynticks_in_eqs(snap)) mask_ofl_test |= mask; else @@ -393,8 +392,7 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) if (!(mask_ofl_ipi & mask)) continue; retry_ipi: - if (rcu_dynticks_in_eqs_since(rdp->dynticks, - rdp->exp_dynticks_snap)) { + if (rcu_dynticks_in_eqs_since(rdp, rdp->exp_dynticks_snap)) { mask_ofl_test |= mask; continue; } diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 7087ee3e1ea5..05915e536336 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1802,7 +1802,6 @@ static void print_cpu_stall_info(int cpu) unsigned long delta; char fast_no_hz[72]; struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); - struct rcu_dynticks *rdtp = rdp->dynticks; char *ticks_title; unsigned long ticks_value; @@ -1830,7 +1829,7 @@ static void print_cpu_stall_info(int cpu) rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' : "!."[!delta], ticks_value, ticks_title, - rcu_dynticks_snap(rdtp) & 0xfff, + rcu_dynticks_snap(rdp) & 0xfff, rdp->dynticks_nesting, rdp->dynticks_nmi_nesting, rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, From 8d8a9d0e7eda9feeee4af7be31932e14b512d3ad Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sat, 4 Aug 2018 20:32:07 -0700 Subject: [PATCH 136/140] rcu: Remove obsolete ->dynticks_fqs and ->cond_resched_completed The rcu_data structure's ->dynticks_fqs is incremented but never accesses. Its ->cond_resched_completed field isn't used at all. This commit therefore removes both fields. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 1 - kernel/rcu/tree.h | 12 +++--------- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 32f500fb24d3..85c2c2dc4c4a 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1039,7 +1039,6 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) */ if (rcu_dynticks_in_eqs_since(rdp, rdp->dynticks_snap)) { trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); - rdp->dynticks_fqs++; rcu_gpnum_ovf(rnp, rdp); return 1; } diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index af8681fec23b..bfbf97a1c29d 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -203,17 +203,11 @@ struct rcu_data { int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */ #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ - /* 4) reasons this CPU needed to be kicked by force_quiescent_state */ - unsigned long dynticks_fqs; /* Kicked due to dynticks idle. */ - unsigned long cond_resched_completed; - /* Grace period that needs help */ - /* from cond_resched(). */ - - /* 5) rcu_barrier(), OOM callbacks, and expediting. */ + /* 4) rcu_barrier(), OOM callbacks, and expediting. */ struct rcu_head barrier_head; int exp_dynticks_snap; /* Double-check need for IPI. */ - /* 6) Callback offloading. */ + /* 5) Callback offloading. */ #ifdef CONFIG_RCU_NOCB_CPU struct rcu_head *nocb_head; /* CBs waiting for kthread. */ struct rcu_head **nocb_tail; @@ -240,7 +234,7 @@ struct rcu_data { /* Leader CPU takes GP-end wakeups. */ #endif /* #ifdef CONFIG_RCU_NOCB_CPU */ - /* 7) Diagnostic data, including RCU CPU stall warnings. */ + /* 6) Diagnostic data, including RCU CPU stall warnings. */ unsigned int softirq_snap; /* Snapshot of softirq activity. */ /* ->rcu_iw* fields protected by leaf rcu_node ->lock. */ struct irq_work rcu_iw; /* Check for non-irq activity. */ From 894d45bbf7e7569ec2aa845155801fd503b5f1bf Mon Sep 17 00:00:00 2001 From: Mike Galbraith Date: Wed, 15 Aug 2018 09:05:29 -0700 Subject: [PATCH 137/140] rcu: Convert rcu_state.ofl_lock to raw_spinlock_t 1e64b15a4b10 ("rcu: Fix grace-period hangs due to race with CPU offline") added spinlock_t ofl_lock to the rcu_state structure, then takes it with preemption disabled during CPU offline, which gives the -rt patchset's sleeping spinlock heartburn. This commit therefore converts ->ofl_lock to raw_spinlock_t. Signed-off-by: Mike Galbraith Signed-off-by: Paul E. McKenney Cc: Sebastian Andrzej Siewior --- kernel/rcu/tree.c | 12 ++++++------ kernel/rcu/tree.h | 2 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 85c2c2dc4c4a..58aa6c2fd7fa 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -97,7 +97,7 @@ struct rcu_state rcu_state = { .abbr = RCU_ABBR, .exp_mutex = __MUTEX_INITIALIZER(rcu_state.exp_mutex), .exp_wake_mutex = __MUTEX_INITIALIZER(rcu_state.exp_wake_mutex), - .ofl_lock = __SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), + .ofl_lock = __RAW_SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), }; /* Dump rcu_node combining tree at boot to verify correct setup. */ @@ -1776,13 +1776,13 @@ static bool rcu_gp_init(void) */ rcu_state.gp_state = RCU_GP_ONOFF; rcu_for_each_leaf_node(rnp) { - spin_lock(&rcu_state.ofl_lock); + raw_spin_lock(&rcu_state.ofl_lock); raw_spin_lock_irq_rcu_node(rnp); if (rnp->qsmaskinit == rnp->qsmaskinitnext && !rnp->wait_blkd_tasks) { /* Nothing to do on this leaf rcu_node structure. */ raw_spin_unlock_irq_rcu_node(rnp); - spin_unlock(&rcu_state.ofl_lock); + raw_spin_unlock(&rcu_state.ofl_lock); continue; } @@ -1818,7 +1818,7 @@ static bool rcu_gp_init(void) } raw_spin_unlock_irq_rcu_node(rnp); - spin_unlock(&rcu_state.ofl_lock); + raw_spin_unlock(&rcu_state.ofl_lock); } rcu_gp_slow(gp_preinit_delay); /* Races with CPU hotplug. */ @@ -3377,7 +3377,7 @@ void rcu_report_dead(unsigned int cpu) /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ mask = rdp->grpmask; - spin_lock(&rcu_state.ofl_lock); + raw_spin_lock(&rcu_state.ofl_lock); raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */ rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq); rdp->rcu_ofl_gp_flags = READ_ONCE(rcu_state.gp_flags); @@ -3388,7 +3388,7 @@ void rcu_report_dead(unsigned int cpu) } rnp->qsmaskinitnext &= ~mask; raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - spin_unlock(&rcu_state.ofl_lock); + raw_spin_unlock(&rcu_state.ofl_lock); per_cpu(rcu_cpu_started, cpu) = 0; } diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index bfbf97a1c29d..703e19ff532d 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -343,7 +343,7 @@ struct rcu_state { const char *name; /* Name of structure. */ char abbr; /* Abbreviated name. */ - spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; + raw_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; /* Synchronize offline with */ /* GP pre-initialization. */ }; From e0fcba9ac02af5aeb1e1c3e842eab987f817c309 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 14 Aug 2018 08:45:54 -0700 Subject: [PATCH 138/140] srcu: Make call_srcu() available during very early boot Event tracing is moving to SRCU in order to take advantage of the fact that SRCU may be safely used from idle and even offline CPUs. However, event tracing can invoke call_srcu() very early in the boot process, even before workqueue_init_early() is invoked (let alone rcu_init()). Therefore, call_srcu()'s attempts to queue work fail miserably. This commit therefore detects this situation, and refrains from attempting to queue work before rcu_init() time, but does everything else that it would have done, and in addition, adds the srcu_struct to a global list. The rcu_init() function now invokes a new srcu_init() function, which is empty if CONFIG_SRCU=n. Otherwise, srcu_init() queues work for each srcu_struct on the list. This all happens early enough in boot that there is but a single CPU with interrupts disabled, which allows synchronization to be dispensed with. Of course, the queued work won't actually be invoked until after workqueue_init() is invoked, which happens shortly after the scheduler is up and running. This means that although call_srcu() may be invoked any time after per-CPU variables have been set up, there is still a very narrow window when synchronize_srcu() won't work, and this window extends from the time that the scheduler starts until the time that workqueue_init() returns. This can be fixed in a manner similar to the fix for synchronize_rcu_expedited() and friends, but until someone actually needs to use synchronize_srcu() during this window, this fix is added churn for no benefit. Finally, note that Tree SRCU's new srcu_init() function invokes queue_work() rather than the queue_delayed_work() function that is invoked post-boot. The reason is that queue_delayed_work() will (as you would expect) post a timer, and timers have not yet been initialized. So use of queue_work() avoids the complaints about use of uninitialized spinlocks that would otherwise result. Besides, some delay is already provide by the aforementioned fact that the queued work won't actually be invoked until after the scheduler is up and running. Requested-by: Steven Rostedt Signed-off-by: Paul E. McKenney Tested-by: Steven Rostedt (VMware) --- include/linux/srcutiny.h | 2 ++ include/linux/srcutree.h | 14 ++++++++------ kernel/rcu/rcu.h | 6 ++++++ kernel/rcu/srcutiny.c | 29 +++++++++++++++++++++++++++-- kernel/rcu/srcutree.c | 26 ++++++++++++++++++++++++-- kernel/rcu/tiny.c | 1 + kernel/rcu/tree.c | 1 + kernel/rcu/update.c | 9 +++++++++ 8 files changed, 78 insertions(+), 10 deletions(-) diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h index f41d2fb09f87..2b5c0822e683 100644 --- a/include/linux/srcutiny.h +++ b/include/linux/srcutiny.h @@ -36,6 +36,7 @@ struct srcu_struct { struct rcu_head *srcu_cb_head; /* Pending callbacks: Head. */ struct rcu_head **srcu_cb_tail; /* Pending callbacks: Tail. */ struct work_struct srcu_work; /* For driving grace periods. */ + struct list_head srcu_boot_entry; /* Early-boot callbacks. */ #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ @@ -48,6 +49,7 @@ void srcu_drive_gp(struct work_struct *wp); .srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq), \ .srcu_cb_tail = &name.srcu_cb_head, \ .srcu_work = __WORK_INITIALIZER(name.srcu_work, srcu_drive_gp), \ + .srcu_boot_entry = LIST_HEAD_INIT(name.srcu_boot_entry), \ __SRCU_DEP_MAP_INIT(name) \ } diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h index 745d4ca4dd50..9cfa4610113a 100644 --- a/include/linux/srcutree.h +++ b/include/linux/srcutree.h @@ -94,6 +94,7 @@ struct srcu_struct { /* callback for the barrier */ /* operation. */ struct delayed_work work; + struct list_head srcu_boot_entry; /* Early-boot callbacks. */ #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ @@ -105,12 +106,13 @@ struct srcu_struct { #define SRCU_STATE_SCAN2 2 #define __SRCU_STRUCT_INIT(name, pcpu_name) \ - { \ - .sda = &pcpu_name, \ - .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ - .srcu_gp_seq_needed = 0 - 1, \ - __SRCU_DEP_MAP_INIT(name) \ - } +{ \ + .sda = &pcpu_name, \ + .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ + .srcu_gp_seq_needed = -1UL, \ + .srcu_boot_entry = LIST_HEAD_INIT(name.srcu_boot_entry), \ + __SRCU_DEP_MAP_INIT(name) \ +} /* * Define and initialize a srcu struct at build time. diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 4d04683c31b2..e1b5aec5ec1c 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -435,6 +435,12 @@ do { \ #endif /* #if defined(SRCU) || !defined(TINY_RCU) */ +#ifdef CONFIG_SRCU +void srcu_init(void); +#else /* #ifdef CONFIG_SRCU */ +static inline void srcu_init(void) { } +#endif /* #else #ifdef CONFIG_SRCU */ + #ifdef CONFIG_TINY_RCU /* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */ static inline bool rcu_gp_is_normal(void) { return true; } diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c index 04fc2ed71af8..d233f0c63f6f 100644 --- a/kernel/rcu/srcutiny.c +++ b/kernel/rcu/srcutiny.c @@ -34,6 +34,8 @@ #include "rcu.h" int rcu_scheduler_active __read_mostly; +static LIST_HEAD(srcu_boot_list); +static bool srcu_init_done; static int init_srcu_struct_fields(struct srcu_struct *sp) { @@ -46,6 +48,7 @@ static int init_srcu_struct_fields(struct srcu_struct *sp) sp->srcu_gp_waiting = false; sp->srcu_idx = 0; INIT_WORK(&sp->srcu_work, srcu_drive_gp); + INIT_LIST_HEAD(&sp->srcu_boot_entry); return 0; } @@ -179,8 +182,12 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *rhp, *sp->srcu_cb_tail = rhp; sp->srcu_cb_tail = &rhp->next; local_irq_restore(flags); - if (!READ_ONCE(sp->srcu_gp_running)) - schedule_work(&sp->srcu_work); + if (!READ_ONCE(sp->srcu_gp_running)) { + if (likely(srcu_init_done)) + schedule_work(&sp->srcu_work); + else if (list_empty(&sp->srcu_boot_entry)) + list_add(&sp->srcu_boot_entry, &srcu_boot_list); + } } EXPORT_SYMBOL_GPL(call_srcu); @@ -204,3 +211,21 @@ void __init rcu_scheduler_starting(void) { rcu_scheduler_active = RCU_SCHEDULER_RUNNING; } + +/* + * Queue work for srcu_struct structures with early boot callbacks. + * The work won't actually execute until the workqueue initialization + * phase that takes place after the scheduler starts. + */ +void __init srcu_init(void) +{ + struct srcu_struct *sp; + + srcu_init_done = true; + while (!list_empty(&srcu_boot_list)) { + sp = list_first_entry(&srcu_boot_list, + struct srcu_struct, srcu_boot_entry); + list_del_init(&sp->srcu_boot_entry); + schedule_work(&sp->srcu_work); + } +} diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 6c9866a854b1..2e7f6b460150 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -51,6 +51,10 @@ module_param(exp_holdoff, ulong, 0444); static ulong counter_wrap_check = (ULONG_MAX >> 2); module_param(counter_wrap_check, ulong, 0444); +/* Early-boot callback-management, so early that no lock is required! */ +static LIST_HEAD(srcu_boot_list); +static bool __read_mostly srcu_init_done; + static void srcu_invoke_callbacks(struct work_struct *work); static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay); static void process_srcu(struct work_struct *work); @@ -182,6 +186,7 @@ static int init_srcu_struct_fields(struct srcu_struct *sp, bool is_static) mutex_init(&sp->srcu_barrier_mutex); atomic_set(&sp->srcu_barrier_cpu_cnt, 0); INIT_DELAYED_WORK(&sp->work, process_srcu); + INIT_LIST_HEAD(&sp->srcu_boot_entry); if (!is_static) sp->sda = alloc_percpu(struct srcu_data); init_srcu_struct_nodes(sp, is_static); @@ -235,7 +240,6 @@ static void check_init_srcu_struct(struct srcu_struct *sp) { unsigned long flags; - WARN_ON_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INIT); /* The smp_load_acquire() pairs with the smp_store_release(). */ if (!rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq_needed))) /*^^^*/ return; /* Already initialized. */ @@ -701,7 +705,11 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp, rcu_seq_state(sp->srcu_gp_seq) == SRCU_STATE_IDLE) { WARN_ON_ONCE(ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)); srcu_gp_start(sp); - queue_delayed_work(rcu_gp_wq, &sp->work, srcu_get_delay(sp)); + if (likely(srcu_init_done)) + queue_delayed_work(rcu_gp_wq, &sp->work, + srcu_get_delay(sp)); + else if (list_empty(&sp->srcu_boot_entry)) + list_add(&sp->srcu_boot_entry, &srcu_boot_list); } spin_unlock_irqrestore_rcu_node(sp, flags); } @@ -1308,3 +1316,17 @@ static int __init srcu_bootup_announce(void) return 0; } early_initcall(srcu_bootup_announce); + +void __init srcu_init(void) +{ + struct srcu_struct *sp; + + srcu_init_done = true; + while (!list_empty(&srcu_boot_list)) { + sp = list_first_entry(&srcu_boot_list, + struct srcu_struct, srcu_boot_entry); + check_init_srcu_struct(sp); + list_del_init(&sp->srcu_boot_entry); + queue_work(rcu_gp_wq, &sp->work.work); + } +} diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index befc9321a89c..101ed5bb836c 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -236,4 +236,5 @@ void __init rcu_init(void) { open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); rcu_early_boot_tests(); + srcu_init(); } diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0b760c1369f7..43c806291208 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4164,6 +4164,7 @@ void __init rcu_init(void) WARN_ON(!rcu_gp_wq); rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0); WARN_ON(!rcu_par_gp_wq); + srcu_init(); } #include "tree_exp.h" diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index 39cb23d22109..7d057d0aaec4 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -888,11 +888,16 @@ static void test_callback(struct rcu_head *r) pr_info("RCU test callback executed %d\n", rcu_self_test_counter); } +DEFINE_STATIC_SRCU(early_srcu); + static void early_boot_test_call_rcu(void) { static struct rcu_head head; + static struct rcu_head shead; call_rcu(&head, test_callback); + if (IS_ENABLED(CONFIG_SRCU)) + call_srcu(&early_srcu, &shead, test_callback); } static void early_boot_test_call_rcu_bh(void) @@ -930,6 +935,10 @@ static int rcu_verify_early_boot_tests(void) if (rcu_self_test) { early_boot_test_counter++; rcu_barrier(); + if (IS_ENABLED(CONFIG_SRCU)) { + early_boot_test_counter++; + srcu_barrier(&early_srcu); + } } if (rcu_self_test_bh) { early_boot_test_counter++; From 55cda2290bf9d8510fbe7c1939a36680476c69c4 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 14 Aug 2018 09:19:05 -0700 Subject: [PATCH 139/140] rcutorture: Test early boot call_srcu() Now that SRCU permits call_srcu() to be invoked at early boot, this commit ensures that the rcutorture scripting tests early boot call_srcu(). Signed-off-by: Paul E. McKenney --- tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot | 1 + tools/testing/selftests/rcutorture/configs/rcu/SRCU-u.boot | 1 + tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot | 1 + 3 files changed, 3 insertions(+) diff --git a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot index 84a7d51b7481..ce48c7b82673 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot @@ -1 +1,2 @@ rcutorture.torture_type=srcud +rcupdate.rcu_self_test=1 diff --git a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u.boot b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u.boot index 84a7d51b7481..ce48c7b82673 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u.boot @@ -1 +1,2 @@ rcutorture.torture_type=srcud +rcupdate.rcu_self_test=1 diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot index c7fd050dfcd9..dfebc82932ca 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot @@ -3,3 +3,4 @@ rcupdate.rcu_self_test_sched=1 rcutree.gp_preinit_delay=3 rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 +rcupdate.rcu_self_test=1 From 4e6ea4ef56f9425cd239ffdb6be45b3aeeb347fd Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 14 Aug 2018 14:41:49 -0700 Subject: [PATCH 140/140] srcu: Make early-boot call_srcu() reuse workqueue lists Allocating a list_head structure that is almost never used, and, when used, is used only during early boot (rcu_init() and earlier), is a bit wasteful. This commit therefore eliminates that list_head in favor of the one in the work_struct structure. This is safe because the work_struct structure cannot be used until after rcu_init() returns. Reported-by: Steven Rostedt Signed-off-by: Paul E. McKenney Cc: Tejun Heo Cc: Lai Jiangshan Tested-by: Steven Rostedt (VMware) --- include/linux/srcutiny.h | 2 -- include/linux/srcutree.h | 3 +-- kernel/rcu/srcutiny.c | 10 +++++----- kernel/rcu/srcutree.c | 11 +++++------ 4 files changed, 11 insertions(+), 15 deletions(-) diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h index 2b5c0822e683..f41d2fb09f87 100644 --- a/include/linux/srcutiny.h +++ b/include/linux/srcutiny.h @@ -36,7 +36,6 @@ struct srcu_struct { struct rcu_head *srcu_cb_head; /* Pending callbacks: Head. */ struct rcu_head **srcu_cb_tail; /* Pending callbacks: Tail. */ struct work_struct srcu_work; /* For driving grace periods. */ - struct list_head srcu_boot_entry; /* Early-boot callbacks. */ #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ @@ -49,7 +48,6 @@ void srcu_drive_gp(struct work_struct *wp); .srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq), \ .srcu_cb_tail = &name.srcu_cb_head, \ .srcu_work = __WORK_INITIALIZER(name.srcu_work, srcu_drive_gp), \ - .srcu_boot_entry = LIST_HEAD_INIT(name.srcu_boot_entry), \ __SRCU_DEP_MAP_INIT(name) \ } diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h index 9cfa4610113a..0ae91b3a7406 100644 --- a/include/linux/srcutree.h +++ b/include/linux/srcutree.h @@ -94,7 +94,6 @@ struct srcu_struct { /* callback for the barrier */ /* operation. */ struct delayed_work work; - struct list_head srcu_boot_entry; /* Early-boot callbacks. */ #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ @@ -110,7 +109,7 @@ struct srcu_struct { .sda = &pcpu_name, \ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ .srcu_gp_seq_needed = -1UL, \ - .srcu_boot_entry = LIST_HEAD_INIT(name.srcu_boot_entry), \ + .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \ __SRCU_DEP_MAP_INIT(name) \ } diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c index d233f0c63f6f..b46e6683f8c9 100644 --- a/kernel/rcu/srcutiny.c +++ b/kernel/rcu/srcutiny.c @@ -48,7 +48,7 @@ static int init_srcu_struct_fields(struct srcu_struct *sp) sp->srcu_gp_waiting = false; sp->srcu_idx = 0; INIT_WORK(&sp->srcu_work, srcu_drive_gp); - INIT_LIST_HEAD(&sp->srcu_boot_entry); + INIT_LIST_HEAD(&sp->srcu_work.entry); return 0; } @@ -185,8 +185,8 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *rhp, if (!READ_ONCE(sp->srcu_gp_running)) { if (likely(srcu_init_done)) schedule_work(&sp->srcu_work); - else if (list_empty(&sp->srcu_boot_entry)) - list_add(&sp->srcu_boot_entry, &srcu_boot_list); + else if (list_empty(&sp->srcu_work.entry)) + list_add(&sp->srcu_work.entry, &srcu_boot_list); } } EXPORT_SYMBOL_GPL(call_srcu); @@ -224,8 +224,8 @@ void __init srcu_init(void) srcu_init_done = true; while (!list_empty(&srcu_boot_list)) { sp = list_first_entry(&srcu_boot_list, - struct srcu_struct, srcu_boot_entry); - list_del_init(&sp->srcu_boot_entry); + struct srcu_struct, srcu_work.entry); + list_del_init(&sp->srcu_work.entry); schedule_work(&sp->srcu_work); } } diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 2e7f6b460150..86c7fd0a1bfe 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -186,7 +186,6 @@ static int init_srcu_struct_fields(struct srcu_struct *sp, bool is_static) mutex_init(&sp->srcu_barrier_mutex); atomic_set(&sp->srcu_barrier_cpu_cnt, 0); INIT_DELAYED_WORK(&sp->work, process_srcu); - INIT_LIST_HEAD(&sp->srcu_boot_entry); if (!is_static) sp->sda = alloc_percpu(struct srcu_data); init_srcu_struct_nodes(sp, is_static); @@ -708,8 +707,8 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp, if (likely(srcu_init_done)) queue_delayed_work(rcu_gp_wq, &sp->work, srcu_get_delay(sp)); - else if (list_empty(&sp->srcu_boot_entry)) - list_add(&sp->srcu_boot_entry, &srcu_boot_list); + else if (list_empty(&sp->work.work.entry)) + list_add(&sp->work.work.entry, &srcu_boot_list); } spin_unlock_irqrestore_rcu_node(sp, flags); } @@ -1323,10 +1322,10 @@ void __init srcu_init(void) srcu_init_done = true; while (!list_empty(&srcu_boot_list)) { - sp = list_first_entry(&srcu_boot_list, - struct srcu_struct, srcu_boot_entry); + sp = list_first_entry(&srcu_boot_list, struct srcu_struct, + work.work.entry); check_init_srcu_struct(sp); - list_del_init(&sp->srcu_boot_entry); + list_del_init(&sp->work.work.entry); queue_work(rcu_gp_wq, &sp->work.work); } }