Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney: "The largest feature of this series is shrinking and simplification, with the following diffstat summary: 79 files changed, 1496 insertions(+), 4211 deletions(-) In other words, this series represents a net reduction of more than 2700 lines of code." Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
commit
567b64aaef
@ -28,8 +28,6 @@ stallwarn.txt
|
||||
- RCU CPU stall warnings (module parameter rcu_cpu_stall_suppress)
|
||||
torture.txt
|
||||
- RCU Torture Test Operation (CONFIG_RCU_TORTURE_TEST)
|
||||
trace.txt
|
||||
- CONFIG_RCU_TRACE debugfs files and formats
|
||||
UP.txt
|
||||
- RCU on Uniprocessor Systems
|
||||
whatisRCU.txt
|
||||
|
@ -559,9 +559,7 @@ The <tt>rcu_access_pointer()</tt> on line 6 is similar to
|
||||
For <tt>remove_gp_synchronous()</tt>, as long as all modifications
|
||||
to <tt>gp</tt> are carried out while holding <tt>gp_lock</tt>,
|
||||
the above optimizations are harmless.
|
||||
However,
|
||||
with <tt>CONFIG_SPARSE_RCU_POINTER=y</tt>,
|
||||
<tt>sparse</tt> will complain if you
|
||||
However, <tt>sparse</tt> will complain if you
|
||||
define <tt>gp</tt> with <tt>__rcu</tt> and then
|
||||
access it without using
|
||||
either <tt>rcu_access_pointer()</tt> or <tt>rcu_dereference()</tt>.
|
||||
@ -1849,7 +1847,8 @@ mass storage, or user patience, whichever comes first.
|
||||
If the nesting is not visible to the compiler, as is the case with
|
||||
mutually recursive functions each in its own translation unit,
|
||||
stack overflow will result.
|
||||
If the nesting takes the form of loops, either the control variable
|
||||
If the nesting takes the form of loops, perhaps in the guise of tail
|
||||
recursion, either the control variable
|
||||
will overflow or (in the Linux kernel) you will get an RCU CPU stall warning.
|
||||
Nevertheless, this class of RCU implementations is one
|
||||
of the most composable constructs in existence.
|
||||
@ -1977,9 +1976,8 @@ guard against mishaps and misuse:
|
||||
and <tt>rcu_dereference()</tt>, perhaps (incorrectly)
|
||||
substituting a simple assignment.
|
||||
To catch this sort of error, a given RCU-protected pointer may be
|
||||
tagged with <tt>__rcu</tt>, after which running sparse
|
||||
with <tt>CONFIG_SPARSE_RCU_POINTER=y</tt> will complain
|
||||
about simple-assignment accesses to that pointer.
|
||||
tagged with <tt>__rcu</tt>, after which sparse
|
||||
will complain about simple-assignment accesses to that pointer.
|
||||
Arnd Bergmann made me aware of this requirement, and also
|
||||
supplied the needed
|
||||
<a href="https://lwn.net/Articles/376011/">patch series</a>.
|
||||
@ -2036,7 +2034,7 @@ guard against mishaps and misuse:
|
||||
some other synchronization mechanism, for example, reference
|
||||
counting.
|
||||
<li> In kernels built with <tt>CONFIG_RCU_TRACE=y</tt>, RCU-related
|
||||
information is provided via both debugfs and event tracing.
|
||||
information is provided via event tracing.
|
||||
<li> Open-coded use of <tt>rcu_assign_pointer()</tt> and
|
||||
<tt>rcu_dereference()</tt> to create typical linked
|
||||
data structures can be surprisingly error-prone.
|
||||
@ -2519,11 +2517,7 @@ It is similarly socially unacceptable to interrupt an
|
||||
<tt>nohz_full</tt> CPU running in userspace.
|
||||
RCU must therefore track <tt>nohz_full</tt> userspace
|
||||
execution.
|
||||
And in
|
||||
<a href="https://lwn.net/Articles/558284/"><tt>CONFIG_NO_HZ_FULL_SYSIDLE=y</tt></a>
|
||||
kernels, RCU must separately track idle CPUs on the one hand and
|
||||
CPUs that are either idle or executing in userspace on the other.
|
||||
In both cases, RCU must be able to sample state at two points in
|
||||
RCU must therefore be able to sample state at two points in
|
||||
time, and be able to determine whether or not some other CPU spent
|
||||
any time idle and/or executing in userspace.
|
||||
|
||||
@ -2935,6 +2929,20 @@ The reason that this is possible is that SRCU is insensitive
|
||||
to whether or not a CPU is online, which means that <tt>srcu_barrier()</tt>
|
||||
need not exclude CPU-hotplug operations.
|
||||
|
||||
<p>
|
||||
SRCU also differs from other RCU flavors in that SRCU's expedited and
|
||||
non-expedited grace periods are implemented by the same mechanism.
|
||||
This means that in the current SRCU implementation, expediting a
|
||||
future grace period has the side effect of expediting all prior
|
||||
grace periods that have not yet completed.
|
||||
(But please note that this is a property of the current implementation,
|
||||
not necessarily of future implementations.)
|
||||
In addition, if SRCU has been idle for longer than the interval
|
||||
specified by the <tt>srcutree.exp_holdoff</tt> kernel boot parameter
|
||||
(25 microseconds by default),
|
||||
and if a <tt>synchronize_srcu()</tt> invocation ends this idle period,
|
||||
that invocation will be automatically expedited.
|
||||
|
||||
<p>
|
||||
As of v4.12, SRCU's callbacks are maintained per-CPU, eliminating
|
||||
a locking bottleneck present in prior kernel versions.
|
||||
|
@ -413,11 +413,11 @@ over a rather long period of time, but improvements are always welcome!
|
||||
read-side critical sections. It is the responsibility of the
|
||||
RCU update-side primitives to deal with this.
|
||||
|
||||
17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
|
||||
__rcu sparse checks (enabled by CONFIG_SPARSE_RCU_POINTER) to
|
||||
validate your RCU code. These can help find problems as follows:
|
||||
17. Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
|
||||
__rcu sparse checks to validate your RCU code. These can help
|
||||
find problems as follows:
|
||||
|
||||
CONFIG_PROVE_RCU: check that accesses to RCU-protected data
|
||||
CONFIG_PROVE_LOCKING: check that accesses to RCU-protected data
|
||||
structures are carried out under the proper RCU
|
||||
read-side critical section, while holding the right
|
||||
combination of locks, or whatever other conditions
|
||||
|
@ -1,535 +0,0 @@
|
||||
CONFIG_RCU_TRACE debugfs Files and Formats
|
||||
|
||||
|
||||
The rcutree and rcutiny implementations of RCU provide debugfs trace
|
||||
output that summarizes counters and state. This information is useful for
|
||||
debugging RCU itself, and can sometimes also help to debug abuses of RCU.
|
||||
The following sections describe the debugfs files and formats, first
|
||||
for rcutree and next for rcutiny.
|
||||
|
||||
|
||||
CONFIG_TREE_RCU and CONFIG_PREEMPT_RCU debugfs Files and Formats
|
||||
|
||||
These implementations of RCU provide several debugfs directories under the
|
||||
top-level directory "rcu":
|
||||
|
||||
rcu/rcu_bh
|
||||
rcu/rcu_preempt
|
||||
rcu/rcu_sched
|
||||
|
||||
Each directory contains files for the corresponding flavor of RCU.
|
||||
Note that rcu/rcu_preempt is only present for CONFIG_PREEMPT_RCU.
|
||||
For CONFIG_TREE_RCU, the RCU flavor maps onto the RCU-sched flavor,
|
||||
so that activity for both appears in rcu/rcu_sched.
|
||||
|
||||
In addition, the following file appears in the top-level directory:
|
||||
rcu/rcutorture. This file displays rcutorture test progress. The output
|
||||
of "cat rcu/rcutorture" looks as follows:
|
||||
|
||||
rcutorture test sequence: 0 (test in progress)
|
||||
rcutorture update version number: 615
|
||||
|
||||
The first line shows the number of rcutorture tests that have completed
|
||||
since boot. If a test is currently running, the "(test in progress)"
|
||||
string will appear as shown above. The second line shows the number of
|
||||
update cycles that the current test has started, or zero if there is
|
||||
no test in progress.
|
||||
|
||||
|
||||
Within each flavor directory (rcu/rcu_bh, rcu/rcu_sched, and possibly
|
||||
also rcu/rcu_preempt) the following files will be present:
|
||||
|
||||
rcudata:
|
||||
Displays fields in struct rcu_data.
|
||||
rcuexp:
|
||||
Displays statistics for expedited grace periods.
|
||||
rcugp:
|
||||
Displays grace-period counters.
|
||||
rcuhier:
|
||||
Displays the struct rcu_node hierarchy.
|
||||
rcu_pending:
|
||||
Displays counts of the reasons rcu_pending() decided that RCU had
|
||||
work to do.
|
||||
rcuboost:
|
||||
Displays RCU boosting statistics. Only present if
|
||||
CONFIG_RCU_BOOST=y.
|
||||
|
||||
The output of "cat rcu/rcu_preempt/rcudata" looks as follows:
|
||||
|
||||
0!c=30455 g=30456 cnq=1/0:1 dt=126535/140000000000000/0 df=2002 of=4 ql=0/0 qs=N... b=10 ci=74572 nci=0 co=1131 ca=716
|
||||
1!c=30719 g=30720 cnq=1/0:0 dt=132007/140000000000000/0 df=1874 of=10 ql=0/0 qs=N... b=10 ci=123209 nci=0 co=685 ca=982
|
||||
2!c=30150 g=30151 cnq=1/1:1 dt=138537/140000000000000/0 df=1707 of=8 ql=0/0 qs=N... b=10 ci=80132 nci=0 co=1328 ca=1458
|
||||
3 c=31249 g=31250 cnq=1/1:0 dt=107255/140000000000000/0 df=1749 of=6 ql=0/450 qs=NRW. b=10 ci=151700 nci=0 co=509 ca=622
|
||||
4!c=29502 g=29503 cnq=1/0:1 dt=83647/140000000000000/0 df=965 of=5 ql=0/0 qs=N... b=10 ci=65643 nci=0 co=1373 ca=1521
|
||||
5 c=31201 g=31202 cnq=1/0:1 dt=70422/0/0 df=535 of=7 ql=0/0 qs=.... b=10 ci=58500 nci=0 co=764 ca=698
|
||||
6!c=30253 g=30254 cnq=1/0:1 dt=95363/140000000000000/0 df=780 of=5 ql=0/0 qs=N... b=10 ci=100607 nci=0 co=1414 ca=1353
|
||||
7 c=31178 g=31178 cnq=1/0:0 dt=91536/0/0 df=547 of=4 ql=0/0 qs=.... b=10 ci=109819 nci=0 co=1115 ca=969
|
||||
|
||||
This file has one line per CPU, or eight for this 8-CPU system.
|
||||
The fields are as follows:
|
||||
|
||||
o The number at the beginning of each line is the CPU number.
|
||||
CPUs numbers followed by an exclamation mark are offline,
|
||||
but have been online at least once since boot. There will be
|
||||
no output for CPUs that have never been online, which can be
|
||||
a good thing in the surprisingly common case where NR_CPUS is
|
||||
substantially larger than the number of actual CPUs.
|
||||
|
||||
o "c" is the count of grace periods that this CPU believes have
|
||||
completed. Offlined CPUs and CPUs in dynticks idle mode may lag
|
||||
quite a ways behind, for example, CPU 4 under "rcu_sched" above,
|
||||
which has been offline through 16 RCU grace periods. It is not
|
||||
unusual to see offline CPUs lagging by thousands of grace periods.
|
||||
Note that although the grace-period number is an unsigned long,
|
||||
it is printed out as a signed long to allow more human-friendly
|
||||
representation near boot time.
|
||||
|
||||
o "g" is the count of grace periods that this CPU believes have
|
||||
started. Again, offlined CPUs and CPUs in dynticks idle mode
|
||||
may lag behind. If the "c" and "g" values are equal, this CPU
|
||||
has already reported a quiescent state for the last RCU grace
|
||||
period that it is aware of, otherwise, the CPU believes that it
|
||||
owes RCU a quiescent state.
|
||||
|
||||
o "pq" indicates that this CPU has passed through a quiescent state
|
||||
for the current grace period. It is possible for "pq" to be
|
||||
"1" and "c" different than "g", which indicates that although
|
||||
the CPU has passed through a quiescent state, either (1) this
|
||||
CPU has not yet reported that fact, (2) some other CPU has not
|
||||
yet reported for this grace period, or (3) both.
|
||||
|
||||
o "qp" indicates that RCU still expects a quiescent state from
|
||||
this CPU. Offlined CPUs and CPUs in dyntick idle mode might
|
||||
well have qp=1, which is OK: RCU is still ignoring them.
|
||||
|
||||
o "dt" is the current value of the dyntick counter that is incremented
|
||||
when entering or leaving idle, either due to a context switch or
|
||||
due to an interrupt. This number is even if the CPU is in idle
|
||||
from RCU's viewpoint and odd otherwise. The number after the
|
||||
first "/" is the interrupt nesting depth when in idle state,
|
||||
or a large number added to the interrupt-nesting depth when
|
||||
running a non-idle task. Some architectures do not accurately
|
||||
count interrupt nesting when running in non-idle kernel context,
|
||||
which can result in interesting anomalies such as negative
|
||||
interrupt-nesting levels. The number after the second "/"
|
||||
is the NMI nesting depth.
|
||||
|
||||
o "df" is the number of times that some other CPU has forced a
|
||||
quiescent state on behalf of this CPU due to this CPU being in
|
||||
idle state.
|
||||
|
||||
o "of" is the number of times that some other CPU has forced a
|
||||
quiescent state on behalf of this CPU due to this CPU being
|
||||
offline. In a perfect world, this might never happen, but it
|
||||
turns out that offlining and onlining a CPU can take several grace
|
||||
periods, and so there is likely to be an extended period of time
|
||||
when RCU believes that the CPU is online when it really is not.
|
||||
Please note that erring in the other direction (RCU believing a
|
||||
CPU is offline when it is really alive and kicking) is a fatal
|
||||
error, so it makes sense to err conservatively.
|
||||
|
||||
o "ql" is the number of RCU callbacks currently residing on
|
||||
this CPU. The first number is the number of "lazy" callbacks
|
||||
that are known to RCU to only be freeing memory, and the number
|
||||
after the "/" is the total number of callbacks, lazy or not.
|
||||
These counters count callbacks regardless of what phase of
|
||||
grace-period processing that they are in (new, waiting for
|
||||
grace period to start, waiting for grace period to end, ready
|
||||
to invoke).
|
||||
|
||||
o "qs" gives an indication of the state of the callback queue
|
||||
with four characters:
|
||||
|
||||
"N" Indicates that there are callbacks queued that are not
|
||||
ready to be handled by the next grace period, and thus
|
||||
will be handled by the grace period following the next
|
||||
one.
|
||||
|
||||
"R" Indicates that there are callbacks queued that are
|
||||
ready to be handled by the next grace period.
|
||||
|
||||
"W" Indicates that there are callbacks queued that are
|
||||
waiting on the current grace period.
|
||||
|
||||
"D" Indicates that there are callbacks queued that have
|
||||
already been handled by a prior grace period, and are
|
||||
thus waiting to be invoked. Note that callbacks in
|
||||
the process of being invoked are not counted here.
|
||||
Callbacks in the process of being invoked are those
|
||||
that have been removed from the rcu_data structures
|
||||
queues by rcu_do_batch(), but which have not yet been
|
||||
invoked.
|
||||
|
||||
If there are no callbacks in a given one of the above states,
|
||||
the corresponding character is replaced by ".".
|
||||
|
||||
o "b" is the batch limit for this CPU. If more than this number
|
||||
of RCU callbacks is ready to invoke, then the remainder will
|
||||
be deferred.
|
||||
|
||||
o "ci" is the number of RCU callbacks that have been invoked for
|
||||
this CPU. Note that ci+nci+ql is the number of callbacks that have
|
||||
been registered in absence of CPU-hotplug activity.
|
||||
|
||||
o "nci" is the number of RCU callbacks that have been offloaded from
|
||||
this CPU. This will always be zero unless the kernel was built
|
||||
with CONFIG_RCU_NOCB_CPU=y and the "rcu_nocbs=" kernel boot
|
||||
parameter was specified.
|
||||
|
||||
o "co" is the number of RCU callbacks that have been orphaned due to
|
||||
this CPU going offline. These orphaned callbacks have been moved
|
||||
to an arbitrarily chosen online CPU.
|
||||
|
||||
o "ca" is the number of RCU callbacks that have been adopted by this
|
||||
CPU due to other CPUs going offline. Note that ci+co-ca+ql is
|
||||
the number of RCU callbacks registered on this CPU.
|
||||
|
||||
|
||||
Kernels compiled with CONFIG_RCU_BOOST=y display the following from
|
||||
/debug/rcu/rcu_preempt/rcudata:
|
||||
|
||||
0!c=12865 g=12866 cnq=1/0:1 dt=83113/140000000000000/0 df=288 of=11 ql=0/0 qs=N... kt=0/O ktl=944 b=10 ci=60709 nci=0 co=748 ca=871
|
||||
1 c=14407 g=14408 cnq=1/0:0 dt=100679/140000000000000/0 df=378 of=7 ql=0/119 qs=NRW. kt=0/W ktl=9b6 b=10 ci=109740 nci=0 co=589 ca=485
|
||||
2 c=14407 g=14408 cnq=1/0:0 dt=105486/0/0 df=90 of=9 ql=0/89 qs=NRW. kt=0/W ktl=c0c b=10 ci=83113 nci=0 co=533 ca=490
|
||||
3 c=14407 g=14408 cnq=1/0:0 dt=107138/0/0 df=142 of=8 ql=0/188 qs=NRW. kt=0/W ktl=b96 b=10 ci=121114 nci=0 co=426 ca=290
|
||||
4 c=14405 g=14406 cnq=1/0:1 dt=50238/0/0 df=706 of=7 ql=0/0 qs=.... kt=0/W ktl=812 b=10 ci=34929 nci=0 co=643 ca=114
|
||||
5!c=14168 g=14169 cnq=1/0:0 dt=45465/140000000000000/0 df=161 of=11 ql=0/0 qs=N... kt=0/O ktl=b4d b=10 ci=47712 nci=0 co=677 ca=722
|
||||
6 c=14404 g=14405 cnq=1/0:0 dt=59454/0/0 df=94 of=6 ql=0/0 qs=.... kt=0/W ktl=e57 b=10 ci=55597 nci=0 co=701 ca=811
|
||||
7 c=14407 g=14408 cnq=1/0:1 dt=68850/0/0 df=31 of=8 ql=0/0 qs=.... kt=0/W ktl=14bd b=10 ci=77475 nci=0 co=508 ca=1042
|
||||
|
||||
This is similar to the output discussed above, but contains the following
|
||||
additional fields:
|
||||
|
||||
o "kt" is the per-CPU kernel-thread state. The digit preceding
|
||||
the first slash is zero if there is no work pending and 1
|
||||
otherwise. The character between the first pair of slashes is
|
||||
as follows:
|
||||
|
||||
"S" The kernel thread is stopped, in other words, all
|
||||
CPUs corresponding to this rcu_node structure are
|
||||
offline.
|
||||
|
||||
"R" The kernel thread is running.
|
||||
|
||||
"W" The kernel thread is waiting because there is no work
|
||||
for it to do.
|
||||
|
||||
"O" The kernel thread is waiting because it has been
|
||||
forced off of its designated CPU or because its
|
||||
->cpus_allowed mask permits it to run on other than
|
||||
its designated CPU.
|
||||
|
||||
"Y" The kernel thread is yielding to avoid hogging CPU.
|
||||
|
||||
"?" Unknown value, indicates a bug.
|
||||
|
||||
The number after the final slash is the CPU that the kthread
|
||||
is actually running on.
|
||||
|
||||
This field is displayed only for CONFIG_RCU_BOOST kernels.
|
||||
|
||||
o "ktl" is the low-order 16 bits (in hexadecimal) of the count of
|
||||
the number of times that this CPU's per-CPU kthread has gone
|
||||
through its loop servicing invoke_rcu_cpu_kthread() requests.
|
||||
|
||||
This field is displayed only for CONFIG_RCU_BOOST kernels.
|
||||
|
||||
|
||||
The output of "cat rcu/rcu_preempt/rcuexp" looks as follows:
|
||||
|
||||
s=21872 wd1=0 wd2=0 wd3=5 enq=0 sc=21872
|
||||
|
||||
These fields are as follows:
|
||||
|
||||
o "s" is the sequence number, with an odd number indicating that
|
||||
an expedited grace period is in progress.
|
||||
|
||||
o "wd1", "wd2", and "wd3" are the number of times that an attempt
|
||||
to start an expedited grace period found that someone else had
|
||||
completed an expedited grace period that satisfies the attempted
|
||||
request. "Our work is done."
|
||||
|
||||
o "enq" is the number of quiescent states still outstanding.
|
||||
|
||||
o "sc" is the number of times that the attempt to start a
|
||||
new expedited grace period succeeded.
|
||||
|
||||
|
||||
The output of "cat rcu/rcu_preempt/rcugp" looks as follows:
|
||||
|
||||
completed=31249 gpnum=31250 age=1 max=18
|
||||
|
||||
These fields are taken from the rcu_state structure, and are as follows:
|
||||
|
||||
o "completed" is the number of grace periods that have completed.
|
||||
It is comparable to the "c" field from rcu/rcudata in that a
|
||||
CPU whose "c" field matches the value of "completed" is aware
|
||||
that the corresponding RCU grace period has completed.
|
||||
|
||||
o "gpnum" is the number of grace periods that have started. It is
|
||||
similarly comparable to the "g" field from rcu/rcudata in that
|
||||
a CPU whose "g" field matches the value of "gpnum" is aware that
|
||||
the corresponding RCU grace period has started.
|
||||
|
||||
If these two fields are equal, then there is no grace period
|
||||
in progress, in other words, RCU is idle. On the other hand,
|
||||
if the two fields differ (as they are above), then an RCU grace
|
||||
period is in progress.
|
||||
|
||||
o "age" is the number of jiffies that the current grace period
|
||||
has extended for, or zero if there is no grace period currently
|
||||
in effect.
|
||||
|
||||
o "max" is the age in jiffies of the longest-duration grace period
|
||||
thus far.
|
||||
|
||||
The output of "cat rcu/rcu_preempt/rcuhier" looks as follows:
|
||||
|
||||
c=14407 g=14408 s=0 jfq=2 j=c863 nfqs=12040/nfqsng=0(12040) fqlh=1051 oqlen=0/0
|
||||
3/3 ..>. 0:7 ^0
|
||||
e/e ..>. 0:3 ^0 d/d ..>. 4:7 ^1
|
||||
|
||||
The fields are as follows:
|
||||
|
||||
o "c" is exactly the same as "completed" under rcu/rcu_preempt/rcugp.
|
||||
|
||||
o "g" is exactly the same as "gpnum" under rcu/rcu_preempt/rcugp.
|
||||
|
||||
o "s" is the current state of the force_quiescent_state()
|
||||
state machine.
|
||||
|
||||
o "jfq" is the number of jiffies remaining for this grace period
|
||||
before force_quiescent_state() is invoked to help push things
|
||||
along. Note that CPUs in idle mode throughout the grace period
|
||||
will not report on their own, but rather must be check by some
|
||||
other CPU via force_quiescent_state().
|
||||
|
||||
o "j" is the low-order four hex digits of the jiffies counter.
|
||||
Yes, Paul did run into a number of problems that turned out to
|
||||
be due to the jiffies counter no longer counting. Why do you ask?
|
||||
|
||||
o "nfqs" is the number of calls to force_quiescent_state() since
|
||||
boot.
|
||||
|
||||
o "nfqsng" is the number of useless calls to force_quiescent_state(),
|
||||
where there wasn't actually a grace period active. This can
|
||||
no longer happen due to grace-period processing being pushed
|
||||
into a kthread. The number in parentheses is the difference
|
||||
between "nfqs" and "nfqsng", or the number of times that
|
||||
force_quiescent_state() actually did some real work.
|
||||
|
||||
o "fqlh" is the number of calls to force_quiescent_state() that
|
||||
exited immediately (without even being counted in nfqs above)
|
||||
due to contention on ->fqslock.
|
||||
|
||||
o Each element of the form "3/3 ..>. 0:7 ^0" represents one rcu_node
|
||||
structure. Each line represents one level of the hierarchy,
|
||||
from root to leaves. It is best to think of the rcu_data
|
||||
structures as forming yet another level after the leaves.
|
||||
Note that there might be either one, two, three, or even four
|
||||
levels of rcu_node structures, depending on the relationship
|
||||
between CONFIG_RCU_FANOUT, CONFIG_RCU_FANOUT_LEAF (possibly
|
||||
adjusted using the rcu_fanout_leaf kernel boot parameter), and
|
||||
CONFIG_NR_CPUS (possibly adjusted using the nr_cpu_ids count of
|
||||
possible CPUs for the booting hardware).
|
||||
|
||||
o The numbers separated by the "/" are the qsmask followed
|
||||
by the qsmaskinit. The qsmask will have one bit
|
||||
set for each entity in the next lower level that has
|
||||
not yet checked in for the current grace period ("e"
|
||||
indicating CPUs 5, 6, and 7 in the example above).
|
||||
The qsmaskinit will have one bit for each entity that is
|
||||
currently expected to check in during each grace period.
|
||||
The value of qsmaskinit is assigned to that of qsmask
|
||||
at the beginning of each grace period.
|
||||
|
||||
o The characters separated by the ">" indicate the state
|
||||
of the blocked-tasks lists. A "G" preceding the ">"
|
||||
indicates that at least one task blocked in an RCU
|
||||
read-side critical section blocks the current grace
|
||||
period, while a "E" preceding the ">" indicates that
|
||||
at least one task blocked in an RCU read-side critical
|
||||
section blocks the current expedited grace period.
|
||||
A "T" character following the ">" indicates that at
|
||||
least one task is blocked within an RCU read-side
|
||||
critical section, regardless of whether any current
|
||||
grace period (expedited or normal) is inconvenienced.
|
||||
A "." character appears if the corresponding condition
|
||||
does not hold, so that "..>." indicates that no tasks
|
||||
are blocked. In contrast, "GE>T" indicates maximal
|
||||
inconvenience from blocked tasks. CONFIG_TREE_RCU
|
||||
builds of the kernel will always show "..>.".
|
||||
|
||||
o The numbers separated by the ":" are the range of CPUs
|
||||
served by this struct rcu_node. This can be helpful
|
||||
in working out how the hierarchy is wired together.
|
||||
|
||||
For example, the example rcu_node structure shown above
|
||||
has "0:7", indicating that it covers CPUs 0 through 7.
|
||||
|
||||
o The number after the "^" indicates the bit in the
|
||||
next higher level rcu_node structure that this rcu_node
|
||||
structure corresponds to. For example, the "d/d ..>. 4:7
|
||||
^1" has a "1" in this position, indicating that it
|
||||
corresponds to the "1" bit in the "3" shown in the
|
||||
"3/3 ..>. 0:7 ^0" entry on the next level up.
|
||||
|
||||
|
||||
The output of "cat rcu/rcu_sched/rcu_pending" looks as follows:
|
||||
|
||||
0!np=26111 qsp=29 rpq=5386 cbr=1 cng=570 gpc=3674 gps=577 nn=15903 ndw=0
|
||||
1!np=28913 qsp=35 rpq=6097 cbr=1 cng=448 gpc=3700 gps=554 nn=18113 ndw=0
|
||||
2!np=32740 qsp=37 rpq=6202 cbr=0 cng=476 gpc=4627 gps=546 nn=20889 ndw=0
|
||||
3 np=23679 qsp=22 rpq=5044 cbr=1 cng=415 gpc=3403 gps=347 nn=14469 ndw=0
|
||||
4!np=30714 qsp=4 rpq=5574 cbr=0 cng=528 gpc=3931 gps=639 nn=20042 ndw=0
|
||||
5 np=28910 qsp=2 rpq=5246 cbr=0 cng=428 gpc=4105 gps=709 nn=18422 ndw=0
|
||||
6!np=38648 qsp=5 rpq=7076 cbr=0 cng=840 gpc=4072 gps=961 nn=25699 ndw=0
|
||||
7 np=37275 qsp=2 rpq=6873 cbr=0 cng=868 gpc=3416 gps=971 nn=25147 ndw=0
|
||||
|
||||
The fields are as follows:
|
||||
|
||||
o The leading number is the CPU number, with "!" indicating
|
||||
an offline CPU.
|
||||
|
||||
o "np" is the number of times that __rcu_pending() has been invoked
|
||||
for the corresponding flavor of RCU.
|
||||
|
||||
o "qsp" is the number of times that the RCU was waiting for a
|
||||
quiescent state from this CPU.
|
||||
|
||||
o "rpq" is the number of times that the CPU had passed through
|
||||
a quiescent state, but not yet reported it to RCU.
|
||||
|
||||
o "cbr" is the number of times that this CPU had RCU callbacks
|
||||
that had passed through a grace period, and were thus ready
|
||||
to be invoked.
|
||||
|
||||
o "cng" is the number of times that this CPU needed another
|
||||
grace period while RCU was idle.
|
||||
|
||||
o "gpc" is the number of times that an old grace period had
|
||||
completed, but this CPU was not yet aware of it.
|
||||
|
||||
o "gps" is the number of times that a new grace period had started,
|
||||
but this CPU was not yet aware of it.
|
||||
|
||||
o "ndw" is the number of times that a wakeup of an rcuo
|
||||
callback-offload kthread had to be deferred in order to avoid
|
||||
deadlock.
|
||||
|
||||
o "nn" is the number of times that this CPU needed nothing.
|
||||
|
||||
|
||||
The output of "cat rcu/rcuboost" looks as follows:
|
||||
|
||||
0:3 tasks=.... kt=W ntb=0 neb=0 nnb=0 j=c864 bt=c894
|
||||
balk: nt=0 egt=4695 bt=0 nb=0 ny=56 nos=0
|
||||
4:7 tasks=.... kt=W ntb=0 neb=0 nnb=0 j=c864 bt=c894
|
||||
balk: nt=0 egt=6541 bt=0 nb=0 ny=126 nos=0
|
||||
|
||||
This information is output only for rcu_preempt. Each two-line entry
|
||||
corresponds to a leaf rcu_node structure. The fields are as follows:
|
||||
|
||||
o "n:m" is the CPU-number range for the corresponding two-line
|
||||
entry. In the sample output above, the first entry covers
|
||||
CPUs zero through three and the second entry covers CPUs four
|
||||
through seven.
|
||||
|
||||
o "tasks=TNEB" gives the state of the various segments of the
|
||||
rnp->blocked_tasks list:
|
||||
|
||||
"T" This indicates that there are some tasks that blocked
|
||||
while running on one of the corresponding CPUs while
|
||||
in an RCU read-side critical section.
|
||||
|
||||
"N" This indicates that some of the blocked tasks are preventing
|
||||
the current normal (non-expedited) grace period from
|
||||
completing.
|
||||
|
||||
"E" This indicates that some of the blocked tasks are preventing
|
||||
the current expedited grace period from completing.
|
||||
|
||||
"B" This indicates that some of the blocked tasks are in
|
||||
need of RCU priority boosting.
|
||||
|
||||
Each character is replaced with "." if the corresponding
|
||||
condition does not hold.
|
||||
|
||||
o "kt" is the state of the RCU priority-boosting kernel
|
||||
thread associated with the corresponding rcu_node structure.
|
||||
The state can be one of the following:
|
||||
|
||||
"S" The kernel thread is stopped, in other words, all
|
||||
CPUs corresponding to this rcu_node structure are
|
||||
offline.
|
||||
|
||||
"R" The kernel thread is running.
|
||||
|
||||
"W" The kernel thread is waiting because there is no work
|
||||
for it to do.
|
||||
|
||||
"Y" The kernel thread is yielding to avoid hogging CPU.
|
||||
|
||||
"?" Unknown value, indicates a bug.
|
||||
|
||||
o "ntb" is the number of tasks boosted.
|
||||
|
||||
o "neb" is the number of tasks boosted in order to complete an
|
||||
expedited grace period.
|
||||
|
||||
o "nnb" is the number of tasks boosted in order to complete a
|
||||
normal (non-expedited) grace period. When boosting a task
|
||||
that was blocking both an expedited and a normal grace period,
|
||||
it is counted against the expedited total above.
|
||||
|
||||
o "j" is the low-order 16 bits of the jiffies counter in
|
||||
hexadecimal.
|
||||
|
||||
o "bt" is the low-order 16 bits of the value that the jiffies
|
||||
counter will have when we next start boosting, assuming that
|
||||
the current grace period does not end beforehand. This is
|
||||
also in hexadecimal.
|
||||
|
||||
o "balk: nt" counts the number of times we didn't boost (in
|
||||
other words, we balked) even though it was time to boost because
|
||||
there were no blocked tasks to boost. This situation occurs
|
||||
when there is one blocked task on one rcu_node structure and
|
||||
none on some other rcu_node structure.
|
||||
|
||||
o "egt" counts the number of times we balked because although
|
||||
there were blocked tasks, none of them were blocking the
|
||||
current grace period, whether expedited or otherwise.
|
||||
|
||||
o "bt" counts the number of times we balked because boosting
|
||||
had already been initiated for the current grace period.
|
||||
|
||||
o "nb" counts the number of times we balked because there
|
||||
was at least one task blocking the current non-expedited grace
|
||||
period that never had blocked. If it is already running, it
|
||||
just won't help to boost its priority!
|
||||
|
||||
o "ny" counts the number of times we balked because it was
|
||||
not yet time to start boosting.
|
||||
|
||||
o "nos" counts the number of times we balked for other
|
||||
reasons, e.g., the grace period ended first.
|
||||
|
||||
|
||||
CONFIG_TINY_RCU debugfs Files and Formats
|
||||
|
||||
These implementations of RCU provides a single debugfs file under the
|
||||
top-level directory RCU, namely rcu/rcudata, which displays fields in
|
||||
rcu_bh_ctrlblk and rcu_sched_ctrlblk.
|
||||
|
||||
The output of "cat rcu/rcudata" is as follows:
|
||||
|
||||
rcu_sched: qlen: 0
|
||||
rcu_bh: qlen: 0
|
||||
|
||||
This is split into rcu_sched and rcu_bh sections. The field is as
|
||||
follows:
|
||||
|
||||
o "qlen" is the number of RCU callbacks currently waiting either
|
||||
for an RCU grace period or waiting to be invoked. This is the
|
||||
only field present for rcu_sched and rcu_bh, due to the
|
||||
short-circuiting of grace period in those two cases.
|
@ -3238,21 +3238,17 @@
|
||||
|
||||
rcutree.gp_cleanup_delay= [KNL]
|
||||
Set the number of jiffies to delay each step of
|
||||
RCU grace-period cleanup. This only has effect
|
||||
when CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP is set.
|
||||
RCU grace-period cleanup.
|
||||
|
||||
rcutree.gp_init_delay= [KNL]
|
||||
Set the number of jiffies to delay each step of
|
||||
RCU grace-period initialization. This only has
|
||||
effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT
|
||||
is set.
|
||||
RCU grace-period initialization.
|
||||
|
||||
rcutree.gp_preinit_delay= [KNL]
|
||||
Set the number of jiffies to delay each step of
|
||||
RCU grace-period pre-initialization, that is,
|
||||
the propagation of recent CPU-hotplug changes up
|
||||
the rcu_node combining tree. This only has effect
|
||||
when CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is set.
|
||||
the rcu_node combining tree.
|
||||
|
||||
rcutree.rcu_fanout_exact= [KNL]
|
||||
Disable autobalancing of the rcu_node combining
|
||||
@ -3328,6 +3324,17 @@
|
||||
This wake_up() will be accompanied by a
|
||||
WARN_ONCE() splat and an ftrace_dump().
|
||||
|
||||
rcuperf.gp_async= [KNL]
|
||||
Measure performance of asynchronous
|
||||
grace-period primitives such as call_rcu().
|
||||
|
||||
rcuperf.gp_async_max= [KNL]
|
||||
Specify the maximum number of outstanding
|
||||
callbacks per writer thread. When a writer
|
||||
thread exceeds this limit, it invokes the
|
||||
corresponding flavor of rcu_barrier() to allow
|
||||
previously posted callbacks to drain.
|
||||
|
||||
rcuperf.gp_exp= [KNL]
|
||||
Measure performance of expedited synchronous
|
||||
grace-period primitives.
|
||||
@ -3355,17 +3362,22 @@
|
||||
rcuperf.perf_runnable= [BOOT]
|
||||
Start rcuperf running at boot time.
|
||||
|
||||
rcuperf.perf_type= [KNL]
|
||||
Specify the RCU implementation to test.
|
||||
|
||||
rcuperf.shutdown= [KNL]
|
||||
Shut the system down after performance tests
|
||||
complete. This is useful for hands-off automated
|
||||
testing.
|
||||
|
||||
rcuperf.perf_type= [KNL]
|
||||
Specify the RCU implementation to test.
|
||||
|
||||
rcuperf.verbose= [KNL]
|
||||
Enable additional printk() statements.
|
||||
|
||||
rcuperf.writer_holdoff= [KNL]
|
||||
Write-side holdoff between grace periods,
|
||||
in microseconds. The default of zero says
|
||||
no holdoff.
|
||||
|
||||
rcutorture.cbflood_inter_holdoff= [KNL]
|
||||
Set holdoff time (jiffies) between successive
|
||||
callback-flood tests.
|
||||
@ -3803,6 +3815,15 @@
|
||||
spia_pedr=
|
||||
spia_peddr=
|
||||
|
||||
srcutree.counter_wrap_check [KNL]
|
||||
Specifies how frequently to check for
|
||||
grace-period sequence counter wrap for the
|
||||
srcu_data structure's ->srcu_gp_seq_needed field.
|
||||
The greater the number of bits set in this kernel
|
||||
parameter, the less frequently counter wrap will
|
||||
be checked for. Note that the bottom two bits
|
||||
are ignored.
|
||||
|
||||
srcutree.exp_holdoff [KNL]
|
||||
Specifies how many nanoseconds must elapse
|
||||
since the end of the last SRCU grace period for
|
||||
|
@ -303,6 +303,11 @@ defined which accomplish this::
|
||||
void smp_mb__before_atomic(void);
|
||||
void smp_mb__after_atomic(void);
|
||||
|
||||
Preceding a non-value-returning read-modify-write atomic operation with
|
||||
smp_mb__before_atomic() and following it with smp_mb__after_atomic()
|
||||
provides the same full ordering that is provided by value-returning
|
||||
read-modify-write atomic operations.
|
||||
|
||||
For example, smp_mb__before_atomic() can be used like so::
|
||||
|
||||
obj->dead = 1;
|
||||
|
@ -103,9 +103,3 @@ have already built it.
|
||||
|
||||
The optional make variable CF can be used to pass arguments to sparse. The
|
||||
build system passes -Wbitwise to sparse automatically.
|
||||
|
||||
Checking RCU annotations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
RCU annotations are not checked by default. To enable RCU annotation
|
||||
checks, include -DCONFIG_SPARSE_RCU_POINTER in your CF flags.
|
||||
|
@ -109,13 +109,12 @@ SCHED_SOFTIRQ: Do all of the following:
|
||||
on that CPU. If a thread that expects to run on the de-jittered
|
||||
CPU awakens, the scheduler will send an IPI that can result in
|
||||
a subsequent SCHED_SOFTIRQ.
|
||||
2. Build with CONFIG_RCU_NOCB_CPU=y, CONFIG_RCU_NOCB_CPU_ALL=y,
|
||||
CONFIG_NO_HZ_FULL=y, and, in addition, ensure that the CPU
|
||||
to be de-jittered is marked as an adaptive-ticks CPU using the
|
||||
"nohz_full=" boot parameter. This reduces the number of
|
||||
scheduler-clock interrupts that the de-jittered CPU receives,
|
||||
minimizing its chances of being selected to do the load balancing
|
||||
work that runs in SCHED_SOFTIRQ context.
|
||||
2. CONFIG_NO_HZ_FULL=y and ensure that the CPU to be de-jittered
|
||||
is marked as an adaptive-ticks CPU using the "nohz_full="
|
||||
boot parameter. This reduces the number of scheduler-clock
|
||||
interrupts that the de-jittered CPU receives, minimizing its
|
||||
chances of being selected to do the load balancing work that
|
||||
runs in SCHED_SOFTIRQ context.
|
||||
3. To the extent possible, keep the CPU out of the kernel when it
|
||||
is non-idle, for example, by avoiding system calls and by
|
||||
forcing both kernel threads and interrupts to execute elsewhere.
|
||||
@ -135,11 +134,10 @@ HRTIMER_SOFTIRQ: Do all of the following:
|
||||
RCU_SOFTIRQ: Do at least one of the following:
|
||||
1. Offload callbacks and keep the CPU in either dyntick-idle or
|
||||
adaptive-ticks state by doing all of the following:
|
||||
a. Build with CONFIG_RCU_NOCB_CPU=y, CONFIG_RCU_NOCB_CPU_ALL=y,
|
||||
CONFIG_NO_HZ_FULL=y, and, in addition ensure that the CPU
|
||||
to be de-jittered is marked as an adaptive-ticks CPU using
|
||||
the "nohz_full=" boot parameter. Bind the rcuo kthreads
|
||||
to housekeeping CPUs, which can tolerate OS jitter.
|
||||
a. CONFIG_NO_HZ_FULL=y and ensure that the CPU to be
|
||||
de-jittered is marked as an adaptive-ticks CPU using the
|
||||
"nohz_full=" boot parameter. Bind the rcuo kthreads to
|
||||
housekeeping CPUs, which can tolerate OS jitter.
|
||||
b. To the extent possible, keep the CPU out of the kernel
|
||||
when it is non-idle, for example, by avoiding system
|
||||
calls and by forcing both kernel threads and interrupts
|
||||
@ -236,11 +234,10 @@ To reduce its OS jitter, do at least one of the following:
|
||||
is feasible only if your workload never requires RCU priority
|
||||
boosting, for example, if you ensure frequent idle time on all
|
||||
CPUs that might execute within the kernel.
|
||||
3. Build with CONFIG_RCU_NOCB_CPU=y and CONFIG_RCU_NOCB_CPU_ALL=y,
|
||||
which offloads all RCU callbacks to kthreads that can be moved
|
||||
off of CPUs susceptible to OS jitter. This approach prevents the
|
||||
rcuc/%u kthreads from having any work to do, so that they are
|
||||
never awakened.
|
||||
3. Build with CONFIG_RCU_NOCB_CPU=y and boot with the rcu_nocbs=
|
||||
boot parameter offloading RCU callbacks from all CPUs susceptible
|
||||
to OS jitter. This approach prevents the rcuc/%u kthreads from
|
||||
having any work to do, so that they are never awakened.
|
||||
4. Ensure that the CPU never enters the kernel, and, in particular,
|
||||
avoid initiating any CPU hotplug operations on this CPU. This is
|
||||
another way of preventing any callbacks from being queued on the
|
||||
|
@ -27,7 +27,7 @@ The purpose of this document is twofold:
|
||||
(2) to provide a guide as to how to use the barriers that are available.
|
||||
|
||||
Note that an architecture can provide more than the minimum requirement
|
||||
for any particular barrier, but if the architecure provides less than
|
||||
for any particular barrier, but if the architecture provides less than
|
||||
that, that architecture is incorrect.
|
||||
|
||||
Note also that it is possible that a barrier may be a no-op for an
|
||||
|
@ -194,32 +194,9 @@ that the RCU callbacks are processed in a timely fashion.
|
||||
|
||||
Another approach is to offload RCU callback processing to "rcuo" kthreads
|
||||
using the CONFIG_RCU_NOCB_CPU=y Kconfig option. The specific CPUs to
|
||||
offload may be selected via several methods:
|
||||
|
||||
1. One of three mutually exclusive Kconfig options specify a
|
||||
build-time default for the CPUs to offload:
|
||||
|
||||
a. The CONFIG_RCU_NOCB_CPU_NONE=y Kconfig option results in
|
||||
no CPUs being offloaded.
|
||||
|
||||
b. The CONFIG_RCU_NOCB_CPU_ZERO=y Kconfig option causes
|
||||
CPU 0 to be offloaded.
|
||||
|
||||
c. The CONFIG_RCU_NOCB_CPU_ALL=y Kconfig option causes all
|
||||
CPUs to be offloaded. Note that the callbacks will be
|
||||
offloaded to "rcuo" kthreads, and that those kthreads
|
||||
will in fact run on some CPU. However, this approach
|
||||
gives fine-grained control on exactly which CPUs the
|
||||
callbacks run on, along with their scheduling priority
|
||||
(including the default of SCHED_OTHER), and it further
|
||||
allows this control to be varied dynamically at runtime.
|
||||
|
||||
2. The "rcu_nocbs=" kernel boot parameter, which takes a comma-separated
|
||||
list of CPUs and CPU ranges, for example, "1,3-5" selects CPUs 1,
|
||||
3, 4, and 5. The specified CPUs will be offloaded in addition to
|
||||
any CPUs specified as offloaded by CONFIG_RCU_NOCB_CPU_ZERO=y or
|
||||
CONFIG_RCU_NOCB_CPU_ALL=y. This means that the "rcu_nocbs=" boot
|
||||
parameter has no effect for kernels built with RCU_NOCB_CPU_ALL=y.
|
||||
offload may be selected using The "rcu_nocbs=" kernel boot parameter,
|
||||
which takes a comma-separated list of CPUs and CPU ranges, for example,
|
||||
"1,3-5" selects CPUs 1, 3, 4, and 5.
|
||||
|
||||
The offloaded CPUs will never queue RCU callbacks, and therefore RCU
|
||||
never prevents offloaded CPUs from entering either dyntick-idle mode
|
||||
|
@ -8,6 +8,7 @@
|
||||
#ifndef __BCM47XX_NVRAM_H
|
||||
#define __BCM47XX_NVRAM_H
|
||||
|
||||
#include <linux/errno.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/vmalloc.h>
|
||||
|
@ -17,11 +17,7 @@
|
||||
# define __release(x) __context__(x,-1)
|
||||
# define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0)
|
||||
# define __percpu __attribute__((noderef, address_space(3)))
|
||||
#ifdef CONFIG_SPARSE_RCU_POINTER
|
||||
# define __rcu __attribute__((noderef, address_space(4)))
|
||||
#else /* CONFIG_SPARSE_RCU_POINTER */
|
||||
# define __rcu
|
||||
#endif /* CONFIG_SPARSE_RCU_POINTER */
|
||||
# define __private __attribute__((noderef))
|
||||
extern void __chk_user_ptr(const volatile void __user *);
|
||||
extern void __chk_io_ptr(const volatile void __iomem *);
|
||||
|
@ -7,6 +7,10 @@
|
||||
* unlimited scalability while maintaining a constant level of contention
|
||||
* on the root node.
|
||||
*
|
||||
* This seemingly RCU-private file must be available to SRCU users
|
||||
* because the size of the TREE SRCU srcu_struct structure depends
|
||||
* on these definitions.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
|
@ -1,6 +1,10 @@
|
||||
/*
|
||||
* RCU segmented callback lists
|
||||
*
|
||||
* This seemingly RCU-private file must be available to SRCU users
|
||||
* because the size of the TREE SRCU srcu_struct structure depends
|
||||
* on these definitions.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
|
@ -34,104 +34,15 @@
|
||||
#define __LINUX_RCUPDATE_H
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/cache.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/threads.h>
|
||||
#include <linux/cpumask.h>
|
||||
#include <linux/seqlock.h>
|
||||
#include <linux/lockdep.h>
|
||||
#include <linux/debugobjects.h>
|
||||
#include <linux/bug.h>
|
||||
#include <linux/compiler.h>
|
||||
#include <linux/ktime.h>
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/irqflags.h>
|
||||
#include <linux/preempt.h>
|
||||
#include <linux/bottom_half.h>
|
||||
#include <linux/lockdep.h>
|
||||
#include <asm/processor.h>
|
||||
#include <linux/cpumask.h>
|
||||
|
||||
#include <asm/barrier.h>
|
||||
|
||||
#ifndef CONFIG_TINY_RCU
|
||||
extern int rcu_expedited; /* for sysctl */
|
||||
extern int rcu_normal; /* also for sysctl */
|
||||
#endif /* #ifndef CONFIG_TINY_RCU */
|
||||
|
||||
#ifdef CONFIG_TINY_RCU
|
||||
/* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
|
||||
static inline bool rcu_gp_is_normal(void) /* Internal RCU use. */
|
||||
{
|
||||
return true;
|
||||
}
|
||||
static inline bool rcu_gp_is_expedited(void) /* Internal RCU use. */
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void rcu_expedite_gp(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_unexpedite_gp(void)
|
||||
{
|
||||
}
|
||||
#else /* #ifdef CONFIG_TINY_RCU */
|
||||
bool rcu_gp_is_normal(void); /* Internal RCU use. */
|
||||
bool rcu_gp_is_expedited(void); /* Internal RCU use. */
|
||||
void rcu_expedite_gp(void);
|
||||
void rcu_unexpedite_gp(void);
|
||||
#endif /* #else #ifdef CONFIG_TINY_RCU */
|
||||
|
||||
enum rcutorture_type {
|
||||
RCU_FLAVOR,
|
||||
RCU_BH_FLAVOR,
|
||||
RCU_SCHED_FLAVOR,
|
||||
RCU_TASKS_FLAVOR,
|
||||
SRCU_FLAVOR,
|
||||
INVALID_RCU_FLAVOR
|
||||
};
|
||||
|
||||
#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU)
|
||||
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
|
||||
unsigned long *gpnum, unsigned long *completed);
|
||||
void rcutorture_record_test_transition(void);
|
||||
void rcutorture_record_progress(unsigned long vernum);
|
||||
void do_trace_rcu_torture_read(const char *rcutorturename,
|
||||
struct rcu_head *rhp,
|
||||
unsigned long secs,
|
||||
unsigned long c_old,
|
||||
unsigned long c);
|
||||
bool rcu_irq_enter_disabled(void);
|
||||
#else
|
||||
static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
int *flags,
|
||||
unsigned long *gpnum,
|
||||
unsigned long *completed)
|
||||
{
|
||||
*flags = 0;
|
||||
*gpnum = 0;
|
||||
*completed = 0;
|
||||
}
|
||||
static inline void rcutorture_record_test_transition(void)
|
||||
{
|
||||
}
|
||||
static inline void rcutorture_record_progress(unsigned long vernum)
|
||||
{
|
||||
}
|
||||
static inline bool rcu_irq_enter_disabled(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
#ifdef CONFIG_RCU_TRACE
|
||||
void do_trace_rcu_torture_read(const char *rcutorturename,
|
||||
struct rcu_head *rhp,
|
||||
unsigned long secs,
|
||||
unsigned long c_old,
|
||||
unsigned long c);
|
||||
#else
|
||||
#define do_trace_rcu_torture_read(rcutorturename, rhp, secs, c_old, c) \
|
||||
do { } while (0)
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#define UINT_CMP_GE(a, b) (UINT_MAX / 2 >= (a) - (b))
|
||||
#define UINT_CMP_LT(a, b) (UINT_MAX / 2 < (a) - (b))
|
||||
#define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b))
|
||||
#define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b))
|
||||
#define ulong2long(a) (*(long *)(&(a)))
|
||||
@ -139,115 +50,14 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
|
||||
/* Exported common interfaces */
|
||||
|
||||
#ifdef CONFIG_PREEMPT_RCU
|
||||
|
||||
/**
|
||||
* call_rcu() - Queue an RCU callback for invocation after a grace period.
|
||||
* @head: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all pre-existing RCU read-side
|
||||
* critical sections have completed. However, the callback function
|
||||
* might well execute concurrently with RCU read-side critical sections
|
||||
* that started after call_rcu() was invoked. RCU read-side critical
|
||||
* sections are delimited by rcu_read_lock() and rcu_read_unlock(),
|
||||
* and may be nested.
|
||||
*
|
||||
* Note that all CPUs must agree that the grace period extended beyond
|
||||
* all pre-existing RCU read-side critical section. On systems with more
|
||||
* than one CPU, this means that when "func()" is invoked, each CPU is
|
||||
* guaranteed to have executed a full memory barrier since the end of its
|
||||
* last RCU read-side critical section whose beginning preceded the call
|
||||
* to call_rcu(). It also means that each CPU executing an RCU read-side
|
||||
* critical section that continues beyond the start of "func()" must have
|
||||
* executed a memory barrier after the call_rcu() but before the beginning
|
||||
* of that RCU read-side critical section. Note that these guarantees
|
||||
* include CPUs that are offline, idle, or executing in user mode, as
|
||||
* well as CPUs that are executing in the kernel.
|
||||
*
|
||||
* Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
|
||||
* resulting RCU callback function "func()", then both CPU A and CPU B are
|
||||
* guaranteed to execute a full memory barrier during the time interval
|
||||
* between the call to call_rcu() and the invocation of "func()" -- even
|
||||
* if CPU A and CPU B are the same CPU (but again only if the system has
|
||||
* more than one CPU).
|
||||
*/
|
||||
void call_rcu(struct rcu_head *head,
|
||||
rcu_callback_t func);
|
||||
|
||||
void call_rcu(struct rcu_head *head, rcu_callback_t func);
|
||||
#else /* #ifdef CONFIG_PREEMPT_RCU */
|
||||
|
||||
/* In classic RCU, call_rcu() is just call_rcu_sched(). */
|
||||
#define call_rcu call_rcu_sched
|
||||
|
||||
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
|
||||
|
||||
/**
|
||||
* call_rcu_bh() - Queue an RCU for invocation after a quicker grace period.
|
||||
* @head: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all currently executing RCU
|
||||
* read-side critical sections have completed. call_rcu_bh() assumes
|
||||
* that the read-side critical sections end on completion of a softirq
|
||||
* handler. This means that read-side critical sections in process
|
||||
* context must not be interrupted by softirqs. This interface is to be
|
||||
* used when most of the read-side critical sections are in softirq context.
|
||||
* RCU read-side critical sections are delimited by :
|
||||
* - rcu_read_lock() and rcu_read_unlock(), if in interrupt context.
|
||||
* OR
|
||||
* - rcu_read_lock_bh() and rcu_read_unlock_bh(), if in process context.
|
||||
* These may be nested.
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
void call_rcu_bh(struct rcu_head *head,
|
||||
rcu_callback_t func);
|
||||
|
||||
/**
|
||||
* call_rcu_sched() - Queue an RCU for invocation after sched grace period.
|
||||
* @head: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all currently executing RCU
|
||||
* read-side critical sections have completed. call_rcu_sched() assumes
|
||||
* that the read-side critical sections end on enabling of preemption
|
||||
* or on voluntary preemption.
|
||||
* RCU read-side critical sections are delimited by :
|
||||
* - rcu_read_lock_sched() and rcu_read_unlock_sched(),
|
||||
* OR
|
||||
* anything that disables preemption.
|
||||
* These may be nested.
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
void call_rcu_sched(struct rcu_head *head,
|
||||
rcu_callback_t func);
|
||||
|
||||
void call_rcu_bh(struct rcu_head *head, rcu_callback_t func);
|
||||
void call_rcu_sched(struct rcu_head *head, rcu_callback_t func);
|
||||
void synchronize_sched(void);
|
||||
|
||||
/**
|
||||
* call_rcu_tasks() - Queue an RCU for invocation task-based grace period
|
||||
* @head: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all currently executing RCU
|
||||
* read-side critical sections have completed. call_rcu_tasks() assumes
|
||||
* that the read-side critical sections end at a voluntary context
|
||||
* switch (not a preemption!), entry into idle, or transition to usermode
|
||||
* execution. As such, there are no read-side primitives analogous to
|
||||
* rcu_read_lock() and rcu_read_unlock() because this primitive is intended
|
||||
* to determine that all tasks have passed through a safe state, not so
|
||||
* much for data-strcuture synchronization.
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
|
||||
void synchronize_rcu_tasks(void);
|
||||
void rcu_barrier_tasks(void);
|
||||
@ -301,22 +111,12 @@ void rcu_check_callbacks(int user);
|
||||
void rcu_report_dead(unsigned int cpu);
|
||||
void rcu_cpu_starting(unsigned int cpu);
|
||||
|
||||
#ifndef CONFIG_TINY_RCU
|
||||
void rcu_end_inkernel_boot(void);
|
||||
#else /* #ifndef CONFIG_TINY_RCU */
|
||||
static inline void rcu_end_inkernel_boot(void) { }
|
||||
#endif /* #ifndef CONFIG_TINY_RCU */
|
||||
|
||||
#ifdef CONFIG_RCU_STALL_COMMON
|
||||
void rcu_sysrq_start(void);
|
||||
void rcu_sysrq_end(void);
|
||||
#else /* #ifdef CONFIG_RCU_STALL_COMMON */
|
||||
static inline void rcu_sysrq_start(void)
|
||||
{
|
||||
}
|
||||
static inline void rcu_sysrq_end(void)
|
||||
{
|
||||
}
|
||||
static inline void rcu_sysrq_start(void) { }
|
||||
static inline void rcu_sysrq_end(void) { }
|
||||
#endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */
|
||||
|
||||
#ifdef CONFIG_NO_HZ_FULL
|
||||
@ -330,9 +130,7 @@ static inline void rcu_user_exit(void) { }
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
void rcu_init_nohz(void);
|
||||
#else /* #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
static inline void rcu_init_nohz(void)
|
||||
{
|
||||
}
|
||||
static inline void rcu_init_nohz(void) { }
|
||||
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
|
||||
/**
|
||||
@ -397,10 +195,6 @@ do { \
|
||||
rcu_note_voluntary_context_switch(current); \
|
||||
} while (0)
|
||||
|
||||
#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP)
|
||||
bool __rcu_is_watching(void);
|
||||
#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */
|
||||
|
||||
/*
|
||||
* Infrastructure to implement the synchronize_() primitives in
|
||||
* TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
|
||||
@ -414,10 +208,6 @@ bool __rcu_is_watching(void);
|
||||
#error "Unknown RCU implementation specified to kernel configuration"
|
||||
#endif
|
||||
|
||||
#define RCU_SCHEDULER_INACTIVE 0
|
||||
#define RCU_SCHEDULER_INIT 1
|
||||
#define RCU_SCHEDULER_RUNNING 2
|
||||
|
||||
/*
|
||||
* init_rcu_head_on_stack()/destroy_rcu_head_on_stack() are needed for dynamic
|
||||
* initialization and destruction of rcu_head on the stack. rcu_head structures
|
||||
@ -430,30 +220,16 @@ void destroy_rcu_head(struct rcu_head *head);
|
||||
void init_rcu_head_on_stack(struct rcu_head *head);
|
||||
void destroy_rcu_head_on_stack(struct rcu_head *head);
|
||||
#else /* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
|
||||
static inline void init_rcu_head(struct rcu_head *head)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void destroy_rcu_head(struct rcu_head *head)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void init_rcu_head_on_stack(struct rcu_head *head)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
|
||||
{
|
||||
}
|
||||
static inline void init_rcu_head(struct rcu_head *head) { }
|
||||
static inline void destroy_rcu_head(struct rcu_head *head) { }
|
||||
static inline void init_rcu_head_on_stack(struct rcu_head *head) { }
|
||||
static inline void destroy_rcu_head_on_stack(struct rcu_head *head) { }
|
||||
#endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
|
||||
|
||||
#if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU)
|
||||
bool rcu_lockdep_current_cpu_online(void);
|
||||
#else /* #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) */
|
||||
static inline bool rcu_lockdep_current_cpu_online(void)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
static inline bool rcu_lockdep_current_cpu_online(void) { return true; }
|
||||
#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) */
|
||||
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
@ -473,18 +249,8 @@ extern struct lockdep_map rcu_bh_lock_map;
|
||||
extern struct lockdep_map rcu_sched_lock_map;
|
||||
extern struct lockdep_map rcu_callback_map;
|
||||
int debug_lockdep_rcu_enabled(void);
|
||||
|
||||
int rcu_read_lock_held(void);
|
||||
int rcu_read_lock_bh_held(void);
|
||||
|
||||
/**
|
||||
* rcu_read_lock_sched_held() - might we be in RCU-sched read-side critical section?
|
||||
*
|
||||
* If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an
|
||||
* RCU-sched read-side critical section. In absence of
|
||||
* CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
|
||||
* critical section unless it can prove otherwise.
|
||||
*/
|
||||
int rcu_read_lock_sched_held(void);
|
||||
|
||||
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
@ -531,9 +297,7 @@ static inline void rcu_preempt_sleep_check(void)
|
||||
"Illegal context switch in RCU read-side critical section");
|
||||
}
|
||||
#else /* #ifdef CONFIG_PROVE_RCU */
|
||||
static inline void rcu_preempt_sleep_check(void)
|
||||
{
|
||||
}
|
||||
static inline void rcu_preempt_sleep_check(void) { }
|
||||
#endif /* #else #ifdef CONFIG_PROVE_RCU */
|
||||
|
||||
#define rcu_sleep_check() \
|
||||
@ -1084,52 +848,6 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
|
||||
#define kfree_rcu(ptr, rcu_head) \
|
||||
__kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
|
||||
|
||||
#ifdef CONFIG_TINY_RCU
|
||||
static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
|
||||
{
|
||||
*nextevt = KTIME_MAX;
|
||||
return 0;
|
||||
}
|
||||
#endif /* #ifdef CONFIG_TINY_RCU */
|
||||
|
||||
#if defined(CONFIG_RCU_NOCB_CPU_ALL)
|
||||
static inline bool rcu_is_nocb_cpu(int cpu) { return true; }
|
||||
#elif defined(CONFIG_RCU_NOCB_CPU)
|
||||
bool rcu_is_nocb_cpu(int cpu);
|
||||
#else
|
||||
static inline bool rcu_is_nocb_cpu(int cpu) { return false; }
|
||||
#endif
|
||||
|
||||
|
||||
/* Only for use by adaptive-ticks code. */
|
||||
#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
|
||||
bool rcu_sys_is_idle(void);
|
||||
void rcu_sysidle_force_exit(void);
|
||||
#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
|
||||
static inline bool rcu_sys_is_idle(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void rcu_sysidle_force_exit(void)
|
||||
{
|
||||
}
|
||||
|
||||
#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
|
||||
|
||||
/*
|
||||
* Dump the ftrace buffer, but only one time per callsite per boot.
|
||||
*/
|
||||
#define rcu_ftrace_dump(oops_dump_mode) \
|
||||
do { \
|
||||
static atomic_t ___rfd_beenhere = ATOMIC_INIT(0); \
|
||||
\
|
||||
if (!atomic_read(&___rfd_beenhere) && \
|
||||
!atomic_xchg(&___rfd_beenhere, 1)) \
|
||||
ftrace_dump(oops_dump_mode); \
|
||||
} while (0)
|
||||
|
||||
/*
|
||||
* Place this after a lock-acquisition primitive to guarantee that
|
||||
|
@ -25,7 +25,7 @@
|
||||
#ifndef __LINUX_TINY_H
|
||||
#define __LINUX_TINY_H
|
||||
|
||||
#include <linux/cache.h>
|
||||
#include <linux/ktime.h>
|
||||
|
||||
struct rcu_dynticks;
|
||||
static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp)
|
||||
@ -33,10 +33,8 @@ static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline bool rcu_eqs_special_set(int cpu)
|
||||
{
|
||||
return false; /* Never flag non-existent other CPUs! */
|
||||
}
|
||||
/* Never flag non-existent other CPUs! */
|
||||
static inline bool rcu_eqs_special_set(int cpu) { return false; }
|
||||
|
||||
static inline unsigned long get_state_synchronize_rcu(void)
|
||||
{
|
||||
@ -98,159 +96,38 @@ static inline void kfree_call_rcu(struct rcu_head *head,
|
||||
rcu_note_voluntary_context_switch_lite(current); \
|
||||
} while (0)
|
||||
|
||||
static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
|
||||
{
|
||||
*nextevt = KTIME_MAX;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Take advantage of the fact that there is only one CPU, which
|
||||
* allows us to ignore virtualization-based context switches.
|
||||
*/
|
||||
static inline void rcu_virt_note_context_switch(int cpu)
|
||||
{
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of grace periods started.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_started(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of bottom-half grace periods started.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_started_bh(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of sched grace periods started.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_started_sched(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_completed(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of bottom-half grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_completed_bh(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of sched grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_completed_sched(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of expedited grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_exp_batches_completed(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of expedited sched grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_exp_batches_completed_sched(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void rcu_force_quiescent_state(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_bh_force_quiescent_state(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_sched_force_quiescent_state(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void show_rcu_gp_kthreads(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_cpu_stall_reset(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_idle_enter(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_idle_exit(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_irq_enter(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_irq_exit_irqson(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_irq_enter_irqson(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_irq_exit(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void exit_rcu(void)
|
||||
{
|
||||
}
|
||||
static inline void rcu_virt_note_context_switch(int cpu) { }
|
||||
static inline void rcu_cpu_stall_reset(void) { }
|
||||
static inline void rcu_idle_enter(void) { }
|
||||
static inline void rcu_idle_exit(void) { }
|
||||
static inline void rcu_irq_enter(void) { }
|
||||
static inline bool rcu_irq_enter_disabled(void) { return false; }
|
||||
static inline void rcu_irq_exit_irqson(void) { }
|
||||
static inline void rcu_irq_enter_irqson(void) { }
|
||||
static inline void rcu_irq_exit(void) { }
|
||||
static inline void exit_rcu(void) { }
|
||||
|
||||
#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU)
|
||||
extern int rcu_scheduler_active __read_mostly;
|
||||
void rcu_scheduler_starting(void);
|
||||
#else /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU) */
|
||||
static inline void rcu_scheduler_starting(void)
|
||||
{
|
||||
}
|
||||
static inline void rcu_scheduler_starting(void) { }
|
||||
#endif /* #else #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU) */
|
||||
static inline void rcu_end_inkernel_boot(void) { }
|
||||
static inline bool rcu_is_watching(void) { return true; }
|
||||
|
||||
#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE)
|
||||
|
||||
static inline bool rcu_is_watching(void)
|
||||
{
|
||||
return __rcu_is_watching();
|
||||
}
|
||||
|
||||
#else /* defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) */
|
||||
|
||||
static inline bool rcu_is_watching(void)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
#endif /* #else defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) */
|
||||
|
||||
static inline void rcu_request_urgent_qs_task(struct task_struct *t)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_all_qs(void)
|
||||
{
|
||||
barrier(); /* Avoid RCU read-side critical sections leaking across. */
|
||||
}
|
||||
/* Avoid RCU read-side critical sections leaking across. */
|
||||
static inline void rcu_all_qs(void) { barrier(); }
|
||||
|
||||
/* RCUtree hotplug events */
|
||||
#define rcutree_prepare_cpu NULL
|
||||
|
@ -79,37 +79,20 @@ void cond_synchronize_rcu(unsigned long oldstate);
|
||||
unsigned long get_state_synchronize_sched(void);
|
||||
void cond_synchronize_sched(unsigned long oldstate);
|
||||
|
||||
extern unsigned long rcutorture_testseq;
|
||||
extern unsigned long rcutorture_vernum;
|
||||
unsigned long rcu_batches_started(void);
|
||||
unsigned long rcu_batches_started_bh(void);
|
||||
unsigned long rcu_batches_started_sched(void);
|
||||
unsigned long rcu_batches_completed(void);
|
||||
unsigned long rcu_batches_completed_bh(void);
|
||||
unsigned long rcu_batches_completed_sched(void);
|
||||
unsigned long rcu_exp_batches_completed(void);
|
||||
unsigned long rcu_exp_batches_completed_sched(void);
|
||||
void show_rcu_gp_kthreads(void);
|
||||
|
||||
void rcu_force_quiescent_state(void);
|
||||
void rcu_bh_force_quiescent_state(void);
|
||||
void rcu_sched_force_quiescent_state(void);
|
||||
|
||||
void rcu_idle_enter(void);
|
||||
void rcu_idle_exit(void);
|
||||
void rcu_irq_enter(void);
|
||||
void rcu_irq_exit(void);
|
||||
void rcu_irq_enter_irqson(void);
|
||||
void rcu_irq_exit_irqson(void);
|
||||
bool rcu_irq_enter_disabled(void);
|
||||
|
||||
void exit_rcu(void);
|
||||
|
||||
void rcu_scheduler_starting(void);
|
||||
extern int rcu_scheduler_active __read_mostly;
|
||||
|
||||
void rcu_end_inkernel_boot(void);
|
||||
bool rcu_is_watching(void);
|
||||
void rcu_request_urgent_qs_task(struct task_struct *t);
|
||||
|
||||
void rcu_all_qs(void);
|
||||
|
||||
/* RCUtree hotplug events */
|
||||
|
@ -369,6 +369,26 @@ static __always_inline int spin_trylock_irq(spinlock_t *lock)
|
||||
raw_spin_trylock_irqsave(spinlock_check(lock), flags); \
|
||||
})
|
||||
|
||||
/**
|
||||
* spin_unlock_wait - Interpose between successive critical sections
|
||||
* @lock: the spinlock whose critical sections are to be interposed.
|
||||
*
|
||||
* Semantically this is equivalent to a spin_lock() immediately
|
||||
* followed by a spin_unlock(). However, most architectures have
|
||||
* more efficient implementations in which the spin_unlock_wait()
|
||||
* cannot block concurrent lock acquisition, and in some cases
|
||||
* where spin_unlock_wait() does not write to the lock variable.
|
||||
* Nevertheless, spin_unlock_wait() can have high overhead, so if
|
||||
* you feel the need to use it, please check to see if there is
|
||||
* a better way to get your job done.
|
||||
*
|
||||
* The ordering guarantees provided by spin_unlock_wait() are:
|
||||
*
|
||||
* 1. All accesses preceding the spin_unlock_wait() happen before
|
||||
* any accesses in later critical sections for this same lock.
|
||||
* 2. All accesses following the spin_unlock_wait() happen after
|
||||
* any accesses in earlier critical sections for this same lock.
|
||||
*/
|
||||
static __always_inline void spin_unlock_wait(spinlock_t *lock)
|
||||
{
|
||||
raw_spin_unlock_wait(&lock->rlock);
|
||||
|
@ -60,32 +60,15 @@ int init_srcu_struct(struct srcu_struct *sp);
|
||||
#include <linux/srcutiny.h>
|
||||
#elif defined(CONFIG_TREE_SRCU)
|
||||
#include <linux/srcutree.h>
|
||||
#elif defined(CONFIG_CLASSIC_SRCU)
|
||||
#include <linux/srcuclassic.h>
|
||||
#else
|
||||
#elif defined(CONFIG_SRCU)
|
||||
#error "Unknown SRCU implementation specified to kernel configuration"
|
||||
#else
|
||||
/* Dummy definition for things like notifiers. Actual use gets link error. */
|
||||
struct srcu_struct { };
|
||||
#endif
|
||||
|
||||
/**
|
||||
* call_srcu() - Queue a callback for invocation after an SRCU grace period
|
||||
* @sp: srcu_struct in queue the callback
|
||||
* @head: structure to be used for queueing the SRCU callback.
|
||||
* @func: function to be invoked after the SRCU grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full SRCU
|
||||
* grace period elapses, in other words after all pre-existing SRCU
|
||||
* read-side critical sections have completed. However, the callback
|
||||
* function might well execute concurrently with other SRCU read-side
|
||||
* critical sections that started after call_srcu() was invoked. SRCU
|
||||
* read-side critical sections are delimited by srcu_read_lock() and
|
||||
* srcu_read_unlock(), and may be nested.
|
||||
*
|
||||
* The callback will be invoked from process context, but must nevertheless
|
||||
* be fast and must not block.
|
||||
*/
|
||||
void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
|
||||
void (*func)(struct rcu_head *head));
|
||||
|
||||
void cleanup_srcu_struct(struct srcu_struct *sp);
|
||||
int __srcu_read_lock(struct srcu_struct *sp) __acquires(sp);
|
||||
void __srcu_read_unlock(struct srcu_struct *sp, int idx) __releases(sp);
|
||||
|
@ -1,115 +0,0 @@
|
||||
/*
|
||||
* Sleepable Read-Copy Update mechanism for mutual exclusion,
|
||||
* classic v4.11 variant.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
* (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright (C) IBM Corporation, 2017
|
||||
*
|
||||
* Author: Paul McKenney <paulmck@us.ibm.com>
|
||||
*/
|
||||
|
||||
#ifndef _LINUX_SRCU_CLASSIC_H
|
||||
#define _LINUX_SRCU_CLASSIC_H
|
||||
|
||||
struct srcu_array {
|
||||
unsigned long lock_count[2];
|
||||
unsigned long unlock_count[2];
|
||||
};
|
||||
|
||||
struct rcu_batch {
|
||||
struct rcu_head *head, **tail;
|
||||
};
|
||||
|
||||
#define RCU_BATCH_INIT(name) { NULL, &(name.head) }
|
||||
|
||||
struct srcu_struct {
|
||||
unsigned long completed;
|
||||
struct srcu_array __percpu *per_cpu_ref;
|
||||
spinlock_t queue_lock; /* protect ->batch_queue, ->running */
|
||||
bool running;
|
||||
/* callbacks just queued */
|
||||
struct rcu_batch batch_queue;
|
||||
/* callbacks try to do the first check_zero */
|
||||
struct rcu_batch batch_check0;
|
||||
/* callbacks done with the first check_zero and the flip */
|
||||
struct rcu_batch batch_check1;
|
||||
struct rcu_batch batch_done;
|
||||
struct delayed_work work;
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
struct lockdep_map dep_map;
|
||||
#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
};
|
||||
|
||||
void process_srcu(struct work_struct *work);
|
||||
|
||||
#define __SRCU_STRUCT_INIT(name) \
|
||||
{ \
|
||||
.completed = -300, \
|
||||
.per_cpu_ref = &name##_srcu_array, \
|
||||
.queue_lock = __SPIN_LOCK_UNLOCKED(name.queue_lock), \
|
||||
.running = false, \
|
||||
.batch_queue = RCU_BATCH_INIT(name.batch_queue), \
|
||||
.batch_check0 = RCU_BATCH_INIT(name.batch_check0), \
|
||||
.batch_check1 = RCU_BATCH_INIT(name.batch_check1), \
|
||||
.batch_done = RCU_BATCH_INIT(name.batch_done), \
|
||||
.work = __DELAYED_WORK_INITIALIZER(name.work, process_srcu, 0),\
|
||||
__SRCU_DEP_MAP_INIT(name) \
|
||||
}
|
||||
|
||||
/*
|
||||
* Define and initialize a srcu struct at build time.
|
||||
* Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
|
||||
*
|
||||
* Note that although DEFINE_STATIC_SRCU() hides the name from other
|
||||
* files, the per-CPU variable rules nevertheless require that the
|
||||
* chosen name be globally unique. These rules also prohibit use of
|
||||
* DEFINE_STATIC_SRCU() within a function. If these rules are too
|
||||
* restrictive, declare the srcu_struct manually. For example, in
|
||||
* each file:
|
||||
*
|
||||
* static struct srcu_struct my_srcu;
|
||||
*
|
||||
* Then, before the first use of each my_srcu, manually initialize it:
|
||||
*
|
||||
* init_srcu_struct(&my_srcu);
|
||||
*
|
||||
* See include/linux/percpu-defs.h for the rules on per-CPU variables.
|
||||
*/
|
||||
#define __DEFINE_SRCU(name, is_static) \
|
||||
static DEFINE_PER_CPU(struct srcu_array, name##_srcu_array);\
|
||||
is_static struct srcu_struct name = __SRCU_STRUCT_INIT(name)
|
||||
#define DEFINE_SRCU(name) __DEFINE_SRCU(name, /* not static */)
|
||||
#define DEFINE_STATIC_SRCU(name) __DEFINE_SRCU(name, static)
|
||||
|
||||
void synchronize_srcu_expedited(struct srcu_struct *sp);
|
||||
void srcu_barrier(struct srcu_struct *sp);
|
||||
unsigned long srcu_batches_completed(struct srcu_struct *sp);
|
||||
|
||||
static inline void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
struct srcu_struct *sp, int *flags,
|
||||
unsigned long *gpnum,
|
||||
unsigned long *completed)
|
||||
{
|
||||
if (test_type != SRCU_FLAVOR)
|
||||
return;
|
||||
*flags = 0;
|
||||
*completed = sp->completed;
|
||||
*gpnum = *completed;
|
||||
if (sp->batch_queue.head || sp->batch_check0.head || sp->batch_check0.head)
|
||||
(*gpnum)++;
|
||||
}
|
||||
|
||||
#endif
|
@ -27,15 +27,14 @@
|
||||
#include <linux/swait.h>
|
||||
|
||||
struct srcu_struct {
|
||||
int srcu_lock_nesting[2]; /* srcu_read_lock() nesting depth. */
|
||||
short srcu_lock_nesting[2]; /* srcu_read_lock() nesting depth. */
|
||||
short srcu_idx; /* Current reader array element. */
|
||||
u8 srcu_gp_running; /* GP workqueue running? */
|
||||
u8 srcu_gp_waiting; /* GP waiting for readers? */
|
||||
struct swait_queue_head srcu_wq;
|
||||
/* Last srcu_read_unlock() wakes GP. */
|
||||
unsigned long srcu_gp_seq; /* GP seq # for callback tagging. */
|
||||
struct rcu_segcblist srcu_cblist;
|
||||
/* Pending SRCU callbacks. */
|
||||
int srcu_idx; /* Current reader array element. */
|
||||
bool srcu_gp_running; /* GP workqueue running? */
|
||||
bool srcu_gp_waiting; /* GP waiting for readers? */
|
||||
struct rcu_head *srcu_cb_head; /* Pending callbacks: Head. */
|
||||
struct rcu_head **srcu_cb_tail; /* Pending callbacks: Tail. */
|
||||
struct work_struct srcu_work; /* For driving grace periods. */
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
struct lockdep_map dep_map;
|
||||
@ -47,7 +46,7 @@ void srcu_drive_gp(struct work_struct *wp);
|
||||
#define __SRCU_STRUCT_INIT(name) \
|
||||
{ \
|
||||
.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq), \
|
||||
.srcu_cblist = RCU_SEGCBLIST_INITIALIZER(name.srcu_cblist), \
|
||||
.srcu_cb_tail = &name.srcu_cb_head, \
|
||||
.srcu_work = __WORK_INITIALIZER(name.srcu_work, srcu_drive_gp), \
|
||||
__SRCU_DEP_MAP_INIT(name) \
|
||||
}
|
||||
@ -63,6 +62,21 @@ void srcu_drive_gp(struct work_struct *wp);
|
||||
|
||||
void synchronize_srcu(struct srcu_struct *sp);
|
||||
|
||||
/*
|
||||
* Counts the new reader in the appropriate per-CPU element of the
|
||||
* srcu_struct. Can be invoked from irq/bh handlers, but the matching
|
||||
* __srcu_read_unlock() must be in the same handler instance. Returns an
|
||||
* index that must be passed to the matching srcu_read_unlock().
|
||||
*/
|
||||
static inline int __srcu_read_lock(struct srcu_struct *sp)
|
||||
{
|
||||
int idx;
|
||||
|
||||
idx = READ_ONCE(sp->srcu_idx);
|
||||
WRITE_ONCE(sp->srcu_lock_nesting[idx], sp->srcu_lock_nesting[idx] + 1);
|
||||
return idx;
|
||||
}
|
||||
|
||||
static inline void synchronize_srcu_expedited(struct srcu_struct *sp)
|
||||
{
|
||||
synchronize_srcu(sp);
|
||||
@ -73,21 +87,4 @@ static inline void srcu_barrier(struct srcu_struct *sp)
|
||||
synchronize_srcu(sp);
|
||||
}
|
||||
|
||||
static inline unsigned long srcu_batches_completed(struct srcu_struct *sp)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
struct srcu_struct *sp, int *flags,
|
||||
unsigned long *gpnum,
|
||||
unsigned long *completed)
|
||||
{
|
||||
if (test_type != SRCU_FLAVOR)
|
||||
return;
|
||||
*flags = 0;
|
||||
*completed = sp->srcu_gp_seq;
|
||||
*gpnum = *completed;
|
||||
}
|
||||
|
||||
#endif
|
||||
|
@ -40,7 +40,7 @@ struct srcu_data {
|
||||
unsigned long srcu_unlock_count[2]; /* Unlocks per CPU. */
|
||||
|
||||
/* Update-side state. */
|
||||
spinlock_t lock ____cacheline_internodealigned_in_smp;
|
||||
raw_spinlock_t __private lock ____cacheline_internodealigned_in_smp;
|
||||
struct rcu_segcblist srcu_cblist; /* List of callbacks.*/
|
||||
unsigned long srcu_gp_seq_needed; /* Furthest future GP needed. */
|
||||
unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
|
||||
@ -58,7 +58,7 @@ struct srcu_data {
|
||||
* Node in SRCU combining tree, similar in function to rcu_data.
|
||||
*/
|
||||
struct srcu_node {
|
||||
spinlock_t lock;
|
||||
raw_spinlock_t __private lock;
|
||||
unsigned long srcu_have_cbs[4]; /* GP seq for children */
|
||||
/* having CBs, but only */
|
||||
/* is > ->srcu_gq_seq. */
|
||||
@ -78,7 +78,7 @@ struct srcu_struct {
|
||||
struct srcu_node *level[RCU_NUM_LVLS + 1];
|
||||
/* First node at each level. */
|
||||
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
|
||||
spinlock_t gp_lock; /* protect ->srcu_cblist */
|
||||
raw_spinlock_t __private lock; /* Protect counters */
|
||||
struct mutex srcu_gp_mutex; /* Serialize GP work. */
|
||||
unsigned int srcu_idx; /* Current rdr array element. */
|
||||
unsigned long srcu_gp_seq; /* Grace-period seq #. */
|
||||
@ -109,7 +109,7 @@ void process_srcu(struct work_struct *work);
|
||||
#define __SRCU_STRUCT_INIT(name) \
|
||||
{ \
|
||||
.sda = &name##_srcu_data, \
|
||||
.gp_lock = __SPIN_LOCK_UNLOCKED(name.gp_lock), \
|
||||
.lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \
|
||||
.srcu_gp_seq_needed = 0 - 1, \
|
||||
__SRCU_DEP_MAP_INIT(name) \
|
||||
}
|
||||
@ -141,10 +141,5 @@ void process_srcu(struct work_struct *work);
|
||||
|
||||
void synchronize_srcu_expedited(struct srcu_struct *sp);
|
||||
void srcu_barrier(struct srcu_struct *sp);
|
||||
unsigned long srcu_batches_completed(struct srcu_struct *sp);
|
||||
|
||||
void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
struct srcu_struct *sp, int *flags,
|
||||
unsigned long *gpnum, unsigned long *completed);
|
||||
|
||||
#endif
|
||||
|
@ -742,6 +742,7 @@ TRACE_EVENT(rcu_torture_read,
|
||||
* "OnlineQ": _rcu_barrier() found online CPU with callbacks.
|
||||
* "OnlineNQ": _rcu_barrier() found online CPU, no callbacks.
|
||||
* "IRQ": An rcu_barrier_callback() callback posted on remote CPU.
|
||||
* "IRQNQ": An rcu_barrier_callback() callback found no callbacks.
|
||||
* "CB": An rcu_barrier_callback() invoked a callback, not the last.
|
||||
* "LastCB": An rcu_barrier_callback() invoked the last callback.
|
||||
* "Inc2": _rcu_barrier() piggyback check counter incremented.
|
||||
|
349
init/Kconfig
349
init/Kconfig
@ -472,354 +472,7 @@ config TASK_IO_ACCOUNTING
|
||||
|
||||
endmenu # "CPU/Task time and stats accounting"
|
||||
|
||||
menu "RCU Subsystem"
|
||||
|
||||
config TREE_RCU
|
||||
bool
|
||||
default y if !PREEMPT && SMP
|
||||
help
|
||||
This option selects the RCU implementation that is
|
||||
designed for very large SMP system with hundreds or
|
||||
thousands of CPUs. It also scales down nicely to
|
||||
smaller systems.
|
||||
|
||||
config PREEMPT_RCU
|
||||
bool
|
||||
default y if PREEMPT
|
||||
help
|
||||
This option selects the RCU implementation that is
|
||||
designed for very large SMP systems with hundreds or
|
||||
thousands of CPUs, but for which real-time response
|
||||
is also required. It also scales down nicely to
|
||||
smaller systems.
|
||||
|
||||
Select this option if you are unsure.
|
||||
|
||||
config TINY_RCU
|
||||
bool
|
||||
default y if !PREEMPT && !SMP
|
||||
help
|
||||
This option selects the RCU implementation that is
|
||||
designed for UP systems from which real-time response
|
||||
is not required. This option greatly reduces the
|
||||
memory footprint of RCU.
|
||||
|
||||
config RCU_EXPERT
|
||||
bool "Make expert-level adjustments to RCU configuration"
|
||||
default n
|
||||
help
|
||||
This option needs to be enabled if you wish to make
|
||||
expert-level adjustments to RCU configuration. By default,
|
||||
no such adjustments can be made, which has the often-beneficial
|
||||
side-effect of preventing "make oldconfig" from asking you all
|
||||
sorts of detailed questions about how you would like numerous
|
||||
obscure RCU options to be set up.
|
||||
|
||||
Say Y if you need to make expert-level adjustments to RCU.
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config SRCU
|
||||
bool
|
||||
default y
|
||||
help
|
||||
This option selects the sleepable version of RCU. This version
|
||||
permits arbitrary sleeping or blocking within RCU read-side critical
|
||||
sections.
|
||||
|
||||
config CLASSIC_SRCU
|
||||
bool "Use v4.11 classic SRCU implementation"
|
||||
default n
|
||||
depends on RCU_EXPERT && SRCU
|
||||
help
|
||||
This option selects the traditional well-tested classic SRCU
|
||||
implementation from v4.11, as might be desired for enterprise
|
||||
Linux distributions. Without this option, the shiny new
|
||||
Tiny SRCU and Tree SRCU implementations are used instead.
|
||||
At some point, it is hoped that Tiny SRCU and Tree SRCU
|
||||
will accumulate enough test time and confidence to allow
|
||||
Classic SRCU to be dropped entirely.
|
||||
|
||||
Say Y if you need a rock-solid SRCU.
|
||||
|
||||
Say N if you would like help test Tree SRCU.
|
||||
|
||||
config TINY_SRCU
|
||||
bool
|
||||
default y if SRCU && TINY_RCU && !CLASSIC_SRCU
|
||||
help
|
||||
This option selects the single-CPU non-preemptible version of SRCU.
|
||||
|
||||
config TREE_SRCU
|
||||
bool
|
||||
default y if SRCU && !TINY_RCU && !CLASSIC_SRCU
|
||||
help
|
||||
This option selects the full-fledged version of SRCU.
|
||||
|
||||
config TASKS_RCU
|
||||
bool
|
||||
default n
|
||||
select SRCU
|
||||
help
|
||||
This option enables a task-based RCU implementation that uses
|
||||
only voluntary context switch (not preemption!), idle, and
|
||||
user-mode execution as quiescent states.
|
||||
|
||||
config RCU_STALL_COMMON
|
||||
def_bool ( TREE_RCU || PREEMPT_RCU || RCU_TRACE )
|
||||
help
|
||||
This option enables RCU CPU stall code that is common between
|
||||
the TINY and TREE variants of RCU. The purpose is to allow
|
||||
the tiny variants to disable RCU CPU stall warnings, while
|
||||
making these warnings mandatory for the tree variants.
|
||||
|
||||
config RCU_NEED_SEGCBLIST
|
||||
def_bool ( TREE_RCU || PREEMPT_RCU || TINY_SRCU || TREE_SRCU )
|
||||
|
||||
config CONTEXT_TRACKING
|
||||
bool
|
||||
|
||||
config CONTEXT_TRACKING_FORCE
|
||||
bool "Force context tracking"
|
||||
depends on CONTEXT_TRACKING
|
||||
default y if !NO_HZ_FULL
|
||||
help
|
||||
The major pre-requirement for full dynticks to work is to
|
||||
support the context tracking subsystem. But there are also
|
||||
other dependencies to provide in order to make the full
|
||||
dynticks working.
|
||||
|
||||
This option stands for testing when an arch implements the
|
||||
context tracking backend but doesn't yet fullfill all the
|
||||
requirements to make the full dynticks feature working.
|
||||
Without the full dynticks, there is no way to test the support
|
||||
for context tracking and the subsystems that rely on it: RCU
|
||||
userspace extended quiescent state and tickless cputime
|
||||
accounting. This option copes with the absence of the full
|
||||
dynticks subsystem by forcing the context tracking on all
|
||||
CPUs in the system.
|
||||
|
||||
Say Y only if you're working on the development of an
|
||||
architecture backend for the context tracking.
|
||||
|
||||
Say N otherwise, this option brings an overhead that you
|
||||
don't want in production.
|
||||
|
||||
|
||||
config RCU_FANOUT
|
||||
int "Tree-based hierarchical RCU fanout value"
|
||||
range 2 64 if 64BIT
|
||||
range 2 32 if !64BIT
|
||||
depends on (TREE_RCU || PREEMPT_RCU) && RCU_EXPERT
|
||||
default 64 if 64BIT
|
||||
default 32 if !64BIT
|
||||
help
|
||||
This option controls the fanout of hierarchical implementations
|
||||
of RCU, allowing RCU to work efficiently on machines with
|
||||
large numbers of CPUs. This value must be at least the fourth
|
||||
root of NR_CPUS, which allows NR_CPUS to be insanely large.
|
||||
The default value of RCU_FANOUT should be used for production
|
||||
systems, but if you are stress-testing the RCU implementation
|
||||
itself, small RCU_FANOUT values allow you to test large-system
|
||||
code paths on small(er) systems.
|
||||
|
||||
Select a specific number if testing RCU itself.
|
||||
Take the default if unsure.
|
||||
|
||||
config RCU_FANOUT_LEAF
|
||||
int "Tree-based hierarchical RCU leaf-level fanout value"
|
||||
range 2 64 if 64BIT
|
||||
range 2 32 if !64BIT
|
||||
depends on (TREE_RCU || PREEMPT_RCU) && RCU_EXPERT
|
||||
default 16
|
||||
help
|
||||
This option controls the leaf-level fanout of hierarchical
|
||||
implementations of RCU, and allows trading off cache misses
|
||||
against lock contention. Systems that synchronize their
|
||||
scheduling-clock interrupts for energy-efficiency reasons will
|
||||
want the default because the smaller leaf-level fanout keeps
|
||||
lock contention levels acceptably low. Very large systems
|
||||
(hundreds or thousands of CPUs) will instead want to set this
|
||||
value to the maximum value possible in order to reduce the
|
||||
number of cache misses incurred during RCU's grace-period
|
||||
initialization. These systems tend to run CPU-bound, and thus
|
||||
are not helped by synchronized interrupts, and thus tend to
|
||||
skew them, which reduces lock contention enough that large
|
||||
leaf-level fanouts work well. That said, setting leaf-level
|
||||
fanout to a large number will likely cause problematic
|
||||
lock contention on the leaf-level rcu_node structures unless
|
||||
you boot with the skew_tick kernel parameter.
|
||||
|
||||
Select a specific number if testing RCU itself.
|
||||
|
||||
Select the maximum permissible value for large systems, but
|
||||
please understand that you may also need to set the skew_tick
|
||||
kernel boot parameter to avoid contention on the rcu_node
|
||||
structure's locks.
|
||||
|
||||
Take the default if unsure.
|
||||
|
||||
config RCU_FAST_NO_HZ
|
||||
bool "Accelerate last non-dyntick-idle CPU's grace periods"
|
||||
depends on NO_HZ_COMMON && SMP && RCU_EXPERT
|
||||
default n
|
||||
help
|
||||
This option permits CPUs to enter dynticks-idle state even if
|
||||
they have RCU callbacks queued, and prevents RCU from waking
|
||||
these CPUs up more than roughly once every four jiffies (by
|
||||
default, you can adjust this using the rcutree.rcu_idle_gp_delay
|
||||
parameter), thus improving energy efficiency. On the other
|
||||
hand, this option increases the duration of RCU grace periods,
|
||||
for example, slowing down synchronize_rcu().
|
||||
|
||||
Say Y if energy efficiency is critically important, and you
|
||||
don't care about increased grace-period durations.
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config TREE_RCU_TRACE
|
||||
def_bool RCU_TRACE && ( TREE_RCU || PREEMPT_RCU )
|
||||
select DEBUG_FS
|
||||
help
|
||||
This option provides tracing for the TREE_RCU and
|
||||
PREEMPT_RCU implementations, permitting Makefile to
|
||||
trivially select kernel/rcutree_trace.c.
|
||||
|
||||
config RCU_BOOST
|
||||
bool "Enable RCU priority boosting"
|
||||
depends on RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT
|
||||
default n
|
||||
help
|
||||
This option boosts the priority of preempted RCU readers that
|
||||
block the current preemptible RCU grace period for too long.
|
||||
This option also prevents heavy loads from blocking RCU
|
||||
callback invocation for all flavors of RCU.
|
||||
|
||||
Say Y here if you are working with real-time apps or heavy loads
|
||||
Say N here if you are unsure.
|
||||
|
||||
config RCU_KTHREAD_PRIO
|
||||
int "Real-time priority to use for RCU worker threads"
|
||||
range 1 99 if RCU_BOOST
|
||||
range 0 99 if !RCU_BOOST
|
||||
default 1 if RCU_BOOST
|
||||
default 0 if !RCU_BOOST
|
||||
depends on RCU_EXPERT
|
||||
help
|
||||
This option specifies the SCHED_FIFO priority value that will be
|
||||
assigned to the rcuc/n and rcub/n threads and is also the value
|
||||
used for RCU_BOOST (if enabled). If you are working with a
|
||||
real-time application that has one or more CPU-bound threads
|
||||
running at a real-time priority level, you should set
|
||||
RCU_KTHREAD_PRIO to a priority higher than the highest-priority
|
||||
real-time CPU-bound application thread. The default RCU_KTHREAD_PRIO
|
||||
value of 1 is appropriate in the common case, which is real-time
|
||||
applications that do not have any CPU-bound threads.
|
||||
|
||||
Some real-time applications might not have a single real-time
|
||||
thread that saturates a given CPU, but instead might have
|
||||
multiple real-time threads that, taken together, fully utilize
|
||||
that CPU. In this case, you should set RCU_KTHREAD_PRIO to
|
||||
a priority higher than the lowest-priority thread that is
|
||||
conspiring to prevent the CPU from running any non-real-time
|
||||
tasks. For example, if one thread at priority 10 and another
|
||||
thread at priority 5 are between themselves fully consuming
|
||||
the CPU time on a given CPU, then RCU_KTHREAD_PRIO should be
|
||||
set to priority 6 or higher.
|
||||
|
||||
Specify the real-time priority, or take the default if unsure.
|
||||
|
||||
config RCU_BOOST_DELAY
|
||||
int "Milliseconds to delay boosting after RCU grace-period start"
|
||||
range 0 3000
|
||||
depends on RCU_BOOST
|
||||
default 500
|
||||
help
|
||||
This option specifies the time to wait after the beginning of
|
||||
a given grace period before priority-boosting preempted RCU
|
||||
readers blocking that grace period. Note that any RCU reader
|
||||
blocking an expedited RCU grace period is boosted immediately.
|
||||
|
||||
Accept the default if unsure.
|
||||
|
||||
config RCU_NOCB_CPU
|
||||
bool "Offload RCU callback processing from boot-selected CPUs"
|
||||
depends on TREE_RCU || PREEMPT_RCU
|
||||
depends on RCU_EXPERT || NO_HZ_FULL
|
||||
default n
|
||||
help
|
||||
Use this option to reduce OS jitter for aggressive HPC or
|
||||
real-time workloads. It can also be used to offload RCU
|
||||
callback invocation to energy-efficient CPUs in battery-powered
|
||||
asymmetric multiprocessors.
|
||||
|
||||
This option offloads callback invocation from the set of
|
||||
CPUs specified at boot time by the rcu_nocbs parameter.
|
||||
For each such CPU, a kthread ("rcuox/N") will be created to
|
||||
invoke callbacks, where the "N" is the CPU being offloaded,
|
||||
and where the "x" is "b" for RCU-bh, "p" for RCU-preempt, and
|
||||
"s" for RCU-sched. Nothing prevents this kthread from running
|
||||
on the specified CPUs, but (1) the kthreads may be preempted
|
||||
between each callback, and (2) affinity or cgroups can be used
|
||||
to force the kthreads to run on whatever set of CPUs is desired.
|
||||
|
||||
Say Y here if you want to help to debug reduced OS jitter.
|
||||
Say N here if you are unsure.
|
||||
|
||||
choice
|
||||
prompt "Build-forced no-CBs CPUs"
|
||||
default RCU_NOCB_CPU_NONE
|
||||
depends on RCU_NOCB_CPU
|
||||
help
|
||||
This option allows no-CBs CPUs (whose RCU callbacks are invoked
|
||||
from kthreads rather than from softirq context) to be specified
|
||||
at build time. Additional no-CBs CPUs may be specified by
|
||||
the rcu_nocbs= boot parameter.
|
||||
|
||||
config RCU_NOCB_CPU_NONE
|
||||
bool "No build_forced no-CBs CPUs"
|
||||
help
|
||||
This option does not force any of the CPUs to be no-CBs CPUs.
|
||||
Only CPUs designated by the rcu_nocbs= boot parameter will be
|
||||
no-CBs CPUs, whose RCU callbacks will be invoked by per-CPU
|
||||
kthreads whose names begin with "rcuo". All other CPUs will
|
||||
invoke their own RCU callbacks in softirq context.
|
||||
|
||||
Select this option if you want to choose no-CBs CPUs at
|
||||
boot time, for example, to allow testing of different no-CBs
|
||||
configurations without having to rebuild the kernel each time.
|
||||
|
||||
config RCU_NOCB_CPU_ZERO
|
||||
bool "CPU 0 is a build_forced no-CBs CPU"
|
||||
help
|
||||
This option forces CPU 0 to be a no-CBs CPU, so that its RCU
|
||||
callbacks are invoked by a per-CPU kthread whose name begins
|
||||
with "rcuo". Additional CPUs may be designated as no-CBs
|
||||
CPUs using the rcu_nocbs= boot parameter will be no-CBs CPUs.
|
||||
All other CPUs will invoke their own RCU callbacks in softirq
|
||||
context.
|
||||
|
||||
Select this if CPU 0 needs to be a no-CBs CPU for real-time
|
||||
or energy-efficiency reasons, but the real reason it exists
|
||||
is to ensure that randconfig testing covers mixed systems.
|
||||
|
||||
config RCU_NOCB_CPU_ALL
|
||||
bool "All CPUs are build_forced no-CBs CPUs"
|
||||
help
|
||||
This option forces all CPUs to be no-CBs CPUs. The rcu_nocbs=
|
||||
boot parameter will be ignored. All CPUs' RCU callbacks will
|
||||
be executed in the context of per-CPU rcuo kthreads created for
|
||||
this purpose. Assuming that the kthreads whose names start with
|
||||
"rcuo" are bound to "housekeeping" CPUs, this reduces OS jitter
|
||||
on the remaining CPUs, but might decrease memory locality during
|
||||
RCU-callback invocation, thus potentially degrading throughput.
|
||||
|
||||
Select this if all CPUs need to be no-CBs CPUs for real-time
|
||||
or energy-efficiency reasons.
|
||||
|
||||
endchoice
|
||||
|
||||
endmenu # "RCU Subsystem"
|
||||
source "kernel/rcu/Kconfig"
|
||||
|
||||
config BUILD_BIN2C
|
||||
bool
|
||||
|
@ -1157,18 +1157,18 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
|
||||
if (debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("======================================================\n");
|
||||
pr_warn("WARNING: possible circular locking dependency detected\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("------------------------------------------------------\n");
|
||||
printk("%s/%d is trying to acquire lock:\n",
|
||||
pr_warn("%s/%d is trying to acquire lock:\n",
|
||||
curr->comm, task_pid_nr(curr));
|
||||
print_lock(check_src);
|
||||
printk("\nbut task is already holding lock:\n");
|
||||
pr_warn("\nbut task is already holding lock:\n");
|
||||
print_lock(check_tgt);
|
||||
printk("\nwhich lock already depends on the new lock.\n\n");
|
||||
printk("\nthe existing dependency chain (in reverse order) is:\n");
|
||||
pr_warn("\nwhich lock already depends on the new lock.\n\n");
|
||||
pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
|
||||
|
||||
print_circular_bug_entry(entry, depth);
|
||||
|
||||
@ -1495,13 +1495,13 @@ print_bad_irq_dependency(struct task_struct *curr,
|
||||
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("=====================================================\n");
|
||||
pr_warn("WARNING: %s-safe -> %s-unsafe lock order detected\n",
|
||||
irqclass, irqclass);
|
||||
print_kernel_ident();
|
||||
pr_warn("-----------------------------------------------------\n");
|
||||
printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
|
||||
pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
|
||||
curr->comm, task_pid_nr(curr),
|
||||
curr->hardirq_context, hardirq_count() >> HARDIRQ_SHIFT,
|
||||
curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
|
||||
@ -1509,46 +1509,46 @@ print_bad_irq_dependency(struct task_struct *curr,
|
||||
curr->softirqs_enabled);
|
||||
print_lock(next);
|
||||
|
||||
printk("\nand this task is already holding:\n");
|
||||
pr_warn("\nand this task is already holding:\n");
|
||||
print_lock(prev);
|
||||
printk("which would create a new lock dependency:\n");
|
||||
pr_warn("which would create a new lock dependency:\n");
|
||||
print_lock_name(hlock_class(prev));
|
||||
printk(KERN_CONT " ->");
|
||||
pr_cont(" ->");
|
||||
print_lock_name(hlock_class(next));
|
||||
printk(KERN_CONT "\n");
|
||||
pr_cont("\n");
|
||||
|
||||
printk("\nbut this new dependency connects a %s-irq-safe lock:\n",
|
||||
pr_warn("\nbut this new dependency connects a %s-irq-safe lock:\n",
|
||||
irqclass);
|
||||
print_lock_name(backwards_entry->class);
|
||||
printk("\n... which became %s-irq-safe at:\n", irqclass);
|
||||
pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
|
||||
|
||||
print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
|
||||
|
||||
printk("\nto a %s-irq-unsafe lock:\n", irqclass);
|
||||
pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
|
||||
print_lock_name(forwards_entry->class);
|
||||
printk("\n... which became %s-irq-unsafe at:\n", irqclass);
|
||||
printk("...");
|
||||
pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
|
||||
pr_warn("...");
|
||||
|
||||
print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
|
||||
|
||||
printk("\nother info that might help us debug this:\n\n");
|
||||
pr_warn("\nother info that might help us debug this:\n\n");
|
||||
print_irq_lock_scenario(backwards_entry, forwards_entry,
|
||||
hlock_class(prev), hlock_class(next));
|
||||
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass);
|
||||
pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass);
|
||||
if (!save_trace(&prev_root->trace))
|
||||
return 0;
|
||||
print_shortest_lock_dependencies(backwards_entry, prev_root);
|
||||
|
||||
printk("\nthe dependencies between the lock to be acquired");
|
||||
printk(" and %s-irq-unsafe lock:\n", irqclass);
|
||||
pr_warn("\nthe dependencies between the lock to be acquired");
|
||||
pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
|
||||
if (!save_trace(&next_root->trace))
|
||||
return 0;
|
||||
print_shortest_lock_dependencies(forwards_entry, next_root);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
return 0;
|
||||
@ -1724,22 +1724,22 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
|
||||
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("============================================\n");
|
||||
pr_warn("WARNING: possible recursive locking detected\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("--------------------------------------------\n");
|
||||
printk("%s/%d is trying to acquire lock:\n",
|
||||
pr_warn("%s/%d is trying to acquire lock:\n",
|
||||
curr->comm, task_pid_nr(curr));
|
||||
print_lock(next);
|
||||
printk("\nbut task is already holding lock:\n");
|
||||
pr_warn("\nbut task is already holding lock:\n");
|
||||
print_lock(prev);
|
||||
|
||||
printk("\nother info that might help us debug this:\n");
|
||||
pr_warn("\nother info that might help us debug this:\n");
|
||||
print_deadlock_scenario(next, prev);
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
return 0;
|
||||
@ -2074,21 +2074,21 @@ static void print_collision(struct task_struct *curr,
|
||||
struct held_lock *hlock_next,
|
||||
struct lock_chain *chain)
|
||||
{
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("============================\n");
|
||||
pr_warn("WARNING: chain_key collision\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("----------------------------\n");
|
||||
printk("%s/%d: ", current->comm, task_pid_nr(current));
|
||||
printk("Hash chain already cached but the contents don't match!\n");
|
||||
pr_warn("%s/%d: ", current->comm, task_pid_nr(current));
|
||||
pr_warn("Hash chain already cached but the contents don't match!\n");
|
||||
|
||||
printk("Held locks:");
|
||||
pr_warn("Held locks:");
|
||||
print_chain_keys_held_locks(curr, hlock_next);
|
||||
|
||||
printk("Locks in cached chain:");
|
||||
pr_warn("Locks in cached chain:");
|
||||
print_chain_keys_chain(chain);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
}
|
||||
#endif
|
||||
@ -2373,16 +2373,16 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
|
||||
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("================================\n");
|
||||
pr_warn("WARNING: inconsistent lock state\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("--------------------------------\n");
|
||||
|
||||
printk("inconsistent {%s} -> {%s} usage.\n",
|
||||
pr_warn("inconsistent {%s} -> {%s} usage.\n",
|
||||
usage_str[prev_bit], usage_str[new_bit]);
|
||||
|
||||
printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] takes:\n",
|
||||
pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] takes:\n",
|
||||
curr->comm, task_pid_nr(curr),
|
||||
trace_hardirq_context(curr), hardirq_count() >> HARDIRQ_SHIFT,
|
||||
trace_softirq_context(curr), softirq_count() >> SOFTIRQ_SHIFT,
|
||||
@ -2390,16 +2390,16 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
|
||||
trace_softirqs_enabled(curr));
|
||||
print_lock(this);
|
||||
|
||||
printk("{%s} state was registered at:\n", usage_str[prev_bit]);
|
||||
pr_warn("{%s} state was registered at:\n", usage_str[prev_bit]);
|
||||
print_stack_trace(hlock_class(this)->usage_traces + prev_bit, 1);
|
||||
|
||||
print_irqtrace_events(curr);
|
||||
printk("\nother info that might help us debug this:\n");
|
||||
pr_warn("\nother info that might help us debug this:\n");
|
||||
print_usage_bug_scenario(this);
|
||||
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
return 0;
|
||||
@ -2438,28 +2438,28 @@ print_irq_inversion_bug(struct task_struct *curr,
|
||||
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("========================================================\n");
|
||||
pr_warn("WARNING: possible irq lock inversion dependency detected\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("--------------------------------------------------------\n");
|
||||
printk("%s/%d just changed the state of lock:\n",
|
||||
pr_warn("%s/%d just changed the state of lock:\n",
|
||||
curr->comm, task_pid_nr(curr));
|
||||
print_lock(this);
|
||||
if (forwards)
|
||||
printk("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
|
||||
pr_warn("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
|
||||
else
|
||||
printk("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
|
||||
pr_warn("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
|
||||
print_lock_name(other->class);
|
||||
printk("\n\nand interrupts could create inverse lock ordering between them.\n\n");
|
||||
pr_warn("\n\nand interrupts could create inverse lock ordering between them.\n\n");
|
||||
|
||||
printk("\nother info that might help us debug this:\n");
|
||||
pr_warn("\nother info that might help us debug this:\n");
|
||||
|
||||
/* Find a middle lock (if one exists) */
|
||||
depth = get_lock_depth(other);
|
||||
do {
|
||||
if (depth == 0 && (entry != root)) {
|
||||
printk("lockdep:%s bad path found in chain graph\n", __func__);
|
||||
pr_warn("lockdep:%s bad path found in chain graph\n", __func__);
|
||||
break;
|
||||
}
|
||||
middle = entry;
|
||||
@ -2475,12 +2475,12 @@ print_irq_inversion_bug(struct task_struct *curr,
|
||||
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
|
||||
pr_warn("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
|
||||
if (!save_trace(&root->trace))
|
||||
return 0;
|
||||
print_shortest_lock_dependencies(other, root);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
return 0;
|
||||
@ -3189,25 +3189,25 @@ print_lock_nested_lock_not_held(struct task_struct *curr,
|
||||
if (debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("==================================\n");
|
||||
pr_warn("WARNING: Nested lock was not taken\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("----------------------------------\n");
|
||||
|
||||
printk("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
|
||||
pr_warn("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
|
||||
print_lock(hlock);
|
||||
|
||||
printk("\nbut this task is not holding:\n");
|
||||
printk("%s\n", hlock->nest_lock->name);
|
||||
pr_warn("\nbut this task is not holding:\n");
|
||||
pr_warn("%s\n", hlock->nest_lock->name);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
printk("\nother info that might help us debug this:\n");
|
||||
pr_warn("\nother info that might help us debug this:\n");
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
return 0;
|
||||
@ -3402,21 +3402,21 @@ print_unlock_imbalance_bug(struct task_struct *curr, struct lockdep_map *lock,
|
||||
if (debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("=====================================\n");
|
||||
pr_warn("WARNING: bad unlock balance detected!\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("-------------------------------------\n");
|
||||
printk("%s/%d is trying to release lock (",
|
||||
pr_warn("%s/%d is trying to release lock (",
|
||||
curr->comm, task_pid_nr(curr));
|
||||
print_lockdep_cache(lock);
|
||||
printk(KERN_CONT ") at:\n");
|
||||
pr_cont(") at:\n");
|
||||
print_ip_sym(ip);
|
||||
printk("but there are no more locks to release!\n");
|
||||
printk("\nother info that might help us debug this:\n");
|
||||
pr_warn("but there are no more locks to release!\n");
|
||||
pr_warn("\nother info that might help us debug this:\n");
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
return 0;
|
||||
@ -3974,21 +3974,21 @@ print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock,
|
||||
if (debug_locks_silent)
|
||||
return 0;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("=================================\n");
|
||||
pr_warn("WARNING: bad contention detected!\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("---------------------------------\n");
|
||||
printk("%s/%d is trying to contend lock (",
|
||||
pr_warn("%s/%d is trying to contend lock (",
|
||||
curr->comm, task_pid_nr(curr));
|
||||
print_lockdep_cache(lock);
|
||||
printk(KERN_CONT ") at:\n");
|
||||
pr_cont(") at:\n");
|
||||
print_ip_sym(ip);
|
||||
printk("but there are no locks held!\n");
|
||||
printk("\nother info that might help us debug this:\n");
|
||||
pr_warn("but there are no locks held!\n");
|
||||
pr_warn("\nother info that might help us debug this:\n");
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
|
||||
return 0;
|
||||
@ -4318,17 +4318,17 @@ print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
|
||||
if (debug_locks_silent)
|
||||
return;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("=========================\n");
|
||||
pr_warn("WARNING: held lock freed!\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("-------------------------\n");
|
||||
printk("%s/%d is freeing memory %p-%p, with a lock still held there!\n",
|
||||
pr_warn("%s/%d is freeing memory %p-%p, with a lock still held there!\n",
|
||||
curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
|
||||
print_lock(hlock);
|
||||
lockdep_print_held_locks(curr);
|
||||
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
}
|
||||
|
||||
@ -4376,14 +4376,14 @@ static void print_held_locks_bug(void)
|
||||
if (debug_locks_silent)
|
||||
return;
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("====================================\n");
|
||||
pr_warn("WARNING: %s/%d still has locks held!\n",
|
||||
current->comm, task_pid_nr(current));
|
||||
print_kernel_ident();
|
||||
pr_warn("------------------------------------\n");
|
||||
lockdep_print_held_locks(current);
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
}
|
||||
|
||||
@ -4402,10 +4402,10 @@ void debug_show_all_locks(void)
|
||||
int unlock = 1;
|
||||
|
||||
if (unlikely(!debug_locks)) {
|
||||
printk("INFO: lockdep is turned off.\n");
|
||||
pr_warn("INFO: lockdep is turned off.\n");
|
||||
return;
|
||||
}
|
||||
printk("\nShowing all locks held in the system:\n");
|
||||
pr_warn("\nShowing all locks held in the system:\n");
|
||||
|
||||
/*
|
||||
* Here we try to get the tasklist_lock as hard as possible,
|
||||
@ -4416,18 +4416,18 @@ void debug_show_all_locks(void)
|
||||
retry:
|
||||
if (!read_trylock(&tasklist_lock)) {
|
||||
if (count == 10)
|
||||
printk("hm, tasklist_lock locked, retrying... ");
|
||||
pr_warn("hm, tasklist_lock locked, retrying... ");
|
||||
if (count) {
|
||||
count--;
|
||||
printk(" #%d", 10-count);
|
||||
pr_cont(" #%d", 10-count);
|
||||
mdelay(200);
|
||||
goto retry;
|
||||
}
|
||||
printk(" ignoring it.\n");
|
||||
pr_cont(" ignoring it.\n");
|
||||
unlock = 0;
|
||||
} else {
|
||||
if (count != 10)
|
||||
printk(KERN_CONT " locked it.\n");
|
||||
pr_cont(" locked it.\n");
|
||||
}
|
||||
|
||||
do_each_thread(g, p) {
|
||||
@ -4445,7 +4445,7 @@ retry:
|
||||
unlock = 1;
|
||||
} while_each_thread(g, p);
|
||||
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("=============================================\n\n");
|
||||
|
||||
if (unlock)
|
||||
@ -4475,12 +4475,12 @@ asmlinkage __visible void lockdep_sys_exit(void)
|
||||
if (unlikely(curr->lockdep_depth)) {
|
||||
if (!debug_locks_off())
|
||||
return;
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("================================================\n");
|
||||
pr_warn("WARNING: lock held when returning to user space!\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("------------------------------------------------\n");
|
||||
printk("%s/%d is leaving the kernel with locks still held!\n",
|
||||
pr_warn("%s/%d is leaving the kernel with locks still held!\n",
|
||||
curr->comm, curr->pid);
|
||||
lockdep_print_held_locks(curr);
|
||||
}
|
||||
@ -4490,19 +4490,15 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
|
||||
{
|
||||
struct task_struct *curr = current;
|
||||
|
||||
#ifndef CONFIG_PROVE_RCU_REPEATEDLY
|
||||
if (!debug_locks_off())
|
||||
return;
|
||||
#endif /* #ifdef CONFIG_PROVE_RCU_REPEATEDLY */
|
||||
/* Note: the following can be executed concurrently, so be careful. */
|
||||
printk("\n");
|
||||
pr_warn("\n");
|
||||
pr_warn("=============================\n");
|
||||
pr_warn("WARNING: suspicious RCU usage\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("-----------------------------\n");
|
||||
printk("%s:%d %s!\n", file, line, s);
|
||||
printk("\nother info that might help us debug this:\n\n");
|
||||
printk("\n%srcu_scheduler_active = %d, debug_locks = %d\n",
|
||||
pr_warn("%s:%d %s!\n", file, line, s);
|
||||
pr_warn("\nother info that might help us debug this:\n\n");
|
||||
pr_warn("\n%srcu_scheduler_active = %d, debug_locks = %d\n",
|
||||
!rcu_lockdep_current_cpu_online()
|
||||
? "RCU used illegally from offline CPU!\n"
|
||||
: !rcu_is_watching()
|
||||
@ -4529,10 +4525,10 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
|
||||
* rcu_read_lock_bh() and so on from extended quiescent states.
|
||||
*/
|
||||
if (!rcu_is_watching())
|
||||
printk("RCU used illegally from extended quiescent state!\n");
|
||||
pr_warn("RCU used illegally from extended quiescent state!\n");
|
||||
|
||||
lockdep_print_held_locks(curr);
|
||||
printk("\nstack backtrace:\n");
|
||||
pr_warn("\nstack backtrace:\n");
|
||||
dump_stack();
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
|
||||
|
242
kernel/rcu/Kconfig
Normal file
242
kernel/rcu/Kconfig
Normal file
@ -0,0 +1,242 @@
|
||||
#
|
||||
# RCU-related configuration options
|
||||
#
|
||||
|
||||
menu "RCU Subsystem"
|
||||
|
||||
config TREE_RCU
|
||||
bool
|
||||
default y if !PREEMPT && SMP
|
||||
help
|
||||
This option selects the RCU implementation that is
|
||||
designed for very large SMP system with hundreds or
|
||||
thousands of CPUs. It also scales down nicely to
|
||||
smaller systems.
|
||||
|
||||
config PREEMPT_RCU
|
||||
bool
|
||||
default y if PREEMPT
|
||||
help
|
||||
This option selects the RCU implementation that is
|
||||
designed for very large SMP systems with hundreds or
|
||||
thousands of CPUs, but for which real-time response
|
||||
is also required. It also scales down nicely to
|
||||
smaller systems.
|
||||
|
||||
Select this option if you are unsure.
|
||||
|
||||
config TINY_RCU
|
||||
bool
|
||||
default y if !PREEMPT && !SMP
|
||||
help
|
||||
This option selects the RCU implementation that is
|
||||
designed for UP systems from which real-time response
|
||||
is not required. This option greatly reduces the
|
||||
memory footprint of RCU.
|
||||
|
||||
config RCU_EXPERT
|
||||
bool "Make expert-level adjustments to RCU configuration"
|
||||
default n
|
||||
help
|
||||
This option needs to be enabled if you wish to make
|
||||
expert-level adjustments to RCU configuration. By default,
|
||||
no such adjustments can be made, which has the often-beneficial
|
||||
side-effect of preventing "make oldconfig" from asking you all
|
||||
sorts of detailed questions about how you would like numerous
|
||||
obscure RCU options to be set up.
|
||||
|
||||
Say Y if you need to make expert-level adjustments to RCU.
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config SRCU
|
||||
bool
|
||||
help
|
||||
This option selects the sleepable version of RCU. This version
|
||||
permits arbitrary sleeping or blocking within RCU read-side critical
|
||||
sections.
|
||||
|
||||
config TINY_SRCU
|
||||
bool
|
||||
default y if SRCU && TINY_RCU
|
||||
help
|
||||
This option selects the single-CPU non-preemptible version of SRCU.
|
||||
|
||||
config TREE_SRCU
|
||||
bool
|
||||
default y if SRCU && !TINY_RCU
|
||||
help
|
||||
This option selects the full-fledged version of SRCU.
|
||||
|
||||
config TASKS_RCU
|
||||
bool
|
||||
default n
|
||||
select SRCU
|
||||
help
|
||||
This option enables a task-based RCU implementation that uses
|
||||
only voluntary context switch (not preemption!), idle, and
|
||||
user-mode execution as quiescent states.
|
||||
|
||||
config RCU_STALL_COMMON
|
||||
def_bool ( TREE_RCU || PREEMPT_RCU )
|
||||
help
|
||||
This option enables RCU CPU stall code that is common between
|
||||
the TINY and TREE variants of RCU. The purpose is to allow
|
||||
the tiny variants to disable RCU CPU stall warnings, while
|
||||
making these warnings mandatory for the tree variants.
|
||||
|
||||
config RCU_NEED_SEGCBLIST
|
||||
def_bool ( TREE_RCU || PREEMPT_RCU || TREE_SRCU )
|
||||
|
||||
config CONTEXT_TRACKING
|
||||
bool
|
||||
|
||||
config CONTEXT_TRACKING_FORCE
|
||||
bool "Force context tracking"
|
||||
depends on CONTEXT_TRACKING
|
||||
default y if !NO_HZ_FULL
|
||||
help
|
||||
The major pre-requirement for full dynticks to work is to
|
||||
support the context tracking subsystem. But there are also
|
||||
other dependencies to provide in order to make the full
|
||||
dynticks working.
|
||||
|
||||
This option stands for testing when an arch implements the
|
||||
context tracking backend but doesn't yet fullfill all the
|
||||
requirements to make the full dynticks feature working.
|
||||
Without the full dynticks, there is no way to test the support
|
||||
for context tracking and the subsystems that rely on it: RCU
|
||||
userspace extended quiescent state and tickless cputime
|
||||
accounting. This option copes with the absence of the full
|
||||
dynticks subsystem by forcing the context tracking on all
|
||||
CPUs in the system.
|
||||
|
||||
Say Y only if you're working on the development of an
|
||||
architecture backend for the context tracking.
|
||||
|
||||
Say N otherwise, this option brings an overhead that you
|
||||
don't want in production.
|
||||
|
||||
|
||||
config RCU_FANOUT
|
||||
int "Tree-based hierarchical RCU fanout value"
|
||||
range 2 64 if 64BIT
|
||||
range 2 32 if !64BIT
|
||||
depends on (TREE_RCU || PREEMPT_RCU) && RCU_EXPERT
|
||||
default 64 if 64BIT
|
||||
default 32 if !64BIT
|
||||
help
|
||||
This option controls the fanout of hierarchical implementations
|
||||
of RCU, allowing RCU to work efficiently on machines with
|
||||
large numbers of CPUs. This value must be at least the fourth
|
||||
root of NR_CPUS, which allows NR_CPUS to be insanely large.
|
||||
The default value of RCU_FANOUT should be used for production
|
||||
systems, but if you are stress-testing the RCU implementation
|
||||
itself, small RCU_FANOUT values allow you to test large-system
|
||||
code paths on small(er) systems.
|
||||
|
||||
Select a specific number if testing RCU itself.
|
||||
Take the default if unsure.
|
||||
|
||||
config RCU_FANOUT_LEAF
|
||||
int "Tree-based hierarchical RCU leaf-level fanout value"
|
||||
range 2 64 if 64BIT
|
||||
range 2 32 if !64BIT
|
||||
depends on (TREE_RCU || PREEMPT_RCU) && RCU_EXPERT
|
||||
default 16
|
||||
help
|
||||
This option controls the leaf-level fanout of hierarchical
|
||||
implementations of RCU, and allows trading off cache misses
|
||||
against lock contention. Systems that synchronize their
|
||||
scheduling-clock interrupts for energy-efficiency reasons will
|
||||
want the default because the smaller leaf-level fanout keeps
|
||||
lock contention levels acceptably low. Very large systems
|
||||
(hundreds or thousands of CPUs) will instead want to set this
|
||||
value to the maximum value possible in order to reduce the
|
||||
number of cache misses incurred during RCU's grace-period
|
||||
initialization. These systems tend to run CPU-bound, and thus
|
||||
are not helped by synchronized interrupts, and thus tend to
|
||||
skew them, which reduces lock contention enough that large
|
||||
leaf-level fanouts work well. That said, setting leaf-level
|
||||
fanout to a large number will likely cause problematic
|
||||
lock contention on the leaf-level rcu_node structures unless
|
||||
you boot with the skew_tick kernel parameter.
|
||||
|
||||
Select a specific number if testing RCU itself.
|
||||
|
||||
Select the maximum permissible value for large systems, but
|
||||
please understand that you may also need to set the skew_tick
|
||||
kernel boot parameter to avoid contention on the rcu_node
|
||||
structure's locks.
|
||||
|
||||
Take the default if unsure.
|
||||
|
||||
config RCU_FAST_NO_HZ
|
||||
bool "Accelerate last non-dyntick-idle CPU's grace periods"
|
||||
depends on NO_HZ_COMMON && SMP && RCU_EXPERT
|
||||
default n
|
||||
help
|
||||
This option permits CPUs to enter dynticks-idle state even if
|
||||
they have RCU callbacks queued, and prevents RCU from waking
|
||||
these CPUs up more than roughly once every four jiffies (by
|
||||
default, you can adjust this using the rcutree.rcu_idle_gp_delay
|
||||
parameter), thus improving energy efficiency. On the other
|
||||
hand, this option increases the duration of RCU grace periods,
|
||||
for example, slowing down synchronize_rcu().
|
||||
|
||||
Say Y if energy efficiency is critically important, and you
|
||||
don't care about increased grace-period durations.
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config RCU_BOOST
|
||||
bool "Enable RCU priority boosting"
|
||||
depends on RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT
|
||||
default n
|
||||
help
|
||||
This option boosts the priority of preempted RCU readers that
|
||||
block the current preemptible RCU grace period for too long.
|
||||
This option also prevents heavy loads from blocking RCU
|
||||
callback invocation for all flavors of RCU.
|
||||
|
||||
Say Y here if you are working with real-time apps or heavy loads
|
||||
Say N here if you are unsure.
|
||||
|
||||
config RCU_BOOST_DELAY
|
||||
int "Milliseconds to delay boosting after RCU grace-period start"
|
||||
range 0 3000
|
||||
depends on RCU_BOOST
|
||||
default 500
|
||||
help
|
||||
This option specifies the time to wait after the beginning of
|
||||
a given grace period before priority-boosting preempted RCU
|
||||
readers blocking that grace period. Note that any RCU reader
|
||||
blocking an expedited RCU grace period is boosted immediately.
|
||||
|
||||
Accept the default if unsure.
|
||||
|
||||
config RCU_NOCB_CPU
|
||||
bool "Offload RCU callback processing from boot-selected CPUs"
|
||||
depends on TREE_RCU || PREEMPT_RCU
|
||||
depends on RCU_EXPERT || NO_HZ_FULL
|
||||
default n
|
||||
help
|
||||
Use this option to reduce OS jitter for aggressive HPC or
|
||||
real-time workloads. It can also be used to offload RCU
|
||||
callback invocation to energy-efficient CPUs in battery-powered
|
||||
asymmetric multiprocessors.
|
||||
|
||||
This option offloads callback invocation from the set of
|
||||
CPUs specified at boot time by the rcu_nocbs parameter.
|
||||
For each such CPU, a kthread ("rcuox/N") will be created to
|
||||
invoke callbacks, where the "N" is the CPU being offloaded,
|
||||
and where the "x" is "b" for RCU-bh, "p" for RCU-preempt, and
|
||||
"s" for RCU-sched. Nothing prevents this kthread from running
|
||||
on the specified CPUs, but (1) the kthreads may be preempted
|
||||
between each callback, and (2) affinity or cgroups can be used
|
||||
to force the kthreads to run on whatever set of CPUs is desired.
|
||||
|
||||
Say Y here if you want to help to debug reduced OS jitter.
|
||||
Say N here if you are unsure.
|
||||
|
||||
endmenu # "RCU Subsystem"
|
82
kernel/rcu/Kconfig.debug
Normal file
82
kernel/rcu/Kconfig.debug
Normal file
@ -0,0 +1,82 @@
|
||||
#
|
||||
# RCU-related debugging configuration options
|
||||
#
|
||||
|
||||
menu "RCU Debugging"
|
||||
|
||||
config PROVE_RCU
|
||||
def_bool PROVE_LOCKING
|
||||
|
||||
config TORTURE_TEST
|
||||
tristate
|
||||
default n
|
||||
|
||||
config RCU_PERF_TEST
|
||||
tristate "performance tests for RCU"
|
||||
depends on DEBUG_KERNEL
|
||||
select TORTURE_TEST
|
||||
select SRCU
|
||||
select TASKS_RCU
|
||||
default n
|
||||
help
|
||||
This option provides a kernel module that runs performance
|
||||
tests on the RCU infrastructure. The kernel module may be built
|
||||
after the fact on the running kernel to be tested, if desired.
|
||||
|
||||
Say Y here if you want RCU performance tests to be built into
|
||||
the kernel.
|
||||
Say M if you want the RCU performance tests to build as a module.
|
||||
Say N if you are unsure.
|
||||
|
||||
config RCU_TORTURE_TEST
|
||||
tristate "torture tests for RCU"
|
||||
depends on DEBUG_KERNEL
|
||||
select TORTURE_TEST
|
||||
select SRCU
|
||||
select TASKS_RCU
|
||||
default n
|
||||
help
|
||||
This option provides a kernel module that runs torture tests
|
||||
on the RCU infrastructure. The kernel module may be built
|
||||
after the fact on the running kernel to be tested, if desired.
|
||||
|
||||
Say Y here if you want RCU torture tests to be built into
|
||||
the kernel.
|
||||
Say M if you want the RCU torture tests to build as a module.
|
||||
Say N if you are unsure.
|
||||
|
||||
config RCU_CPU_STALL_TIMEOUT
|
||||
int "RCU CPU stall timeout in seconds"
|
||||
depends on RCU_STALL_COMMON
|
||||
range 3 300
|
||||
default 21
|
||||
help
|
||||
If a given RCU grace period extends more than the specified
|
||||
number of seconds, a CPU stall warning is printed. If the
|
||||
RCU grace period persists, additional CPU stall warnings are
|
||||
printed at more widely spaced intervals.
|
||||
|
||||
config RCU_TRACE
|
||||
bool "Enable tracing for RCU"
|
||||
depends on DEBUG_KERNEL
|
||||
default y if TREE_RCU
|
||||
select TRACE_CLOCK
|
||||
help
|
||||
This option enables additional tracepoints for ftrace-style
|
||||
event tracing.
|
||||
|
||||
Say Y here if you want to enable RCU tracing
|
||||
Say N if you are unsure.
|
||||
|
||||
config RCU_EQS_DEBUG
|
||||
bool "Provide debugging asserts for adding NO_HZ support to an arch"
|
||||
depends on DEBUG_KERNEL
|
||||
help
|
||||
This option provides consistency checks in RCU's handling of
|
||||
NO_HZ. These checks have proven quite helpful in detecting
|
||||
bugs in arch-specific NO_HZ code.
|
||||
|
||||
Say N here if you need ultimate kernel/user switch latencies
|
||||
Say Y if you are unsure
|
||||
|
||||
endmenu # "RCU Debugging"
|
@ -3,13 +3,11 @@
|
||||
KCOV_INSTRUMENT := n
|
||||
|
||||
obj-y += update.o sync.o
|
||||
obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
|
||||
obj-$(CONFIG_TREE_SRCU) += srcutree.o
|
||||
obj-$(CONFIG_TINY_SRCU) += srcutiny.o
|
||||
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
|
||||
obj-$(CONFIG_RCU_PERF_TEST) += rcuperf.o
|
||||
obj-$(CONFIG_TREE_RCU) += tree.o
|
||||
obj-$(CONFIG_PREEMPT_RCU) += tree.o
|
||||
obj-$(CONFIG_TREE_RCU_TRACE) += tree_trace.o
|
||||
obj-$(CONFIG_TINY_RCU) += tiny.o
|
||||
obj-$(CONFIG_RCU_NEED_SEGCBLIST) += rcu_segcblist.o
|
||||
|
277
kernel/rcu/rcu.h
277
kernel/rcu/rcu.h
@ -212,6 +212,18 @@ int rcu_jiffies_till_stall_check(void);
|
||||
*/
|
||||
#define TPS(x) tracepoint_string(x)
|
||||
|
||||
/*
|
||||
* Dump the ftrace buffer, but only one time per callsite per boot.
|
||||
*/
|
||||
#define rcu_ftrace_dump(oops_dump_mode) \
|
||||
do { \
|
||||
static atomic_t ___rfd_beenhere = ATOMIC_INIT(0); \
|
||||
\
|
||||
if (!atomic_read(&___rfd_beenhere) && \
|
||||
!atomic_xchg(&___rfd_beenhere, 1)) \
|
||||
ftrace_dump(oops_dump_mode); \
|
||||
} while (0)
|
||||
|
||||
void rcu_early_boot_tests(void);
|
||||
void rcu_test_sync_prims(void);
|
||||
|
||||
@ -291,6 +303,271 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt)
|
||||
cpu <= rnp->grphi; \
|
||||
cpu = cpumask_next((cpu), cpu_possible_mask))
|
||||
|
||||
/*
|
||||
* Wrappers for the rcu_node::lock acquire and release.
|
||||
*
|
||||
* Because the rcu_nodes form a tree, the tree traversal locking will observe
|
||||
* different lock values, this in turn means that an UNLOCK of one level
|
||||
* followed by a LOCK of another level does not imply a full memory barrier;
|
||||
* and most importantly transitivity is lost.
|
||||
*
|
||||
* In order to restore full ordering between tree levels, augment the regular
|
||||
* lock acquire functions with smp_mb__after_unlock_lock().
|
||||
*
|
||||
* As ->lock of struct rcu_node is a __private field, therefore one should use
|
||||
* these wrappers rather than directly call raw_spin_{lock,unlock}* on ->lock.
|
||||
*/
|
||||
#define raw_spin_lock_rcu_node(p) \
|
||||
do { \
|
||||
raw_spin_lock(&ACCESS_PRIVATE(p, lock)); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
} while (0)
|
||||
|
||||
#define raw_spin_unlock_rcu_node(p) raw_spin_unlock(&ACCESS_PRIVATE(p, lock))
|
||||
|
||||
#define raw_spin_lock_irq_rcu_node(p) \
|
||||
do { \
|
||||
raw_spin_lock_irq(&ACCESS_PRIVATE(p, lock)); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
} while (0)
|
||||
|
||||
#define raw_spin_unlock_irq_rcu_node(p) \
|
||||
raw_spin_unlock_irq(&ACCESS_PRIVATE(p, lock))
|
||||
|
||||
#define raw_spin_lock_irqsave_rcu_node(p, flags) \
|
||||
do { \
|
||||
raw_spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
} while (0)
|
||||
|
||||
#define raw_spin_unlock_irqrestore_rcu_node(p, flags) \
|
||||
raw_spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags) \
|
||||
|
||||
#define raw_spin_trylock_rcu_node(p) \
|
||||
({ \
|
||||
bool ___locked = raw_spin_trylock(&ACCESS_PRIVATE(p, lock)); \
|
||||
\
|
||||
if (___locked) \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
___locked; \
|
||||
})
|
||||
|
||||
#endif /* #if defined(SRCU) || !defined(TINY_RCU) */
|
||||
|
||||
#ifdef CONFIG_TINY_RCU
|
||||
/* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
|
||||
static inline bool rcu_gp_is_normal(void) /* Internal RCU use. */
|
||||
{
|
||||
return true;
|
||||
}
|
||||
static inline bool rcu_gp_is_expedited(void) /* Internal RCU use. */
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void rcu_expedite_gp(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_unexpedite_gp(void)
|
||||
{
|
||||
}
|
||||
#else /* #ifdef CONFIG_TINY_RCU */
|
||||
bool rcu_gp_is_normal(void); /* Internal RCU use. */
|
||||
bool rcu_gp_is_expedited(void); /* Internal RCU use. */
|
||||
void rcu_expedite_gp(void);
|
||||
void rcu_unexpedite_gp(void);
|
||||
void rcupdate_announce_bootup_oddness(void);
|
||||
#endif /* #else #ifdef CONFIG_TINY_RCU */
|
||||
|
||||
#define RCU_SCHEDULER_INACTIVE 0
|
||||
#define RCU_SCHEDULER_INIT 1
|
||||
#define RCU_SCHEDULER_RUNNING 2
|
||||
|
||||
#ifdef CONFIG_TINY_RCU
|
||||
static inline void rcu_request_urgent_qs_task(struct task_struct *t) { }
|
||||
#else /* #ifdef CONFIG_TINY_RCU */
|
||||
void rcu_request_urgent_qs_task(struct task_struct *t);
|
||||
#endif /* #else #ifdef CONFIG_TINY_RCU */
|
||||
|
||||
enum rcutorture_type {
|
||||
RCU_FLAVOR,
|
||||
RCU_BH_FLAVOR,
|
||||
RCU_SCHED_FLAVOR,
|
||||
RCU_TASKS_FLAVOR,
|
||||
SRCU_FLAVOR,
|
||||
INVALID_RCU_FLAVOR
|
||||
};
|
||||
|
||||
#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU)
|
||||
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
|
||||
unsigned long *gpnum, unsigned long *completed);
|
||||
void rcutorture_record_test_transition(void);
|
||||
void rcutorture_record_progress(unsigned long vernum);
|
||||
void do_trace_rcu_torture_read(const char *rcutorturename,
|
||||
struct rcu_head *rhp,
|
||||
unsigned long secs,
|
||||
unsigned long c_old,
|
||||
unsigned long c);
|
||||
#else
|
||||
static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
int *flags,
|
||||
unsigned long *gpnum,
|
||||
unsigned long *completed)
|
||||
{
|
||||
*flags = 0;
|
||||
*gpnum = 0;
|
||||
*completed = 0;
|
||||
}
|
||||
static inline void rcutorture_record_test_transition(void)
|
||||
{
|
||||
}
|
||||
static inline void rcutorture_record_progress(unsigned long vernum)
|
||||
{
|
||||
}
|
||||
#ifdef CONFIG_RCU_TRACE
|
||||
void do_trace_rcu_torture_read(const char *rcutorturename,
|
||||
struct rcu_head *rhp,
|
||||
unsigned long secs,
|
||||
unsigned long c_old,
|
||||
unsigned long c);
|
||||
#else
|
||||
#define do_trace_rcu_torture_read(rcutorturename, rhp, secs, c_old, c) \
|
||||
do { } while (0)
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TINY_SRCU
|
||||
|
||||
static inline void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
struct srcu_struct *sp, int *flags,
|
||||
unsigned long *gpnum,
|
||||
unsigned long *completed)
|
||||
{
|
||||
if (test_type != SRCU_FLAVOR)
|
||||
return;
|
||||
*flags = 0;
|
||||
*completed = sp->srcu_idx;
|
||||
*gpnum = *completed;
|
||||
}
|
||||
|
||||
#elif defined(CONFIG_TREE_SRCU)
|
||||
|
||||
void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
struct srcu_struct *sp, int *flags,
|
||||
unsigned long *gpnum, unsigned long *completed);
|
||||
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TINY_RCU
|
||||
|
||||
/*
|
||||
* Return the number of grace periods started.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_started(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of bottom-half grace periods started.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_started_bh(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of sched grace periods started.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_started_sched(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_completed(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of bottom-half grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_completed_bh(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of sched grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_batches_completed_sched(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of expedited grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_exp_batches_completed(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the number of expedited sched grace periods completed.
|
||||
*/
|
||||
static inline unsigned long rcu_exp_batches_completed_sched(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline unsigned long srcu_batches_completed(struct srcu_struct *sp)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void rcu_force_quiescent_state(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_bh_force_quiescent_state(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rcu_sched_force_quiescent_state(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void show_rcu_gp_kthreads(void)
|
||||
{
|
||||
}
|
||||
|
||||
#else /* #ifdef CONFIG_TINY_RCU */
|
||||
extern unsigned long rcutorture_testseq;
|
||||
extern unsigned long rcutorture_vernum;
|
||||
unsigned long rcu_batches_started(void);
|
||||
unsigned long rcu_batches_started_bh(void);
|
||||
unsigned long rcu_batches_started_sched(void);
|
||||
unsigned long rcu_batches_completed(void);
|
||||
unsigned long rcu_batches_completed_bh(void);
|
||||
unsigned long rcu_batches_completed_sched(void);
|
||||
unsigned long rcu_exp_batches_completed(void);
|
||||
unsigned long rcu_exp_batches_completed_sched(void);
|
||||
unsigned long srcu_batches_completed(struct srcu_struct *sp);
|
||||
void show_rcu_gp_kthreads(void);
|
||||
void rcu_force_quiescent_state(void);
|
||||
void rcu_bh_force_quiescent_state(void);
|
||||
void rcu_sched_force_quiescent_state(void);
|
||||
#endif /* #else #ifdef CONFIG_TINY_RCU */
|
||||
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
bool rcu_is_nocb_cpu(int cpu);
|
||||
#else
|
||||
static inline bool rcu_is_nocb_cpu(int cpu) { return false; }
|
||||
#endif
|
||||
|
||||
#endif /* __LINUX_RCU_H */
|
||||
|
@ -48,6 +48,8 @@
|
||||
#include <linux/torture.h>
|
||||
#include <linux/vmalloc.h>
|
||||
|
||||
#include "rcu.h"
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.vnet.ibm.com>");
|
||||
|
||||
@ -59,12 +61,16 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.vnet.ibm.com>");
|
||||
#define VERBOSE_PERFOUT_ERRSTRING(s) \
|
||||
do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0)
|
||||
|
||||
torture_param(bool, gp_async, false, "Use asynchronous GP wait primitives");
|
||||
torture_param(int, gp_async_max, 1000, "Max # outstanding waits per reader");
|
||||
torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
|
||||
torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
|
||||
torture_param(int, nreaders, -1, "Number of RCU reader threads");
|
||||
torture_param(int, nreaders, 0, "Number of RCU reader threads");
|
||||
torture_param(int, nwriters, -1, "Number of RCU updater threads");
|
||||
torture_param(bool, shutdown, false, "Shutdown at end of performance tests.");
|
||||
torture_param(bool, shutdown, !IS_ENABLED(MODULE),
|
||||
"Shutdown at end of performance tests.");
|
||||
torture_param(bool, verbose, true, "Enable verbose debugging printk()s");
|
||||
torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable");
|
||||
|
||||
static char *perf_type = "rcu";
|
||||
module_param(perf_type, charp, 0444);
|
||||
@ -86,13 +92,16 @@ static u64 t_rcu_perf_writer_started;
|
||||
static u64 t_rcu_perf_writer_finished;
|
||||
static unsigned long b_rcu_perf_writer_started;
|
||||
static unsigned long b_rcu_perf_writer_finished;
|
||||
static DEFINE_PER_CPU(atomic_t, n_async_inflight);
|
||||
|
||||
static int rcu_perf_writer_state;
|
||||
#define RTWS_INIT 0
|
||||
#define RTWS_EXP_SYNC 1
|
||||
#define RTWS_SYNC 2
|
||||
#define RTWS_IDLE 2
|
||||
#define RTWS_STOPPING 3
|
||||
#define RTWS_ASYNC 1
|
||||
#define RTWS_BARRIER 2
|
||||
#define RTWS_EXP_SYNC 3
|
||||
#define RTWS_SYNC 4
|
||||
#define RTWS_IDLE 5
|
||||
#define RTWS_STOPPING 6
|
||||
|
||||
#define MAX_MEAS 10000
|
||||
#define MIN_MEAS 100
|
||||
@ -114,6 +123,8 @@ struct rcu_perf_ops {
|
||||
unsigned long (*started)(void);
|
||||
unsigned long (*completed)(void);
|
||||
unsigned long (*exp_completed)(void);
|
||||
void (*async)(struct rcu_head *head, rcu_callback_t func);
|
||||
void (*gp_barrier)(void);
|
||||
void (*sync)(void);
|
||||
void (*exp_sync)(void);
|
||||
const char *name;
|
||||
@ -153,6 +164,8 @@ static struct rcu_perf_ops rcu_ops = {
|
||||
.started = rcu_batches_started,
|
||||
.completed = rcu_batches_completed,
|
||||
.exp_completed = rcu_exp_batches_completed,
|
||||
.async = call_rcu,
|
||||
.gp_barrier = rcu_barrier,
|
||||
.sync = synchronize_rcu,
|
||||
.exp_sync = synchronize_rcu_expedited,
|
||||
.name = "rcu"
|
||||
@ -181,6 +194,8 @@ static struct rcu_perf_ops rcu_bh_ops = {
|
||||
.started = rcu_batches_started_bh,
|
||||
.completed = rcu_batches_completed_bh,
|
||||
.exp_completed = rcu_exp_batches_completed_sched,
|
||||
.async = call_rcu_bh,
|
||||
.gp_barrier = rcu_barrier_bh,
|
||||
.sync = synchronize_rcu_bh,
|
||||
.exp_sync = synchronize_rcu_bh_expedited,
|
||||
.name = "rcu_bh"
|
||||
@ -208,6 +223,16 @@ static unsigned long srcu_perf_completed(void)
|
||||
return srcu_batches_completed(srcu_ctlp);
|
||||
}
|
||||
|
||||
static void srcu_call_rcu(struct rcu_head *head, rcu_callback_t func)
|
||||
{
|
||||
call_srcu(srcu_ctlp, head, func);
|
||||
}
|
||||
|
||||
static void srcu_rcu_barrier(void)
|
||||
{
|
||||
srcu_barrier(srcu_ctlp);
|
||||
}
|
||||
|
||||
static void srcu_perf_synchronize(void)
|
||||
{
|
||||
synchronize_srcu(srcu_ctlp);
|
||||
@ -226,11 +251,42 @@ static struct rcu_perf_ops srcu_ops = {
|
||||
.started = NULL,
|
||||
.completed = srcu_perf_completed,
|
||||
.exp_completed = srcu_perf_completed,
|
||||
.async = srcu_call_rcu,
|
||||
.gp_barrier = srcu_rcu_barrier,
|
||||
.sync = srcu_perf_synchronize,
|
||||
.exp_sync = srcu_perf_synchronize_expedited,
|
||||
.name = "srcu"
|
||||
};
|
||||
|
||||
static struct srcu_struct srcud;
|
||||
|
||||
static void srcu_sync_perf_init(void)
|
||||
{
|
||||
srcu_ctlp = &srcud;
|
||||
init_srcu_struct(srcu_ctlp);
|
||||
}
|
||||
|
||||
static void srcu_sync_perf_cleanup(void)
|
||||
{
|
||||
cleanup_srcu_struct(srcu_ctlp);
|
||||
}
|
||||
|
||||
static struct rcu_perf_ops srcud_ops = {
|
||||
.ptype = SRCU_FLAVOR,
|
||||
.init = srcu_sync_perf_init,
|
||||
.cleanup = srcu_sync_perf_cleanup,
|
||||
.readlock = srcu_perf_read_lock,
|
||||
.readunlock = srcu_perf_read_unlock,
|
||||
.started = NULL,
|
||||
.completed = srcu_perf_completed,
|
||||
.exp_completed = srcu_perf_completed,
|
||||
.async = srcu_call_rcu,
|
||||
.gp_barrier = srcu_rcu_barrier,
|
||||
.sync = srcu_perf_synchronize,
|
||||
.exp_sync = srcu_perf_synchronize_expedited,
|
||||
.name = "srcud"
|
||||
};
|
||||
|
||||
/*
|
||||
* Definitions for sched perf testing.
|
||||
*/
|
||||
@ -254,6 +310,8 @@ static struct rcu_perf_ops sched_ops = {
|
||||
.started = rcu_batches_started_sched,
|
||||
.completed = rcu_batches_completed_sched,
|
||||
.exp_completed = rcu_exp_batches_completed_sched,
|
||||
.async = call_rcu_sched,
|
||||
.gp_barrier = rcu_barrier_sched,
|
||||
.sync = synchronize_sched,
|
||||
.exp_sync = synchronize_sched_expedited,
|
||||
.name = "sched"
|
||||
@ -281,6 +339,8 @@ static struct rcu_perf_ops tasks_ops = {
|
||||
.readunlock = tasks_perf_read_unlock,
|
||||
.started = rcu_no_completed,
|
||||
.completed = rcu_no_completed,
|
||||
.async = call_rcu_tasks,
|
||||
.gp_barrier = rcu_barrier_tasks,
|
||||
.sync = synchronize_rcu_tasks,
|
||||
.exp_sync = synchronize_rcu_tasks,
|
||||
.name = "tasks"
|
||||
@ -343,6 +403,15 @@ rcu_perf_reader(void *arg)
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Callback function for asynchronous grace periods from rcu_perf_writer().
|
||||
*/
|
||||
static void rcu_perf_async_cb(struct rcu_head *rhp)
|
||||
{
|
||||
atomic_dec(this_cpu_ptr(&n_async_inflight));
|
||||
kfree(rhp);
|
||||
}
|
||||
|
||||
/*
|
||||
* RCU perf writer kthread. Repeatedly does a grace period.
|
||||
*/
|
||||
@ -352,6 +421,7 @@ rcu_perf_writer(void *arg)
|
||||
int i = 0;
|
||||
int i_max;
|
||||
long me = (long)arg;
|
||||
struct rcu_head *rhp = NULL;
|
||||
struct sched_param sp;
|
||||
bool started = false, done = false, alldone = false;
|
||||
u64 t;
|
||||
@ -380,9 +450,27 @@ rcu_perf_writer(void *arg)
|
||||
}
|
||||
|
||||
do {
|
||||
if (writer_holdoff)
|
||||
udelay(writer_holdoff);
|
||||
wdp = &wdpp[i];
|
||||
*wdp = ktime_get_mono_fast_ns();
|
||||
if (gp_exp) {
|
||||
if (gp_async) {
|
||||
retry:
|
||||
if (!rhp)
|
||||
rhp = kmalloc(sizeof(*rhp), GFP_KERNEL);
|
||||
if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) {
|
||||
rcu_perf_writer_state = RTWS_ASYNC;
|
||||
atomic_inc(this_cpu_ptr(&n_async_inflight));
|
||||
cur_ops->async(rhp, rcu_perf_async_cb);
|
||||
rhp = NULL;
|
||||
} else if (!kthread_should_stop()) {
|
||||
rcu_perf_writer_state = RTWS_BARRIER;
|
||||
cur_ops->gp_barrier();
|
||||
goto retry;
|
||||
} else {
|
||||
kfree(rhp); /* Because we are stopping. */
|
||||
}
|
||||
} else if (gp_exp) {
|
||||
rcu_perf_writer_state = RTWS_EXP_SYNC;
|
||||
cur_ops->exp_sync();
|
||||
} else {
|
||||
@ -429,6 +517,10 @@ rcu_perf_writer(void *arg)
|
||||
i++;
|
||||
rcu_perf_wait_shutdown();
|
||||
} while (!torture_must_stop());
|
||||
if (gp_async) {
|
||||
rcu_perf_writer_state = RTWS_BARRIER;
|
||||
cur_ops->gp_barrier();
|
||||
}
|
||||
rcu_perf_writer_state = RTWS_STOPPING;
|
||||
writer_n_durations[me] = i_max;
|
||||
torture_kthread_stopping("rcu_perf_writer");
|
||||
@ -452,6 +544,17 @@ rcu_perf_cleanup(void)
|
||||
u64 *wdp;
|
||||
u64 *wdpp;
|
||||
|
||||
/*
|
||||
* Would like warning at start, but everything is expedited
|
||||
* during the mid-boot phase, so have to wait till the end.
|
||||
*/
|
||||
if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
|
||||
VERBOSE_PERFOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
|
||||
if (rcu_gp_is_normal() && gp_exp)
|
||||
VERBOSE_PERFOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
|
||||
if (gp_exp && gp_async)
|
||||
VERBOSE_PERFOUT_ERRSTRING("No expedited async GPs, so went with async!");
|
||||
|
||||
if (torture_cleanup_begin())
|
||||
return;
|
||||
|
||||
@ -554,7 +657,7 @@ rcu_perf_init(void)
|
||||
long i;
|
||||
int firsterr = 0;
|
||||
static struct rcu_perf_ops *perf_ops[] = {
|
||||
&rcu_ops, &rcu_bh_ops, &srcu_ops, &sched_ops,
|
||||
&rcu_ops, &rcu_bh_ops, &srcu_ops, &srcud_ops, &sched_ops,
|
||||
RCUPERF_TASKS_OPS
|
||||
};
|
||||
|
||||
@ -624,16 +727,6 @@ rcu_perf_init(void)
|
||||
firsterr = -ENOMEM;
|
||||
goto unwind;
|
||||
}
|
||||
if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp) {
|
||||
VERBOSE_PERFOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
|
||||
firsterr = -EINVAL;
|
||||
goto unwind;
|
||||
}
|
||||
if (rcu_gp_is_normal() && gp_exp) {
|
||||
VERBOSE_PERFOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
|
||||
firsterr = -EINVAL;
|
||||
goto unwind;
|
||||
}
|
||||
for (i = 0; i < nrealwriters; i++) {
|
||||
writer_durations[i] =
|
||||
kcalloc(MAX_MEAS, sizeof(*writer_durations[i]),
|
||||
|
@ -52,6 +52,8 @@
|
||||
#include <linux/torture.h>
|
||||
#include <linux/vmalloc.h>
|
||||
|
||||
#include "rcu.h"
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com> and Josh Triplett <josh@joshtriplett.org>");
|
||||
|
||||
@ -562,31 +564,19 @@ static void srcu_torture_stats(void)
|
||||
int __maybe_unused cpu;
|
||||
int idx;
|
||||
|
||||
#if defined(CONFIG_TREE_SRCU) || defined(CONFIG_CLASSIC_SRCU)
|
||||
#ifdef CONFIG_TREE_SRCU
|
||||
idx = srcu_ctlp->srcu_idx & 0x1;
|
||||
#else /* #ifdef CONFIG_TREE_SRCU */
|
||||
idx = srcu_ctlp->completed & 0x1;
|
||||
#endif /* #else #ifdef CONFIG_TREE_SRCU */
|
||||
pr_alert("%s%s Tree SRCU per-CPU(idx=%d):",
|
||||
torture_type, TORTURE_FLAG, idx);
|
||||
for_each_possible_cpu(cpu) {
|
||||
unsigned long l0, l1;
|
||||
unsigned long u0, u1;
|
||||
long c0, c1;
|
||||
#ifdef CONFIG_TREE_SRCU
|
||||
struct srcu_data *counts;
|
||||
|
||||
counts = per_cpu_ptr(srcu_ctlp->sda, cpu);
|
||||
u0 = counts->srcu_unlock_count[!idx];
|
||||
u1 = counts->srcu_unlock_count[idx];
|
||||
#else /* #ifdef CONFIG_TREE_SRCU */
|
||||
struct srcu_array *counts;
|
||||
|
||||
counts = per_cpu_ptr(srcu_ctlp->per_cpu_ref, cpu);
|
||||
u0 = counts->unlock_count[!idx];
|
||||
u1 = counts->unlock_count[idx];
|
||||
#endif /* #else #ifdef CONFIG_TREE_SRCU */
|
||||
|
||||
/*
|
||||
* Make sure that a lock is always counted if the corresponding
|
||||
@ -594,13 +584,8 @@ static void srcu_torture_stats(void)
|
||||
*/
|
||||
smp_rmb();
|
||||
|
||||
#ifdef CONFIG_TREE_SRCU
|
||||
l0 = counts->srcu_lock_count[!idx];
|
||||
l1 = counts->srcu_lock_count[idx];
|
||||
#else /* #ifdef CONFIG_TREE_SRCU */
|
||||
l0 = counts->lock_count[!idx];
|
||||
l1 = counts->lock_count[idx];
|
||||
#endif /* #else #ifdef CONFIG_TREE_SRCU */
|
||||
|
||||
c0 = l0 - u0;
|
||||
c1 = l1 - u1;
|
||||
@ -609,7 +594,7 @@ static void srcu_torture_stats(void)
|
||||
pr_cont("\n");
|
||||
#elif defined(CONFIG_TINY_SRCU)
|
||||
idx = READ_ONCE(srcu_ctlp->srcu_idx) & 0x1;
|
||||
pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%d,%d)\n",
|
||||
pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%hd,%hd)\n",
|
||||
torture_type, TORTURE_FLAG, idx,
|
||||
READ_ONCE(srcu_ctlp->srcu_lock_nesting[!idx]),
|
||||
READ_ONCE(srcu_ctlp->srcu_lock_nesting[idx]));
|
||||
|
@ -1,661 +0,0 @@
|
||||
/*
|
||||
* Sleepable Read-Copy Update mechanism for mutual exclusion.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
* (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright (C) IBM Corporation, 2006
|
||||
* Copyright (C) Fujitsu, 2012
|
||||
*
|
||||
* Author: Paul McKenney <paulmck@us.ibm.com>
|
||||
* Lai Jiangshan <laijs@cn.fujitsu.com>
|
||||
*
|
||||
* For detailed explanation of Read-Copy Update mechanism see -
|
||||
* Documentation/RCU/ *.txt
|
||||
*
|
||||
*/
|
||||
|
||||
#include <linux/export.h>
|
||||
#include <linux/mutex.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/preempt.h>
|
||||
#include <linux/rcupdate_wait.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/srcu.h>
|
||||
|
||||
#include "rcu.h"
|
||||
|
||||
/*
|
||||
* Initialize an rcu_batch structure to empty.
|
||||
*/
|
||||
static inline void rcu_batch_init(struct rcu_batch *b)
|
||||
{
|
||||
b->head = NULL;
|
||||
b->tail = &b->head;
|
||||
}
|
||||
|
||||
/*
|
||||
* Enqueue a callback onto the tail of the specified rcu_batch structure.
|
||||
*/
|
||||
static inline void rcu_batch_queue(struct rcu_batch *b, struct rcu_head *head)
|
||||
{
|
||||
*b->tail = head;
|
||||
b->tail = &head->next;
|
||||
}
|
||||
|
||||
/*
|
||||
* Is the specified rcu_batch structure empty?
|
||||
*/
|
||||
static inline bool rcu_batch_empty(struct rcu_batch *b)
|
||||
{
|
||||
return b->tail == &b->head;
|
||||
}
|
||||
|
||||
/*
|
||||
* Remove the callback at the head of the specified rcu_batch structure
|
||||
* and return a pointer to it, or return NULL if the structure is empty.
|
||||
*/
|
||||
static inline struct rcu_head *rcu_batch_dequeue(struct rcu_batch *b)
|
||||
{
|
||||
struct rcu_head *head;
|
||||
|
||||
if (rcu_batch_empty(b))
|
||||
return NULL;
|
||||
|
||||
head = b->head;
|
||||
b->head = head->next;
|
||||
if (b->tail == &head->next)
|
||||
rcu_batch_init(b);
|
||||
|
||||
return head;
|
||||
}
|
||||
|
||||
/*
|
||||
* Move all callbacks from the rcu_batch structure specified by "from" to
|
||||
* the structure specified by "to".
|
||||
*/
|
||||
static inline void rcu_batch_move(struct rcu_batch *to, struct rcu_batch *from)
|
||||
{
|
||||
if (!rcu_batch_empty(from)) {
|
||||
*to->tail = from->head;
|
||||
to->tail = from->tail;
|
||||
rcu_batch_init(from);
|
||||
}
|
||||
}
|
||||
|
||||
static int init_srcu_struct_fields(struct srcu_struct *sp)
|
||||
{
|
||||
sp->completed = 0;
|
||||
spin_lock_init(&sp->queue_lock);
|
||||
sp->running = false;
|
||||
rcu_batch_init(&sp->batch_queue);
|
||||
rcu_batch_init(&sp->batch_check0);
|
||||
rcu_batch_init(&sp->batch_check1);
|
||||
rcu_batch_init(&sp->batch_done);
|
||||
INIT_DELAYED_WORK(&sp->work, process_srcu);
|
||||
sp->per_cpu_ref = alloc_percpu(struct srcu_array);
|
||||
return sp->per_cpu_ref ? 0 : -ENOMEM;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
|
||||
int __init_srcu_struct(struct srcu_struct *sp, const char *name,
|
||||
struct lock_class_key *key)
|
||||
{
|
||||
/* Don't re-initialize a lock while it is held. */
|
||||
debug_check_no_locks_freed((void *)sp, sizeof(*sp));
|
||||
lockdep_init_map(&sp->dep_map, name, key, 0);
|
||||
return init_srcu_struct_fields(sp);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(__init_srcu_struct);
|
||||
|
||||
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
|
||||
/**
|
||||
* init_srcu_struct - initialize a sleep-RCU structure
|
||||
* @sp: structure to initialize.
|
||||
*
|
||||
* Must invoke this on a given srcu_struct before passing that srcu_struct
|
||||
* to any other function. Each srcu_struct represents a separate domain
|
||||
* of SRCU protection.
|
||||
*/
|
||||
int init_srcu_struct(struct srcu_struct *sp)
|
||||
{
|
||||
return init_srcu_struct_fields(sp);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(init_srcu_struct);
|
||||
|
||||
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||
|
||||
/*
|
||||
* Returns approximate total of the readers' ->lock_count[] values for the
|
||||
* rank of per-CPU counters specified by idx.
|
||||
*/
|
||||
static unsigned long srcu_readers_lock_idx(struct srcu_struct *sp, int idx)
|
||||
{
|
||||
int cpu;
|
||||
unsigned long sum = 0;
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct srcu_array *cpuc = per_cpu_ptr(sp->per_cpu_ref, cpu);
|
||||
|
||||
sum += READ_ONCE(cpuc->lock_count[idx]);
|
||||
}
|
||||
return sum;
|
||||
}
|
||||
|
||||
/*
|
||||
* Returns approximate total of the readers' ->unlock_count[] values for the
|
||||
* rank of per-CPU counters specified by idx.
|
||||
*/
|
||||
static unsigned long srcu_readers_unlock_idx(struct srcu_struct *sp, int idx)
|
||||
{
|
||||
int cpu;
|
||||
unsigned long sum = 0;
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct srcu_array *cpuc = per_cpu_ptr(sp->per_cpu_ref, cpu);
|
||||
|
||||
sum += READ_ONCE(cpuc->unlock_count[idx]);
|
||||
}
|
||||
return sum;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return true if the number of pre-existing readers is determined to
|
||||
* be zero.
|
||||
*/
|
||||
static bool srcu_readers_active_idx_check(struct srcu_struct *sp, int idx)
|
||||
{
|
||||
unsigned long unlocks;
|
||||
|
||||
unlocks = srcu_readers_unlock_idx(sp, idx);
|
||||
|
||||
/*
|
||||
* Make sure that a lock is always counted if the corresponding unlock
|
||||
* is counted. Needs to be a smp_mb() as the read side may contain a
|
||||
* read from a variable that is written to before the synchronize_srcu()
|
||||
* in the write side. In this case smp_mb()s A and B act like the store
|
||||
* buffering pattern.
|
||||
*
|
||||
* This smp_mb() also pairs with smp_mb() C to prevent accesses after the
|
||||
* synchronize_srcu() from being executed before the grace period ends.
|
||||
*/
|
||||
smp_mb(); /* A */
|
||||
|
||||
/*
|
||||
* If the locks are the same as the unlocks, then there must have
|
||||
* been no readers on this index at some time in between. This does not
|
||||
* mean that there are no more readers, as one could have read the
|
||||
* current index but not have incremented the lock counter yet.
|
||||
*
|
||||
* Possible bug: There is no guarantee that there haven't been ULONG_MAX
|
||||
* increments of ->lock_count[] since the unlocks were counted, meaning
|
||||
* that this could return true even if there are still active readers.
|
||||
* Since there are no memory barriers around srcu_flip(), the CPU is not
|
||||
* required to increment ->completed before running
|
||||
* srcu_readers_unlock_idx(), which means that there could be an
|
||||
* arbitrarily large number of critical sections that execute after
|
||||
* srcu_readers_unlock_idx() but use the old value of ->completed.
|
||||
*/
|
||||
return srcu_readers_lock_idx(sp, idx) == unlocks;
|
||||
}
|
||||
|
||||
/**
|
||||
* srcu_readers_active - returns true if there are readers. and false
|
||||
* otherwise
|
||||
* @sp: which srcu_struct to count active readers (holding srcu_read_lock).
|
||||
*
|
||||
* Note that this is not an atomic primitive, and can therefore suffer
|
||||
* severe errors when invoked on an active srcu_struct. That said, it
|
||||
* can be useful as an error check at cleanup time.
|
||||
*/
|
||||
static bool srcu_readers_active(struct srcu_struct *sp)
|
||||
{
|
||||
int cpu;
|
||||
unsigned long sum = 0;
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct srcu_array *cpuc = per_cpu_ptr(sp->per_cpu_ref, cpu);
|
||||
|
||||
sum += READ_ONCE(cpuc->lock_count[0]);
|
||||
sum += READ_ONCE(cpuc->lock_count[1]);
|
||||
sum -= READ_ONCE(cpuc->unlock_count[0]);
|
||||
sum -= READ_ONCE(cpuc->unlock_count[1]);
|
||||
}
|
||||
return sum;
|
||||
}
|
||||
|
||||
/**
|
||||
* cleanup_srcu_struct - deconstruct a sleep-RCU structure
|
||||
* @sp: structure to clean up.
|
||||
*
|
||||
* Must invoke this only after you are finished using a given srcu_struct
|
||||
* that was initialized via init_srcu_struct(). This code does some
|
||||
* probabalistic checking, spotting late uses of srcu_read_lock(),
|
||||
* synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu().
|
||||
* If any such late uses are detected, the per-CPU memory associated with
|
||||
* the srcu_struct is simply leaked and WARN_ON() is invoked. If the
|
||||
* caller frees the srcu_struct itself, a use-after-free crash will likely
|
||||
* ensue, but at least there will be a warning printed.
|
||||
*/
|
||||
void cleanup_srcu_struct(struct srcu_struct *sp)
|
||||
{
|
||||
if (WARN_ON(srcu_readers_active(sp)))
|
||||
return; /* Leakage unless caller handles error. */
|
||||
free_percpu(sp->per_cpu_ref);
|
||||
sp->per_cpu_ref = NULL;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
|
||||
|
||||
/*
|
||||
* Counts the new reader in the appropriate per-CPU element of the
|
||||
* srcu_struct.
|
||||
* Returns an index that must be passed to the matching srcu_read_unlock().
|
||||
*/
|
||||
int __srcu_read_lock(struct srcu_struct *sp)
|
||||
{
|
||||
int idx;
|
||||
|
||||
idx = READ_ONCE(sp->completed) & 0x1;
|
||||
this_cpu_inc(sp->per_cpu_ref->lock_count[idx]);
|
||||
smp_mb(); /* B */ /* Avoid leaking the critical section. */
|
||||
return idx;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(__srcu_read_lock);
|
||||
|
||||
/*
|
||||
* Removes the count for the old reader from the appropriate per-CPU
|
||||
* element of the srcu_struct. Note that this may well be a different
|
||||
* CPU than that which was incremented by the corresponding srcu_read_lock().
|
||||
*/
|
||||
void __srcu_read_unlock(struct srcu_struct *sp, int idx)
|
||||
{
|
||||
smp_mb(); /* C */ /* Avoid leaking the critical section. */
|
||||
this_cpu_inc(sp->per_cpu_ref->unlock_count[idx]);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(__srcu_read_unlock);
|
||||
|
||||
/*
|
||||
* We use an adaptive strategy for synchronize_srcu() and especially for
|
||||
* synchronize_srcu_expedited(). We spin for a fixed time period
|
||||
* (defined below) to allow SRCU readers to exit their read-side critical
|
||||
* sections. If there are still some readers after 10 microseconds,
|
||||
* we repeatedly block for 1-millisecond time periods. This approach
|
||||
* has done well in testing, so there is no need for a config parameter.
|
||||
*/
|
||||
#define SRCU_RETRY_CHECK_DELAY 5
|
||||
#define SYNCHRONIZE_SRCU_TRYCOUNT 2
|
||||
#define SYNCHRONIZE_SRCU_EXP_TRYCOUNT 12
|
||||
|
||||
/*
|
||||
* @@@ Wait until all pre-existing readers complete. Such readers
|
||||
* will have used the index specified by "idx".
|
||||
* the caller should ensures the ->completed is not changed while checking
|
||||
* and idx = (->completed & 1) ^ 1
|
||||
*/
|
||||
static bool try_check_zero(struct srcu_struct *sp, int idx, int trycount)
|
||||
{
|
||||
for (;;) {
|
||||
if (srcu_readers_active_idx_check(sp, idx))
|
||||
return true;
|
||||
if (--trycount <= 0)
|
||||
return false;
|
||||
udelay(SRCU_RETRY_CHECK_DELAY);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Increment the ->completed counter so that future SRCU readers will
|
||||
* use the other rank of the ->(un)lock_count[] arrays. This allows
|
||||
* us to wait for pre-existing readers in a starvation-free manner.
|
||||
*/
|
||||
static void srcu_flip(struct srcu_struct *sp)
|
||||
{
|
||||
WRITE_ONCE(sp->completed, sp->completed + 1);
|
||||
|
||||
/*
|
||||
* Ensure that if the updater misses an __srcu_read_unlock()
|
||||
* increment, that task's next __srcu_read_lock() will see the
|
||||
* above counter update. Note that both this memory barrier
|
||||
* and the one in srcu_readers_active_idx_check() provide the
|
||||
* guarantee for __srcu_read_lock().
|
||||
*/
|
||||
smp_mb(); /* D */ /* Pairs with C. */
|
||||
}
|
||||
|
||||
/*
|
||||
* Enqueue an SRCU callback on the specified srcu_struct structure,
|
||||
* initiating grace-period processing if it is not already running.
|
||||
*
|
||||
* Note that all CPUs must agree that the grace period extended beyond
|
||||
* all pre-existing SRCU read-side critical section. On systems with
|
||||
* more than one CPU, this means that when "func()" is invoked, each CPU
|
||||
* is guaranteed to have executed a full memory barrier since the end of
|
||||
* its last corresponding SRCU read-side critical section whose beginning
|
||||
* preceded the call to call_rcu(). It also means that each CPU executing
|
||||
* an SRCU read-side critical section that continues beyond the start of
|
||||
* "func()" must have executed a memory barrier after the call_rcu()
|
||||
* but before the beginning of that SRCU read-side critical section.
|
||||
* Note that these guarantees include CPUs that are offline, idle, or
|
||||
* executing in user mode, as well as CPUs that are executing in the kernel.
|
||||
*
|
||||
* Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
|
||||
* resulting SRCU callback function "func()", then both CPU A and CPU
|
||||
* B are guaranteed to execute a full memory barrier during the time
|
||||
* interval between the call to call_rcu() and the invocation of "func()".
|
||||
* This guarantee applies even if CPU A and CPU B are the same CPU (but
|
||||
* again only if the system has more than one CPU).
|
||||
*
|
||||
* Of course, these guarantees apply only for invocations of call_srcu(),
|
||||
* srcu_read_lock(), and srcu_read_unlock() that are all passed the same
|
||||
* srcu_struct structure.
|
||||
*/
|
||||
void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
|
||||
rcu_callback_t func)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
head->next = NULL;
|
||||
head->func = func;
|
||||
spin_lock_irqsave(&sp->queue_lock, flags);
|
||||
smp_mb__after_unlock_lock(); /* Caller's prior accesses before GP. */
|
||||
rcu_batch_queue(&sp->batch_queue, head);
|
||||
if (!sp->running) {
|
||||
sp->running = true;
|
||||
queue_delayed_work(system_power_efficient_wq, &sp->work, 0);
|
||||
}
|
||||
spin_unlock_irqrestore(&sp->queue_lock, flags);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(call_srcu);
|
||||
|
||||
static void srcu_advance_batches(struct srcu_struct *sp, int trycount);
|
||||
static void srcu_reschedule(struct srcu_struct *sp);
|
||||
|
||||
/*
|
||||
* Helper function for synchronize_srcu() and synchronize_srcu_expedited().
|
||||
*/
|
||||
static void __synchronize_srcu(struct srcu_struct *sp, int trycount)
|
||||
{
|
||||
struct rcu_synchronize rcu;
|
||||
struct rcu_head *head = &rcu.head;
|
||||
bool done = false;
|
||||
|
||||
RCU_LOCKDEP_WARN(lock_is_held(&sp->dep_map) ||
|
||||
lock_is_held(&rcu_bh_lock_map) ||
|
||||
lock_is_held(&rcu_lock_map) ||
|
||||
lock_is_held(&rcu_sched_lock_map),
|
||||
"Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section");
|
||||
|
||||
might_sleep();
|
||||
init_completion(&rcu.completion);
|
||||
|
||||
head->next = NULL;
|
||||
head->func = wakeme_after_rcu;
|
||||
spin_lock_irq(&sp->queue_lock);
|
||||
smp_mb__after_unlock_lock(); /* Caller's prior accesses before GP. */
|
||||
if (!sp->running) {
|
||||
/* steal the processing owner */
|
||||
sp->running = true;
|
||||
rcu_batch_queue(&sp->batch_check0, head);
|
||||
spin_unlock_irq(&sp->queue_lock);
|
||||
|
||||
srcu_advance_batches(sp, trycount);
|
||||
if (!rcu_batch_empty(&sp->batch_done)) {
|
||||
BUG_ON(sp->batch_done.head != head);
|
||||
rcu_batch_dequeue(&sp->batch_done);
|
||||
done = true;
|
||||
}
|
||||
/* give the processing owner to work_struct */
|
||||
srcu_reschedule(sp);
|
||||
} else {
|
||||
rcu_batch_queue(&sp->batch_queue, head);
|
||||
spin_unlock_irq(&sp->queue_lock);
|
||||
}
|
||||
|
||||
if (!done) {
|
||||
wait_for_completion(&rcu.completion);
|
||||
smp_mb(); /* Caller's later accesses after GP. */
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
/**
|
||||
* synchronize_srcu - wait for prior SRCU read-side critical-section completion
|
||||
* @sp: srcu_struct with which to synchronize.
|
||||
*
|
||||
* Wait for the count to drain to zero of both indexes. To avoid the
|
||||
* possible starvation of synchronize_srcu(), it waits for the count of
|
||||
* the index=((->completed & 1) ^ 1) to drain to zero at first,
|
||||
* and then flip the completed and wait for the count of the other index.
|
||||
*
|
||||
* Can block; must be called from process context.
|
||||
*
|
||||
* Note that it is illegal to call synchronize_srcu() from the corresponding
|
||||
* SRCU read-side critical section; doing so will result in deadlock.
|
||||
* However, it is perfectly legal to call synchronize_srcu() on one
|
||||
* srcu_struct from some other srcu_struct's read-side critical section,
|
||||
* as long as the resulting graph of srcu_structs is acyclic.
|
||||
*
|
||||
* There are memory-ordering constraints implied by synchronize_srcu().
|
||||
* On systems with more than one CPU, when synchronize_srcu() returns,
|
||||
* each CPU is guaranteed to have executed a full memory barrier since
|
||||
* the end of its last corresponding SRCU-sched read-side critical section
|
||||
* whose beginning preceded the call to synchronize_srcu(). In addition,
|
||||
* each CPU having an SRCU read-side critical section that extends beyond
|
||||
* the return from synchronize_srcu() is guaranteed to have executed a
|
||||
* full memory barrier after the beginning of synchronize_srcu() and before
|
||||
* the beginning of that SRCU read-side critical section. Note that these
|
||||
* guarantees include CPUs that are offline, idle, or executing in user mode,
|
||||
* as well as CPUs that are executing in the kernel.
|
||||
*
|
||||
* Furthermore, if CPU A invoked synchronize_srcu(), which returned
|
||||
* to its caller on CPU B, then both CPU A and CPU B are guaranteed
|
||||
* to have executed a full memory barrier during the execution of
|
||||
* synchronize_srcu(). This guarantee applies even if CPU A and CPU B
|
||||
* are the same CPU, but again only if the system has more than one CPU.
|
||||
*
|
||||
* Of course, these memory-ordering guarantees apply only when
|
||||
* synchronize_srcu(), srcu_read_lock(), and srcu_read_unlock() are
|
||||
* passed the same srcu_struct structure.
|
||||
*/
|
||||
void synchronize_srcu(struct srcu_struct *sp)
|
||||
{
|
||||
__synchronize_srcu(sp, (rcu_gp_is_expedited() && !rcu_gp_is_normal())
|
||||
? SYNCHRONIZE_SRCU_EXP_TRYCOUNT
|
||||
: SYNCHRONIZE_SRCU_TRYCOUNT);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(synchronize_srcu);
|
||||
|
||||
/**
|
||||
* synchronize_srcu_expedited - Brute-force SRCU grace period
|
||||
* @sp: srcu_struct with which to synchronize.
|
||||
*
|
||||
* Wait for an SRCU grace period to elapse, but be more aggressive about
|
||||
* spinning rather than blocking when waiting.
|
||||
*
|
||||
* Note that synchronize_srcu_expedited() has the same deadlock and
|
||||
* memory-ordering properties as does synchronize_srcu().
|
||||
*/
|
||||
void synchronize_srcu_expedited(struct srcu_struct *sp)
|
||||
{
|
||||
__synchronize_srcu(sp, SYNCHRONIZE_SRCU_EXP_TRYCOUNT);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(synchronize_srcu_expedited);
|
||||
|
||||
/**
|
||||
* srcu_barrier - Wait until all in-flight call_srcu() callbacks complete.
|
||||
* @sp: srcu_struct on which to wait for in-flight callbacks.
|
||||
*/
|
||||
void srcu_barrier(struct srcu_struct *sp)
|
||||
{
|
||||
synchronize_srcu(sp);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(srcu_barrier);
|
||||
|
||||
/**
|
||||
* srcu_batches_completed - return batches completed.
|
||||
* @sp: srcu_struct on which to report batch completion.
|
||||
*
|
||||
* Report the number of batches, correlated with, but not necessarily
|
||||
* precisely the same as, the number of grace periods that have elapsed.
|
||||
*/
|
||||
unsigned long srcu_batches_completed(struct srcu_struct *sp)
|
||||
{
|
||||
return sp->completed;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(srcu_batches_completed);
|
||||
|
||||
#define SRCU_CALLBACK_BATCH 10
|
||||
#define SRCU_INTERVAL 1
|
||||
|
||||
/*
|
||||
* Move any new SRCU callbacks to the first stage of the SRCU grace
|
||||
* period pipeline.
|
||||
*/
|
||||
static void srcu_collect_new(struct srcu_struct *sp)
|
||||
{
|
||||
if (!rcu_batch_empty(&sp->batch_queue)) {
|
||||
spin_lock_irq(&sp->queue_lock);
|
||||
rcu_batch_move(&sp->batch_check0, &sp->batch_queue);
|
||||
spin_unlock_irq(&sp->queue_lock);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Core SRCU state machine. Advance callbacks from ->batch_check0 to
|
||||
* ->batch_check1 and then to ->batch_done as readers drain.
|
||||
*/
|
||||
static void srcu_advance_batches(struct srcu_struct *sp, int trycount)
|
||||
{
|
||||
int idx = 1 ^ (sp->completed & 1);
|
||||
|
||||
/*
|
||||
* Because readers might be delayed for an extended period after
|
||||
* fetching ->completed for their index, at any point in time there
|
||||
* might well be readers using both idx=0 and idx=1. We therefore
|
||||
* need to wait for readers to clear from both index values before
|
||||
* invoking a callback.
|
||||
*/
|
||||
|
||||
if (rcu_batch_empty(&sp->batch_check0) &&
|
||||
rcu_batch_empty(&sp->batch_check1))
|
||||
return; /* no callbacks need to be advanced */
|
||||
|
||||
if (!try_check_zero(sp, idx, trycount))
|
||||
return; /* failed to advance, will try after SRCU_INTERVAL */
|
||||
|
||||
/*
|
||||
* The callbacks in ->batch_check1 have already done with their
|
||||
* first zero check and flip back when they were enqueued on
|
||||
* ->batch_check0 in a previous invocation of srcu_advance_batches().
|
||||
* (Presumably try_check_zero() returned false during that
|
||||
* invocation, leaving the callbacks stranded on ->batch_check1.)
|
||||
* They are therefore ready to invoke, so move them to ->batch_done.
|
||||
*/
|
||||
rcu_batch_move(&sp->batch_done, &sp->batch_check1);
|
||||
|
||||
if (rcu_batch_empty(&sp->batch_check0))
|
||||
return; /* no callbacks need to be advanced */
|
||||
srcu_flip(sp);
|
||||
|
||||
/*
|
||||
* The callbacks in ->batch_check0 just finished their
|
||||
* first check zero and flip, so move them to ->batch_check1
|
||||
* for future checking on the other idx.
|
||||
*/
|
||||
rcu_batch_move(&sp->batch_check1, &sp->batch_check0);
|
||||
|
||||
/*
|
||||
* SRCU read-side critical sections are normally short, so check
|
||||
* at least twice in quick succession after a flip.
|
||||
*/
|
||||
trycount = trycount < 2 ? 2 : trycount;
|
||||
if (!try_check_zero(sp, idx^1, trycount))
|
||||
return; /* failed to advance, will try after SRCU_INTERVAL */
|
||||
|
||||
/*
|
||||
* The callbacks in ->batch_check1 have now waited for all
|
||||
* pre-existing readers using both idx values. They are therefore
|
||||
* ready to invoke, so move them to ->batch_done.
|
||||
*/
|
||||
rcu_batch_move(&sp->batch_done, &sp->batch_check1);
|
||||
}
|
||||
|
||||
/*
|
||||
* Invoke a limited number of SRCU callbacks that have passed through
|
||||
* their grace period. If there are more to do, SRCU will reschedule
|
||||
* the workqueue. Note that needed memory barriers have been executed
|
||||
* in this task's context by srcu_readers_active_idx_check().
|
||||
*/
|
||||
static void srcu_invoke_callbacks(struct srcu_struct *sp)
|
||||
{
|
||||
int i;
|
||||
struct rcu_head *head;
|
||||
|
||||
for (i = 0; i < SRCU_CALLBACK_BATCH; i++) {
|
||||
head = rcu_batch_dequeue(&sp->batch_done);
|
||||
if (!head)
|
||||
break;
|
||||
local_bh_disable();
|
||||
head->func(head);
|
||||
local_bh_enable();
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Finished one round of SRCU grace period. Start another if there are
|
||||
* more SRCU callbacks queued, otherwise put SRCU into not-running state.
|
||||
*/
|
||||
static void srcu_reschedule(struct srcu_struct *sp)
|
||||
{
|
||||
bool pending = true;
|
||||
|
||||
if (rcu_batch_empty(&sp->batch_done) &&
|
||||
rcu_batch_empty(&sp->batch_check1) &&
|
||||
rcu_batch_empty(&sp->batch_check0) &&
|
||||
rcu_batch_empty(&sp->batch_queue)) {
|
||||
spin_lock_irq(&sp->queue_lock);
|
||||
if (rcu_batch_empty(&sp->batch_done) &&
|
||||
rcu_batch_empty(&sp->batch_check1) &&
|
||||
rcu_batch_empty(&sp->batch_check0) &&
|
||||
rcu_batch_empty(&sp->batch_queue)) {
|
||||
sp->running = false;
|
||||
pending = false;
|
||||
}
|
||||
spin_unlock_irq(&sp->queue_lock);
|
||||
}
|
||||
|
||||
if (pending)
|
||||
queue_delayed_work(system_power_efficient_wq,
|
||||
&sp->work, SRCU_INTERVAL);
|
||||
}
|
||||
|
||||
/*
|
||||
* This is the work-queue function that handles SRCU grace periods.
|
||||
*/
|
||||
void process_srcu(struct work_struct *work)
|
||||
{
|
||||
struct srcu_struct *sp;
|
||||
|
||||
sp = container_of(work, struct srcu_struct, work.work);
|
||||
|
||||
srcu_collect_new(sp);
|
||||
srcu_advance_batches(sp, 1);
|
||||
srcu_invoke_callbacks(sp);
|
||||
srcu_reschedule(sp);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(process_srcu);
|
@ -38,8 +38,8 @@ static int init_srcu_struct_fields(struct srcu_struct *sp)
|
||||
sp->srcu_lock_nesting[0] = 0;
|
||||
sp->srcu_lock_nesting[1] = 0;
|
||||
init_swait_queue_head(&sp->srcu_wq);
|
||||
sp->srcu_gp_seq = 0;
|
||||
rcu_segcblist_init(&sp->srcu_cblist);
|
||||
sp->srcu_cb_head = NULL;
|
||||
sp->srcu_cb_tail = &sp->srcu_cb_head;
|
||||
sp->srcu_gp_running = false;
|
||||
sp->srcu_gp_waiting = false;
|
||||
sp->srcu_idx = 0;
|
||||
@ -88,29 +88,13 @@ void cleanup_srcu_struct(struct srcu_struct *sp)
|
||||
{
|
||||
WARN_ON(sp->srcu_lock_nesting[0] || sp->srcu_lock_nesting[1]);
|
||||
flush_work(&sp->srcu_work);
|
||||
WARN_ON(rcu_seq_state(sp->srcu_gp_seq));
|
||||
WARN_ON(sp->srcu_gp_running);
|
||||
WARN_ON(sp->srcu_gp_waiting);
|
||||
WARN_ON(!rcu_segcblist_empty(&sp->srcu_cblist));
|
||||
WARN_ON(sp->srcu_cb_head);
|
||||
WARN_ON(&sp->srcu_cb_head != sp->srcu_cb_tail);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
|
||||
|
||||
/*
|
||||
* Counts the new reader in the appropriate per-CPU element of the
|
||||
* srcu_struct. Can be invoked from irq/bh handlers, but the matching
|
||||
* __srcu_read_unlock() must be in the same handler instance. Returns an
|
||||
* index that must be passed to the matching srcu_read_unlock().
|
||||
*/
|
||||
int __srcu_read_lock(struct srcu_struct *sp)
|
||||
{
|
||||
int idx;
|
||||
|
||||
idx = READ_ONCE(sp->srcu_idx);
|
||||
WRITE_ONCE(sp->srcu_lock_nesting[idx], sp->srcu_lock_nesting[idx] + 1);
|
||||
return idx;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(__srcu_read_lock);
|
||||
|
||||
/*
|
||||
* Removes the count for the old reader from the appropriate element of
|
||||
* the srcu_struct.
|
||||
@ -133,52 +117,44 @@ EXPORT_SYMBOL_GPL(__srcu_read_unlock);
|
||||
void srcu_drive_gp(struct work_struct *wp)
|
||||
{
|
||||
int idx;
|
||||
struct rcu_cblist ready_cbs;
|
||||
struct srcu_struct *sp;
|
||||
struct rcu_head *lh;
|
||||
struct rcu_head *rhp;
|
||||
struct srcu_struct *sp;
|
||||
|
||||
sp = container_of(wp, struct srcu_struct, srcu_work);
|
||||
if (sp->srcu_gp_running || rcu_segcblist_empty(&sp->srcu_cblist))
|
||||
if (sp->srcu_gp_running || !READ_ONCE(sp->srcu_cb_head))
|
||||
return; /* Already running or nothing to do. */
|
||||
|
||||
/* Tag recently arrived callbacks and wait for readers. */
|
||||
/* Remove recently arrived callbacks and wait for readers. */
|
||||
WRITE_ONCE(sp->srcu_gp_running, true);
|
||||
rcu_segcblist_accelerate(&sp->srcu_cblist,
|
||||
rcu_seq_snap(&sp->srcu_gp_seq));
|
||||
rcu_seq_start(&sp->srcu_gp_seq);
|
||||
local_irq_disable();
|
||||
lh = sp->srcu_cb_head;
|
||||
sp->srcu_cb_head = NULL;
|
||||
sp->srcu_cb_tail = &sp->srcu_cb_head;
|
||||
local_irq_enable();
|
||||
idx = sp->srcu_idx;
|
||||
WRITE_ONCE(sp->srcu_idx, !sp->srcu_idx);
|
||||
WRITE_ONCE(sp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */
|
||||
swait_event(sp->srcu_wq, !READ_ONCE(sp->srcu_lock_nesting[idx]));
|
||||
WRITE_ONCE(sp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */
|
||||
rcu_seq_end(&sp->srcu_gp_seq);
|
||||
|
||||
/* Update callback list based on GP, and invoke ready callbacks. */
|
||||
rcu_segcblist_advance(&sp->srcu_cblist,
|
||||
rcu_seq_current(&sp->srcu_gp_seq));
|
||||
if (rcu_segcblist_ready_cbs(&sp->srcu_cblist)) {
|
||||
rcu_cblist_init(&ready_cbs);
|
||||
local_irq_disable();
|
||||
rcu_segcblist_extract_done_cbs(&sp->srcu_cblist, &ready_cbs);
|
||||
local_irq_enable();
|
||||
rhp = rcu_cblist_dequeue(&ready_cbs);
|
||||
for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) {
|
||||
local_bh_disable();
|
||||
rhp->func(rhp);
|
||||
local_bh_enable();
|
||||
}
|
||||
local_irq_disable();
|
||||
rcu_segcblist_insert_count(&sp->srcu_cblist, &ready_cbs);
|
||||
local_irq_enable();
|
||||
/* Invoke the callbacks we removed above. */
|
||||
while (lh) {
|
||||
rhp = lh;
|
||||
lh = lh->next;
|
||||
local_bh_disable();
|
||||
rhp->func(rhp);
|
||||
local_bh_enable();
|
||||
}
|
||||
WRITE_ONCE(sp->srcu_gp_running, false);
|
||||
|
||||
/*
|
||||
* If more callbacks, reschedule ourselves. This can race with
|
||||
* a call_srcu() at interrupt level, but the ->srcu_gp_running
|
||||
* checks will straighten that out.
|
||||
* Enable rescheduling, and if there are more callbacks,
|
||||
* reschedule ourselves. This can race with a call_srcu()
|
||||
* at interrupt level, but the ->srcu_gp_running checks will
|
||||
* straighten that out.
|
||||
*/
|
||||
if (!rcu_segcblist_empty(&sp->srcu_cblist))
|
||||
WRITE_ONCE(sp->srcu_gp_running, false);
|
||||
if (READ_ONCE(sp->srcu_cb_head))
|
||||
schedule_work(&sp->srcu_work);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(srcu_drive_gp);
|
||||
@ -187,14 +163,16 @@ EXPORT_SYMBOL_GPL(srcu_drive_gp);
|
||||
* Enqueue an SRCU callback on the specified srcu_struct structure,
|
||||
* initiating grace-period processing if it is not already running.
|
||||
*/
|
||||
void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
|
||||
void call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
|
||||
rcu_callback_t func)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
head->func = func;
|
||||
rhp->func = func;
|
||||
rhp->next = NULL;
|
||||
local_irq_save(flags);
|
||||
rcu_segcblist_enqueue(&sp->srcu_cblist, head, false);
|
||||
*sp->srcu_cb_tail = rhp;
|
||||
sp->srcu_cb_tail = &rhp->next;
|
||||
local_irq_restore(flags);
|
||||
if (!READ_ONCE(sp->srcu_gp_running))
|
||||
schedule_work(&sp->srcu_work);
|
||||
|
@ -40,9 +40,15 @@
|
||||
#include "rcu.h"
|
||||
#include "rcu_segcblist.h"
|
||||
|
||||
ulong exp_holdoff = 25 * 1000; /* Holdoff (ns) for auto-expediting. */
|
||||
/* Holdoff in nanoseconds for auto-expediting. */
|
||||
#define DEFAULT_SRCU_EXP_HOLDOFF (25 * 1000)
|
||||
static ulong exp_holdoff = DEFAULT_SRCU_EXP_HOLDOFF;
|
||||
module_param(exp_holdoff, ulong, 0444);
|
||||
|
||||
/* Overflow-check frequency. N bits roughly says every 2**N grace periods. */
|
||||
static ulong counter_wrap_check = (ULONG_MAX >> 2);
|
||||
module_param(counter_wrap_check, ulong, 0444);
|
||||
|
||||
static void srcu_invoke_callbacks(struct work_struct *work);
|
||||
static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay);
|
||||
|
||||
@ -70,7 +76,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static)
|
||||
|
||||
/* Each pass through this loop initializes one srcu_node structure. */
|
||||
rcu_for_each_node_breadth_first(sp, snp) {
|
||||
spin_lock_init(&snp->lock);
|
||||
raw_spin_lock_init(&ACCESS_PRIVATE(snp, lock));
|
||||
WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) !=
|
||||
ARRAY_SIZE(snp->srcu_data_have_cbs));
|
||||
for (i = 0; i < ARRAY_SIZE(snp->srcu_have_cbs); i++) {
|
||||
@ -104,7 +110,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static)
|
||||
snp_first = sp->level[level];
|
||||
for_each_possible_cpu(cpu) {
|
||||
sdp = per_cpu_ptr(sp->sda, cpu);
|
||||
spin_lock_init(&sdp->lock);
|
||||
raw_spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
|
||||
rcu_segcblist_init(&sdp->srcu_cblist);
|
||||
sdp->srcu_cblist_invoking = false;
|
||||
sdp->srcu_gp_seq_needed = sp->srcu_gp_seq;
|
||||
@ -163,7 +169,7 @@ int __init_srcu_struct(struct srcu_struct *sp, const char *name,
|
||||
/* Don't re-initialize a lock while it is held. */
|
||||
debug_check_no_locks_freed((void *)sp, sizeof(*sp));
|
||||
lockdep_init_map(&sp->dep_map, name, key, 0);
|
||||
spin_lock_init(&sp->gp_lock);
|
||||
raw_spin_lock_init(&ACCESS_PRIVATE(sp, lock));
|
||||
return init_srcu_struct_fields(sp, false);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(__init_srcu_struct);
|
||||
@ -180,7 +186,7 @@ EXPORT_SYMBOL_GPL(__init_srcu_struct);
|
||||
*/
|
||||
int init_srcu_struct(struct srcu_struct *sp)
|
||||
{
|
||||
spin_lock_init(&sp->gp_lock);
|
||||
raw_spin_lock_init(&ACCESS_PRIVATE(sp, lock));
|
||||
return init_srcu_struct_fields(sp, false);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(init_srcu_struct);
|
||||
@ -191,7 +197,7 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
|
||||
* First-use initialization of statically allocated srcu_struct
|
||||
* structure. Wiring up the combining tree is more than can be
|
||||
* done with compile-time initialization, so this check is added
|
||||
* to each update-side SRCU primitive. Use ->gp_lock, which -is-
|
||||
* to each update-side SRCU primitive. Use sp->lock, which -is-
|
||||
* compile-time initialized, to resolve races involving multiple
|
||||
* CPUs trying to garner first-use privileges.
|
||||
*/
|
||||
@ -203,13 +209,13 @@ static void check_init_srcu_struct(struct srcu_struct *sp)
|
||||
/* The smp_load_acquire() pairs with the smp_store_release(). */
|
||||
if (!rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq_needed))) /*^^^*/
|
||||
return; /* Already initialized. */
|
||||
spin_lock_irqsave(&sp->gp_lock, flags);
|
||||
raw_spin_lock_irqsave_rcu_node(sp, flags);
|
||||
if (!rcu_seq_state(sp->srcu_gp_seq_needed)) {
|
||||
spin_unlock_irqrestore(&sp->gp_lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
|
||||
return;
|
||||
}
|
||||
init_srcu_struct_fields(sp, true);
|
||||
spin_unlock_irqrestore(&sp->gp_lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -275,15 +281,20 @@ static bool srcu_readers_active_idx_check(struct srcu_struct *sp, int idx)
|
||||
* not mean that there are no more readers, as one could have read
|
||||
* the current index but not have incremented the lock counter yet.
|
||||
*
|
||||
* Possible bug: There is no guarantee that there haven't been
|
||||
* ULONG_MAX increments of ->srcu_lock_count[] since the unlocks were
|
||||
* counted, meaning that this could return true even if there are
|
||||
* still active readers. Since there are no memory barriers around
|
||||
* srcu_flip(), the CPU is not required to increment ->srcu_idx
|
||||
* before running srcu_readers_unlock_idx(), which means that there
|
||||
* could be an arbitrarily large number of critical sections that
|
||||
* execute after srcu_readers_unlock_idx() but use the old value
|
||||
* of ->srcu_idx.
|
||||
* So suppose that the updater is preempted here for so long
|
||||
* that more than ULONG_MAX non-nested readers come and go in
|
||||
* the meantime. It turns out that this cannot result in overflow
|
||||
* because if a reader modifies its unlock count after we read it
|
||||
* above, then that reader's next load of ->srcu_idx is guaranteed
|
||||
* to get the new value, which will cause it to operate on the
|
||||
* other bank of counters, where it cannot contribute to the
|
||||
* overflow of these counters. This means that there is a maximum
|
||||
* of 2*NR_CPUS increments, which cannot overflow given current
|
||||
* systems, especially not on 64-bit systems.
|
||||
*
|
||||
* OK, how about nesting? This does impose a limit on nesting
|
||||
* of floor(ULONG_MAX/NR_CPUS/2), which should be sufficient,
|
||||
* especially on 64-bit systems.
|
||||
*/
|
||||
return srcu_readers_lock_idx(sp, idx) == unlocks;
|
||||
}
|
||||
@ -400,8 +411,7 @@ static void srcu_gp_start(struct srcu_struct *sp)
|
||||
struct srcu_data *sdp = this_cpu_ptr(sp->sda);
|
||||
int state;
|
||||
|
||||
RCU_LOCKDEP_WARN(!lockdep_is_held(&sp->gp_lock),
|
||||
"Invoked srcu_gp_start() without ->gp_lock!");
|
||||
lockdep_assert_held(&sp->lock);
|
||||
WARN_ON_ONCE(ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed));
|
||||
rcu_segcblist_advance(&sdp->srcu_cblist,
|
||||
rcu_seq_current(&sp->srcu_gp_seq));
|
||||
@ -489,17 +499,20 @@ static void srcu_gp_end(struct srcu_struct *sp)
|
||||
{
|
||||
unsigned long cbdelay;
|
||||
bool cbs;
|
||||
int cpu;
|
||||
unsigned long flags;
|
||||
unsigned long gpseq;
|
||||
int idx;
|
||||
int idxnext;
|
||||
unsigned long mask;
|
||||
struct srcu_data *sdp;
|
||||
struct srcu_node *snp;
|
||||
|
||||
/* Prevent more than one additional grace period. */
|
||||
mutex_lock(&sp->srcu_cb_mutex);
|
||||
|
||||
/* End the current grace period. */
|
||||
spin_lock_irq(&sp->gp_lock);
|
||||
raw_spin_lock_irq_rcu_node(sp);
|
||||
idx = rcu_seq_state(sp->srcu_gp_seq);
|
||||
WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
|
||||
cbdelay = srcu_get_delay(sp);
|
||||
@ -508,7 +521,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
|
||||
gpseq = rcu_seq_current(&sp->srcu_gp_seq);
|
||||
if (ULONG_CMP_LT(sp->srcu_gp_seq_needed_exp, gpseq))
|
||||
sp->srcu_gp_seq_needed_exp = gpseq;
|
||||
spin_unlock_irq(&sp->gp_lock);
|
||||
raw_spin_unlock_irq_rcu_node(sp);
|
||||
mutex_unlock(&sp->srcu_gp_mutex);
|
||||
/* A new grace period can start at this point. But only one. */
|
||||
|
||||
@ -516,7 +529,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
|
||||
idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
|
||||
idxnext = (idx + 1) % ARRAY_SIZE(snp->srcu_have_cbs);
|
||||
rcu_for_each_node_breadth_first(sp, snp) {
|
||||
spin_lock_irq(&snp->lock);
|
||||
raw_spin_lock_irq_rcu_node(snp);
|
||||
cbs = false;
|
||||
if (snp >= sp->level[rcu_num_lvls - 1])
|
||||
cbs = snp->srcu_have_cbs[idx] == gpseq;
|
||||
@ -526,28 +539,37 @@ static void srcu_gp_end(struct srcu_struct *sp)
|
||||
snp->srcu_gp_seq_needed_exp = gpseq;
|
||||
mask = snp->srcu_data_have_cbs[idx];
|
||||
snp->srcu_data_have_cbs[idx] = 0;
|
||||
spin_unlock_irq(&snp->lock);
|
||||
if (cbs) {
|
||||
smp_mb(); /* GP end before CB invocation. */
|
||||
raw_spin_unlock_irq_rcu_node(snp);
|
||||
if (cbs)
|
||||
srcu_schedule_cbs_snp(sp, snp, mask, cbdelay);
|
||||
}
|
||||
|
||||
/* Occasionally prevent srcu_data counter wrap. */
|
||||
if (!(gpseq & counter_wrap_check))
|
||||
for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) {
|
||||
sdp = per_cpu_ptr(sp->sda, cpu);
|
||||
raw_spin_lock_irqsave_rcu_node(sdp, flags);
|
||||
if (ULONG_CMP_GE(gpseq,
|
||||
sdp->srcu_gp_seq_needed + 100))
|
||||
sdp->srcu_gp_seq_needed = gpseq;
|
||||
raw_spin_unlock_irqrestore_rcu_node(sdp, flags);
|
||||
}
|
||||
}
|
||||
|
||||
/* Callback initiation done, allow grace periods after next. */
|
||||
mutex_unlock(&sp->srcu_cb_mutex);
|
||||
|
||||
/* Start a new grace period if needed. */
|
||||
spin_lock_irq(&sp->gp_lock);
|
||||
raw_spin_lock_irq_rcu_node(sp);
|
||||
gpseq = rcu_seq_current(&sp->srcu_gp_seq);
|
||||
if (!rcu_seq_state(gpseq) &&
|
||||
ULONG_CMP_LT(gpseq, sp->srcu_gp_seq_needed)) {
|
||||
srcu_gp_start(sp);
|
||||
spin_unlock_irq(&sp->gp_lock);
|
||||
raw_spin_unlock_irq_rcu_node(sp);
|
||||
/* Throttle expedited grace periods: Should be rare! */
|
||||
srcu_reschedule(sp, rcu_seq_ctr(gpseq) & 0x3ff
|
||||
? 0 : SRCU_INTERVAL);
|
||||
} else {
|
||||
spin_unlock_irq(&sp->gp_lock);
|
||||
raw_spin_unlock_irq_rcu_node(sp);
|
||||
}
|
||||
}
|
||||
|
||||
@ -567,18 +589,18 @@ static void srcu_funnel_exp_start(struct srcu_struct *sp, struct srcu_node *snp,
|
||||
if (rcu_seq_done(&sp->srcu_gp_seq, s) ||
|
||||
ULONG_CMP_GE(READ_ONCE(snp->srcu_gp_seq_needed_exp), s))
|
||||
return;
|
||||
spin_lock_irqsave(&snp->lock, flags);
|
||||
raw_spin_lock_irqsave_rcu_node(snp, flags);
|
||||
if (ULONG_CMP_GE(snp->srcu_gp_seq_needed_exp, s)) {
|
||||
spin_unlock_irqrestore(&snp->lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
return;
|
||||
}
|
||||
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
|
||||
spin_unlock_irqrestore(&snp->lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
}
|
||||
spin_lock_irqsave(&sp->gp_lock, flags);
|
||||
raw_spin_lock_irqsave_rcu_node(sp, flags);
|
||||
if (!ULONG_CMP_LT(sp->srcu_gp_seq_needed_exp, s))
|
||||
sp->srcu_gp_seq_needed_exp = s;
|
||||
spin_unlock_irqrestore(&sp->gp_lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -600,14 +622,13 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
|
||||
for (; snp != NULL; snp = snp->srcu_parent) {
|
||||
if (rcu_seq_done(&sp->srcu_gp_seq, s) && snp != sdp->mynode)
|
||||
return; /* GP already done and CBs recorded. */
|
||||
spin_lock_irqsave(&snp->lock, flags);
|
||||
raw_spin_lock_irqsave_rcu_node(snp, flags);
|
||||
if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
|
||||
snp_seq = snp->srcu_have_cbs[idx];
|
||||
if (snp == sdp->mynode && snp_seq == s)
|
||||
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
|
||||
spin_unlock_irqrestore(&snp->lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
if (snp == sdp->mynode && snp_seq != s) {
|
||||
smp_mb(); /* CBs after GP! */
|
||||
srcu_schedule_cbs_sdp(sdp, do_norm
|
||||
? SRCU_INTERVAL
|
||||
: 0);
|
||||
@ -622,11 +643,11 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
|
||||
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
|
||||
if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
|
||||
snp->srcu_gp_seq_needed_exp = s;
|
||||
spin_unlock_irqrestore(&snp->lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
|
||||
}
|
||||
|
||||
/* Top of tree, must ensure the grace period will be started. */
|
||||
spin_lock_irqsave(&sp->gp_lock, flags);
|
||||
raw_spin_lock_irqsave_rcu_node(sp, flags);
|
||||
if (ULONG_CMP_LT(sp->srcu_gp_seq_needed, s)) {
|
||||
/*
|
||||
* Record need for grace period s. Pair with load
|
||||
@ -645,7 +666,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
|
||||
queue_delayed_work(system_power_efficient_wq, &sp->work,
|
||||
srcu_get_delay(sp));
|
||||
}
|
||||
spin_unlock_irqrestore(&sp->gp_lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -671,6 +692,16 @@ static bool try_check_zero(struct srcu_struct *sp, int idx, int trycount)
|
||||
*/
|
||||
static void srcu_flip(struct srcu_struct *sp)
|
||||
{
|
||||
/*
|
||||
* Ensure that if this updater saw a given reader's increment
|
||||
* from __srcu_read_lock(), that reader was using an old value
|
||||
* of ->srcu_idx. Also ensure that if a given reader sees the
|
||||
* new value of ->srcu_idx, this updater's earlier scans cannot
|
||||
* have seen that reader's increments (which is OK, because this
|
||||
* grace period need not wait on that reader).
|
||||
*/
|
||||
smp_mb(); /* E */ /* Pairs with B and C. */
|
||||
|
||||
WRITE_ONCE(sp->srcu_idx, sp->srcu_idx + 1);
|
||||
|
||||
/*
|
||||
@ -744,6 +775,13 @@ static bool srcu_might_be_idle(struct srcu_struct *sp)
|
||||
return true; /* With reasonable probability, idle! */
|
||||
}
|
||||
|
||||
/*
|
||||
* SRCU callback function to leak a callback.
|
||||
*/
|
||||
static void srcu_leak_callback(struct rcu_head *rhp)
|
||||
{
|
||||
}
|
||||
|
||||
/*
|
||||
* Enqueue an SRCU callback on the srcu_data structure associated with
|
||||
* the current CPU and the specified srcu_struct structure, initiating
|
||||
@ -782,10 +820,16 @@ void __call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
|
||||
struct srcu_data *sdp;
|
||||
|
||||
check_init_srcu_struct(sp);
|
||||
if (debug_rcu_head_queue(rhp)) {
|
||||
/* Probable double call_srcu(), so leak the callback. */
|
||||
WRITE_ONCE(rhp->func, srcu_leak_callback);
|
||||
WARN_ONCE(1, "call_srcu(): Leaked duplicate callback\n");
|
||||
return;
|
||||
}
|
||||
rhp->func = func;
|
||||
local_irq_save(flags);
|
||||
sdp = this_cpu_ptr(sp->sda);
|
||||
spin_lock(&sdp->lock);
|
||||
raw_spin_lock_rcu_node(sdp);
|
||||
rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp, false);
|
||||
rcu_segcblist_advance(&sdp->srcu_cblist,
|
||||
rcu_seq_current(&sp->srcu_gp_seq));
|
||||
@ -799,13 +843,30 @@ void __call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
|
||||
sdp->srcu_gp_seq_needed_exp = s;
|
||||
needexp = true;
|
||||
}
|
||||
spin_unlock_irqrestore(&sdp->lock, flags);
|
||||
raw_spin_unlock_irqrestore_rcu_node(sdp, flags);
|
||||
if (needgp)
|
||||
srcu_funnel_gp_start(sp, sdp, s, do_norm);
|
||||
else if (needexp)
|
||||
srcu_funnel_exp_start(sp, sdp->mynode, s);
|
||||
}
|
||||
|
||||
/**
|
||||
* call_srcu() - Queue a callback for invocation after an SRCU grace period
|
||||
* @sp: srcu_struct in queue the callback
|
||||
* @head: structure to be used for queueing the SRCU callback.
|
||||
* @func: function to be invoked after the SRCU grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full SRCU
|
||||
* grace period elapses, in other words after all pre-existing SRCU
|
||||
* read-side critical sections have completed. However, the callback
|
||||
* function might well execute concurrently with other SRCU read-side
|
||||
* critical sections that started after call_srcu() was invoked. SRCU
|
||||
* read-side critical sections are delimited by srcu_read_lock() and
|
||||
* srcu_read_unlock(), and may be nested.
|
||||
*
|
||||
* The callback will be invoked from process context, but must nevertheless
|
||||
* be fast and must not block.
|
||||
*/
|
||||
void call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
|
||||
rcu_callback_t func)
|
||||
{
|
||||
@ -953,13 +1014,16 @@ void srcu_barrier(struct srcu_struct *sp)
|
||||
*/
|
||||
for_each_possible_cpu(cpu) {
|
||||
sdp = per_cpu_ptr(sp->sda, cpu);
|
||||
spin_lock_irq(&sdp->lock);
|
||||
raw_spin_lock_irq_rcu_node(sdp);
|
||||
atomic_inc(&sp->srcu_barrier_cpu_cnt);
|
||||
sdp->srcu_barrier_head.func = srcu_barrier_cb;
|
||||
debug_rcu_head_queue(&sdp->srcu_barrier_head);
|
||||
if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
|
||||
&sdp->srcu_barrier_head, 0))
|
||||
&sdp->srcu_barrier_head, 0)) {
|
||||
debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
|
||||
atomic_dec(&sp->srcu_barrier_cpu_cnt);
|
||||
spin_unlock_irq(&sdp->lock);
|
||||
}
|
||||
raw_spin_unlock_irq_rcu_node(sdp);
|
||||
}
|
||||
|
||||
/* Remove the initial count, at which point reaching zero can happen. */
|
||||
@ -1008,17 +1072,17 @@ static void srcu_advance_state(struct srcu_struct *sp)
|
||||
*/
|
||||
idx = rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq)); /* ^^^ */
|
||||
if (idx == SRCU_STATE_IDLE) {
|
||||
spin_lock_irq(&sp->gp_lock);
|
||||
raw_spin_lock_irq_rcu_node(sp);
|
||||
if (ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)) {
|
||||
WARN_ON_ONCE(rcu_seq_state(sp->srcu_gp_seq));
|
||||
spin_unlock_irq(&sp->gp_lock);
|
||||
raw_spin_unlock_irq_rcu_node(sp);
|
||||
mutex_unlock(&sp->srcu_gp_mutex);
|
||||
return;
|
||||
}
|
||||
idx = rcu_seq_state(READ_ONCE(sp->srcu_gp_seq));
|
||||
if (idx == SRCU_STATE_IDLE)
|
||||
srcu_gp_start(sp);
|
||||
spin_unlock_irq(&sp->gp_lock);
|
||||
raw_spin_unlock_irq_rcu_node(sp);
|
||||
if (idx != SRCU_STATE_IDLE) {
|
||||
mutex_unlock(&sp->srcu_gp_mutex);
|
||||
return; /* Someone else started the grace period. */
|
||||
@ -1067,22 +1131,22 @@ static void srcu_invoke_callbacks(struct work_struct *work)
|
||||
sdp = container_of(work, struct srcu_data, work.work);
|
||||
sp = sdp->sp;
|
||||
rcu_cblist_init(&ready_cbs);
|
||||
spin_lock_irq(&sdp->lock);
|
||||
smp_mb(); /* Old grace periods before callback invocation! */
|
||||
raw_spin_lock_irq_rcu_node(sdp);
|
||||
rcu_segcblist_advance(&sdp->srcu_cblist,
|
||||
rcu_seq_current(&sp->srcu_gp_seq));
|
||||
if (sdp->srcu_cblist_invoking ||
|
||||
!rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
|
||||
spin_unlock_irq(&sdp->lock);
|
||||
raw_spin_unlock_irq_rcu_node(sdp);
|
||||
return; /* Someone else on the job or nothing to do. */
|
||||
}
|
||||
|
||||
/* We are on the job! Extract and invoke ready callbacks. */
|
||||
sdp->srcu_cblist_invoking = true;
|
||||
rcu_segcblist_extract_done_cbs(&sdp->srcu_cblist, &ready_cbs);
|
||||
spin_unlock_irq(&sdp->lock);
|
||||
raw_spin_unlock_irq_rcu_node(sdp);
|
||||
rhp = rcu_cblist_dequeue(&ready_cbs);
|
||||
for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) {
|
||||
debug_rcu_head_unqueue(rhp);
|
||||
local_bh_disable();
|
||||
rhp->func(rhp);
|
||||
local_bh_enable();
|
||||
@ -1092,13 +1156,13 @@ static void srcu_invoke_callbacks(struct work_struct *work)
|
||||
* Update counts, accelerate new callbacks, and if needed,
|
||||
* schedule another round of callback invocation.
|
||||
*/
|
||||
spin_lock_irq(&sdp->lock);
|
||||
raw_spin_lock_irq_rcu_node(sdp);
|
||||
rcu_segcblist_insert_count(&sdp->srcu_cblist, &ready_cbs);
|
||||
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
|
||||
rcu_seq_snap(&sp->srcu_gp_seq));
|
||||
sdp->srcu_cblist_invoking = false;
|
||||
more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
|
||||
spin_unlock_irq(&sdp->lock);
|
||||
raw_spin_unlock_irq_rcu_node(sdp);
|
||||
if (more)
|
||||
srcu_schedule_cbs_sdp(sdp, 0);
|
||||
}
|
||||
@ -1111,7 +1175,7 @@ static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay)
|
||||
{
|
||||
bool pushgp = true;
|
||||
|
||||
spin_lock_irq(&sp->gp_lock);
|
||||
raw_spin_lock_irq_rcu_node(sp);
|
||||
if (ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)) {
|
||||
if (!WARN_ON_ONCE(rcu_seq_state(sp->srcu_gp_seq))) {
|
||||
/* All requests fulfilled, time to go idle. */
|
||||
@ -1121,7 +1185,7 @@ static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay)
|
||||
/* Outstanding request and no GP. Start one. */
|
||||
srcu_gp_start(sp);
|
||||
}
|
||||
spin_unlock_irq(&sp->gp_lock);
|
||||
raw_spin_unlock_irq_rcu_node(sp);
|
||||
|
||||
if (pushgp)
|
||||
queue_delayed_work(system_power_efficient_wq, &sp->work, delay);
|
||||
@ -1152,3 +1216,12 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
|
||||
*gpnum = rcu_seq_ctr(sp->srcu_gp_seq_needed);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
|
||||
|
||||
static int __init srcu_bootup_announce(void)
|
||||
{
|
||||
pr_info("Hierarchical SRCU implementation.\n");
|
||||
if (exp_holdoff != DEFAULT_SRCU_EXP_HOLDOFF)
|
||||
pr_info("\tNon-default auto-expedite holdoff of %lu ns.\n", exp_holdoff);
|
||||
return 0;
|
||||
}
|
||||
early_initcall(srcu_bootup_announce);
|
||||
|
@ -35,15 +35,26 @@
|
||||
#include <linux/time.h>
|
||||
#include <linux/cpu.h>
|
||||
#include <linux/prefetch.h>
|
||||
#include <linux/trace_events.h>
|
||||
|
||||
#include "rcu.h"
|
||||
|
||||
/* Forward declarations for tiny_plugin.h. */
|
||||
struct rcu_ctrlblk;
|
||||
static void __call_rcu(struct rcu_head *head,
|
||||
rcu_callback_t func,
|
||||
struct rcu_ctrlblk *rcp);
|
||||
/* Global control variables for rcupdate callback mechanism. */
|
||||
struct rcu_ctrlblk {
|
||||
struct rcu_head *rcucblist; /* List of pending callbacks (CBs). */
|
||||
struct rcu_head **donetail; /* ->next pointer of last "done" CB. */
|
||||
struct rcu_head **curtail; /* ->next pointer of last CB. */
|
||||
};
|
||||
|
||||
/* Definition for rcupdate control block. */
|
||||
static struct rcu_ctrlblk rcu_sched_ctrlblk = {
|
||||
.donetail = &rcu_sched_ctrlblk.rcucblist,
|
||||
.curtail = &rcu_sched_ctrlblk.rcucblist,
|
||||
};
|
||||
|
||||
static struct rcu_ctrlblk rcu_bh_ctrlblk = {
|
||||
.donetail = &rcu_bh_ctrlblk.rcucblist,
|
||||
.curtail = &rcu_bh_ctrlblk.rcucblist,
|
||||
};
|
||||
|
||||
#include "tiny_plugin.h"
|
||||
|
||||
@ -59,19 +70,6 @@ void rcu_barrier_sched(void)
|
||||
}
|
||||
EXPORT_SYMBOL(rcu_barrier_sched);
|
||||
|
||||
#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE)
|
||||
|
||||
/*
|
||||
* Test whether RCU thinks that the current CPU is idle.
|
||||
*/
|
||||
bool notrace __rcu_is_watching(void)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
EXPORT_SYMBOL(__rcu_is_watching);
|
||||
|
||||
#endif /* defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) */
|
||||
|
||||
/*
|
||||
* Helper function for rcu_sched_qs() and rcu_bh_qs().
|
||||
* Also irqs are disabled to avoid confusion due to interrupt handlers
|
||||
@ -79,7 +77,6 @@ EXPORT_SYMBOL(__rcu_is_watching);
|
||||
*/
|
||||
static int rcu_qsctr_help(struct rcu_ctrlblk *rcp)
|
||||
{
|
||||
RCU_TRACE(reset_cpu_stall_ticks(rcp);)
|
||||
if (rcp->donetail != rcp->curtail) {
|
||||
rcp->donetail = rcp->curtail;
|
||||
return 1;
|
||||
@ -125,7 +122,6 @@ void rcu_bh_qs(void)
|
||||
*/
|
||||
void rcu_check_callbacks(int user)
|
||||
{
|
||||
RCU_TRACE(check_cpu_stalls();)
|
||||
if (user)
|
||||
rcu_sched_qs();
|
||||
else if (!in_softirq())
|
||||
@ -140,10 +136,8 @@ void rcu_check_callbacks(int user)
|
||||
*/
|
||||
static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
|
||||
{
|
||||
const char *rn = NULL;
|
||||
struct rcu_head *next, *list;
|
||||
unsigned long flags;
|
||||
RCU_TRACE(int cb_count = 0;)
|
||||
|
||||
/* Move the ready-to-invoke callbacks to a local list. */
|
||||
local_irq_save(flags);
|
||||
@ -152,7 +146,6 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
|
||||
local_irq_restore(flags);
|
||||
return;
|
||||
}
|
||||
RCU_TRACE(trace_rcu_batch_start(rcp->name, 0, rcp->qlen, -1);)
|
||||
list = rcp->rcucblist;
|
||||
rcp->rcucblist = *rcp->donetail;
|
||||
*rcp->donetail = NULL;
|
||||
@ -162,22 +155,15 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
|
||||
local_irq_restore(flags);
|
||||
|
||||
/* Invoke the callbacks on the local list. */
|
||||
RCU_TRACE(rn = rcp->name;)
|
||||
while (list) {
|
||||
next = list->next;
|
||||
prefetch(next);
|
||||
debug_rcu_head_unqueue(list);
|
||||
local_bh_disable();
|
||||
__rcu_reclaim(rn, list);
|
||||
__rcu_reclaim("", list);
|
||||
local_bh_enable();
|
||||
list = next;
|
||||
RCU_TRACE(cb_count++;)
|
||||
}
|
||||
RCU_TRACE(rcu_trace_sub_qlen(rcp, cb_count);)
|
||||
RCU_TRACE(trace_rcu_batch_end(rcp->name,
|
||||
cb_count, 0, need_resched(),
|
||||
is_idle_task(current),
|
||||
false));
|
||||
}
|
||||
|
||||
static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused)
|
||||
@ -221,7 +207,6 @@ static void __call_rcu(struct rcu_head *head,
|
||||
local_irq_save(flags);
|
||||
*rcp->curtail = head;
|
||||
rcp->curtail = &head->next;
|
||||
RCU_TRACE(rcp->qlen++;)
|
||||
local_irq_restore(flags);
|
||||
|
||||
if (unlikely(is_idle_task(current))) {
|
||||
@ -254,8 +239,5 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
|
||||
void __init rcu_init(void)
|
||||
{
|
||||
open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
|
||||
RCU_TRACE(reset_cpu_stall_ticks(&rcu_sched_ctrlblk);)
|
||||
RCU_TRACE(reset_cpu_stall_ticks(&rcu_bh_ctrlblk);)
|
||||
|
||||
rcu_early_boot_tests();
|
||||
}
|
||||
|
@ -22,36 +22,6 @@
|
||||
* Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
||||
*/
|
||||
|
||||
#include <linux/kthread.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/debugfs.h>
|
||||
#include <linux/seq_file.h>
|
||||
|
||||
/* Global control variables for rcupdate callback mechanism. */
|
||||
struct rcu_ctrlblk {
|
||||
struct rcu_head *rcucblist; /* List of pending callbacks (CBs). */
|
||||
struct rcu_head **donetail; /* ->next pointer of last "done" CB. */
|
||||
struct rcu_head **curtail; /* ->next pointer of last CB. */
|
||||
RCU_TRACE(long qlen); /* Number of pending CBs. */
|
||||
RCU_TRACE(unsigned long gp_start); /* Start time for stalls. */
|
||||
RCU_TRACE(unsigned long ticks_this_gp); /* Statistic for stalls. */
|
||||
RCU_TRACE(unsigned long jiffies_stall); /* Jiffies at next stall. */
|
||||
RCU_TRACE(const char *name); /* Name of RCU type. */
|
||||
};
|
||||
|
||||
/* Definition for rcupdate control block. */
|
||||
static struct rcu_ctrlblk rcu_sched_ctrlblk = {
|
||||
.donetail = &rcu_sched_ctrlblk.rcucblist,
|
||||
.curtail = &rcu_sched_ctrlblk.rcucblist,
|
||||
RCU_TRACE(.name = "rcu_sched")
|
||||
};
|
||||
|
||||
static struct rcu_ctrlblk rcu_bh_ctrlblk = {
|
||||
.donetail = &rcu_bh_ctrlblk.rcucblist,
|
||||
.curtail = &rcu_bh_ctrlblk.rcucblist,
|
||||
RCU_TRACE(.name = "rcu_bh")
|
||||
};
|
||||
|
||||
#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU)
|
||||
#include <linux/kernel_stat.h>
|
||||
|
||||
@ -75,96 +45,3 @@ void __init rcu_scheduler_starting(void)
|
||||
}
|
||||
|
||||
#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU) */
|
||||
|
||||
#ifdef CONFIG_RCU_TRACE
|
||||
|
||||
static void rcu_trace_sub_qlen(struct rcu_ctrlblk *rcp, int n)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
local_irq_save(flags);
|
||||
rcp->qlen -= n;
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
|
||||
/*
|
||||
* Dump statistics for TINY_RCU, such as they are.
|
||||
*/
|
||||
static int show_tiny_stats(struct seq_file *m, void *unused)
|
||||
{
|
||||
seq_printf(m, "rcu_sched: qlen: %ld\n", rcu_sched_ctrlblk.qlen);
|
||||
seq_printf(m, "rcu_bh: qlen: %ld\n", rcu_bh_ctrlblk.qlen);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int show_tiny_stats_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return single_open(file, show_tiny_stats, NULL);
|
||||
}
|
||||
|
||||
static const struct file_operations show_tiny_stats_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = show_tiny_stats_open,
|
||||
.read = seq_read,
|
||||
.llseek = seq_lseek,
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
static struct dentry *rcudir;
|
||||
|
||||
static int __init rcutiny_trace_init(void)
|
||||
{
|
||||
struct dentry *retval;
|
||||
|
||||
rcudir = debugfs_create_dir("rcu", NULL);
|
||||
if (!rcudir)
|
||||
goto free_out;
|
||||
retval = debugfs_create_file("rcudata", 0444, rcudir,
|
||||
NULL, &show_tiny_stats_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
return 0;
|
||||
free_out:
|
||||
debugfs_remove_recursive(rcudir);
|
||||
return 1;
|
||||
}
|
||||
device_initcall(rcutiny_trace_init);
|
||||
|
||||
static void check_cpu_stall(struct rcu_ctrlblk *rcp)
|
||||
{
|
||||
unsigned long j;
|
||||
unsigned long js;
|
||||
|
||||
if (rcu_cpu_stall_suppress)
|
||||
return;
|
||||
rcp->ticks_this_gp++;
|
||||
j = jiffies;
|
||||
js = READ_ONCE(rcp->jiffies_stall);
|
||||
if (rcp->rcucblist && ULONG_CMP_GE(j, js)) {
|
||||
pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n",
|
||||
rcp->name, rcp->ticks_this_gp, DYNTICK_TASK_EXIT_IDLE,
|
||||
jiffies - rcp->gp_start, rcp->qlen);
|
||||
dump_stack();
|
||||
WRITE_ONCE(rcp->jiffies_stall,
|
||||
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
|
||||
} else if (ULONG_CMP_GE(j, js)) {
|
||||
WRITE_ONCE(rcp->jiffies_stall,
|
||||
jiffies + rcu_jiffies_till_stall_check());
|
||||
}
|
||||
}
|
||||
|
||||
static void reset_cpu_stall_ticks(struct rcu_ctrlblk *rcp)
|
||||
{
|
||||
rcp->ticks_this_gp = 0;
|
||||
rcp->gp_start = jiffies;
|
||||
WRITE_ONCE(rcp->jiffies_stall,
|
||||
jiffies + rcu_jiffies_till_stall_check());
|
||||
}
|
||||
|
||||
static void check_cpu_stalls(void)
|
||||
{
|
||||
RCU_TRACE(check_cpu_stall(&rcu_bh_ctrlblk);)
|
||||
RCU_TRACE(check_cpu_stall(&rcu_sched_ctrlblk);)
|
||||
}
|
||||
|
||||
#endif /* #ifdef CONFIG_RCU_TRACE */
|
||||
|
@ -168,35 +168,17 @@ static void rcu_report_exp_rdp(struct rcu_state *rsp,
|
||||
static void sync_sched_exp_online_cleanup(int cpu);
|
||||
|
||||
/* rcuc/rcub kthread realtime priority */
|
||||
#ifdef CONFIG_RCU_KTHREAD_PRIO
|
||||
static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
|
||||
#else /* #ifdef CONFIG_RCU_KTHREAD_PRIO */
|
||||
static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
|
||||
#endif /* #else #ifdef CONFIG_RCU_KTHREAD_PRIO */
|
||||
module_param(kthread_prio, int, 0644);
|
||||
|
||||
/* Delay in jiffies for grace-period initialization delays, debug only. */
|
||||
|
||||
#ifdef CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT
|
||||
static int gp_preinit_delay = CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT_DELAY;
|
||||
module_param(gp_preinit_delay, int, 0644);
|
||||
#else /* #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT */
|
||||
static const int gp_preinit_delay;
|
||||
#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT */
|
||||
|
||||
#ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT
|
||||
static int gp_init_delay = CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY;
|
||||
module_param(gp_init_delay, int, 0644);
|
||||
#else /* #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT */
|
||||
static const int gp_init_delay;
|
||||
#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_INIT */
|
||||
|
||||
#ifdef CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP
|
||||
static int gp_cleanup_delay = CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY;
|
||||
module_param(gp_cleanup_delay, int, 0644);
|
||||
#else /* #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP */
|
||||
static const int gp_cleanup_delay;
|
||||
#endif /* #else #ifdef CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP */
|
||||
static int gp_preinit_delay;
|
||||
module_param(gp_preinit_delay, int, 0444);
|
||||
static int gp_init_delay;
|
||||
module_param(gp_init_delay, int, 0444);
|
||||
static int gp_cleanup_delay;
|
||||
module_param(gp_cleanup_delay, int, 0444);
|
||||
|
||||
/*
|
||||
* Number of grace periods between delays, normalized by the duration of
|
||||
@ -250,6 +232,7 @@ static int rcu_gp_in_progress(struct rcu_state *rsp)
|
||||
*/
|
||||
void rcu_sched_qs(void)
|
||||
{
|
||||
RCU_LOCKDEP_WARN(preemptible(), "rcu_sched_qs() invoked with preemption enabled!!!");
|
||||
if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.s))
|
||||
return;
|
||||
trace_rcu_grace_period(TPS("rcu_sched"),
|
||||
@ -265,6 +248,7 @@ void rcu_sched_qs(void)
|
||||
|
||||
void rcu_bh_qs(void)
|
||||
{
|
||||
RCU_LOCKDEP_WARN(preemptible(), "rcu_bh_qs() invoked with preemption enabled!!!");
|
||||
if (__this_cpu_read(rcu_bh_data.cpu_no_qs.s)) {
|
||||
trace_rcu_grace_period(TPS("rcu_bh"),
|
||||
__this_cpu_read(rcu_bh_data.gpnum),
|
||||
@ -286,10 +270,6 @@ void rcu_bh_qs(void)
|
||||
static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
|
||||
.dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
|
||||
.dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR),
|
||||
#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
|
||||
.dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE,
|
||||
.dynticks_idle = ATOMIC_INIT(1),
|
||||
#endif /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
};
|
||||
|
||||
/*
|
||||
@ -478,7 +458,7 @@ void rcu_note_context_switch(bool preempt)
|
||||
barrier(); /* Avoid RCU read-side critical sections leaking down. */
|
||||
trace_rcu_utilization(TPS("Start context switch"));
|
||||
rcu_sched_qs();
|
||||
rcu_preempt_note_context_switch();
|
||||
rcu_preempt_note_context_switch(preempt);
|
||||
/* Load rcu_urgent_qs before other flags. */
|
||||
if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
|
||||
goto out;
|
||||
@ -534,9 +514,12 @@ void rcu_all_qs(void)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_all_qs);
|
||||
|
||||
static long blimit = 10; /* Maximum callbacks per rcu_do_batch. */
|
||||
static long qhimark = 10000; /* If this many pending, ignore blimit. */
|
||||
static long qlowmark = 100; /* Once only this many pending, use blimit. */
|
||||
#define DEFAULT_RCU_BLIMIT 10 /* Maximum callbacks per rcu_do_batch. */
|
||||
static long blimit = DEFAULT_RCU_BLIMIT;
|
||||
#define DEFAULT_RCU_QHIMARK 10000 /* If this many pending, ignore blimit. */
|
||||
static long qhimark = DEFAULT_RCU_QHIMARK;
|
||||
#define DEFAULT_RCU_QLOMARK 100 /* Once only this many pending, use blimit. */
|
||||
static long qlowmark = DEFAULT_RCU_QLOMARK;
|
||||
|
||||
module_param(blimit, long, 0444);
|
||||
module_param(qhimark, long, 0444);
|
||||
@ -559,10 +542,7 @@ module_param(jiffies_till_sched_qs, ulong, 0644);
|
||||
|
||||
static bool rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node *rnp,
|
||||
struct rcu_data *rdp);
|
||||
static void force_qs_rnp(struct rcu_state *rsp,
|
||||
int (*f)(struct rcu_data *rsp, bool *isidle,
|
||||
unsigned long *maxj),
|
||||
bool *isidle, unsigned long *maxj);
|
||||
static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp));
|
||||
static void force_quiescent_state(struct rcu_state *rsp);
|
||||
static int rcu_pending(void);
|
||||
|
||||
@ -757,6 +737,7 @@ static int rcu_future_needs_gp(struct rcu_state *rsp)
|
||||
int idx = (READ_ONCE(rnp->completed) + 1) & 0x1;
|
||||
int *fp = &rnp->need_future_gp[idx];
|
||||
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_future_needs_gp() invoked with irqs enabled!!!");
|
||||
return READ_ONCE(*fp);
|
||||
}
|
||||
|
||||
@ -768,6 +749,7 @@ static int rcu_future_needs_gp(struct rcu_state *rsp)
|
||||
static bool
|
||||
cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
|
||||
{
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "cpu_needs_another_gp() invoked with irqs enabled!!!");
|
||||
if (rcu_gp_in_progress(rsp))
|
||||
return false; /* No, a grace period is already in progress. */
|
||||
if (rcu_future_needs_gp(rsp))
|
||||
@ -794,6 +776,7 @@ static void rcu_eqs_enter_common(bool user)
|
||||
struct rcu_data *rdp;
|
||||
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
|
||||
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_eqs_enter_common() invoked with irqs enabled!!!");
|
||||
trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
|
||||
if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
|
||||
!user && !is_idle_task(current)) {
|
||||
@ -864,7 +847,6 @@ void rcu_idle_enter(void)
|
||||
|
||||
local_irq_save(flags);
|
||||
rcu_eqs_enter(false);
|
||||
rcu_sysidle_enter(0);
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_idle_enter);
|
||||
@ -914,7 +896,6 @@ void rcu_irq_exit(void)
|
||||
trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nesting, rdtp->dynticks_nesting - 1);
|
||||
rdtp->dynticks_nesting--;
|
||||
}
|
||||
rcu_sysidle_enter(1);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -967,6 +948,7 @@ static void rcu_eqs_exit(bool user)
|
||||
struct rcu_dynticks *rdtp;
|
||||
long long oldval;
|
||||
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_eqs_exit() invoked with irqs enabled!!!");
|
||||
rdtp = this_cpu_ptr(&rcu_dynticks);
|
||||
oldval = rdtp->dynticks_nesting;
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
|
||||
@ -995,7 +977,6 @@ void rcu_idle_exit(void)
|
||||
|
||||
local_irq_save(flags);
|
||||
rcu_eqs_exit(false);
|
||||
rcu_sysidle_exit(0);
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_idle_exit);
|
||||
@ -1047,7 +1028,6 @@ void rcu_irq_enter(void)
|
||||
trace_rcu_dyntick(TPS("++="), oldval, rdtp->dynticks_nesting);
|
||||
else
|
||||
rcu_eqs_exit_common(oldval, true);
|
||||
rcu_sysidle_exit(1);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -1129,23 +1109,12 @@ void rcu_nmi_exit(void)
|
||||
rcu_dynticks_eqs_enter();
|
||||
}
|
||||
|
||||
/**
|
||||
* __rcu_is_watching - are RCU read-side critical sections safe?
|
||||
*
|
||||
* Return true if RCU is watching the running CPU, which means that
|
||||
* this CPU can safely enter RCU read-side critical sections. Unlike
|
||||
* rcu_is_watching(), the caller of __rcu_is_watching() must have at
|
||||
* least disabled preemption.
|
||||
*/
|
||||
bool notrace __rcu_is_watching(void)
|
||||
{
|
||||
return !rcu_dynticks_curr_cpu_in_eqs();
|
||||
}
|
||||
|
||||
/**
|
||||
* rcu_is_watching - see if RCU thinks that the current CPU is idle
|
||||
*
|
||||
* If the current CPU is in its idle loop and is neither in an interrupt
|
||||
* Return true if RCU is watching the running CPU, which means that this
|
||||
* CPU can safely enter RCU read-side critical sections. In other words,
|
||||
* if the current CPU is in its idle loop and is neither in an interrupt
|
||||
* or NMI handler, return true.
|
||||
*/
|
||||
bool notrace rcu_is_watching(void)
|
||||
@ -1153,7 +1122,7 @@ bool notrace rcu_is_watching(void)
|
||||
bool ret;
|
||||
|
||||
preempt_disable_notrace();
|
||||
ret = __rcu_is_watching();
|
||||
ret = !rcu_dynticks_curr_cpu_in_eqs();
|
||||
preempt_enable_notrace();
|
||||
return ret;
|
||||
}
|
||||
@ -1237,11 +1206,9 @@ static int rcu_is_cpu_rrupt_from_idle(void)
|
||||
* credit them with an implicit quiescent state. Return 1 if this CPU
|
||||
* is in dynticks idle mode, which is an extended quiescent state.
|
||||
*/
|
||||
static int dyntick_save_progress_counter(struct rcu_data *rdp,
|
||||
bool *isidle, unsigned long *maxj)
|
||||
static int dyntick_save_progress_counter(struct rcu_data *rdp)
|
||||
{
|
||||
rdp->dynticks_snap = rcu_dynticks_snap(rdp->dynticks);
|
||||
rcu_sysidle_check_cpu(rdp, isidle, maxj);
|
||||
if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) {
|
||||
trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("dti"));
|
||||
if (ULONG_CMP_LT(READ_ONCE(rdp->gpnum) + ULONG_MAX / 4,
|
||||
@ -1258,8 +1225,7 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp,
|
||||
* idle state since the last call to dyntick_save_progress_counter()
|
||||
* for this same CPU, or by virtue of having been offline.
|
||||
*/
|
||||
static int rcu_implicit_dynticks_qs(struct rcu_data *rdp,
|
||||
bool *isidle, unsigned long *maxj)
|
||||
static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
|
||||
{
|
||||
unsigned long jtsq;
|
||||
bool *rnhqp;
|
||||
@ -1674,6 +1640,8 @@ void rcu_cpu_stall_reset(void)
|
||||
static unsigned long rcu_cbs_completed(struct rcu_state *rsp,
|
||||
struct rcu_node *rnp)
|
||||
{
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
|
||||
/*
|
||||
* If RCU is idle, we just wait for the next grace period.
|
||||
* But we can only be sure that RCU is idle if we are looking
|
||||
@ -1719,6 +1687,8 @@ rcu_start_future_gp(struct rcu_node *rnp, struct rcu_data *rdp,
|
||||
bool ret = false;
|
||||
struct rcu_node *rnp_root = rcu_get_root(rdp->rsp);
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
|
||||
/*
|
||||
* Pick up grace-period number for new callbacks. If this
|
||||
* grace period is already marked as needed, return to the caller.
|
||||
@ -1845,6 +1815,8 @@ static bool rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp,
|
||||
{
|
||||
bool ret = false;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
|
||||
/* If no pending (not yet ready to invoke) callbacks, nothing to do. */
|
||||
if (!rcu_segcblist_pend_cbs(&rdp->cblist))
|
||||
return false;
|
||||
@ -1883,6 +1855,8 @@ static bool rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp,
|
||||
static bool rcu_advance_cbs(struct rcu_state *rsp, struct rcu_node *rnp,
|
||||
struct rcu_data *rdp)
|
||||
{
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
|
||||
/* If no pending (not yet ready to invoke) callbacks, nothing to do. */
|
||||
if (!rcu_segcblist_pend_cbs(&rdp->cblist))
|
||||
return false;
|
||||
@ -1909,6 +1883,8 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp,
|
||||
bool ret;
|
||||
bool need_gp;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
|
||||
/* Handle the ends of any preceding grace periods first. */
|
||||
if (rdp->completed == rnp->completed &&
|
||||
!unlikely(READ_ONCE(rdp->gpwrap))) {
|
||||
@ -2115,25 +2091,16 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
|
||||
*/
|
||||
static void rcu_gp_fqs(struct rcu_state *rsp, bool first_time)
|
||||
{
|
||||
bool isidle = false;
|
||||
unsigned long maxj;
|
||||
struct rcu_node *rnp = rcu_get_root(rsp);
|
||||
|
||||
WRITE_ONCE(rsp->gp_activity, jiffies);
|
||||
rsp->n_force_qs++;
|
||||
if (first_time) {
|
||||
/* Collect dyntick-idle snapshots. */
|
||||
if (is_sysidle_rcu_state(rsp)) {
|
||||
isidle = true;
|
||||
maxj = jiffies - ULONG_MAX / 4;
|
||||
}
|
||||
force_qs_rnp(rsp, dyntick_save_progress_counter,
|
||||
&isidle, &maxj);
|
||||
rcu_sysidle_report_gp(rsp, isidle, maxj);
|
||||
force_qs_rnp(rsp, dyntick_save_progress_counter);
|
||||
} else {
|
||||
/* Handle dyntick-idle and offline CPUs. */
|
||||
isidle = true;
|
||||
force_qs_rnp(rsp, rcu_implicit_dynticks_qs, &isidle, &maxj);
|
||||
force_qs_rnp(rsp, rcu_implicit_dynticks_qs);
|
||||
}
|
||||
/* Clear flag to prevent immediate re-entry. */
|
||||
if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) {
|
||||
@ -2341,6 +2308,7 @@ static bool
|
||||
rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node *rnp,
|
||||
struct rcu_data *rdp)
|
||||
{
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
if (!rsp->gp_kthread || !cpu_needs_another_gp(rsp, rdp)) {
|
||||
/*
|
||||
* Either we have not yet spawned the grace-period
|
||||
@ -2402,6 +2370,7 @@ static bool rcu_start_gp(struct rcu_state *rsp)
|
||||
static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
|
||||
__releases(rcu_get_root(rsp)->lock)
|
||||
{
|
||||
lockdep_assert_held(&rcu_get_root(rsp)->lock);
|
||||
WARN_ON_ONCE(!rcu_gp_in_progress(rsp));
|
||||
WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS);
|
||||
raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(rsp), flags);
|
||||
@ -2426,6 +2395,8 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
|
||||
unsigned long oldmask = 0;
|
||||
struct rcu_node *rnp_c;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
|
||||
/* Walk up the rcu_node hierarchy. */
|
||||
for (;;) {
|
||||
if (!(rnp->qsmask & mask) || rnp->gpnum != gps) {
|
||||
@ -2486,6 +2457,7 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
|
||||
unsigned long mask;
|
||||
struct rcu_node *rnp_p;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
if (rcu_state_p == &rcu_sched_state || rsp != rcu_state_p ||
|
||||
rnp->qsmask != 0 || rcu_preempt_blocked_readers_cgp(rnp)) {
|
||||
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
|
||||
@ -2599,6 +2571,8 @@ static void
|
||||
rcu_send_cbs_to_orphanage(int cpu, struct rcu_state *rsp,
|
||||
struct rcu_node *rnp, struct rcu_data *rdp)
|
||||
{
|
||||
lockdep_assert_held(&rsp->orphan_lock);
|
||||
|
||||
/* No-CBs CPUs do not have orphanable callbacks. */
|
||||
if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) || rcu_is_nocb_cpu(rdp->cpu))
|
||||
return;
|
||||
@ -2639,6 +2613,8 @@ static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, unsigned long flags)
|
||||
{
|
||||
struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
|
||||
|
||||
lockdep_assert_held(&rsp->orphan_lock);
|
||||
|
||||
/* No-CBs CPUs are handled specially. */
|
||||
if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) ||
|
||||
rcu_nocb_adopt_orphan_cbs(rsp, rdp, flags))
|
||||
@ -2705,6 +2681,7 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
|
||||
long mask;
|
||||
struct rcu_node *rnp = rnp_leaf;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) ||
|
||||
rnp->qsmaskinit || rcu_preempt_has_tasks(rnp))
|
||||
return;
|
||||
@ -2895,10 +2872,7 @@ void rcu_check_callbacks(int user)
|
||||
*
|
||||
* The caller must have suppressed start of new grace periods.
|
||||
*/
|
||||
static void force_qs_rnp(struct rcu_state *rsp,
|
||||
int (*f)(struct rcu_data *rsp, bool *isidle,
|
||||
unsigned long *maxj),
|
||||
bool *isidle, unsigned long *maxj)
|
||||
static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp))
|
||||
{
|
||||
int cpu;
|
||||
unsigned long flags;
|
||||
@ -2937,7 +2911,7 @@ static void force_qs_rnp(struct rcu_state *rsp,
|
||||
for_each_leaf_node_possible_cpu(rnp, cpu) {
|
||||
unsigned long bit = leaf_node_cpu_bit(rnp, cpu);
|
||||
if ((rnp->qsmask & bit) != 0) {
|
||||
if (f(per_cpu_ptr(rsp->rda, cpu), isidle, maxj))
|
||||
if (f(per_cpu_ptr(rsp->rda, cpu)))
|
||||
mask |= bit;
|
||||
}
|
||||
}
|
||||
@ -3143,9 +3117,14 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func,
|
||||
WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
|
||||
|
||||
if (debug_rcu_head_queue(head)) {
|
||||
/* Probable double call_rcu(), so leak the callback. */
|
||||
/*
|
||||
* Probable double call_rcu(), so leak the callback.
|
||||
* Use rcu:rcu_callback trace event to find the previous
|
||||
* time callback was passed to __call_rcu().
|
||||
*/
|
||||
WARN_ONCE(1, "__call_rcu(): Double-freed CB %p->%pF()!!!\n",
|
||||
head, head->func);
|
||||
WRITE_ONCE(head->func, rcu_leak_callback);
|
||||
WARN_ONCE(1, "__call_rcu(): Leaked duplicate callback\n");
|
||||
return;
|
||||
}
|
||||
head->func = func;
|
||||
@ -3194,8 +3173,24 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func,
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
|
||||
/*
|
||||
* Queue an RCU-sched callback for invocation after a grace period.
|
||||
/**
|
||||
* call_rcu_sched() - Queue an RCU for invocation after sched grace period.
|
||||
* @head: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all currently executing RCU
|
||||
* read-side critical sections have completed. call_rcu_sched() assumes
|
||||
* that the read-side critical sections end on enabling of preemption
|
||||
* or on voluntary preemption.
|
||||
* RCU read-side critical sections are delimited by :
|
||||
* - rcu_read_lock_sched() and rcu_read_unlock_sched(), OR
|
||||
* - anything that disables preemption.
|
||||
*
|
||||
* These may be nested.
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
void call_rcu_sched(struct rcu_head *head, rcu_callback_t func)
|
||||
{
|
||||
@ -3203,8 +3198,26 @@ void call_rcu_sched(struct rcu_head *head, rcu_callback_t func)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(call_rcu_sched);
|
||||
|
||||
/*
|
||||
* Queue an RCU callback for invocation after a quicker grace period.
|
||||
/**
|
||||
* call_rcu_bh() - Queue an RCU for invocation after a quicker grace period.
|
||||
* @head: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all currently executing RCU
|
||||
* read-side critical sections have completed. call_rcu_bh() assumes
|
||||
* that the read-side critical sections end on completion of a softirq
|
||||
* handler. This means that read-side critical sections in process
|
||||
* context must not be interrupted by softirqs. This interface is to be
|
||||
* used when most of the read-side critical sections are in softirq context.
|
||||
* RCU read-side critical sections are delimited by :
|
||||
* - rcu_read_lock() and rcu_read_unlock(), if in interrupt context.
|
||||
* OR
|
||||
* - rcu_read_lock_bh() and rcu_read_unlock_bh(), if in process context.
|
||||
* These may be nested.
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
void call_rcu_bh(struct rcu_head *head, rcu_callback_t func)
|
||||
{
|
||||
@ -3280,12 +3293,6 @@ static inline int rcu_blocking_is_gp(void)
|
||||
* to have executed a full memory barrier during the execution of
|
||||
* synchronize_sched() -- even if CPU A and CPU B are the same CPU (but
|
||||
* again only if the system has more than one CPU).
|
||||
*
|
||||
* This primitive provides the guarantees made by the (now removed)
|
||||
* synchronize_kernel() API. In contrast, synchronize_rcu() only
|
||||
* guarantees that rcu_read_lock() sections will have completed.
|
||||
* In "classic RCU", these two guarantees happen to be one and
|
||||
* the same, but can differ in realtime RCU implementations.
|
||||
*/
|
||||
void synchronize_sched(void)
|
||||
{
|
||||
@ -3578,8 +3585,14 @@ static void rcu_barrier_func(void *type)
|
||||
struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
|
||||
|
||||
_rcu_barrier_trace(rsp, "IRQ", -1, rsp->barrier_sequence);
|
||||
atomic_inc(&rsp->barrier_cpu_count);
|
||||
rsp->call(&rdp->barrier_head, rcu_barrier_callback);
|
||||
rdp->barrier_head.func = rcu_barrier_callback;
|
||||
debug_rcu_head_queue(&rdp->barrier_head);
|
||||
if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head, 0)) {
|
||||
atomic_inc(&rsp->barrier_cpu_count);
|
||||
} else {
|
||||
debug_rcu_head_unqueue(&rdp->barrier_head);
|
||||
_rcu_barrier_trace(rsp, "IRQNQ", -1, rsp->barrier_sequence);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
@ -3698,6 +3711,7 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf)
|
||||
long mask;
|
||||
struct rcu_node *rnp = rnp_leaf;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
for (;;) {
|
||||
mask = rnp->grpmask;
|
||||
rnp = rnp->parent;
|
||||
@ -3753,7 +3767,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
|
||||
!init_nocb_callback_list(rdp))
|
||||
rcu_segcblist_init(&rdp->cblist); /* Re-enable callbacks. */
|
||||
rdp->dynticks->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
|
||||
rcu_sysidle_init_percpu_data(rdp->dynticks);
|
||||
rcu_dynticks_eqs_online();
|
||||
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
|
||||
|
||||
|
@ -45,14 +45,6 @@ struct rcu_dynticks {
|
||||
bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */
|
||||
unsigned long rcu_qs_ctr; /* Light universal quiescent state ctr. */
|
||||
bool rcu_urgent_qs; /* GP old need light quiescent state. */
|
||||
#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
|
||||
long long dynticks_idle_nesting;
|
||||
/* irq/process nesting level from idle. */
|
||||
atomic_t dynticks_idle; /* Even value for idle, else odd. */
|
||||
/* "Idle" excludes userspace execution. */
|
||||
unsigned long dynticks_idle_jiffies;
|
||||
/* End of last non-NMI non-idle period. */
|
||||
#endif /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
#ifdef CONFIG_RCU_FAST_NO_HZ
|
||||
bool all_lazy; /* Are all CPU's CBs lazy? */
|
||||
unsigned long nonlazy_posted;
|
||||
@ -160,19 +152,6 @@ struct rcu_node {
|
||||
/* Number of tasks boosted for expedited GP. */
|
||||
unsigned long n_normal_boosts;
|
||||
/* Number of tasks boosted for normal GP. */
|
||||
unsigned long n_balk_blkd_tasks;
|
||||
/* Refused to boost: no blocked tasks. */
|
||||
unsigned long n_balk_exp_gp_tasks;
|
||||
/* Refused to boost: nothing blocking GP. */
|
||||
unsigned long n_balk_boost_tasks;
|
||||
/* Refused to boost: already boosting. */
|
||||
unsigned long n_balk_notblocked;
|
||||
/* Refused to boost: RCU RS CS still running. */
|
||||
unsigned long n_balk_notyet;
|
||||
/* Refused to boost: not yet time. */
|
||||
unsigned long n_balk_nos;
|
||||
/* Refused to boost: not sure why, though. */
|
||||
/* This can happen due to race conditions. */
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
struct swait_queue_head nocb_gp_wq[2];
|
||||
/* Place for rcu_nocb_kthread() to wait GP. */
|
||||
@ -312,9 +291,9 @@ struct rcu_data {
|
||||
};
|
||||
|
||||
/* Values for nocb_defer_wakeup field in struct rcu_data. */
|
||||
#define RCU_NOGP_WAKE_NOT 0
|
||||
#define RCU_NOGP_WAKE 1
|
||||
#define RCU_NOGP_WAKE_FORCE 2
|
||||
#define RCU_NOCB_WAKE_NOT 0
|
||||
#define RCU_NOCB_WAKE 1
|
||||
#define RCU_NOCB_WAKE_FORCE 2
|
||||
|
||||
#define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500))
|
||||
/* For jiffies_till_first_fqs and */
|
||||
@ -477,7 +456,7 @@ DECLARE_PER_CPU(char, rcu_cpu_has_work);
|
||||
|
||||
/* Forward declarations for rcutree_plugin.h */
|
||||
static void rcu_bootup_announce(void);
|
||||
static void rcu_preempt_note_context_switch(void);
|
||||
static void rcu_preempt_note_context_switch(bool preempt);
|
||||
static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp);
|
||||
#ifdef CONFIG_HOTPLUG_CPU
|
||||
static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
|
||||
@ -529,15 +508,7 @@ static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp);
|
||||
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
static void __maybe_unused rcu_kick_nohz_cpu(int cpu);
|
||||
static bool init_nocb_callback_list(struct rcu_data *rdp);
|
||||
static void rcu_sysidle_enter(int irq);
|
||||
static void rcu_sysidle_exit(int irq);
|
||||
static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
|
||||
unsigned long *maxj);
|
||||
static bool is_sysidle_rcu_state(struct rcu_state *rsp);
|
||||
static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
|
||||
unsigned long maxj);
|
||||
static void rcu_bind_gp_kthread(void);
|
||||
static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp);
|
||||
static bool rcu_nohz_full_cpu(struct rcu_state *rsp);
|
||||
static void rcu_dynticks_task_enter(void);
|
||||
static void rcu_dynticks_task_exit(void);
|
||||
@ -551,75 +522,3 @@ void srcu_offline_cpu(unsigned int cpu) { }
|
||||
#endif /* #else #ifdef CONFIG_SRCU */
|
||||
|
||||
#endif /* #ifndef RCU_TREE_NONCORE */
|
||||
|
||||
#ifdef CONFIG_RCU_TRACE
|
||||
/* Read out queue lengths for tracing. */
|
||||
static inline void rcu_nocb_q_lengths(struct rcu_data *rdp, long *ql, long *qll)
|
||||
{
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
*ql = atomic_long_read(&rdp->nocb_q_count);
|
||||
*qll = atomic_long_read(&rdp->nocb_q_count_lazy);
|
||||
#else /* #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
*ql = 0;
|
||||
*qll = 0;
|
||||
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
}
|
||||
#endif /* #ifdef CONFIG_RCU_TRACE */
|
||||
|
||||
/*
|
||||
* Wrappers for the rcu_node::lock acquire and release.
|
||||
*
|
||||
* Because the rcu_nodes form a tree, the tree traversal locking will observe
|
||||
* different lock values, this in turn means that an UNLOCK of one level
|
||||
* followed by a LOCK of another level does not imply a full memory barrier;
|
||||
* and most importantly transitivity is lost.
|
||||
*
|
||||
* In order to restore full ordering between tree levels, augment the regular
|
||||
* lock acquire functions with smp_mb__after_unlock_lock().
|
||||
*
|
||||
* As ->lock of struct rcu_node is a __private field, therefore one should use
|
||||
* these wrappers rather than directly call raw_spin_{lock,unlock}* on ->lock.
|
||||
*/
|
||||
static inline void raw_spin_lock_rcu_node(struct rcu_node *rnp)
|
||||
{
|
||||
raw_spin_lock(&ACCESS_PRIVATE(rnp, lock));
|
||||
smp_mb__after_unlock_lock();
|
||||
}
|
||||
|
||||
static inline void raw_spin_unlock_rcu_node(struct rcu_node *rnp)
|
||||
{
|
||||
raw_spin_unlock(&ACCESS_PRIVATE(rnp, lock));
|
||||
}
|
||||
|
||||
static inline void raw_spin_lock_irq_rcu_node(struct rcu_node *rnp)
|
||||
{
|
||||
raw_spin_lock_irq(&ACCESS_PRIVATE(rnp, lock));
|
||||
smp_mb__after_unlock_lock();
|
||||
}
|
||||
|
||||
static inline void raw_spin_unlock_irq_rcu_node(struct rcu_node *rnp)
|
||||
{
|
||||
raw_spin_unlock_irq(&ACCESS_PRIVATE(rnp, lock));
|
||||
}
|
||||
|
||||
#define raw_spin_lock_irqsave_rcu_node(rnp, flags) \
|
||||
do { \
|
||||
typecheck(unsigned long, flags); \
|
||||
raw_spin_lock_irqsave(&ACCESS_PRIVATE(rnp, lock), flags); \
|
||||
smp_mb__after_unlock_lock(); \
|
||||
} while (0)
|
||||
|
||||
#define raw_spin_unlock_irqrestore_rcu_node(rnp, flags) \
|
||||
do { \
|
||||
typecheck(unsigned long, flags); \
|
||||
raw_spin_unlock_irqrestore(&ACCESS_PRIVATE(rnp, lock), flags); \
|
||||
} while (0)
|
||||
|
||||
static inline bool raw_spin_trylock_rcu_node(struct rcu_node *rnp)
|
||||
{
|
||||
bool locked = raw_spin_trylock(&ACCESS_PRIVATE(rnp, lock));
|
||||
|
||||
if (locked)
|
||||
smp_mb__after_unlock_lock();
|
||||
return locked;
|
||||
}
|
||||
|
@ -147,7 +147,7 @@ static void __maybe_unused sync_exp_reset_tree(struct rcu_state *rsp)
|
||||
*
|
||||
* Caller must hold the rcu_state's exp_mutex.
|
||||
*/
|
||||
static int sync_rcu_preempt_exp_done(struct rcu_node *rnp)
|
||||
static bool sync_rcu_preempt_exp_done(struct rcu_node *rnp)
|
||||
{
|
||||
return rnp->exp_tasks == NULL &&
|
||||
READ_ONCE(rnp->expmask) == 0;
|
||||
|
@ -70,7 +70,7 @@ static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */
|
||||
static void __init rcu_bootup_announce_oddness(void)
|
||||
{
|
||||
if (IS_ENABLED(CONFIG_RCU_TRACE))
|
||||
pr_info("\tRCU debugfs-based tracing is enabled.\n");
|
||||
pr_info("\tRCU event tracing is enabled.\n");
|
||||
if ((IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 64) ||
|
||||
(!IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 32))
|
||||
pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d\n",
|
||||
@ -90,8 +90,32 @@ static void __init rcu_bootup_announce_oddness(void)
|
||||
pr_info("\tBoot-time adjustment of leaf fanout to %d.\n", rcu_fanout_leaf);
|
||||
if (nr_cpu_ids != NR_CPUS)
|
||||
pr_info("\tRCU restricting CPUs from NR_CPUS=%d to nr_cpu_ids=%d.\n", NR_CPUS, nr_cpu_ids);
|
||||
if (IS_ENABLED(CONFIG_RCU_BOOST))
|
||||
pr_info("\tRCU kthread priority: %d.\n", kthread_prio);
|
||||
#ifdef CONFIG_RCU_BOOST
|
||||
pr_info("\tRCU priority boosting: priority %d delay %d ms.\n", kthread_prio, CONFIG_RCU_BOOST_DELAY);
|
||||
#endif
|
||||
if (blimit != DEFAULT_RCU_BLIMIT)
|
||||
pr_info("\tBoot-time adjustment of callback invocation limit to %ld.\n", blimit);
|
||||
if (qhimark != DEFAULT_RCU_QHIMARK)
|
||||
pr_info("\tBoot-time adjustment of callback high-water mark to %ld.\n", qhimark);
|
||||
if (qlowmark != DEFAULT_RCU_QLOMARK)
|
||||
pr_info("\tBoot-time adjustment of callback low-water mark to %ld.\n", qlowmark);
|
||||
if (jiffies_till_first_fqs != ULONG_MAX)
|
||||
pr_info("\tBoot-time adjustment of first FQS scan delay to %ld jiffies.\n", jiffies_till_first_fqs);
|
||||
if (jiffies_till_next_fqs != ULONG_MAX)
|
||||
pr_info("\tBoot-time adjustment of subsequent FQS scan delay to %ld jiffies.\n", jiffies_till_next_fqs);
|
||||
if (rcu_kick_kthreads)
|
||||
pr_info("\tKick kthreads if too-long grace period.\n");
|
||||
if (IS_ENABLED(CONFIG_DEBUG_OBJECTS_RCU_HEAD))
|
||||
pr_info("\tRCU callback double-/use-after-free debug enabled.\n");
|
||||
if (gp_preinit_delay)
|
||||
pr_info("\tRCU debug GP pre-init slowdown %d jiffies.\n", gp_preinit_delay);
|
||||
if (gp_init_delay)
|
||||
pr_info("\tRCU debug GP init slowdown %d jiffies.\n", gp_init_delay);
|
||||
if (gp_cleanup_delay)
|
||||
pr_info("\tRCU debug GP init slowdown %d jiffies.\n", gp_cleanup_delay);
|
||||
if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG))
|
||||
pr_info("\tRCU debug extended QS entry/exit.\n");
|
||||
rcupdate_announce_bootup_oddness();
|
||||
}
|
||||
|
||||
#ifdef CONFIG_PREEMPT_RCU
|
||||
@ -155,6 +179,8 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
|
||||
(rnp->expmask & rdp->grpmask ? RCU_EXP_BLKD : 0);
|
||||
struct task_struct *t = current;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
|
||||
/*
|
||||
* Decide where to queue the newly blocked task. In theory,
|
||||
* this could be an if-statement. In practice, when I tried
|
||||
@ -263,6 +289,7 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
|
||||
*/
|
||||
static void rcu_preempt_qs(void)
|
||||
{
|
||||
RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_qs() invoked with preemption enabled!!!\n");
|
||||
if (__this_cpu_read(rcu_data_p->cpu_no_qs.s)) {
|
||||
trace_rcu_grace_period(TPS("rcu_preempt"),
|
||||
__this_cpu_read(rcu_data_p->gpnum),
|
||||
@ -286,12 +313,14 @@ static void rcu_preempt_qs(void)
|
||||
*
|
||||
* Caller must disable interrupts.
|
||||
*/
|
||||
static void rcu_preempt_note_context_switch(void)
|
||||
static void rcu_preempt_note_context_switch(bool preempt)
|
||||
{
|
||||
struct task_struct *t = current;
|
||||
struct rcu_data *rdp;
|
||||
struct rcu_node *rnp;
|
||||
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_preempt_note_context_switch() invoked with interrupts enabled!!!\n");
|
||||
WARN_ON_ONCE(!preempt && t->rcu_read_lock_nesting > 0);
|
||||
if (t->rcu_read_lock_nesting > 0 &&
|
||||
!t->rcu_read_unlock_special.b.blocked) {
|
||||
|
||||
@ -607,6 +636,7 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
|
||||
*/
|
||||
static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
|
||||
{
|
||||
RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n");
|
||||
WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp));
|
||||
if (rcu_preempt_has_tasks(rnp))
|
||||
rnp->gp_tasks = rnp->blkd_tasks.next;
|
||||
@ -643,8 +673,37 @@ static void rcu_preempt_do_callbacks(void)
|
||||
|
||||
#endif /* #ifdef CONFIG_RCU_BOOST */
|
||||
|
||||
/*
|
||||
* Queue a preemptible-RCU callback for invocation after a grace period.
|
||||
/**
|
||||
* call_rcu() - Queue an RCU callback for invocation after a grace period.
|
||||
* @head: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all pre-existing RCU read-side
|
||||
* critical sections have completed. However, the callback function
|
||||
* might well execute concurrently with RCU read-side critical sections
|
||||
* that started after call_rcu() was invoked. RCU read-side critical
|
||||
* sections are delimited by rcu_read_lock() and rcu_read_unlock(),
|
||||
* and may be nested.
|
||||
*
|
||||
* Note that all CPUs must agree that the grace period extended beyond
|
||||
* all pre-existing RCU read-side critical section. On systems with more
|
||||
* than one CPU, this means that when "func()" is invoked, each CPU is
|
||||
* guaranteed to have executed a full memory barrier since the end of its
|
||||
* last RCU read-side critical section whose beginning preceded the call
|
||||
* to call_rcu(). It also means that each CPU executing an RCU read-side
|
||||
* critical section that continues beyond the start of "func()" must have
|
||||
* executed a memory barrier after the call_rcu() but before the beginning
|
||||
* of that RCU read-side critical section. Note that these guarantees
|
||||
* include CPUs that are offline, idle, or executing in user mode, as
|
||||
* well as CPUs that are executing in the kernel.
|
||||
*
|
||||
* Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
|
||||
* resulting RCU callback function "func()", then both CPU A and CPU B are
|
||||
* guaranteed to execute a full memory barrier during the time interval
|
||||
* between the call to call_rcu() and the invocation of "func()" -- even
|
||||
* if CPU A and CPU B are the same CPU (but again only if the system has
|
||||
* more than one CPU).
|
||||
*/
|
||||
void call_rcu(struct rcu_head *head, rcu_callback_t func)
|
||||
{
|
||||
@ -663,8 +722,13 @@ EXPORT_SYMBOL_GPL(call_rcu);
|
||||
* synchronize_rcu() was waiting. RCU read-side critical sections are
|
||||
* delimited by rcu_read_lock() and rcu_read_unlock(), and may be nested.
|
||||
*
|
||||
* See the description of synchronize_sched() for more detailed information
|
||||
* on memory ordering guarantees.
|
||||
* See the description of synchronize_sched() for more detailed
|
||||
* information on memory-ordering guarantees. However, please note
|
||||
* that -only- the memory-ordering guarantees apply. For example,
|
||||
* synchronize_rcu() is -not- guaranteed to wait on things like code
|
||||
* protected by preempt_disable(), instead, synchronize_rcu() is -only-
|
||||
* guaranteed to wait on RCU read-side critical sections, that is, sections
|
||||
* of code protected by rcu_read_lock().
|
||||
*/
|
||||
void synchronize_rcu(void)
|
||||
{
|
||||
@ -738,7 +802,7 @@ static void __init rcu_bootup_announce(void)
|
||||
* Because preemptible RCU does not exist, we never have to check for
|
||||
* CPUs being in quiescent states.
|
||||
*/
|
||||
static void rcu_preempt_note_context_switch(void)
|
||||
static void rcu_preempt_note_context_switch(bool preempt)
|
||||
{
|
||||
}
|
||||
|
||||
@ -835,33 +899,6 @@ void exit_rcu(void)
|
||||
|
||||
#include "../locking/rtmutex_common.h"
|
||||
|
||||
#ifdef CONFIG_RCU_TRACE
|
||||
|
||||
static void rcu_initiate_boost_trace(struct rcu_node *rnp)
|
||||
{
|
||||
if (!rcu_preempt_has_tasks(rnp))
|
||||
rnp->n_balk_blkd_tasks++;
|
||||
else if (rnp->exp_tasks == NULL && rnp->gp_tasks == NULL)
|
||||
rnp->n_balk_exp_gp_tasks++;
|
||||
else if (rnp->gp_tasks != NULL && rnp->boost_tasks != NULL)
|
||||
rnp->n_balk_boost_tasks++;
|
||||
else if (rnp->gp_tasks != NULL && rnp->qsmask != 0)
|
||||
rnp->n_balk_notblocked++;
|
||||
else if (rnp->gp_tasks != NULL &&
|
||||
ULONG_CMP_LT(jiffies, rnp->boost_time))
|
||||
rnp->n_balk_notyet++;
|
||||
else
|
||||
rnp->n_balk_nos++;
|
||||
}
|
||||
|
||||
#else /* #ifdef CONFIG_RCU_TRACE */
|
||||
|
||||
static void rcu_initiate_boost_trace(struct rcu_node *rnp)
|
||||
{
|
||||
}
|
||||
|
||||
#endif /* #else #ifdef CONFIG_RCU_TRACE */
|
||||
|
||||
static void rcu_wake_cond(struct task_struct *t, int status)
|
||||
{
|
||||
/*
|
||||
@ -992,8 +1029,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
|
||||
{
|
||||
struct task_struct *t;
|
||||
|
||||
lockdep_assert_held(&rnp->lock);
|
||||
if (!rcu_preempt_blocked_readers_cgp(rnp) && rnp->exp_tasks == NULL) {
|
||||
rnp->n_balk_exp_gp_tasks++;
|
||||
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
|
||||
return;
|
||||
}
|
||||
@ -1009,7 +1046,6 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
|
||||
if (t)
|
||||
rcu_wake_cond(t, rnp->boost_kthread_status);
|
||||
} else {
|
||||
rcu_initiate_boost_trace(rnp);
|
||||
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
|
||||
}
|
||||
}
|
||||
@ -1260,8 +1296,7 @@ static void rcu_prepare_kthreads(int cpu)
|
||||
int rcu_needs_cpu(u64 basemono, u64 *nextevt)
|
||||
{
|
||||
*nextevt = KTIME_MAX;
|
||||
return IS_ENABLED(CONFIG_RCU_NOCB_CPU_ALL)
|
||||
? 0 : rcu_cpu_has_callbacks(NULL);
|
||||
return rcu_cpu_has_callbacks(NULL);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -1372,10 +1407,7 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt)
|
||||
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
|
||||
unsigned long dj;
|
||||
|
||||
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_ALL)) {
|
||||
*nextevt = KTIME_MAX;
|
||||
return 0;
|
||||
}
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_needs_cpu() invoked with irqs enabled!!!");
|
||||
|
||||
/* Snapshot to detect later posting of non-lazy callback. */
|
||||
rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted;
|
||||
@ -1424,8 +1456,8 @@ static void rcu_prepare_for_idle(void)
|
||||
struct rcu_state *rsp;
|
||||
int tne;
|
||||
|
||||
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_ALL) ||
|
||||
rcu_is_nocb_cpu(smp_processor_id()))
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_prepare_for_idle() invoked with irqs enabled!!!");
|
||||
if (rcu_is_nocb_cpu(smp_processor_id()))
|
||||
return;
|
||||
|
||||
/* Handle nohz enablement switches conservatively. */
|
||||
@ -1479,8 +1511,8 @@ static void rcu_prepare_for_idle(void)
|
||||
*/
|
||||
static void rcu_cleanup_after_idle(void)
|
||||
{
|
||||
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_ALL) ||
|
||||
rcu_is_nocb_cpu(smp_processor_id()))
|
||||
RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_cleanup_after_idle() invoked with irqs enabled!!!");
|
||||
if (rcu_is_nocb_cpu(smp_processor_id()))
|
||||
return;
|
||||
if (rcu_try_advance_all_cbs())
|
||||
invoke_rcu_core();
|
||||
@ -1747,7 +1779,6 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
|
||||
init_swait_queue_head(&rnp->nocb_gp_wq[1]);
|
||||
}
|
||||
|
||||
#ifndef CONFIG_RCU_NOCB_CPU_ALL
|
||||
/* Is the specified CPU a no-CBs CPU? */
|
||||
bool rcu_is_nocb_cpu(int cpu)
|
||||
{
|
||||
@ -1755,7 +1786,6 @@ bool rcu_is_nocb_cpu(int cpu)
|
||||
return cpumask_test_cpu(cpu, rcu_nocb_mask);
|
||||
return false;
|
||||
}
|
||||
#endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
|
||||
|
||||
/*
|
||||
* Kick the leader kthread for this NOCB group.
|
||||
@ -1769,6 +1799,7 @@ static void wake_nocb_leader(struct rcu_data *rdp, bool force)
|
||||
if (READ_ONCE(rdp_leader->nocb_leader_sleep) || force) {
|
||||
/* Prior smp_mb__after_atomic() orders against prior enqueue. */
|
||||
WRITE_ONCE(rdp_leader->nocb_leader_sleep, false);
|
||||
smp_mb(); /* ->nocb_leader_sleep before swake_up(). */
|
||||
swake_up(&rdp_leader->nocb_wq);
|
||||
}
|
||||
}
|
||||
@ -1860,7 +1891,7 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp,
|
||||
trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
|
||||
TPS("WakeEmpty"));
|
||||
} else {
|
||||
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOGP_WAKE);
|
||||
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE);
|
||||
/* Store ->nocb_defer_wakeup before ->rcu_urgent_qs. */
|
||||
smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true);
|
||||
trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
|
||||
@ -1874,7 +1905,7 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp,
|
||||
trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
|
||||
TPS("WakeOvf"));
|
||||
} else {
|
||||
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOGP_WAKE_FORCE);
|
||||
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_FORCE);
|
||||
/* Store ->nocb_defer_wakeup before ->rcu_urgent_qs. */
|
||||
smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true);
|
||||
trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
|
||||
@ -2023,6 +2054,7 @@ wait_again:
|
||||
* nocb_gp_head, where they await a grace period.
|
||||
*/
|
||||
gotcbs = false;
|
||||
smp_mb(); /* wakeup before ->nocb_head reads. */
|
||||
for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_follower) {
|
||||
rdp->nocb_gp_head = READ_ONCE(rdp->nocb_head);
|
||||
if (!rdp->nocb_gp_head)
|
||||
@ -2201,8 +2233,8 @@ static void do_nocb_deferred_wakeup(struct rcu_data *rdp)
|
||||
if (!rcu_nocb_need_deferred_wakeup(rdp))
|
||||
return;
|
||||
ndw = READ_ONCE(rdp->nocb_defer_wakeup);
|
||||
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOGP_WAKE_NOT);
|
||||
wake_nocb_leader(rdp, ndw == RCU_NOGP_WAKE_FORCE);
|
||||
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
|
||||
wake_nocb_leader(rdp, ndw == RCU_NOCB_WAKE_FORCE);
|
||||
trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("DeferredWake"));
|
||||
}
|
||||
|
||||
@ -2212,10 +2244,6 @@ void __init rcu_init_nohz(void)
|
||||
bool need_rcu_nocb_mask = true;
|
||||
struct rcu_state *rsp;
|
||||
|
||||
#ifdef CONFIG_RCU_NOCB_CPU_NONE
|
||||
need_rcu_nocb_mask = false;
|
||||
#endif /* #ifndef CONFIG_RCU_NOCB_CPU_NONE */
|
||||
|
||||
#if defined(CONFIG_NO_HZ_FULL)
|
||||
if (tick_nohz_full_running && cpumask_weight(tick_nohz_full_mask))
|
||||
need_rcu_nocb_mask = true;
|
||||
@ -2231,14 +2259,6 @@ void __init rcu_init_nohz(void)
|
||||
if (!have_rcu_nocb_mask)
|
||||
return;
|
||||
|
||||
#ifdef CONFIG_RCU_NOCB_CPU_ZERO
|
||||
pr_info("\tOffload RCU callbacks from CPU 0\n");
|
||||
cpumask_set_cpu(0, rcu_nocb_mask);
|
||||
#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ZERO */
|
||||
#ifdef CONFIG_RCU_NOCB_CPU_ALL
|
||||
pr_info("\tOffload RCU callbacks from all CPUs\n");
|
||||
cpumask_copy(rcu_nocb_mask, cpu_possible_mask);
|
||||
#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
|
||||
#if defined(CONFIG_NO_HZ_FULL)
|
||||
if (tick_nohz_full_running)
|
||||
cpumask_or(rcu_nocb_mask, rcu_nocb_mask, tick_nohz_full_mask);
|
||||
@ -2491,421 +2511,6 @@ static void __maybe_unused rcu_kick_nohz_cpu(int cpu)
|
||||
#endif /* #ifdef CONFIG_NO_HZ_FULL */
|
||||
}
|
||||
|
||||
|
||||
#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
|
||||
|
||||
static int full_sysidle_state; /* Current system-idle state. */
|
||||
#define RCU_SYSIDLE_NOT 0 /* Some CPU is not idle. */
|
||||
#define RCU_SYSIDLE_SHORT 1 /* All CPUs idle for brief period. */
|
||||
#define RCU_SYSIDLE_LONG 2 /* All CPUs idle for long enough. */
|
||||
#define RCU_SYSIDLE_FULL 3 /* All CPUs idle, ready for sysidle. */
|
||||
#define RCU_SYSIDLE_FULL_NOTED 4 /* Actually entered sysidle state. */
|
||||
|
||||
/*
|
||||
* Invoked to note exit from irq or task transition to idle. Note that
|
||||
* usermode execution does -not- count as idle here! After all, we want
|
||||
* to detect full-system idle states, not RCU quiescent states and grace
|
||||
* periods. The caller must have disabled interrupts.
|
||||
*/
|
||||
static void rcu_sysidle_enter(int irq)
|
||||
{
|
||||
unsigned long j;
|
||||
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
|
||||
|
||||
/* If there are no nohz_full= CPUs, no need to track this. */
|
||||
if (!tick_nohz_full_enabled())
|
||||
return;
|
||||
|
||||
/* Adjust nesting, check for fully idle. */
|
||||
if (irq) {
|
||||
rdtp->dynticks_idle_nesting--;
|
||||
WARN_ON_ONCE(rdtp->dynticks_idle_nesting < 0);
|
||||
if (rdtp->dynticks_idle_nesting != 0)
|
||||
return; /* Still not fully idle. */
|
||||
} else {
|
||||
if ((rdtp->dynticks_idle_nesting & DYNTICK_TASK_NEST_MASK) ==
|
||||
DYNTICK_TASK_NEST_VALUE) {
|
||||
rdtp->dynticks_idle_nesting = 0;
|
||||
} else {
|
||||
rdtp->dynticks_idle_nesting -= DYNTICK_TASK_NEST_VALUE;
|
||||
WARN_ON_ONCE(rdtp->dynticks_idle_nesting < 0);
|
||||
return; /* Still not fully idle. */
|
||||
}
|
||||
}
|
||||
|
||||
/* Record start of fully idle period. */
|
||||
j = jiffies;
|
||||
WRITE_ONCE(rdtp->dynticks_idle_jiffies, j);
|
||||
smp_mb__before_atomic();
|
||||
atomic_inc(&rdtp->dynticks_idle);
|
||||
smp_mb__after_atomic();
|
||||
WARN_ON_ONCE(atomic_read(&rdtp->dynticks_idle) & 0x1);
|
||||
}
|
||||
|
||||
/*
|
||||
* Unconditionally force exit from full system-idle state. This is
|
||||
* invoked when a normal CPU exits idle, but must be called separately
|
||||
* for the timekeeping CPU (tick_do_timer_cpu). The reason for this
|
||||
* is that the timekeeping CPU is permitted to take scheduling-clock
|
||||
* interrupts while the system is in system-idle state, and of course
|
||||
* rcu_sysidle_exit() has no way of distinguishing a scheduling-clock
|
||||
* interrupt from any other type of interrupt.
|
||||
*/
|
||||
void rcu_sysidle_force_exit(void)
|
||||
{
|
||||
int oldstate = READ_ONCE(full_sysidle_state);
|
||||
int newoldstate;
|
||||
|
||||
/*
|
||||
* Each pass through the following loop attempts to exit full
|
||||
* system-idle state. If contention proves to be a problem,
|
||||
* a trylock-based contention tree could be used here.
|
||||
*/
|
||||
while (oldstate > RCU_SYSIDLE_SHORT) {
|
||||
newoldstate = cmpxchg(&full_sysidle_state,
|
||||
oldstate, RCU_SYSIDLE_NOT);
|
||||
if (oldstate == newoldstate &&
|
||||
oldstate == RCU_SYSIDLE_FULL_NOTED) {
|
||||
rcu_kick_nohz_cpu(tick_do_timer_cpu);
|
||||
return; /* We cleared it, done! */
|
||||
}
|
||||
oldstate = newoldstate;
|
||||
}
|
||||
smp_mb(); /* Order initial oldstate fetch vs. later non-idle work. */
|
||||
}
|
||||
|
||||
/*
|
||||
* Invoked to note entry to irq or task transition from idle. Note that
|
||||
* usermode execution does -not- count as idle here! The caller must
|
||||
* have disabled interrupts.
|
||||
*/
|
||||
static void rcu_sysidle_exit(int irq)
|
||||
{
|
||||
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
|
||||
|
||||
/* If there are no nohz_full= CPUs, no need to track this. */
|
||||
if (!tick_nohz_full_enabled())
|
||||
return;
|
||||
|
||||
/* Adjust nesting, check for already non-idle. */
|
||||
if (irq) {
|
||||
rdtp->dynticks_idle_nesting++;
|
||||
WARN_ON_ONCE(rdtp->dynticks_idle_nesting <= 0);
|
||||
if (rdtp->dynticks_idle_nesting != 1)
|
||||
return; /* Already non-idle. */
|
||||
} else {
|
||||
/*
|
||||
* Allow for irq misnesting. Yes, it really is possible
|
||||
* to enter an irq handler then never leave it, and maybe
|
||||
* also vice versa. Handle both possibilities.
|
||||
*/
|
||||
if (rdtp->dynticks_idle_nesting & DYNTICK_TASK_NEST_MASK) {
|
||||
rdtp->dynticks_idle_nesting += DYNTICK_TASK_NEST_VALUE;
|
||||
WARN_ON_ONCE(rdtp->dynticks_idle_nesting <= 0);
|
||||
return; /* Already non-idle. */
|
||||
} else {
|
||||
rdtp->dynticks_idle_nesting = DYNTICK_TASK_EXIT_IDLE;
|
||||
}
|
||||
}
|
||||
|
||||
/* Record end of idle period. */
|
||||
smp_mb__before_atomic();
|
||||
atomic_inc(&rdtp->dynticks_idle);
|
||||
smp_mb__after_atomic();
|
||||
WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks_idle) & 0x1));
|
||||
|
||||
/*
|
||||
* If we are the timekeeping CPU, we are permitted to be non-idle
|
||||
* during a system-idle state. This must be the case, because
|
||||
* the timekeeping CPU has to take scheduling-clock interrupts
|
||||
* during the time that the system is transitioning to full
|
||||
* system-idle state. This means that the timekeeping CPU must
|
||||
* invoke rcu_sysidle_force_exit() directly if it does anything
|
||||
* more than take a scheduling-clock interrupt.
|
||||
*/
|
||||
if (smp_processor_id() == tick_do_timer_cpu)
|
||||
return;
|
||||
|
||||
/* Update system-idle state: We are clearly no longer fully idle! */
|
||||
rcu_sysidle_force_exit();
|
||||
}
|
||||
|
||||
/*
|
||||
* Check to see if the current CPU is idle. Note that usermode execution
|
||||
* does not count as idle. The caller must have disabled interrupts,
|
||||
* and must be running on tick_do_timer_cpu.
|
||||
*/
|
||||
static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
|
||||
unsigned long *maxj)
|
||||
{
|
||||
int cur;
|
||||
unsigned long j;
|
||||
struct rcu_dynticks *rdtp = rdp->dynticks;
|
||||
|
||||
/* If there are no nohz_full= CPUs, don't check system-wide idleness. */
|
||||
if (!tick_nohz_full_enabled())
|
||||
return;
|
||||
|
||||
/*
|
||||
* If some other CPU has already reported non-idle, if this is
|
||||
* not the flavor of RCU that tracks sysidle state, or if this
|
||||
* is an offline or the timekeeping CPU, nothing to do.
|
||||
*/
|
||||
if (!*isidle || rdp->rsp != rcu_state_p ||
|
||||
cpu_is_offline(rdp->cpu) || rdp->cpu == tick_do_timer_cpu)
|
||||
return;
|
||||
/* Verify affinity of current kthread. */
|
||||
WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu);
|
||||
|
||||
/* Pick up current idle and NMI-nesting counter and check. */
|
||||
cur = atomic_read(&rdtp->dynticks_idle);
|
||||
if (cur & 0x1) {
|
||||
*isidle = false; /* We are not idle! */
|
||||
return;
|
||||
}
|
||||
smp_mb(); /* Read counters before timestamps. */
|
||||
|
||||
/* Pick up timestamps. */
|
||||
j = READ_ONCE(rdtp->dynticks_idle_jiffies);
|
||||
/* If this CPU entered idle more recently, update maxj timestamp. */
|
||||
if (ULONG_CMP_LT(*maxj, j))
|
||||
*maxj = j;
|
||||
}
|
||||
|
||||
/*
|
||||
* Is this the flavor of RCU that is handling full-system idle?
|
||||
*/
|
||||
static bool is_sysidle_rcu_state(struct rcu_state *rsp)
|
||||
{
|
||||
return rsp == rcu_state_p;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return a delay in jiffies based on the number of CPUs, rcu_node
|
||||
* leaf fanout, and jiffies tick rate. The idea is to allow larger
|
||||
* systems more time to transition to full-idle state in order to
|
||||
* avoid the cache thrashing that otherwise occur on the state variable.
|
||||
* Really small systems (less than a couple of tens of CPUs) should
|
||||
* instead use a single global atomically incremented counter, and later
|
||||
* versions of this will automatically reconfigure themselves accordingly.
|
||||
*/
|
||||
static unsigned long rcu_sysidle_delay(void)
|
||||
{
|
||||
if (nr_cpu_ids <= CONFIG_NO_HZ_FULL_SYSIDLE_SMALL)
|
||||
return 0;
|
||||
return DIV_ROUND_UP(nr_cpu_ids * HZ, rcu_fanout_leaf * 1000);
|
||||
}
|
||||
|
||||
/*
|
||||
* Advance the full-system-idle state. This is invoked when all of
|
||||
* the non-timekeeping CPUs are idle.
|
||||
*/
|
||||
static void rcu_sysidle(unsigned long j)
|
||||
{
|
||||
/* Check the current state. */
|
||||
switch (READ_ONCE(full_sysidle_state)) {
|
||||
case RCU_SYSIDLE_NOT:
|
||||
|
||||
/* First time all are idle, so note a short idle period. */
|
||||
WRITE_ONCE(full_sysidle_state, RCU_SYSIDLE_SHORT);
|
||||
break;
|
||||
|
||||
case RCU_SYSIDLE_SHORT:
|
||||
|
||||
/*
|
||||
* Idle for a bit, time to advance to next state?
|
||||
* cmpxchg failure means race with non-idle, let them win.
|
||||
*/
|
||||
if (ULONG_CMP_GE(jiffies, j + rcu_sysidle_delay()))
|
||||
(void)cmpxchg(&full_sysidle_state,
|
||||
RCU_SYSIDLE_SHORT, RCU_SYSIDLE_LONG);
|
||||
break;
|
||||
|
||||
case RCU_SYSIDLE_LONG:
|
||||
|
||||
/*
|
||||
* Do an additional check pass before advancing to full.
|
||||
* cmpxchg failure means race with non-idle, let them win.
|
||||
*/
|
||||
if (ULONG_CMP_GE(jiffies, j + rcu_sysidle_delay()))
|
||||
(void)cmpxchg(&full_sysidle_state,
|
||||
RCU_SYSIDLE_LONG, RCU_SYSIDLE_FULL);
|
||||
break;
|
||||
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Found a non-idle non-timekeeping CPU, so kick the system-idle state
|
||||
* back to the beginning.
|
||||
*/
|
||||
static void rcu_sysidle_cancel(void)
|
||||
{
|
||||
smp_mb();
|
||||
if (full_sysidle_state > RCU_SYSIDLE_SHORT)
|
||||
WRITE_ONCE(full_sysidle_state, RCU_SYSIDLE_NOT);
|
||||
}
|
||||
|
||||
/*
|
||||
* Update the sysidle state based on the results of a force-quiescent-state
|
||||
* scan of the CPUs' dyntick-idle state.
|
||||
*/
|
||||
static void rcu_sysidle_report(struct rcu_state *rsp, int isidle,
|
||||
unsigned long maxj, bool gpkt)
|
||||
{
|
||||
if (rsp != rcu_state_p)
|
||||
return; /* Wrong flavor, ignore. */
|
||||
if (gpkt && nr_cpu_ids <= CONFIG_NO_HZ_FULL_SYSIDLE_SMALL)
|
||||
return; /* Running state machine from timekeeping CPU. */
|
||||
if (isidle)
|
||||
rcu_sysidle(maxj); /* More idle! */
|
||||
else
|
||||
rcu_sysidle_cancel(); /* Idle is over. */
|
||||
}
|
||||
|
||||
/*
|
||||
* Wrapper for rcu_sysidle_report() when called from the grace-period
|
||||
* kthread's context.
|
||||
*/
|
||||
static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
|
||||
unsigned long maxj)
|
||||
{
|
||||
/* If there are no nohz_full= CPUs, no need to track this. */
|
||||
if (!tick_nohz_full_enabled())
|
||||
return;
|
||||
|
||||
rcu_sysidle_report(rsp, isidle, maxj, true);
|
||||
}
|
||||
|
||||
/* Callback and function for forcing an RCU grace period. */
|
||||
struct rcu_sysidle_head {
|
||||
struct rcu_head rh;
|
||||
int inuse;
|
||||
};
|
||||
|
||||
static void rcu_sysidle_cb(struct rcu_head *rhp)
|
||||
{
|
||||
struct rcu_sysidle_head *rshp;
|
||||
|
||||
/*
|
||||
* The following memory barrier is needed to replace the
|
||||
* memory barriers that would normally be in the memory
|
||||
* allocator.
|
||||
*/
|
||||
smp_mb(); /* grace period precedes setting inuse. */
|
||||
|
||||
rshp = container_of(rhp, struct rcu_sysidle_head, rh);
|
||||
WRITE_ONCE(rshp->inuse, 0);
|
||||
}
|
||||
|
||||
/*
|
||||
* Check to see if the system is fully idle, other than the timekeeping CPU.
|
||||
* The caller must have disabled interrupts. This is not intended to be
|
||||
* called unless tick_nohz_full_enabled().
|
||||
*/
|
||||
bool rcu_sys_is_idle(void)
|
||||
{
|
||||
static struct rcu_sysidle_head rsh;
|
||||
int rss = READ_ONCE(full_sysidle_state);
|
||||
|
||||
if (WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu))
|
||||
return false;
|
||||
|
||||
/* Handle small-system case by doing a full scan of CPUs. */
|
||||
if (nr_cpu_ids <= CONFIG_NO_HZ_FULL_SYSIDLE_SMALL) {
|
||||
int oldrss = rss - 1;
|
||||
|
||||
/*
|
||||
* One pass to advance to each state up to _FULL.
|
||||
* Give up if any pass fails to advance the state.
|
||||
*/
|
||||
while (rss < RCU_SYSIDLE_FULL && oldrss < rss) {
|
||||
int cpu;
|
||||
bool isidle = true;
|
||||
unsigned long maxj = jiffies - ULONG_MAX / 4;
|
||||
struct rcu_data *rdp;
|
||||
|
||||
/* Scan all the CPUs looking for nonidle CPUs. */
|
||||
for_each_possible_cpu(cpu) {
|
||||
rdp = per_cpu_ptr(rcu_state_p->rda, cpu);
|
||||
rcu_sysidle_check_cpu(rdp, &isidle, &maxj);
|
||||
if (!isidle)
|
||||
break;
|
||||
}
|
||||
rcu_sysidle_report(rcu_state_p, isidle, maxj, false);
|
||||
oldrss = rss;
|
||||
rss = READ_ONCE(full_sysidle_state);
|
||||
}
|
||||
}
|
||||
|
||||
/* If this is the first observation of an idle period, record it. */
|
||||
if (rss == RCU_SYSIDLE_FULL) {
|
||||
rss = cmpxchg(&full_sysidle_state,
|
||||
RCU_SYSIDLE_FULL, RCU_SYSIDLE_FULL_NOTED);
|
||||
return rss == RCU_SYSIDLE_FULL;
|
||||
}
|
||||
|
||||
smp_mb(); /* ensure rss load happens before later caller actions. */
|
||||
|
||||
/* If already fully idle, tell the caller (in case of races). */
|
||||
if (rss == RCU_SYSIDLE_FULL_NOTED)
|
||||
return true;
|
||||
|
||||
/*
|
||||
* If we aren't there yet, and a grace period is not in flight,
|
||||
* initiate a grace period. Either way, tell the caller that
|
||||
* we are not there yet. We use an xchg() rather than an assignment
|
||||
* to make up for the memory barriers that would otherwise be
|
||||
* provided by the memory allocator.
|
||||
*/
|
||||
if (nr_cpu_ids > CONFIG_NO_HZ_FULL_SYSIDLE_SMALL &&
|
||||
!rcu_gp_in_progress(rcu_state_p) &&
|
||||
!rsh.inuse && xchg(&rsh.inuse, 1) == 0)
|
||||
call_rcu(&rsh.rh, rcu_sysidle_cb);
|
||||
return false;
|
||||
}
|
||||
|
||||
/*
|
||||
* Initialize dynticks sysidle state for CPUs coming online.
|
||||
*/
|
||||
static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
|
||||
{
|
||||
rdtp->dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE;
|
||||
}
|
||||
|
||||
#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
|
||||
static void rcu_sysidle_enter(int irq)
|
||||
{
|
||||
}
|
||||
|
||||
static void rcu_sysidle_exit(int irq)
|
||||
{
|
||||
}
|
||||
|
||||
static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
|
||||
unsigned long *maxj)
|
||||
{
|
||||
}
|
||||
|
||||
static bool is_sysidle_rcu_state(struct rcu_state *rsp)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
|
||||
unsigned long maxj)
|
||||
{
|
||||
}
|
||||
|
||||
static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
|
||||
{
|
||||
}
|
||||
|
||||
#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
|
||||
/*
|
||||
* Is this CPU a NO_HZ_FULL CPU that should ignore RCU so that the
|
||||
* grace-period kthread will do force_quiescent_state() processing?
|
||||
@ -2936,13 +2541,7 @@ static void rcu_bind_gp_kthread(void)
|
||||
|
||||
if (!tick_nohz_full_enabled())
|
||||
return;
|
||||
#ifdef CONFIG_NO_HZ_FULL_SYSIDLE
|
||||
cpu = tick_do_timer_cpu;
|
||||
if (cpu >= 0 && cpu < nr_cpu_ids)
|
||||
set_cpus_allowed_ptr(current, cpumask_of(cpu));
|
||||
#else /* #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
housekeeping_affine(current);
|
||||
#endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
|
||||
}
|
||||
|
||||
/* Record the current task on dyntick-idle entry. */
|
||||
|
@ -1,494 +0,0 @@
|
||||
/*
|
||||
* Read-Copy Update tracing for hierarchical implementation.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
* (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright IBM Corporation, 2008
|
||||
* Author: Paul E. McKenney
|
||||
*
|
||||
* Papers: http://www.rdrop.com/users/paulmck/RCU
|
||||
*
|
||||
* For detailed explanation of Read-Copy Update mechanism see -
|
||||
* Documentation/RCU
|
||||
*
|
||||
*/
|
||||
#include <linux/types.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/rcupdate.h>
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/bitops.h>
|
||||
#include <linux/completion.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/notifier.h>
|
||||
#include <linux/cpu.h>
|
||||
#include <linux/mutex.h>
|
||||
#include <linux/debugfs.h>
|
||||
#include <linux/seq_file.h>
|
||||
#include <linux/prefetch.h>
|
||||
|
||||
#define RCU_TREE_NONCORE
|
||||
#include "tree.h"
|
||||
#include "rcu.h"
|
||||
|
||||
static int r_open(struct inode *inode, struct file *file,
|
||||
const struct seq_operations *op)
|
||||
{
|
||||
int ret = seq_open(file, op);
|
||||
if (!ret) {
|
||||
struct seq_file *m = (struct seq_file *)file->private_data;
|
||||
m->private = inode->i_private;
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void *r_start(struct seq_file *m, loff_t *pos)
|
||||
{
|
||||
struct rcu_state *rsp = (struct rcu_state *)m->private;
|
||||
*pos = cpumask_next(*pos - 1, cpu_possible_mask);
|
||||
if ((*pos) < nr_cpu_ids)
|
||||
return per_cpu_ptr(rsp->rda, *pos);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static void *r_next(struct seq_file *m, void *v, loff_t *pos)
|
||||
{
|
||||
(*pos)++;
|
||||
return r_start(m, pos);
|
||||
}
|
||||
|
||||
static void r_stop(struct seq_file *m, void *v)
|
||||
{
|
||||
}
|
||||
|
||||
static int show_rcubarrier(struct seq_file *m, void *v)
|
||||
{
|
||||
struct rcu_state *rsp = (struct rcu_state *)m->private;
|
||||
seq_printf(m, "bcc: %d bseq: %lu\n",
|
||||
atomic_read(&rsp->barrier_cpu_count),
|
||||
rsp->barrier_sequence);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rcubarrier_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return single_open(file, show_rcubarrier, inode->i_private);
|
||||
}
|
||||
|
||||
static const struct file_operations rcubarrier_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcubarrier_open,
|
||||
.read = seq_read,
|
||||
.llseek = no_llseek,
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
#ifdef CONFIG_RCU_BOOST
|
||||
|
||||
static char convert_kthread_status(unsigned int kthread_status)
|
||||
{
|
||||
if (kthread_status > RCU_KTHREAD_MAX)
|
||||
return '?';
|
||||
return "SRWOY"[kthread_status];
|
||||
}
|
||||
|
||||
#endif /* #ifdef CONFIG_RCU_BOOST */
|
||||
|
||||
static void print_one_rcu_data(struct seq_file *m, struct rcu_data *rdp)
|
||||
{
|
||||
long ql, qll;
|
||||
|
||||
if (!rdp->beenonline)
|
||||
return;
|
||||
seq_printf(m, "%3d%cc=%ld g=%ld cnq=%d/%d:%d",
|
||||
rdp->cpu,
|
||||
cpu_is_offline(rdp->cpu) ? '!' : ' ',
|
||||
ulong2long(rdp->completed), ulong2long(rdp->gpnum),
|
||||
rdp->cpu_no_qs.b.norm,
|
||||
rdp->rcu_qs_ctr_snap == per_cpu(rdp->dynticks->rcu_qs_ctr, rdp->cpu),
|
||||
rdp->core_needs_qs);
|
||||
seq_printf(m, " dt=%d/%llx/%d df=%lu",
|
||||
rcu_dynticks_snap(rdp->dynticks),
|
||||
rdp->dynticks->dynticks_nesting,
|
||||
rdp->dynticks->dynticks_nmi_nesting,
|
||||
rdp->dynticks_fqs);
|
||||
seq_printf(m, " of=%lu", rdp->offline_fqs);
|
||||
rcu_nocb_q_lengths(rdp, &ql, &qll);
|
||||
qll += rcu_segcblist_n_lazy_cbs(&rdp->cblist);
|
||||
ql += rcu_segcblist_n_cbs(&rdp->cblist);
|
||||
seq_printf(m, " ql=%ld/%ld qs=%c%c%c%c",
|
||||
qll, ql,
|
||||
".N"[!rcu_segcblist_segempty(&rdp->cblist, RCU_NEXT_TAIL)],
|
||||
".R"[!rcu_segcblist_segempty(&rdp->cblist,
|
||||
RCU_NEXT_READY_TAIL)],
|
||||
".W"[!rcu_segcblist_segempty(&rdp->cblist, RCU_WAIT_TAIL)],
|
||||
".D"[!rcu_segcblist_segempty(&rdp->cblist, RCU_DONE_TAIL)]);
|
||||
#ifdef CONFIG_RCU_BOOST
|
||||
seq_printf(m, " kt=%d/%c ktl=%x",
|
||||
per_cpu(rcu_cpu_has_work, rdp->cpu),
|
||||
convert_kthread_status(per_cpu(rcu_cpu_kthread_status,
|
||||
rdp->cpu)),
|
||||
per_cpu(rcu_cpu_kthread_loops, rdp->cpu) & 0xffff);
|
||||
#endif /* #ifdef CONFIG_RCU_BOOST */
|
||||
seq_printf(m, " b=%ld", rdp->blimit);
|
||||
seq_printf(m, " ci=%lu nci=%lu co=%lu ca=%lu\n",
|
||||
rdp->n_cbs_invoked, rdp->n_nocbs_invoked,
|
||||
rdp->n_cbs_orphaned, rdp->n_cbs_adopted);
|
||||
}
|
||||
|
||||
static int show_rcudata(struct seq_file *m, void *v)
|
||||
{
|
||||
print_one_rcu_data(m, (struct rcu_data *)v);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct seq_operations rcudate_op = {
|
||||
.start = r_start,
|
||||
.next = r_next,
|
||||
.stop = r_stop,
|
||||
.show = show_rcudata,
|
||||
};
|
||||
|
||||
static int rcudata_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return r_open(inode, file, &rcudate_op);
|
||||
}
|
||||
|
||||
static const struct file_operations rcudata_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcudata_open,
|
||||
.read = seq_read,
|
||||
.llseek = no_llseek,
|
||||
.release = seq_release,
|
||||
};
|
||||
|
||||
static int show_rcuexp(struct seq_file *m, void *v)
|
||||
{
|
||||
int cpu;
|
||||
struct rcu_state *rsp = (struct rcu_state *)m->private;
|
||||
struct rcu_data *rdp;
|
||||
unsigned long s0 = 0, s1 = 0, s2 = 0, s3 = 0;
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
rdp = per_cpu_ptr(rsp->rda, cpu);
|
||||
s0 += atomic_long_read(&rdp->exp_workdone0);
|
||||
s1 += atomic_long_read(&rdp->exp_workdone1);
|
||||
s2 += atomic_long_read(&rdp->exp_workdone2);
|
||||
s3 += atomic_long_read(&rdp->exp_workdone3);
|
||||
}
|
||||
seq_printf(m, "s=%lu wd0=%lu wd1=%lu wd2=%lu wd3=%lu enq=%d sc=%lu\n",
|
||||
rsp->expedited_sequence, s0, s1, s2, s3,
|
||||
atomic_read(&rsp->expedited_need_qs),
|
||||
rsp->expedited_sequence / 2);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rcuexp_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return single_open(file, show_rcuexp, inode->i_private);
|
||||
}
|
||||
|
||||
static const struct file_operations rcuexp_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcuexp_open,
|
||||
.read = seq_read,
|
||||
.llseek = no_llseek,
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
#ifdef CONFIG_RCU_BOOST
|
||||
|
||||
static void print_one_rcu_node_boost(struct seq_file *m, struct rcu_node *rnp)
|
||||
{
|
||||
seq_printf(m, "%d:%d tasks=%c%c%c%c kt=%c ntb=%lu neb=%lu nnb=%lu ",
|
||||
rnp->grplo, rnp->grphi,
|
||||
"T."[list_empty(&rnp->blkd_tasks)],
|
||||
"N."[!rnp->gp_tasks],
|
||||
"E."[!rnp->exp_tasks],
|
||||
"B."[!rnp->boost_tasks],
|
||||
convert_kthread_status(rnp->boost_kthread_status),
|
||||
rnp->n_tasks_boosted, rnp->n_exp_boosts,
|
||||
rnp->n_normal_boosts);
|
||||
seq_printf(m, "j=%04x bt=%04x\n",
|
||||
(int)(jiffies & 0xffff),
|
||||
(int)(rnp->boost_time & 0xffff));
|
||||
seq_printf(m, " balk: nt=%lu egt=%lu bt=%lu nb=%lu ny=%lu nos=%lu\n",
|
||||
rnp->n_balk_blkd_tasks,
|
||||
rnp->n_balk_exp_gp_tasks,
|
||||
rnp->n_balk_boost_tasks,
|
||||
rnp->n_balk_notblocked,
|
||||
rnp->n_balk_notyet,
|
||||
rnp->n_balk_nos);
|
||||
}
|
||||
|
||||
static int show_rcu_node_boost(struct seq_file *m, void *unused)
|
||||
{
|
||||
struct rcu_node *rnp;
|
||||
|
||||
rcu_for_each_leaf_node(&rcu_preempt_state, rnp)
|
||||
print_one_rcu_node_boost(m, rnp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rcu_node_boost_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return single_open(file, show_rcu_node_boost, NULL);
|
||||
}
|
||||
|
||||
static const struct file_operations rcu_node_boost_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcu_node_boost_open,
|
||||
.read = seq_read,
|
||||
.llseek = no_llseek,
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
#endif /* #ifdef CONFIG_RCU_BOOST */
|
||||
|
||||
static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
|
||||
{
|
||||
unsigned long gpnum;
|
||||
int level = 0;
|
||||
struct rcu_node *rnp;
|
||||
|
||||
gpnum = rsp->gpnum;
|
||||
seq_printf(m, "c=%ld g=%ld s=%d jfq=%ld j=%x ",
|
||||
ulong2long(rsp->completed), ulong2long(gpnum),
|
||||
rsp->gp_state,
|
||||
(long)(rsp->jiffies_force_qs - jiffies),
|
||||
(int)(jiffies & 0xffff));
|
||||
seq_printf(m, "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
|
||||
rsp->n_force_qs, rsp->n_force_qs_ngp,
|
||||
rsp->n_force_qs - rsp->n_force_qs_ngp,
|
||||
READ_ONCE(rsp->n_force_qs_lh),
|
||||
rsp->orphan_done.len_lazy,
|
||||
rsp->orphan_done.len);
|
||||
for (rnp = &rsp->node[0]; rnp - &rsp->node[0] < rcu_num_nodes; rnp++) {
|
||||
if (rnp->level != level) {
|
||||
seq_puts(m, "\n");
|
||||
level = rnp->level;
|
||||
}
|
||||
seq_printf(m, "%lx/%lx->%lx %c%c>%c %d:%d ^%d ",
|
||||
rnp->qsmask, rnp->qsmaskinit, rnp->qsmaskinitnext,
|
||||
".G"[rnp->gp_tasks != NULL],
|
||||
".E"[rnp->exp_tasks != NULL],
|
||||
".T"[!list_empty(&rnp->blkd_tasks)],
|
||||
rnp->grplo, rnp->grphi, rnp->grpnum);
|
||||
}
|
||||
seq_puts(m, "\n");
|
||||
}
|
||||
|
||||
static int show_rcuhier(struct seq_file *m, void *v)
|
||||
{
|
||||
struct rcu_state *rsp = (struct rcu_state *)m->private;
|
||||
print_one_rcu_state(m, rsp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rcuhier_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return single_open(file, show_rcuhier, inode->i_private);
|
||||
}
|
||||
|
||||
static const struct file_operations rcuhier_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcuhier_open,
|
||||
.read = seq_read,
|
||||
.llseek = no_llseek,
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
static void show_one_rcugp(struct seq_file *m, struct rcu_state *rsp)
|
||||
{
|
||||
unsigned long flags;
|
||||
unsigned long completed;
|
||||
unsigned long gpnum;
|
||||
unsigned long gpage;
|
||||
unsigned long gpmax;
|
||||
struct rcu_node *rnp = &rsp->node[0];
|
||||
|
||||
raw_spin_lock_irqsave_rcu_node(rnp, flags);
|
||||
completed = READ_ONCE(rsp->completed);
|
||||
gpnum = READ_ONCE(rsp->gpnum);
|
||||
if (completed == gpnum)
|
||||
gpage = 0;
|
||||
else
|
||||
gpage = jiffies - rsp->gp_start;
|
||||
gpmax = rsp->gp_max;
|
||||
raw_spin_unlock_irqrestore(&rnp->lock, flags);
|
||||
seq_printf(m, "completed=%ld gpnum=%ld age=%ld max=%ld\n",
|
||||
ulong2long(completed), ulong2long(gpnum), gpage, gpmax);
|
||||
}
|
||||
|
||||
static int show_rcugp(struct seq_file *m, void *v)
|
||||
{
|
||||
struct rcu_state *rsp = (struct rcu_state *)m->private;
|
||||
show_one_rcugp(m, rsp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rcugp_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return single_open(file, show_rcugp, inode->i_private);
|
||||
}
|
||||
|
||||
static const struct file_operations rcugp_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcugp_open,
|
||||
.read = seq_read,
|
||||
.llseek = no_llseek,
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
static void print_one_rcu_pending(struct seq_file *m, struct rcu_data *rdp)
|
||||
{
|
||||
if (!rdp->beenonline)
|
||||
return;
|
||||
seq_printf(m, "%3d%cnp=%ld ",
|
||||
rdp->cpu,
|
||||
cpu_is_offline(rdp->cpu) ? '!' : ' ',
|
||||
rdp->n_rcu_pending);
|
||||
seq_printf(m, "qsp=%ld rpq=%ld cbr=%ld cng=%ld ",
|
||||
rdp->n_rp_core_needs_qs,
|
||||
rdp->n_rp_report_qs,
|
||||
rdp->n_rp_cb_ready,
|
||||
rdp->n_rp_cpu_needs_gp);
|
||||
seq_printf(m, "gpc=%ld gps=%ld nn=%ld ndw%ld\n",
|
||||
rdp->n_rp_gp_completed,
|
||||
rdp->n_rp_gp_started,
|
||||
rdp->n_rp_nocb_defer_wakeup,
|
||||
rdp->n_rp_need_nothing);
|
||||
}
|
||||
|
||||
static int show_rcu_pending(struct seq_file *m, void *v)
|
||||
{
|
||||
print_one_rcu_pending(m, (struct rcu_data *)v);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct seq_operations rcu_pending_op = {
|
||||
.start = r_start,
|
||||
.next = r_next,
|
||||
.stop = r_stop,
|
||||
.show = show_rcu_pending,
|
||||
};
|
||||
|
||||
static int rcu_pending_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return r_open(inode, file, &rcu_pending_op);
|
||||
}
|
||||
|
||||
static const struct file_operations rcu_pending_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcu_pending_open,
|
||||
.read = seq_read,
|
||||
.llseek = no_llseek,
|
||||
.release = seq_release,
|
||||
};
|
||||
|
||||
static int show_rcutorture(struct seq_file *m, void *unused)
|
||||
{
|
||||
seq_printf(m, "rcutorture test sequence: %lu %s\n",
|
||||
rcutorture_testseq >> 1,
|
||||
(rcutorture_testseq & 0x1) ? "(test in progress)" : "");
|
||||
seq_printf(m, "rcutorture update version number: %lu\n",
|
||||
rcutorture_vernum);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rcutorture_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return single_open(file, show_rcutorture, NULL);
|
||||
}
|
||||
|
||||
static const struct file_operations rcutorture_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = rcutorture_open,
|
||||
.read = seq_read,
|
||||
.llseek = seq_lseek,
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
static struct dentry *rcudir;
|
||||
|
||||
static int __init rcutree_trace_init(void)
|
||||
{
|
||||
struct rcu_state *rsp;
|
||||
struct dentry *retval;
|
||||
struct dentry *rspdir;
|
||||
|
||||
rcudir = debugfs_create_dir("rcu", NULL);
|
||||
if (!rcudir)
|
||||
goto free_out;
|
||||
|
||||
for_each_rcu_flavor(rsp) {
|
||||
rspdir = debugfs_create_dir(rsp->name, rcudir);
|
||||
if (!rspdir)
|
||||
goto free_out;
|
||||
|
||||
retval = debugfs_create_file("rcudata", 0444,
|
||||
rspdir, rsp, &rcudata_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
|
||||
retval = debugfs_create_file("rcuexp", 0444,
|
||||
rspdir, rsp, &rcuexp_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
|
||||
retval = debugfs_create_file("rcu_pending", 0444,
|
||||
rspdir, rsp, &rcu_pending_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
|
||||
retval = debugfs_create_file("rcubarrier", 0444,
|
||||
rspdir, rsp, &rcubarrier_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
|
||||
#ifdef CONFIG_RCU_BOOST
|
||||
if (rsp == &rcu_preempt_state) {
|
||||
retval = debugfs_create_file("rcuboost", 0444,
|
||||
rspdir, NULL, &rcu_node_boost_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
}
|
||||
#endif
|
||||
|
||||
retval = debugfs_create_file("rcugp", 0444,
|
||||
rspdir, rsp, &rcugp_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
|
||||
retval = debugfs_create_file("rcuhier", 0444,
|
||||
rspdir, rsp, &rcuhier_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
}
|
||||
|
||||
retval = debugfs_create_file("rcutorture", 0444, rcudir,
|
||||
NULL, &rcutorture_fops);
|
||||
if (!retval)
|
||||
goto free_out;
|
||||
return 0;
|
||||
free_out:
|
||||
debugfs_remove_recursive(rcudir);
|
||||
return 1;
|
||||
}
|
||||
device_initcall(rcutree_trace_init);
|
@ -62,7 +62,9 @@
|
||||
#define MODULE_PARAM_PREFIX "rcupdate."
|
||||
|
||||
#ifndef CONFIG_TINY_RCU
|
||||
extern int rcu_expedited; /* from sysctl */
|
||||
module_param(rcu_expedited, int, 0);
|
||||
extern int rcu_normal; /* from sysctl */
|
||||
module_param(rcu_normal, int, 0);
|
||||
static int rcu_normal_after_boot;
|
||||
module_param(rcu_normal_after_boot, int, 0);
|
||||
@ -379,6 +381,7 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
|
||||
struct rcu_synchronize *rs_array)
|
||||
{
|
||||
int i;
|
||||
int j;
|
||||
|
||||
/* Initialize and register callbacks for each flavor specified. */
|
||||
for (i = 0; i < n; i++) {
|
||||
@ -390,7 +393,11 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
|
||||
}
|
||||
init_rcu_head_on_stack(&rs_array[i].head);
|
||||
init_completion(&rs_array[i].completion);
|
||||
(crcu_array[i])(&rs_array[i].head, wakeme_after_rcu);
|
||||
for (j = 0; j < i; j++)
|
||||
if (crcu_array[j] == crcu_array[i])
|
||||
break;
|
||||
if (j == i)
|
||||
(crcu_array[i])(&rs_array[i].head, wakeme_after_rcu);
|
||||
}
|
||||
|
||||
/* Wait for all callbacks to be invoked. */
|
||||
@ -399,7 +406,11 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
|
||||
(crcu_array[i] == call_rcu ||
|
||||
crcu_array[i] == call_rcu_bh))
|
||||
continue;
|
||||
wait_for_completion(&rs_array[i].completion);
|
||||
for (j = 0; j < i; j++)
|
||||
if (crcu_array[j] == crcu_array[i])
|
||||
break;
|
||||
if (j == i)
|
||||
wait_for_completion(&rs_array[i].completion);
|
||||
destroy_rcu_head_on_stack(&rs_array[i].head);
|
||||
}
|
||||
}
|
||||
@ -560,15 +571,30 @@ static DEFINE_RAW_SPINLOCK(rcu_tasks_cbs_lock);
|
||||
DEFINE_SRCU(tasks_rcu_exit_srcu);
|
||||
|
||||
/* Control stall timeouts. Disable with <= 0, otherwise jiffies till stall. */
|
||||
static int rcu_task_stall_timeout __read_mostly = HZ * 60 * 10;
|
||||
#define RCU_TASK_STALL_TIMEOUT (HZ * 60 * 10)
|
||||
static int rcu_task_stall_timeout __read_mostly = RCU_TASK_STALL_TIMEOUT;
|
||||
module_param(rcu_task_stall_timeout, int, 0644);
|
||||
|
||||
static void rcu_spawn_tasks_kthread(void);
|
||||
static struct task_struct *rcu_tasks_kthread_ptr;
|
||||
|
||||
/*
|
||||
* Post an RCU-tasks callback. First call must be from process context
|
||||
* after the scheduler if fully operational.
|
||||
/**
|
||||
* call_rcu_tasks() - Queue an RCU for invocation task-based grace period
|
||||
* @rhp: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
*
|
||||
* The callback function will be invoked some time after a full grace
|
||||
* period elapses, in other words after all currently executing RCU
|
||||
* read-side critical sections have completed. call_rcu_tasks() assumes
|
||||
* that the read-side critical sections end at a voluntary context
|
||||
* switch (not a preemption!), entry into idle, or transition to usermode
|
||||
* execution. As such, there are no read-side primitives analogous to
|
||||
* rcu_read_lock() and rcu_read_unlock() because this primitive is intended
|
||||
* to determine that all tasks have passed through a safe state, not so
|
||||
* much for data-strcuture synchronization.
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*/
|
||||
void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
|
||||
{
|
||||
@ -851,6 +877,23 @@ static void rcu_spawn_tasks_kthread(void)
|
||||
|
||||
#endif /* #ifdef CONFIG_TASKS_RCU */
|
||||
|
||||
#ifndef CONFIG_TINY_RCU
|
||||
|
||||
/*
|
||||
* Print any non-default Tasks RCU settings.
|
||||
*/
|
||||
static void __init rcu_tasks_bootup_oddness(void)
|
||||
{
|
||||
#ifdef CONFIG_TASKS_RCU
|
||||
if (rcu_task_stall_timeout != RCU_TASK_STALL_TIMEOUT)
|
||||
pr_info("\tTasks-RCU CPU stall warnings timeout set to %d (rcu_task_stall_timeout).\n", rcu_task_stall_timeout);
|
||||
else
|
||||
pr_info("\tTasks RCU enabled.\n");
|
||||
#endif /* #ifdef CONFIG_TASKS_RCU */
|
||||
}
|
||||
|
||||
#endif /* #ifndef CONFIG_TINY_RCU */
|
||||
|
||||
#ifdef CONFIG_PROVE_RCU
|
||||
|
||||
/*
|
||||
@ -935,3 +978,25 @@ late_initcall(rcu_verify_early_boot_tests);
|
||||
#else
|
||||
void rcu_early_boot_tests(void) {}
|
||||
#endif /* CONFIG_PROVE_RCU */
|
||||
|
||||
#ifndef CONFIG_TINY_RCU
|
||||
|
||||
/*
|
||||
* Print any significant non-default boot-time settings.
|
||||
*/
|
||||
void __init rcupdate_announce_bootup_oddness(void)
|
||||
{
|
||||
if (rcu_normal)
|
||||
pr_info("\tNo expedited grace period (rcu_normal).\n");
|
||||
else if (rcu_normal_after_boot)
|
||||
pr_info("\tNo expedited grace period (rcu_normal_after_boot).\n");
|
||||
else if (rcu_expedited)
|
||||
pr_info("\tAll grace periods are expedited (rcu_expedited).\n");
|
||||
if (rcu_cpu_stall_suppress)
|
||||
pr_info("\tRCU CPU stall warnings suppressed (rcu_cpu_stall_suppress).\n");
|
||||
if (rcu_cpu_stall_timeout != CONFIG_RCU_CPU_STALL_TIMEOUT)
|
||||
pr_info("\tRCU CPU stall warnings timeout set to %d (rcu_cpu_stall_timeout).\n", rcu_cpu_stall_timeout);
|
||||
rcu_tasks_bootup_oddness();
|
||||
}
|
||||
|
||||
#endif /* #ifndef CONFIG_TINY_RCU */
|
||||
|
@ -5874,15 +5874,9 @@ int sched_cpu_deactivate(unsigned int cpu)
|
||||
* users of this state to go away such that all new such users will
|
||||
* observe it.
|
||||
*
|
||||
* For CONFIG_PREEMPT we have preemptible RCU and its sync_rcu() might
|
||||
* not imply sync_sched(), so wait for both.
|
||||
*
|
||||
* Do sync before park smpboot threads to take care the rcu boost case.
|
||||
*/
|
||||
if (IS_ENABLED(CONFIG_PREEMPT))
|
||||
synchronize_rcu_mult(call_rcu, call_rcu_sched);
|
||||
else
|
||||
synchronize_rcu();
|
||||
synchronize_rcu_mult(call_rcu, call_rcu_sched);
|
||||
|
||||
if (!sched_smp_initialized)
|
||||
return 0;
|
||||
|
@ -126,56 +126,6 @@ config NO_HZ_FULL_ALL
|
||||
Note the boot CPU will still be kept outside the range to
|
||||
handle the timekeeping duty.
|
||||
|
||||
config NO_HZ_FULL_SYSIDLE
|
||||
bool "Detect full-system idle state for full dynticks system"
|
||||
depends on NO_HZ_FULL
|
||||
default n
|
||||
help
|
||||
At least one CPU must keep the scheduling-clock tick running for
|
||||
timekeeping purposes whenever there is a non-idle CPU, where
|
||||
"non-idle" also includes dynticks CPUs as long as they are
|
||||
running non-idle tasks. Because the underlying adaptive-tick
|
||||
support cannot distinguish between all CPUs being idle and
|
||||
all CPUs each running a single task in dynticks mode, the
|
||||
underlying support simply ensures that there is always a CPU
|
||||
handling the scheduling-clock tick, whether or not all CPUs
|
||||
are idle. This Kconfig option enables scalable detection of
|
||||
the all-CPUs-idle state, thus allowing the scheduling-clock
|
||||
tick to be disabled when all CPUs are idle. Note that scalable
|
||||
detection of the all-CPUs-idle state means that larger systems
|
||||
will be slower to declare the all-CPUs-idle state.
|
||||
|
||||
Say Y if you would like to help debug all-CPUs-idle detection.
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config NO_HZ_FULL_SYSIDLE_SMALL
|
||||
int "Number of CPUs above which large-system approach is used"
|
||||
depends on NO_HZ_FULL_SYSIDLE
|
||||
range 1 NR_CPUS
|
||||
default 8
|
||||
help
|
||||
The full-system idle detection mechanism takes a lazy approach
|
||||
on large systems, as is required to attain decent scalability.
|
||||
However, on smaller systems, scalability is not anywhere near as
|
||||
large a concern as is energy efficiency. The sysidle subsystem
|
||||
therefore uses a fast but non-scalable algorithm for small
|
||||
systems and a lazier but scalable algorithm for large systems.
|
||||
This Kconfig parameter defines the number of CPUs in the largest
|
||||
system that will be considered to be "small".
|
||||
|
||||
The default value will be fine in most cases. Battery-powered
|
||||
systems that (1) enable NO_HZ_FULL_SYSIDLE, (2) have larger
|
||||
numbers of CPUs, and (3) are suffering from battery-lifetime
|
||||
problems due to long sysidle latencies might wish to experiment
|
||||
with larger values for this Kconfig parameter. On the other
|
||||
hand, they might be even better served by disabling NO_HZ_FULL
|
||||
entirely, given that NO_HZ_FULL is intended for HPC and
|
||||
real-time workloads that at present do not tend to be run on
|
||||
battery-powered systems.
|
||||
|
||||
Take the default if you are unsure.
|
||||
|
||||
config NO_HZ
|
||||
bool "Old Idle dynticks config"
|
||||
depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
|
||||
|
@ -1301,189 +1301,7 @@ config DEBUG_CREDENTIALS
|
||||
|
||||
If unsure, say N.
|
||||
|
||||
menu "RCU Debugging"
|
||||
|
||||
config PROVE_RCU
|
||||
def_bool PROVE_LOCKING
|
||||
|
||||
config PROVE_RCU_REPEATEDLY
|
||||
bool "RCU debugging: don't disable PROVE_RCU on first splat"
|
||||
depends on PROVE_RCU
|
||||
default n
|
||||
help
|
||||
By itself, PROVE_RCU will disable checking upon issuing the
|
||||
first warning (or "splat"). This feature prevents such
|
||||
disabling, allowing multiple RCU-lockdep warnings to be printed
|
||||
on a single reboot.
|
||||
|
||||
Say Y to allow multiple RCU-lockdep warnings per boot.
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config SPARSE_RCU_POINTER
|
||||
bool "RCU debugging: sparse-based checks for pointer usage"
|
||||
default n
|
||||
help
|
||||
This feature enables the __rcu sparse annotation for
|
||||
RCU-protected pointers. This annotation will cause sparse
|
||||
to flag any non-RCU used of annotated pointers. This can be
|
||||
helpful when debugging RCU usage. Please note that this feature
|
||||
is not intended to enforce code cleanliness; it is instead merely
|
||||
a debugging aid.
|
||||
|
||||
Say Y to make sparse flag questionable use of RCU-protected pointers
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config TORTURE_TEST
|
||||
tristate
|
||||
default n
|
||||
|
||||
config RCU_PERF_TEST
|
||||
tristate "performance tests for RCU"
|
||||
depends on DEBUG_KERNEL
|
||||
select TORTURE_TEST
|
||||
select SRCU
|
||||
select TASKS_RCU
|
||||
default n
|
||||
help
|
||||
This option provides a kernel module that runs performance
|
||||
tests on the RCU infrastructure. The kernel module may be built
|
||||
after the fact on the running kernel to be tested, if desired.
|
||||
|
||||
Say Y here if you want RCU performance tests to be built into
|
||||
the kernel.
|
||||
Say M if you want the RCU performance tests to build as a module.
|
||||
Say N if you are unsure.
|
||||
|
||||
config RCU_TORTURE_TEST
|
||||
tristate "torture tests for RCU"
|
||||
depends on DEBUG_KERNEL
|
||||
select TORTURE_TEST
|
||||
select SRCU
|
||||
select TASKS_RCU
|
||||
default n
|
||||
help
|
||||
This option provides a kernel module that runs torture tests
|
||||
on the RCU infrastructure. The kernel module may be built
|
||||
after the fact on the running kernel to be tested, if desired.
|
||||
|
||||
Say Y here if you want RCU torture tests to be built into
|
||||
the kernel.
|
||||
Say M if you want the RCU torture tests to build as a module.
|
||||
Say N if you are unsure.
|
||||
|
||||
config RCU_TORTURE_TEST_SLOW_PREINIT
|
||||
bool "Slow down RCU grace-period pre-initialization to expose races"
|
||||
depends on RCU_TORTURE_TEST
|
||||
help
|
||||
This option delays grace-period pre-initialization (the
|
||||
propagation of CPU-hotplug changes up the rcu_node combining
|
||||
tree) for a few jiffies between initializing each pair of
|
||||
consecutive rcu_node structures. This helps to expose races
|
||||
involving grace-period pre-initialization, in other words, it
|
||||
makes your kernel less stable. It can also greatly increase
|
||||
grace-period latency, especially on systems with large numbers
|
||||
of CPUs. This is useful when torture-testing RCU, but in
|
||||
almost no other circumstance.
|
||||
|
||||
Say Y here if you want your system to crash and hang more often.
|
||||
Say N if you want a sane system.
|
||||
|
||||
config RCU_TORTURE_TEST_SLOW_PREINIT_DELAY
|
||||
int "How much to slow down RCU grace-period pre-initialization"
|
||||
range 0 5
|
||||
default 3
|
||||
depends on RCU_TORTURE_TEST_SLOW_PREINIT
|
||||
help
|
||||
This option specifies the number of jiffies to wait between
|
||||
each rcu_node structure pre-initialization step.
|
||||
|
||||
config RCU_TORTURE_TEST_SLOW_INIT
|
||||
bool "Slow down RCU grace-period initialization to expose races"
|
||||
depends on RCU_TORTURE_TEST
|
||||
help
|
||||
This option delays grace-period initialization for a few
|
||||
jiffies between initializing each pair of consecutive
|
||||
rcu_node structures. This helps to expose races involving
|
||||
grace-period initialization, in other words, it makes your
|
||||
kernel less stable. It can also greatly increase grace-period
|
||||
latency, especially on systems with large numbers of CPUs.
|
||||
This is useful when torture-testing RCU, but in almost no
|
||||
other circumstance.
|
||||
|
||||
Say Y here if you want your system to crash and hang more often.
|
||||
Say N if you want a sane system.
|
||||
|
||||
config RCU_TORTURE_TEST_SLOW_INIT_DELAY
|
||||
int "How much to slow down RCU grace-period initialization"
|
||||
range 0 5
|
||||
default 3
|
||||
depends on RCU_TORTURE_TEST_SLOW_INIT
|
||||
help
|
||||
This option specifies the number of jiffies to wait between
|
||||
each rcu_node structure initialization.
|
||||
|
||||
config RCU_TORTURE_TEST_SLOW_CLEANUP
|
||||
bool "Slow down RCU grace-period cleanup to expose races"
|
||||
depends on RCU_TORTURE_TEST
|
||||
help
|
||||
This option delays grace-period cleanup for a few jiffies
|
||||
between cleaning up each pair of consecutive rcu_node
|
||||
structures. This helps to expose races involving grace-period
|
||||
cleanup, in other words, it makes your kernel less stable.
|
||||
It can also greatly increase grace-period latency, especially
|
||||
on systems with large numbers of CPUs. This is useful when
|
||||
torture-testing RCU, but in almost no other circumstance.
|
||||
|
||||
Say Y here if you want your system to crash and hang more often.
|
||||
Say N if you want a sane system.
|
||||
|
||||
config RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY
|
||||
int "How much to slow down RCU grace-period cleanup"
|
||||
range 0 5
|
||||
default 3
|
||||
depends on RCU_TORTURE_TEST_SLOW_CLEANUP
|
||||
help
|
||||
This option specifies the number of jiffies to wait between
|
||||
each rcu_node structure cleanup operation.
|
||||
|
||||
config RCU_CPU_STALL_TIMEOUT
|
||||
int "RCU CPU stall timeout in seconds"
|
||||
depends on RCU_STALL_COMMON
|
||||
range 3 300
|
||||
default 21
|
||||
help
|
||||
If a given RCU grace period extends more than the specified
|
||||
number of seconds, a CPU stall warning is printed. If the
|
||||
RCU grace period persists, additional CPU stall warnings are
|
||||
printed at more widely spaced intervals.
|
||||
|
||||
config RCU_TRACE
|
||||
bool "Enable tracing for RCU"
|
||||
depends on DEBUG_KERNEL
|
||||
default y if TREE_RCU
|
||||
select TRACE_CLOCK
|
||||
help
|
||||
This option provides tracing in RCU which presents stats
|
||||
in debugfs for debugging RCU implementation. It also enables
|
||||
additional tracepoints for ftrace-style event tracing.
|
||||
|
||||
Say Y here if you want to enable RCU tracing
|
||||
Say N if you are unsure.
|
||||
|
||||
config RCU_EQS_DEBUG
|
||||
bool "Provide debugging asserts for adding NO_HZ support to an arch"
|
||||
depends on DEBUG_KERNEL
|
||||
help
|
||||
This option provides consistency checks in RCU's handling of
|
||||
NO_HZ. These checks have proven quite helpful in detecting
|
||||
bugs in arch-specific NO_HZ code.
|
||||
|
||||
Say N here if you need ultimate kernel/user switch latencies
|
||||
Say Y if you are unsure
|
||||
|
||||
endmenu # "RCU Debugging"
|
||||
source "kernel/rcu/Kconfig.debug"
|
||||
|
||||
config DEBUG_WQ_FORCE_RR_CPU
|
||||
bool "Force round-robin CPU selection for unbound work items"
|
||||
|
@ -25,9 +25,6 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
|
||||
earlycpio.o seq_buf.o siphash.o \
|
||||
nmi_backtrace.o nodemask.o win_minmax.o
|
||||
|
||||
CFLAGS_radix-tree.o += -DCONFIG_SPARSE_RCU_POINTER
|
||||
CFLAGS_idr.o += -DCONFIG_SPARSE_RCU_POINTER
|
||||
|
||||
lib-$(CONFIG_MMU) += ioremap.o
|
||||
lib-$(CONFIG_SMP) += cpumask.o
|
||||
lib-$(CONFIG_DMA_NOOP_OPS) += dma-noop.o
|
||||
|
@ -5533,23 +5533,6 @@ sub process {
|
||||
}
|
||||
}
|
||||
|
||||
# Check for expedited grace periods that interrupt non-idle non-nohz
|
||||
# online CPUs. These expedited can therefore degrade real-time response
|
||||
# if used carelessly, and should be avoided where not absolutely
|
||||
# needed. It is always OK to use synchronize_rcu_expedited() and
|
||||
# synchronize_sched_expedited() at boot time (before real-time applications
|
||||
# start) and in error situations where real-time response is compromised in
|
||||
# any case. Note that synchronize_srcu_expedited() does -not- interrupt
|
||||
# other CPUs, so don't warn on uses of synchronize_srcu_expedited().
|
||||
# Of course, nothing comes for free, and srcu_read_lock() and
|
||||
# srcu_read_unlock() do contain full memory barriers in payment for
|
||||
# synchronize_srcu_expedited() non-interruption properties.
|
||||
if ($line =~ /\b(synchronize_rcu_expedited|synchronize_sched_expedited)\(/) {
|
||||
WARN("EXPEDITED_RCU_GRACE_PERIOD",
|
||||
"expedited RCU grace periods should be avoided where they can degrade real-time response\n" . $herecurr);
|
||||
|
||||
}
|
||||
|
||||
# check of hardware specific defines
|
||||
if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) {
|
||||
CHK("ARCH_DEFINES",
|
||||
|
@ -27,7 +27,7 @@ cat $1 > $T/.config
|
||||
|
||||
cat $2 | sed -e 's/\(.*\)=n/# \1 is not set/' -e 's/^#CHECK#//' |
|
||||
awk '
|
||||
BEGIN {
|
||||
{
|
||||
print "if grep -q \"" $0 "\" < '"$T/.config"'";
|
||||
print "then";
|
||||
print "\t:";
|
||||
|
@ -45,7 +45,7 @@ T=/tmp/test-linux.sh.$$
|
||||
trap 'rm -rf $T' 0
|
||||
mkdir $T
|
||||
|
||||
grep -v 'CONFIG_[A-Z]*_TORTURE_TEST' < ${config_template} > $T/config
|
||||
grep -v 'CONFIG_[A-Z]*_TORTURE_TEST=' < ${config_template} > $T/config
|
||||
cat << ___EOF___ >> $T/config
|
||||
CONFIG_INITRAMFS_SOURCE="$TORTURE_INITRD"
|
||||
CONFIG_VIRTIO_PCI=y
|
||||
|
@ -296,10 +296,7 @@ if test -d .git
|
||||
then
|
||||
git status >> $resdir/$ds/testid.txt
|
||||
git rev-parse HEAD >> $resdir/$ds/testid.txt
|
||||
if ! git diff HEAD > $T/git-diff 2>&1
|
||||
then
|
||||
cp $T/git-diff $resdir/$ds
|
||||
fi
|
||||
git diff HEAD >> $resdir/$ds/testid.txt
|
||||
fi
|
||||
___EOF___
|
||||
awk < $T/cfgcpu.pack \
|
||||
|
@ -9,6 +9,8 @@ TREE08
|
||||
TREE09
|
||||
SRCU-N
|
||||
SRCU-P
|
||||
SRCU-t
|
||||
SRCU-u
|
||||
TINY01
|
||||
TINY02
|
||||
TASKS01
|
||||
|
@ -0,0 +1 @@
|
||||
rcutorture.torture_type=srcud
|
@ -5,4 +5,4 @@ CONFIG_HOTPLUG_CPU=y
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
#CHECK#CONFIG_RCU_EXPERT=n
|
||||
|
@ -2,7 +2,11 @@ CONFIG_RCU_TRACE=n
|
||||
CONFIG_SMP=y
|
||||
CONFIG_NR_CPUS=8
|
||||
CONFIG_HOTPLUG_CPU=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_FANOUT=2
|
||||
CONFIG_RCU_FANOUT_LEAF=2
|
||||
CONFIG_PREEMPT_NONE=n
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=y
|
||||
#CHECK#CONFIG_RCU_EXPERT=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_PROVE_LOCKING=y
|
||||
|
10
tools/testing/selftests/rcutorture/configs/rcu/SRCU-t
Normal file
10
tools/testing/selftests/rcutorture/configs/rcu/SRCU-t
Normal file
@ -0,0 +1,10 @@
|
||||
CONFIG_SMP=n
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
#CHECK#CONFIG_TINY_SRCU=y
|
||||
CONFIG_RCU_TRACE=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_DEBUG_ATOMIC_SLEEP=y
|
||||
#CHECK#CONFIG_PREEMPT_COUNT=y
|
@ -0,0 +1 @@
|
||||
rcutorture.torture_type=srcu
|
9
tools/testing/selftests/rcutorture/configs/rcu/SRCU-u
Normal file
9
tools/testing/selftests/rcutorture/configs/rcu/SRCU-u
Normal file
@ -0,0 +1,9 @@
|
||||
CONFIG_SMP=n
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
#CHECK#CONFIG_TINY_SRCU=y
|
||||
CONFIG_RCU_TRACE=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_PREEMPT_COUNT=n
|
@ -0,0 +1 @@
|
||||
rcutorture.torture_type=srcud
|
@ -6,10 +6,9 @@ CONFIG_PREEMPT=n
|
||||
CONFIG_HZ_PERIODIC=y
|
||||
CONFIG_NO_HZ_IDLE=n
|
||||
CONFIG_NO_HZ_FULL=n
|
||||
CONFIG_RCU_TRACE=y
|
||||
CONFIG_PROVE_LOCKING=y
|
||||
CONFIG_PROVE_RCU_REPEATEDLY=y
|
||||
#CHECK#CONFIG_PROVE_RCU=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_DEBUG_OBJECTS=y
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
|
||||
CONFIG_PREEMPT_COUNT=y
|
||||
CONFIG_DEBUG_ATOMIC_SLEEP=y
|
||||
|
@ -10,12 +10,9 @@ CONFIG_RCU_FAST_NO_HZ=y
|
||||
CONFIG_RCU_TRACE=y
|
||||
CONFIG_HOTPLUG_CPU=y
|
||||
CONFIG_MAXSMP=y
|
||||
CONFIG_CPUMASK_OFFSTACK=y
|
||||
CONFIG_RCU_NOCB_CPU=y
|
||||
CONFIG_RCU_NOCB_CPU_ZERO=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_RCU_BOOST=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
|
||||
|
@ -1 +1,5 @@
|
||||
rcutorture.torture_type=rcu_bh maxcpus=8
|
||||
rcutree.gp_preinit_delay=3
|
||||
rcutree.gp_init_delay=3
|
||||
rcutree.gp_cleanup_delay=3
|
||||
rcu_nocbs=0
|
||||
|
@ -18,9 +18,6 @@ CONFIG_RCU_NOCB_CPU=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_PROVE_LOCKING=n
|
||||
CONFIG_RCU_BOOST=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
|
||||
CONFIG_DEBUG_OBJECTS=y
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
|
||||
|
@ -14,9 +14,5 @@ CONFIG_RCU_FANOUT_LEAF=2
|
||||
CONFIG_RCU_NOCB_CPU=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_RCU_BOOST=y
|
||||
CONFIG_RCU_KTHREAD_PRIO=2
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
|
||||
|
@ -1 +1,5 @@
|
||||
rcutorture.onoff_interval=1 rcutorture.onoff_holdoff=30
|
||||
rcutree.gp_preinit_delay=3
|
||||
rcutree.gp_init_delay=3
|
||||
rcutree.gp_cleanup_delay=3
|
||||
rcutree.kthread_prio=2
|
||||
|
@ -15,11 +15,7 @@ CONFIG_SUSPEND=n
|
||||
CONFIG_HIBERNATION=n
|
||||
CONFIG_RCU_FANOUT=4
|
||||
CONFIG_RCU_FANOUT_LEAF=3
|
||||
CONFIG_RCU_NOCB_CPU=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
|
||||
CONFIG_RCU_EQS_DEBUG=y
|
||||
|
@ -13,12 +13,8 @@ CONFIG_HOTPLUG_CPU=y
|
||||
CONFIG_RCU_FANOUT=6
|
||||
CONFIG_RCU_FANOUT_LEAF=6
|
||||
CONFIG_RCU_NOCB_CPU=y
|
||||
CONFIG_RCU_NOCB_CPU_NONE=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_PROVE_LOCKING=y
|
||||
#CHECK#CONFIG_PROVE_RCU=y
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
|
||||
|
@ -1,2 +1,5 @@
|
||||
rcutorture.torture_type=sched
|
||||
rcupdate.rcu_self_test_sched=1
|
||||
rcutree.gp_preinit_delay=3
|
||||
rcutree.gp_init_delay=3
|
||||
rcutree.gp_cleanup_delay=3
|
||||
|
@ -18,8 +18,6 @@ CONFIG_RCU_NOCB_CPU=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_PROVE_LOCKING=y
|
||||
#CHECK#CONFIG_PROVE_RCU=y
|
||||
CONFIG_DEBUG_OBJECTS=y
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
|
||||
|
@ -2,3 +2,6 @@ rcupdate.rcu_self_test=1
|
||||
rcupdate.rcu_self_test_bh=1
|
||||
rcupdate.rcu_self_test_sched=1
|
||||
rcutree.rcu_fanout_exact=1
|
||||
rcutree.gp_preinit_delay=3
|
||||
rcutree.gp_init_delay=3
|
||||
rcutree.gp_cleanup_delay=3
|
||||
|
@ -1,6 +1,5 @@
|
||||
CONFIG_SMP=y
|
||||
CONFIG_NR_CPUS=16
|
||||
CONFIG_CPUMASK_OFFSTACK=y
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
@ -9,16 +8,11 @@ CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=n
|
||||
CONFIG_NO_HZ_FULL=y
|
||||
CONFIG_NO_HZ_FULL_ALL=n
|
||||
CONFIG_NO_HZ_FULL_SYSIDLE=y
|
||||
CONFIG_RCU_FAST_NO_HZ=n
|
||||
CONFIG_RCU_TRACE=y
|
||||
CONFIG_HOTPLUG_CPU=y
|
||||
CONFIG_RCU_FANOUT=2
|
||||
CONFIG_RCU_FANOUT_LEAF=2
|
||||
CONFIG_RCU_NOCB_CPU=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
|
||||
|
@ -15,7 +15,6 @@ CONFIG_HIBERNATION=n
|
||||
CONFIG_RCU_FANOUT=3
|
||||
CONFIG_RCU_FANOUT_LEAF=2
|
||||
CONFIG_RCU_NOCB_CPU=y
|
||||
CONFIG_RCU_NOCB_CPU_ALL=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_PROVE_LOCKING=n
|
||||
CONFIG_RCU_BOOST=n
|
||||
|
@ -1,21 +0,0 @@
|
||||
CONFIG_SMP=y
|
||||
CONFIG_NR_CPUS=16
|
||||
CONFIG_PREEMPT_NONE=n
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=y
|
||||
#CHECK#CONFIG_PREEMPT_RCU=y
|
||||
CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=y
|
||||
CONFIG_NO_HZ_FULL=n
|
||||
CONFIG_RCU_FAST_NO_HZ=n
|
||||
CONFIG_RCU_TRACE=y
|
||||
CONFIG_HOTPLUG_CPU=n
|
||||
CONFIG_SUSPEND=n
|
||||
CONFIG_HIBERNATION=n
|
||||
CONFIG_RCU_FANOUT=3
|
||||
CONFIG_RCU_FANOUT_LEAF=2
|
||||
CONFIG_RCU_NOCB_CPU=y
|
||||
CONFIG_RCU_NOCB_CPU_ALL=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_RCU_BOOST=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
@ -2,3 +2,4 @@ rcutorture.torture_type=sched
|
||||
rcupdate.rcu_self_test=1
|
||||
rcupdate.rcu_self_test_sched=1
|
||||
rcutree.rcu_fanout_exact=1
|
||||
rcu_nocbs=0-7
|
||||
|
@ -1,21 +1,16 @@
|
||||
CONFIG_SMP=y
|
||||
CONFIG_NR_CPUS=8
|
||||
CONFIG_PREEMPT_NONE=n
|
||||
CONFIG_SMP=n
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=y
|
||||
#CHECK#CONFIG_PREEMPT_RCU=y
|
||||
CONFIG_PREEMPT=n
|
||||
#CHECK#CONFIG_TINY_RCU=y
|
||||
CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=y
|
||||
CONFIG_NO_HZ_FULL=n
|
||||
CONFIG_RCU_FAST_NO_HZ=n
|
||||
CONFIG_RCU_TRACE=y
|
||||
CONFIG_HOTPLUG_CPU=n
|
||||
CONFIG_SUSPEND=n
|
||||
CONFIG_HIBERNATION=n
|
||||
CONFIG_RCU_FANOUT=3
|
||||
CONFIG_RCU_FANOUT_LEAF=3
|
||||
CONFIG_RCU_NOCB_CPU=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=y
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_PROVE_LOCKING=n
|
||||
CONFIG_RCU_BOOST=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_RCU_TRACE=y
|
@ -7,7 +7,6 @@ CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=y
|
||||
CONFIG_NO_HZ_FULL=n
|
||||
CONFIG_RCU_FAST_NO_HZ=n
|
||||
CONFIG_RCU_TRACE=n
|
||||
CONFIG_HOTPLUG_CPU=n
|
||||
CONFIG_SUSPEND=n
|
||||
CONFIG_HIBERNATION=n
|
||||
|
@ -8,7 +8,6 @@ CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=y
|
||||
CONFIG_NO_HZ_FULL=n
|
||||
CONFIG_RCU_FAST_NO_HZ=n
|
||||
CONFIG_RCU_TRACE=n
|
||||
CONFIG_HOTPLUG_CPU=n
|
||||
CONFIG_SUSPEND=n
|
||||
CONFIG_HIBERNATION=n
|
||||
|
@ -18,7 +18,6 @@ CONFIG_PROVE_RCU
|
||||
|
||||
In common code tested by TREE_RCU test cases.
|
||||
|
||||
CONFIG_NO_HZ_FULL_SYSIDLE
|
||||
CONFIG_RCU_NOCB_CPU
|
||||
|
||||
Meaningless for TINY_RCU.
|
||||
|
@ -9,28 +9,20 @@ CONFIG_DEBUG_OBJECTS_RCU_HEAD -- Do one.
|
||||
CONFIG_HOTPLUG_CPU -- Do half. (Every second.)
|
||||
CONFIG_HZ_PERIODIC -- Do one.
|
||||
CONFIG_NO_HZ_IDLE -- Do those not otherwise specified. (Groups of two.)
|
||||
CONFIG_NO_HZ_FULL -- Do two, one with CONFIG_NO_HZ_FULL_SYSIDLE.
|
||||
CONFIG_NO_HZ_FULL_SYSIDLE -- Do one.
|
||||
CONFIG_NO_HZ_FULL -- Do two, one with partial CPU enablement.
|
||||
CONFIG_PREEMPT -- Do half. (First three and #8.)
|
||||
CONFIG_PROVE_LOCKING -- Do several, covering CONFIG_DEBUG_LOCK_ALLOC=y and not.
|
||||
CONFIG_PROVE_RCU -- Hardwired to CONFIG_PROVE_LOCKING.
|
||||
CONFIG_PROVE_RCU_REPEATEDLY -- Do one.
|
||||
CONFIG_RCU_BOOST -- one of PREEMPT_RCU.
|
||||
CONFIG_RCU_KTHREAD_PRIO -- set to 2 for _BOOST testing.
|
||||
CONFIG_RCU_FANOUT -- Cover hierarchy, but overlap with others.
|
||||
CONFIG_RCU_FANOUT_LEAF -- Do one non-default.
|
||||
CONFIG_RCU_FAST_NO_HZ -- Do one, but not with CONFIG_RCU_NOCB_CPU_ALL.
|
||||
CONFIG_RCU_NOCB_CPU -- Do three, see below.
|
||||
CONFIG_RCU_NOCB_CPU_ALL -- Do one.
|
||||
CONFIG_RCU_NOCB_CPU_NONE -- Do one.
|
||||
CONFIG_RCU_NOCB_CPU_ZERO -- Do one.
|
||||
CONFIG_RCU_FAST_NO_HZ -- Do one, but not with all nohz_full CPUs.
|
||||
CONFIG_RCU_NOCB_CPU -- Do three, one with no rcu_nocbs CPUs, one with
|
||||
rcu_nocbs=0, and one with all rcu_nocbs CPUs.
|
||||
CONFIG_RCU_TRACE -- Do half.
|
||||
CONFIG_SMP -- Need one !SMP for PREEMPT_RCU.
|
||||
CONFIG_RCU_EXPERT=n -- Do a few, but these have to be vanilla configurations.
|
||||
CONFIG_RCU_EQS_DEBUG -- Do at least one for CONFIG_NO_HZ_FULL and not.
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP -- Do for all but a couple TREE scenarios.
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT -- Do for all but a couple TREE scenarios.
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT -- Do for all but a couple TREE scenarios.
|
||||
|
||||
RCU-bh: Do one with PREEMPT and one with !PREEMPT.
|
||||
RCU-sched: Do one with PREEMPT but not BOOST.
|
||||
@ -52,10 +44,6 @@ CONFIG_64BIT
|
||||
|
||||
Used only to check CONFIG_RCU_FANOUT value, inspection suffices.
|
||||
|
||||
CONFIG_NO_HZ_FULL_SYSIDLE_SMALL
|
||||
|
||||
Defer until Frederic uses this.
|
||||
|
||||
CONFIG_PREEMPT_COUNT
|
||||
CONFIG_PREEMPT_RCU
|
||||
|
||||
@ -78,30 +66,16 @@ CONFIG_RCU_TORTURE_TEST_RUNNABLE
|
||||
|
||||
Always used in KVM testing.
|
||||
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT_DELAY
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY
|
||||
CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY
|
||||
|
||||
Inspection suffices, ignore.
|
||||
|
||||
CONFIG_PREEMPT_RCU
|
||||
CONFIG_TREE_RCU
|
||||
CONFIG_TINY_RCU
|
||||
|
||||
These are controlled by CONFIG_PREEMPT and/or CONFIG_SMP.
|
||||
|
||||
CONFIG_SPARSE_RCU_POINTER
|
||||
|
||||
Makes sense only for sparse runs, not for kernel builds.
|
||||
|
||||
CONFIG_SRCU
|
||||
CONFIG_TASKS_RCU
|
||||
|
||||
Selected by CONFIG_RCU_TORTURE_TEST, so cannot disable.
|
||||
|
||||
CONFIG_RCU_TRACE
|
||||
|
||||
Implied by CONFIG_RCU_TRACE for Tree RCU.
|
||||
|
||||
|
||||
boot parameters ignored: TBD
|
||||
|
@ -1,4 +1,4 @@
|
||||
#!/bin/awk -f
|
||||
#!/usr/bin/awk -f
|
||||
|
||||
# Modify SRCU for formal verification. The first argument should be srcu.h and
|
||||
# the second should be srcu.c. Outputs modified srcu.h and srcu.c into the
|
||||
|
Loading…
Reference in New Issue
Block a user