RCU pull request for v6.12
This pull request contains the following branches: context_tracking.15.08.24a: Rename context tracking state related symbols and remove references to "dynticks" in various context tracking state variables and related helpers; force context_tracking_enabled_this_cpu() to be inlined to avoid leaving a noinstr section. csd.lock.15.08.24a: Enhance CSD-lock diagnostic reports; add an API to provide an indication of ongoing CSD-lock stall. nocb.09.09.24a: Update and simplify RCU nocb code to handle (de-)offloading of callbacks only for offline CPUs; fix RT throttling hrtimer being armed from offline CPU. rcutorture.14.08.24a: Remove redundant rcu_torture_ops get_gp_completed fields; add SRCU ->same_gp_state and ->get_comp_state functions; add generic test for NUM_ACTIVE_*RCU_POLL* for testing RCU and SRCU polled grace periods; add CFcommon.arch for arch-specific Kconfig options; print number of update types in rcu_torture_write_types(); add rcutree.nohz_full_patience_delay testing to the TREE07 scenario; add a stall_cpu_repeat module parameter to test repeated CPU stalls; add argument to limit number of CPUs a guest OS can use in torture.sh; rcustall.09.09.24a: Abbreviate RCU CPU stall warnings during CSD-lock stalls; Allow dump_cpu_task() to be called without disabling preemption; defer printing stall-warning backtrace when holding rcu_node lock. srcu.12.08.24a: Make SRCU gp seq wrap-around faster; add KCSAN checks for concurrent updates to ->srcu_n_exp_nodelay and ->reschedule_count which are used in heuristics governing auto-expediting of normal SRCU grace periods and grace-period-state-machine delays; mark idle SRCU-barrier callbacks to help identify stuck SRCU-barrier callback. rcu.tasks.14.08.24a: Remove RCU Tasks Rude asynchronous APIs as they are no longer used; stop testing RCU Tasks Rude asynchronous APIs; fix access to non-existent percpu regions; check processor-ID assumptions during chosen CPU calculation for callback enqueuing; update description of rtp->tasks_gp_seq grace-period sequence number; add rcu_barrier_cb_is_done() to identify whether a given rcu_barrier callback is stuck; mark idle Tasks-RCU-barrier callbacks; add *torture_stats_print() functions to print detailed diagnostics for Tasks-RCU variants; capture start time of rcu_barrier_tasks*() operation to help distinguish a hung barrier operation from a long series of barrier operations. rcu_scaling_tests.15.08.24a: refscale: Add a TINY scenario to support tests of Tiny RCU and Tiny SRCU; Optimize process_durations() operation; rcuscale: Dump stacks of stalled rcu_scale_writer() instances; dump grace-period statistics when rcu_scale_writer() stalls; mark idle RCU-barrier callbacks to identify stuck RCU-barrier callbacks; print detailed grace-period and barrier diagnostics on rcu_scale_writer() hangs for Tasks-RCU variants; warn if async module parameter is specified for RCU implementations that do not have async primitives such as RCU Tasks Rude; make all writer tasks report upon hang; tolerate repeated GFP_KERNEL failure in rcu_scale_writer(); use special allocator for rcu_scale_writer(); NULL out top-level pointers to heap memory to avoid double-free bugs on modprobe failures; maintain per-task instead of per-CPU callbacks count to avoid any issues with migration of either tasks or callbacks; constify struct ref_scale_ops. fixes.12.08.24a: Use system_unbound_wq for kfree_rcu work to avoid disturbing isolated CPUs. misc.11.08.24a: Warn on unexpected rcu_state.srs_done_tail state; Better define "atomic" for list_replace_rcu() and hlist_replace_rcu() routines; annotate struct kvfree_rcu_bulk_data with __counted_by(). -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSi2tPIQIc2VEtjarIAHS7/6Z0wpQUCZt8+8wAKCRAAHS7/6Z0w pTqoAPwPN//tlEoJx2PRs6t0q+nD1YNvnZawPaRmdzgdM8zJogD+PiSN+XhqRr80 jzyvMDU4Aa0wjUNP3XsCoaCxo7L/lQk= =bZ9z -----END PGP SIGNATURE----- Merge tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Neeraj Upadhyay: "Context tracking: - rename context tracking state related symbols and remove references to "dynticks" in various context tracking state variables and related helpers - force context_tracking_enabled_this_cpu() to be inlined to avoid leaving a noinstr section CSD lock: - enhance CSD-lock diagnostic reports - add an API to provide an indication of ongoing CSD-lock stall nocb: - update and simplify RCU nocb code to handle (de-)offloading of callbacks only for offline CPUs - fix RT throttling hrtimer being armed from offline CPU rcutorture: - remove redundant rcu_torture_ops get_gp_completed fields - add SRCU ->same_gp_state and ->get_comp_state functions - add generic test for NUM_ACTIVE_*RCU_POLL* for testing RCU and SRCU polled grace periods - add CFcommon.arch for arch-specific Kconfig options - print number of update types in rcu_torture_write_types() - add rcutree.nohz_full_patience_delay testing to the TREE07 scenario - add a stall_cpu_repeat module parameter to test repeated CPU stalls - add argument to limit number of CPUs a guest OS can use in torture.sh rcustall: - abbreviate RCU CPU stall warnings during CSD-lock stalls - Allow dump_cpu_task() to be called without disabling preemption - defer printing stall-warning backtrace when holding rcu_node lock srcu: - make SRCU gp seq wrap-around faster - add KCSAN checks for concurrent updates to ->srcu_n_exp_nodelay and ->reschedule_count which are used in heuristics governing auto-expediting of normal SRCU grace periods and grace-period-state-machine delays - mark idle SRCU-barrier callbacks to help identify stuck SRCU-barrier callback rcu tasks: - remove RCU Tasks Rude asynchronous APIs as they are no longer used - stop testing RCU Tasks Rude asynchronous APIs - fix access to non-existent percpu regions - check processor-ID assumptions during chosen CPU calculation for callback enqueuing - update description of rtp->tasks_gp_seq grace-period sequence number - add rcu_barrier_cb_is_done() to identify whether a given rcu_barrier callback is stuck - mark idle Tasks-RCU-barrier callbacks - add *torture_stats_print() functions to print detailed diagnostics for Tasks-RCU variants - capture start time of rcu_barrier_tasks*() operation to help distinguish a hung barrier operation from a long series of barrier operations refscale: - add a TINY scenario to support tests of Tiny RCU and Tiny SRCU - optimize process_durations() operation rcuscale: - dump stacks of stalled rcu_scale_writer() instances and grace-period statistics when rcu_scale_writer() stalls - mark idle RCU-barrier callbacks to identify stuck RCU-barrier callbacks - print detailed grace-period and barrier diagnostics on rcu_scale_writer() hangs for Tasks-RCU variants - warn if async module parameter is specified for RCU implementations that do not have async primitives such as RCU Tasks Rude - make all writer tasks report upon hang - tolerate repeated GFP_KERNEL failure in rcu_scale_writer() - use special allocator for rcu_scale_writer() - NULL out top-level pointers to heap memory to avoid double-free bugs on modprobe failures - maintain per-task instead of per-CPU callbacks count to avoid any issues with migration of either tasks or callbacks - constify struct ref_scale_ops Fixes: - use system_unbound_wq for kfree_rcu work to avoid disturbing isolated CPUs Misc: - warn on unexpected rcu_state.srs_done_tail state - better define "atomic" for list_replace_rcu() and hlist_replace_rcu() routines - annotate struct kvfree_rcu_bulk_data with __counted_by()" * tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (90 commits) rcu: Defer printing stall-warning backtrace when holding rcu_node lock rcu/nocb: Remove superfluous memory barrier after bypass enqueue rcu/nocb: Conditionally wake up rcuo if not already waiting on GP rcu/nocb: Fix RT throttling hrtimer armed from offline CPU rcu/nocb: Simplify (de-)offloading state machine context_tracking: Tag context_tracking_enabled_this_cpu() __always_inline context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching rcu: Update stray documentation references to rcu_dynticks_eqs_{enter, exit}() rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs() rcu: Rename rcu_implicit_dynticks_qs() into rcu_watching_snap_recheck() rcu: Rename dyntick_save_progress_counter() into rcu_watching_snap_save() rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap rcu: Rename struct rcu_data .dynticks_snap into .watching_snap rcu: Rename rcu_dynticks_zero_in_eqs() into rcu_watching_zero_in_eqs() rcu: Rename rcu_dynticks_in_eqs_since() into rcu_watching_snap_stopped_since() rcu: Rename rcu_dynticks_in_eqs() into rcu_watching_snap_in_eqs() rcu: Rename rcu_dynticks_eqs_online() into rcu_watching_online() context_tracking, rcu: Rename rcu_dynticks_curr_cpu_in_eqs() into rcu_is_watching_curr_cpu() context_tracking, rcu: Rename rcu_dynticks_task*() into rcu_task*() refscale: Constify struct ref_scale_ops ...
@ -921,10 +921,10 @@ This portion of the ``rcu_data`` structure is declared as follows:
|
||||
|
||||
::
|
||||
|
||||
1 int dynticks_snap;
|
||||
1 int watching_snap;
|
||||
2 unsigned long dynticks_fqs;
|
||||
|
||||
The ``->dynticks_snap`` field is used to take a snapshot of the
|
||||
The ``->watching_snap`` field is used to take a snapshot of the
|
||||
corresponding CPU's dyntick-idle state when forcing quiescent states,
|
||||
and is therefore accessed from other CPUs. Finally, the
|
||||
``->dynticks_fqs`` field is used to count the number of times this CPU
|
||||
@ -935,8 +935,8 @@ This portion of the rcu_data structure is declared as follows:
|
||||
|
||||
::
|
||||
|
||||
1 long dynticks_nesting;
|
||||
2 long dynticks_nmi_nesting;
|
||||
1 long nesting;
|
||||
2 long nmi_nesting;
|
||||
3 atomic_t dynticks;
|
||||
4 bool rcu_need_heavy_qs;
|
||||
5 bool rcu_urgent_qs;
|
||||
@ -945,14 +945,14 @@ These fields in the rcu_data structure maintain the per-CPU dyntick-idle
|
||||
state for the corresponding CPU. The fields may be accessed only from
|
||||
the corresponding CPU (and from tracing) unless otherwise stated.
|
||||
|
||||
The ``->dynticks_nesting`` field counts the nesting depth of process
|
||||
The ``->nesting`` field counts the nesting depth of process
|
||||
execution, so that in normal circumstances this counter has value zero
|
||||
or one. NMIs, irqs, and tracers are counted by the
|
||||
``->dynticks_nmi_nesting`` field. Because NMIs cannot be masked, changes
|
||||
``->nmi_nesting`` field. Because NMIs cannot be masked, changes
|
||||
to this variable have to be undertaken carefully using an algorithm
|
||||
provided by Andy Lutomirski. The initial transition from idle adds one,
|
||||
and nested transitions add two, so that a nesting level of five is
|
||||
represented by a ``->dynticks_nmi_nesting`` value of nine. This counter
|
||||
represented by a ``->nmi_nesting`` value of nine. This counter
|
||||
can therefore be thought of as counting the number of reasons why this
|
||||
CPU cannot be permitted to enter dyntick-idle mode, aside from
|
||||
process-level transitions.
|
||||
@ -960,12 +960,12 @@ process-level transitions.
|
||||
However, it turns out that when running in non-idle kernel context, the
|
||||
Linux kernel is fully capable of entering interrupt handlers that never
|
||||
exit and perhaps also vice versa. Therefore, whenever the
|
||||
``->dynticks_nesting`` field is incremented up from zero, the
|
||||
``->dynticks_nmi_nesting`` field is set to a large positive number, and
|
||||
whenever the ``->dynticks_nesting`` field is decremented down to zero,
|
||||
the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that
|
||||
``->nesting`` field is incremented up from zero, the
|
||||
``->nmi_nesting`` field is set to a large positive number, and
|
||||
whenever the ``->nesting`` field is decremented down to zero,
|
||||
the ``->nmi_nesting`` field is set to zero. Assuming that
|
||||
the number of misnested interrupts is not sufficient to overflow the
|
||||
counter, this approach corrects the ``->dynticks_nmi_nesting`` field
|
||||
counter, this approach corrects the ``->nmi_nesting`` field
|
||||
every time the corresponding CPU enters the idle loop from process
|
||||
context.
|
||||
|
||||
@ -992,8 +992,8 @@ code.
|
||||
+-----------------------------------------------------------------------+
|
||||
| **Quick Quiz**: |
|
||||
+-----------------------------------------------------------------------+
|
||||
| Why not simply combine the ``->dynticks_nesting`` and |
|
||||
| ``->dynticks_nmi_nesting`` counters into a single counter that just |
|
||||
| Why not simply combine the ``->nesting`` and |
|
||||
| ``->nmi_nesting`` counters into a single counter that just |
|
||||
| counts the number of reasons that the corresponding CPU is non-idle? |
|
||||
+-----------------------------------------------------------------------+
|
||||
| **Answer**: |
|
||||
|
@ -147,10 +147,10 @@ RCU read-side critical sections preceding and following the current
|
||||
idle sojourn.
|
||||
This case is handled by calls to the strongly ordered
|
||||
``atomic_add_return()`` read-modify-write atomic operation that
|
||||
is invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry
|
||||
time and within ``rcu_dynticks_eqs_exit()`` at idle-exit time.
|
||||
The grace-period kthread invokes first ``ct_dynticks_cpu_acquire()``
|
||||
(preceded by a full memory barrier) and ``rcu_dynticks_in_eqs_since()``
|
||||
is invoked within ``ct_kernel_exit_state()`` at idle-entry
|
||||
time and within ``ct_kernel_enter_state()`` at idle-exit time.
|
||||
The grace-period kthread invokes first ``ct_rcu_watching_cpu_acquire()``
|
||||
(preceded by a full memory barrier) and ``rcu_watching_snap_stopped_since()``
|
||||
(both of which rely on acquire semantics) to detect idle CPUs.
|
||||
|
||||
+-----------------------------------------------------------------------+
|
||||
|
@ -528,7 +528,7 @@
|
||||
font-style="normal"
|
||||
y="-8652.5312"
|
||||
x="2466.7822"
|
||||
xml:space="preserve">dyntick_save_progress_counter()</text>
|
||||
xml:space="preserve">rcu_watching_snap_save()</text>
|
||||
<text
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"
|
||||
id="text202-7-2-7-2-0"
|
||||
@ -537,7 +537,7 @@
|
||||
font-style="normal"
|
||||
y="-8368.1475"
|
||||
x="2463.3262"
|
||||
xml:space="preserve">rcu_implicit_dynticks_qs()</text>
|
||||
xml:space="preserve">rcu_watching_snap_recheck()</text>
|
||||
</g>
|
||||
<g
|
||||
id="g4504"
|
||||
@ -607,7 +607,7 @@
|
||||
font-weight="bold"
|
||||
font-size="192"
|
||||
id="text202-7-5-3-27-6"
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text>
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="3745.7725"
|
||||
@ -638,7 +638,7 @@
|
||||
font-weight="bold"
|
||||
font-size="192"
|
||||
id="text202-7-5-3-27-6-1"
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text>
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="3745.7725"
|
||||
|
Before Width: | Height: | Size: 25 KiB After Width: | Height: | Size: 25 KiB |
@ -844,7 +844,7 @@
|
||||
font-style="normal"
|
||||
y="1547.8876"
|
||||
x="4417.6396"
|
||||
xml:space="preserve">dyntick_save_progress_counter()</text>
|
||||
xml:space="preserve">rcu_watching_snap_save()</text>
|
||||
<g
|
||||
style="fill:none;stroke-width:0.025in"
|
||||
transform="translate(6501.9719,-10685.904)"
|
||||
@ -899,7 +899,7 @@
|
||||
font-style="normal"
|
||||
y="1858.8729"
|
||||
x="4414.1836"
|
||||
xml:space="preserve">rcu_implicit_dynticks_qs()</text>
|
||||
xml:space="preserve">rcu_watching_snap_recheck()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="14659.87"
|
||||
@ -977,7 +977,7 @@
|
||||
font-weight="bold"
|
||||
font-size="192"
|
||||
id="text202-7-5-3-27-6"
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text>
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="3745.7725"
|
||||
@ -1008,7 +1008,7 @@
|
||||
font-weight="bold"
|
||||
font-size="192"
|
||||
id="text202-7-5-3-27-6-1"
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text>
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="3745.7725"
|
||||
|
Before Width: | Height: | Size: 50 KiB After Width: | Height: | Size: 50 KiB |
@ -2974,7 +2974,7 @@
|
||||
font-style="normal"
|
||||
y="38114.047"
|
||||
x="-334.33856"
|
||||
xml:space="preserve">dyntick_save_progress_counter()</text>
|
||||
xml:space="preserve">rcu_watching_snap_save()</text>
|
||||
<g
|
||||
style="fill:none;stroke-width:0.025in"
|
||||
transform="translate(1749.9916,25880.249)"
|
||||
@ -3029,7 +3029,7 @@
|
||||
font-style="normal"
|
||||
y="38425.035"
|
||||
x="-337.79462"
|
||||
xml:space="preserve">rcu_implicit_dynticks_qs()</text>
|
||||
xml:space="preserve">rcu_watching_snap_recheck()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="9907.8887"
|
||||
@ -3107,7 +3107,7 @@
|
||||
font-weight="bold"
|
||||
font-size="192"
|
||||
id="text202-7-5-3-27-6"
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text>
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="3745.7725"
|
||||
@ -3138,7 +3138,7 @@
|
||||
font-weight="bold"
|
||||
font-size="192"
|
||||
id="text202-7-5-3-27-6-1"
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text>
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text>
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="3745.7725"
|
||||
|
Before Width: | Height: | Size: 208 KiB After Width: | Height: | Size: 208 KiB |
@ -516,7 +516,7 @@
|
||||
font-style="normal"
|
||||
y="-8652.5312"
|
||||
x="2466.7822"
|
||||
xml:space="preserve">dyntick_save_progress_counter()</text>
|
||||
xml:space="preserve">rcu_watching_snap_save()</text>
|
||||
<text
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"
|
||||
id="text202-7-2-7-2-0"
|
||||
@ -525,7 +525,7 @@
|
||||
font-style="normal"
|
||||
y="-8368.1475"
|
||||
x="2463.3262"
|
||||
xml:space="preserve">rcu_implicit_dynticks_qs()</text>
|
||||
xml:space="preserve">rcu_watching_snap_recheck()</text>
|
||||
<text
|
||||
sodipodi:linespacing="125%"
|
||||
style="font-size:192px;font-style:normal;font-weight:bold;line-height:125%;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"
|
||||
|
Before Width: | Height: | Size: 28 KiB After Width: | Height: | Size: 28 KiB |
@ -2649,8 +2649,7 @@ those that are idle from RCU's perspective) and then Tasks Rude RCU can
|
||||
be removed from the kernel.
|
||||
|
||||
The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
|
||||
consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(),
|
||||
and rcu_barrier_tasks_rude().
|
||||
consisting solely of synchronize_rcu_tasks_rude().
|
||||
|
||||
Tasks Trace RCU
|
||||
~~~~~~~~~~~~~~~
|
||||
|
@ -194,14 +194,13 @@ over a rather long period of time, but improvements are always welcome!
|
||||
when publicizing a pointer to a structure that can
|
||||
be traversed by an RCU read-side critical section.
|
||||
|
||||
5. If any of call_rcu(), call_srcu(), call_rcu_tasks(),
|
||||
call_rcu_tasks_rude(), or call_rcu_tasks_trace() is used,
|
||||
the callback function may be invoked from softirq context,
|
||||
and in any case with bottom halves disabled. In particular,
|
||||
this callback function cannot block. If you need the callback
|
||||
to block, run that code in a workqueue handler scheduled from
|
||||
the callback. The queue_rcu_work() function does this for you
|
||||
in the case of call_rcu().
|
||||
5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), or
|
||||
call_rcu_tasks_trace() is used, the callback function may be
|
||||
invoked from softirq context, and in any case with bottom halves
|
||||
disabled. In particular, this callback function cannot block.
|
||||
If you need the callback to block, run that code in a workqueue
|
||||
handler scheduled from the callback. The queue_rcu_work()
|
||||
function does this for you in the case of call_rcu().
|
||||
|
||||
6. Since synchronize_rcu() can block, it cannot be called
|
||||
from any sort of irq context. The same rule applies
|
||||
@ -254,10 +253,10 @@ over a rather long period of time, but improvements are always welcome!
|
||||
corresponding readers must use rcu_read_lock_trace()
|
||||
and rcu_read_unlock_trace().
|
||||
|
||||
c. If an updater uses call_rcu_tasks_rude() or
|
||||
synchronize_rcu_tasks_rude(), then the corresponding
|
||||
readers must use anything that disables preemption,
|
||||
for example, preempt_disable() and preempt_enable().
|
||||
c. If an updater uses synchronize_rcu_tasks_rude(),
|
||||
then the corresponding readers must use anything that
|
||||
disables preemption, for example, preempt_disable()
|
||||
and preempt_enable().
|
||||
|
||||
Mixing things up will result in confusion and broken kernels, and
|
||||
has even resulted in an exploitable security issue. Therefore,
|
||||
@ -326,11 +325,9 @@ over a rather long period of time, but improvements are always welcome!
|
||||
d. Periodically invoke rcu_barrier(), permitting a limited
|
||||
number of updates per grace period.
|
||||
|
||||
The same cautions apply to call_srcu(), call_rcu_tasks(),
|
||||
call_rcu_tasks_rude(), and call_rcu_tasks_trace(). This is
|
||||
why there is an srcu_barrier(), rcu_barrier_tasks(),
|
||||
rcu_barrier_tasks_rude(), and rcu_barrier_tasks_rude(),
|
||||
respectively.
|
||||
The same cautions apply to call_srcu(), call_rcu_tasks(), and
|
||||
call_rcu_tasks_trace(). This is why there is an srcu_barrier(),
|
||||
rcu_barrier_tasks(), and rcu_barrier_tasks_trace(), respectively.
|
||||
|
||||
Note that although these primitives do take action to avoid
|
||||
memory exhaustion when any given CPU has too many callbacks,
|
||||
@ -383,17 +380,17 @@ over a rather long period of time, but improvements are always welcome!
|
||||
must use whatever locking or other synchronization is required
|
||||
to safely access and/or modify that data structure.
|
||||
|
||||
Do not assume that RCU callbacks will be executed on
|
||||
the same CPU that executed the corresponding call_rcu(),
|
||||
call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), or
|
||||
call_rcu_tasks_trace(). For example, if a given CPU goes offline
|
||||
while having an RCU callback pending, then that RCU callback
|
||||
will execute on some surviving CPU. (If this was not the case,
|
||||
a self-spawning RCU callback would prevent the victim CPU from
|
||||
ever going offline.) Furthermore, CPUs designated by rcu_nocbs=
|
||||
might well *always* have their RCU callbacks executed on some
|
||||
other CPUs, in fact, for some real-time workloads, this is the
|
||||
whole point of using the rcu_nocbs= kernel boot parameter.
|
||||
Do not assume that RCU callbacks will be executed on the same
|
||||
CPU that executed the corresponding call_rcu(), call_srcu(),
|
||||
call_rcu_tasks(), or call_rcu_tasks_trace(). For example, if
|
||||
a given CPU goes offline while having an RCU callback pending,
|
||||
then that RCU callback will execute on some surviving CPU.
|
||||
(If this was not the case, a self-spawning RCU callback would
|
||||
prevent the victim CPU from ever going offline.) Furthermore,
|
||||
CPUs designated by rcu_nocbs= might well *always* have their
|
||||
RCU callbacks executed on some other CPUs, in fact, for some
|
||||
real-time workloads, this is the whole point of using the
|
||||
rcu_nocbs= kernel boot parameter.
|
||||
|
||||
In addition, do not assume that callbacks queued in a given order
|
||||
will be invoked in that order, even if they all are queued on the
|
||||
@ -507,9 +504,9 @@ over a rather long period of time, but improvements are always welcome!
|
||||
These debugging aids can help you find problems that are
|
||||
otherwise extremely difficult to spot.
|
||||
|
||||
17. If you pass a callback function defined within a module to one of
|
||||
call_rcu(), call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(),
|
||||
or call_rcu_tasks_trace(), then it is necessary to wait for all
|
||||
17. If you pass a callback function defined within a module
|
||||
to one of call_rcu(), call_srcu(), call_rcu_tasks(), or
|
||||
call_rcu_tasks_trace(), then it is necessary to wait for all
|
||||
pending callbacks to be invoked before unloading that module.
|
||||
Note that it is absolutely *not* sufficient to wait for a grace
|
||||
period! For example, synchronize_rcu() implementation is *not*
|
||||
@ -522,7 +519,6 @@ over a rather long period of time, but improvements are always welcome!
|
||||
- call_rcu() -> rcu_barrier()
|
||||
- call_srcu() -> srcu_barrier()
|
||||
- call_rcu_tasks() -> rcu_barrier_tasks()
|
||||
- call_rcu_tasks_rude() -> rcu_barrier_tasks_rude()
|
||||
- call_rcu_tasks_trace() -> rcu_barrier_tasks_trace()
|
||||
|
||||
However, these barrier functions are absolutely *not* guaranteed
|
||||
@ -539,7 +535,6 @@ over a rather long period of time, but improvements are always welcome!
|
||||
- Either synchronize_srcu() or synchronize_srcu_expedited(),
|
||||
together with and srcu_barrier()
|
||||
- synchronize_rcu_tasks() and rcu_barrier_tasks()
|
||||
- synchronize_tasks_rude() and rcu_barrier_tasks_rude()
|
||||
- synchronize_tasks_trace() and rcu_barrier_tasks_trace()
|
||||
|
||||
If necessary, you can use something like workqueues to execute
|
||||
|
@ -1103,7 +1103,7 @@ RCU-Tasks-Rude::
|
||||
|
||||
Critical sections Grace period Barrier
|
||||
|
||||
N/A call_rcu_tasks_rude rcu_barrier_tasks_rude
|
||||
N/A N/A
|
||||
synchronize_rcu_tasks_rude
|
||||
|
||||
|
||||
|
@ -4969,6 +4969,10 @@
|
||||
Set maximum number of finished RCU callbacks to
|
||||
process in one batch.
|
||||
|
||||
rcutree.csd_lock_suppress_rcu_stall= [KNL]
|
||||
Do only a one-line RCU CPU stall warning when
|
||||
there is an ongoing too-long CSD-lock wait.
|
||||
|
||||
rcutree.do_rcu_barrier= [KNL]
|
||||
Request a call to rcu_barrier(). This is
|
||||
throttled so that userspace tests can safely
|
||||
@ -5416,7 +5420,13 @@
|
||||
Time to wait (s) after boot before inducing stall.
|
||||
|
||||
rcutorture.stall_cpu_irqsoff= [KNL]
|
||||
Disable interrupts while stalling if set.
|
||||
Disable interrupts while stalling if set, but only
|
||||
on the first stall in the set.
|
||||
|
||||
rcutorture.stall_cpu_repeat= [KNL]
|
||||
Number of times to repeat the stall sequence,
|
||||
so that rcutorture.stall_cpu_repeat=3 will result
|
||||
in four stall sequences.
|
||||
|
||||
rcutorture.stall_gp_kthread= [KNL]
|
||||
Duration (s) of forced sleep within RCU
|
||||
@ -5604,14 +5614,6 @@
|
||||
of zero will disable batching. Batching is
|
||||
always disabled for synchronize_rcu_tasks().
|
||||
|
||||
rcupdate.rcu_tasks_rude_lazy_ms= [KNL]
|
||||
Set timeout in milliseconds RCU Tasks
|
||||
Rude asynchronous callback batching for
|
||||
call_rcu_tasks_rude(). A negative value
|
||||
will take the default. A value of zero will
|
||||
disable batching. Batching is always disabled
|
||||
for synchronize_rcu_tasks_rude().
|
||||
|
||||
rcupdate.rcu_tasks_trace_lazy_ms= [KNL]
|
||||
Set timeout in milliseconds RCU Tasks
|
||||
Trace asynchronous callback batching for
|
||||
|
@ -862,7 +862,7 @@ config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
|
||||
Architecture neither relies on exception_enter()/exception_exit()
|
||||
nor on schedule_user(). Also preempt_schedule_notrace() and
|
||||
preempt_schedule_irq() can't be called in a preemptible section
|
||||
while context tracking is CONTEXT_USER. This feature reflects a sane
|
||||
while context tracking is CT_STATE_USER. This feature reflects a sane
|
||||
entry implementation where the following requirements are met on
|
||||
critical entry code, ie: before user_exit() or after user_enter():
|
||||
|
||||
|
@ -103,7 +103,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs)
|
||||
static __always_inline void __enter_from_user_mode(void)
|
||||
{
|
||||
lockdep_hardirqs_off(CALLER_ADDR0);
|
||||
CT_WARN_ON(ct_state() != CONTEXT_USER);
|
||||
CT_WARN_ON(ct_state() != CT_STATE_USER);
|
||||
user_exit_irqoff();
|
||||
trace_hardirqs_off_finish();
|
||||
mte_disable_tco_entry(current);
|
||||
|
@ -177,7 +177,7 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs)
|
||||
|
||||
if (user_mode(regs)) {
|
||||
kuap_lock();
|
||||
CT_WARN_ON(ct_state() != CONTEXT_USER);
|
||||
CT_WARN_ON(ct_state() != CT_STATE_USER);
|
||||
user_exit_irqoff();
|
||||
|
||||
account_cpu_user_entry();
|
||||
@ -189,8 +189,8 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs)
|
||||
* so avoid recursion.
|
||||
*/
|
||||
if (TRAP(regs) != INTERRUPT_PROGRAM)
|
||||
CT_WARN_ON(ct_state() != CONTEXT_KERNEL &&
|
||||
ct_state() != CONTEXT_IDLE);
|
||||
CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
|
||||
ct_state() != CT_STATE_IDLE);
|
||||
INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
|
||||
INT_SOFT_MASK_BUG_ON(regs, arch_irq_disabled_regs(regs) &&
|
||||
search_kernel_restart_table(regs->nip));
|
||||
|
@ -266,7 +266,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
|
||||
unsigned long ret = 0;
|
||||
bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
|
||||
|
||||
CT_WARN_ON(ct_state() == CONTEXT_USER);
|
||||
CT_WARN_ON(ct_state() == CT_STATE_USER);
|
||||
|
||||
kuap_assert_locked();
|
||||
|
||||
@ -344,7 +344,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
|
||||
|
||||
BUG_ON(regs_is_unrecoverable(regs));
|
||||
BUG_ON(arch_irq_disabled_regs(regs));
|
||||
CT_WARN_ON(ct_state() == CONTEXT_USER);
|
||||
CT_WARN_ON(ct_state() == CT_STATE_USER);
|
||||
|
||||
/*
|
||||
* We don't need to restore AMR on the way back to userspace for KUAP.
|
||||
@ -386,7 +386,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
|
||||
if (!IS_ENABLED(CONFIG_PPC_BOOK3E_64) &&
|
||||
TRAP(regs) != INTERRUPT_PROGRAM &&
|
||||
TRAP(regs) != INTERRUPT_PERFMON)
|
||||
CT_WARN_ON(ct_state() == CONTEXT_USER);
|
||||
CT_WARN_ON(ct_state() == CT_STATE_USER);
|
||||
|
||||
kuap = kuap_get_and_assert_locked();
|
||||
|
||||
|
@ -27,7 +27,7 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
|
||||
|
||||
trace_hardirqs_off(); /* finish reconciling */
|
||||
|
||||
CT_WARN_ON(ct_state() == CONTEXT_KERNEL);
|
||||
CT_WARN_ON(ct_state() == CT_STATE_KERNEL);
|
||||
user_exit_irqoff();
|
||||
|
||||
BUG_ON(regs_is_unrecoverable(regs));
|
||||
|
@ -150,7 +150,7 @@ early_param("ia32_emulation", ia32_emulation_override_cmdline);
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Invoke a 32-bit syscall. Called with IRQs on in CONTEXT_KERNEL.
|
||||
* Invoke a 32-bit syscall. Called with IRQs on in CT_STATE_KERNEL.
|
||||
*/
|
||||
static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr)
|
||||
{
|
||||
|
@ -26,26 +26,26 @@ extern void user_exit_callable(void);
|
||||
static inline void user_enter(void)
|
||||
{
|
||||
if (context_tracking_enabled())
|
||||
ct_user_enter(CONTEXT_USER);
|
||||
ct_user_enter(CT_STATE_USER);
|
||||
|
||||
}
|
||||
static inline void user_exit(void)
|
||||
{
|
||||
if (context_tracking_enabled())
|
||||
ct_user_exit(CONTEXT_USER);
|
||||
ct_user_exit(CT_STATE_USER);
|
||||
}
|
||||
|
||||
/* Called with interrupts disabled. */
|
||||
static __always_inline void user_enter_irqoff(void)
|
||||
{
|
||||
if (context_tracking_enabled())
|
||||
__ct_user_enter(CONTEXT_USER);
|
||||
__ct_user_enter(CT_STATE_USER);
|
||||
|
||||
}
|
||||
static __always_inline void user_exit_irqoff(void)
|
||||
{
|
||||
if (context_tracking_enabled())
|
||||
__ct_user_exit(CONTEXT_USER);
|
||||
__ct_user_exit(CT_STATE_USER);
|
||||
}
|
||||
|
||||
static inline enum ctx_state exception_enter(void)
|
||||
@ -57,7 +57,7 @@ static inline enum ctx_state exception_enter(void)
|
||||
return 0;
|
||||
|
||||
prev_ctx = __ct_state();
|
||||
if (prev_ctx != CONTEXT_KERNEL)
|
||||
if (prev_ctx != CT_STATE_KERNEL)
|
||||
ct_user_exit(prev_ctx);
|
||||
|
||||
return prev_ctx;
|
||||
@ -67,7 +67,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
|
||||
{
|
||||
if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK) &&
|
||||
context_tracking_enabled()) {
|
||||
if (prev_ctx != CONTEXT_KERNEL)
|
||||
if (prev_ctx != CT_STATE_KERNEL)
|
||||
ct_user_enter(prev_ctx);
|
||||
}
|
||||
}
|
||||
@ -75,7 +75,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
|
||||
static __always_inline bool context_tracking_guest_enter(void)
|
||||
{
|
||||
if (context_tracking_enabled())
|
||||
__ct_user_enter(CONTEXT_GUEST);
|
||||
__ct_user_enter(CT_STATE_GUEST);
|
||||
|
||||
return context_tracking_enabled_this_cpu();
|
||||
}
|
||||
@ -83,7 +83,7 @@ static __always_inline bool context_tracking_guest_enter(void)
|
||||
static __always_inline bool context_tracking_guest_exit(void)
|
||||
{
|
||||
if (context_tracking_enabled())
|
||||
__ct_user_exit(CONTEXT_GUEST);
|
||||
__ct_user_exit(CT_STATE_GUEST);
|
||||
|
||||
return context_tracking_enabled_this_cpu();
|
||||
}
|
||||
@ -115,13 +115,17 @@ extern void ct_idle_enter(void);
|
||||
extern void ct_idle_exit(void);
|
||||
|
||||
/*
|
||||
* Is the current CPU in an extended quiescent state?
|
||||
* Is RCU watching the current CPU (IOW, it is not in an extended quiescent state)?
|
||||
*
|
||||
* Note that this returns the actual boolean data (watching / not watching),
|
||||
* whereas ct_rcu_watching() returns the RCU_WATCHING subvariable of
|
||||
* context_tracking.state.
|
||||
*
|
||||
* No ordering, as we are sampling CPU-local information.
|
||||
*/
|
||||
static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
|
||||
static __always_inline bool rcu_is_watching_curr_cpu(void)
|
||||
{
|
||||
return !(raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & RCU_DYNTICKS_IDX);
|
||||
return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_RCU_WATCHING;
|
||||
}
|
||||
|
||||
/*
|
||||
@ -142,9 +146,9 @@ static __always_inline bool warn_rcu_enter(void)
|
||||
* lots of the actual reporting also relies on RCU.
|
||||
*/
|
||||
preempt_disable_notrace();
|
||||
if (rcu_dynticks_curr_cpu_in_eqs()) {
|
||||
if (!rcu_is_watching_curr_cpu()) {
|
||||
ret = true;
|
||||
ct_state_inc(RCU_DYNTICKS_IDX);
|
||||
ct_state_inc(CT_RCU_WATCHING);
|
||||
}
|
||||
|
||||
return ret;
|
||||
@ -153,7 +157,7 @@ static __always_inline bool warn_rcu_enter(void)
|
||||
static __always_inline void warn_rcu_exit(bool rcu)
|
||||
{
|
||||
if (rcu)
|
||||
ct_state_inc(RCU_DYNTICKS_IDX);
|
||||
ct_state_inc(CT_RCU_WATCHING);
|
||||
preempt_enable_notrace();
|
||||
}
|
||||
|
||||
|
@ -7,22 +7,22 @@
|
||||
#include <linux/context_tracking_irq.h>
|
||||
|
||||
/* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
|
||||
#define DYNTICK_IRQ_NONIDLE ((LONG_MAX / 2) + 1)
|
||||
#define CT_NESTING_IRQ_NONIDLE ((LONG_MAX / 2) + 1)
|
||||
|
||||
enum ctx_state {
|
||||
CONTEXT_DISABLED = -1, /* returned by ct_state() if unknown */
|
||||
CONTEXT_KERNEL = 0,
|
||||
CONTEXT_IDLE = 1,
|
||||
CONTEXT_USER = 2,
|
||||
CONTEXT_GUEST = 3,
|
||||
CONTEXT_MAX = 4,
|
||||
CT_STATE_DISABLED = -1, /* returned by ct_state() if unknown */
|
||||
CT_STATE_KERNEL = 0,
|
||||
CT_STATE_IDLE = 1,
|
||||
CT_STATE_USER = 2,
|
||||
CT_STATE_GUEST = 3,
|
||||
CT_STATE_MAX = 4,
|
||||
};
|
||||
|
||||
/* Even value for idle, else odd. */
|
||||
#define RCU_DYNTICKS_IDX CONTEXT_MAX
|
||||
/* Odd value for watching, else even. */
|
||||
#define CT_RCU_WATCHING CT_STATE_MAX
|
||||
|
||||
#define CT_STATE_MASK (CONTEXT_MAX - 1)
|
||||
#define CT_DYNTICKS_MASK (~CT_STATE_MASK)
|
||||
#define CT_STATE_MASK (CT_STATE_MAX - 1)
|
||||
#define CT_RCU_WATCHING_MASK (~CT_STATE_MASK)
|
||||
|
||||
struct context_tracking {
|
||||
#ifdef CONFIG_CONTEXT_TRACKING_USER
|
||||
@ -39,8 +39,8 @@ struct context_tracking {
|
||||
atomic_t state;
|
||||
#endif
|
||||
#ifdef CONFIG_CONTEXT_TRACKING_IDLE
|
||||
long dynticks_nesting; /* Track process nesting level. */
|
||||
long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */
|
||||
long nesting; /* Track process nesting level. */
|
||||
long nmi_nesting; /* Track irq/NMI nesting level. */
|
||||
#endif
|
||||
};
|
||||
|
||||
@ -56,47 +56,47 @@ static __always_inline int __ct_state(void)
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_CONTEXT_TRACKING_IDLE
|
||||
static __always_inline int ct_dynticks(void)
|
||||
static __always_inline int ct_rcu_watching(void)
|
||||
{
|
||||
return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_DYNTICKS_MASK;
|
||||
return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_RCU_WATCHING_MASK;
|
||||
}
|
||||
|
||||
static __always_inline int ct_dynticks_cpu(int cpu)
|
||||
static __always_inline int ct_rcu_watching_cpu(int cpu)
|
||||
{
|
||||
struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
|
||||
|
||||
return atomic_read(&ct->state) & CT_DYNTICKS_MASK;
|
||||
return atomic_read(&ct->state) & CT_RCU_WATCHING_MASK;
|
||||
}
|
||||
|
||||
static __always_inline int ct_dynticks_cpu_acquire(int cpu)
|
||||
static __always_inline int ct_rcu_watching_cpu_acquire(int cpu)
|
||||
{
|
||||
struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
|
||||
|
||||
return atomic_read_acquire(&ct->state) & CT_DYNTICKS_MASK;
|
||||
return atomic_read_acquire(&ct->state) & CT_RCU_WATCHING_MASK;
|
||||
}
|
||||
|
||||
static __always_inline long ct_dynticks_nesting(void)
|
||||
static __always_inline long ct_nesting(void)
|
||||
{
|
||||
return __this_cpu_read(context_tracking.dynticks_nesting);
|
||||
return __this_cpu_read(context_tracking.nesting);
|
||||
}
|
||||
|
||||
static __always_inline long ct_dynticks_nesting_cpu(int cpu)
|
||||
static __always_inline long ct_nesting_cpu(int cpu)
|
||||
{
|
||||
struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
|
||||
|
||||
return ct->dynticks_nesting;
|
||||
return ct->nesting;
|
||||
}
|
||||
|
||||
static __always_inline long ct_dynticks_nmi_nesting(void)
|
||||
static __always_inline long ct_nmi_nesting(void)
|
||||
{
|
||||
return __this_cpu_read(context_tracking.dynticks_nmi_nesting);
|
||||
return __this_cpu_read(context_tracking.nmi_nesting);
|
||||
}
|
||||
|
||||
static __always_inline long ct_dynticks_nmi_nesting_cpu(int cpu)
|
||||
static __always_inline long ct_nmi_nesting_cpu(int cpu)
|
||||
{
|
||||
struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
|
||||
|
||||
return ct->dynticks_nmi_nesting;
|
||||
return ct->nmi_nesting;
|
||||
}
|
||||
#endif /* #ifdef CONFIG_CONTEXT_TRACKING_IDLE */
|
||||
|
||||
@ -113,7 +113,7 @@ static __always_inline bool context_tracking_enabled_cpu(int cpu)
|
||||
return context_tracking_enabled() && per_cpu(context_tracking.active, cpu);
|
||||
}
|
||||
|
||||
static inline bool context_tracking_enabled_this_cpu(void)
|
||||
static __always_inline bool context_tracking_enabled_this_cpu(void)
|
||||
{
|
||||
return context_tracking_enabled() && __this_cpu_read(context_tracking.active);
|
||||
}
|
||||
@ -123,14 +123,14 @@ static inline bool context_tracking_enabled_this_cpu(void)
|
||||
*
|
||||
* Returns the current cpu's context tracking state if context tracking
|
||||
* is enabled. If context tracking is disabled, returns
|
||||
* CONTEXT_DISABLED. This should be used primarily for debugging.
|
||||
* CT_STATE_DISABLED. This should be used primarily for debugging.
|
||||
*/
|
||||
static __always_inline int ct_state(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (!context_tracking_enabled())
|
||||
return CONTEXT_DISABLED;
|
||||
return CT_STATE_DISABLED;
|
||||
|
||||
preempt_disable();
|
||||
ret = __ct_state();
|
||||
|
@ -108,7 +108,7 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs)
|
||||
arch_enter_from_user_mode(regs);
|
||||
lockdep_hardirqs_off(CALLER_ADDR0);
|
||||
|
||||
CT_WARN_ON(__ct_state() != CONTEXT_USER);
|
||||
CT_WARN_ON(__ct_state() != CT_STATE_USER);
|
||||
user_exit_irqoff();
|
||||
|
||||
instrumentation_begin();
|
||||
|
@ -185,11 +185,7 @@ struct rcu_cblist {
|
||||
* ----------------------------------------------------------------------------
|
||||
*/
|
||||
#define SEGCBLIST_ENABLED BIT(0)
|
||||
#define SEGCBLIST_RCU_CORE BIT(1)
|
||||
#define SEGCBLIST_LOCKING BIT(2)
|
||||
#define SEGCBLIST_KTHREAD_CB BIT(3)
|
||||
#define SEGCBLIST_KTHREAD_GP BIT(4)
|
||||
#define SEGCBLIST_OFFLOADED BIT(5)
|
||||
#define SEGCBLIST_OFFLOADED BIT(1)
|
||||
|
||||
struct rcu_segcblist {
|
||||
struct rcu_head *head;
|
||||
|
@ -191,7 +191,10 @@ static inline void hlist_del_init_rcu(struct hlist_node *n)
|
||||
* @old : the element to be replaced
|
||||
* @new : the new element to insert
|
||||
*
|
||||
* The @old entry will be replaced with the @new entry atomically.
|
||||
* The @old entry will be replaced with the @new entry atomically from
|
||||
* the perspective of concurrent readers. It is the caller's responsibility
|
||||
* to synchronize with concurrent updaters, if any.
|
||||
*
|
||||
* Note: @old should not be empty.
|
||||
*/
|
||||
static inline void list_replace_rcu(struct list_head *old,
|
||||
@ -519,7 +522,9 @@ static inline void hlist_del_rcu(struct hlist_node *n)
|
||||
* @old : the element to be replaced
|
||||
* @new : the new element to insert
|
||||
*
|
||||
* The @old entry will be replaced with the @new entry atomically.
|
||||
* The @old entry will be replaced with the @new entry atomically from
|
||||
* the perspective of concurrent readers. It is the caller's responsibility
|
||||
* to synchronize with concurrent updaters, if any.
|
||||
*/
|
||||
static inline void hlist_replace_rcu(struct hlist_node *old,
|
||||
struct hlist_node *new)
|
||||
|
@ -34,10 +34,12 @@
|
||||
#define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b))
|
||||
#define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b))
|
||||
|
||||
#define RCU_SEQ_CTR_SHIFT 2
|
||||
#define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1)
|
||||
|
||||
/* Exported common interfaces */
|
||||
void call_rcu(struct rcu_head *head, rcu_callback_t func);
|
||||
void rcu_barrier_tasks(void);
|
||||
void rcu_barrier_tasks_rude(void);
|
||||
void synchronize_rcu(void);
|
||||
|
||||
struct rcu_gp_oldstate;
|
||||
@ -144,11 +146,18 @@ void rcu_init_nohz(void);
|
||||
int rcu_nocb_cpu_offload(int cpu);
|
||||
int rcu_nocb_cpu_deoffload(int cpu);
|
||||
void rcu_nocb_flush_deferred_wakeup(void);
|
||||
|
||||
#define RCU_NOCB_LOCKDEP_WARN(c, s) RCU_LOCKDEP_WARN(c, s)
|
||||
|
||||
#else /* #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
|
||||
static inline void rcu_init_nohz(void) { }
|
||||
static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; }
|
||||
static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; }
|
||||
static inline void rcu_nocb_flush_deferred_wakeup(void) { }
|
||||
|
||||
#define RCU_NOCB_LOCKDEP_WARN(c, s)
|
||||
|
||||
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
|
||||
/*
|
||||
@ -165,6 +174,7 @@ static inline void rcu_nocb_flush_deferred_wakeup(void) { }
|
||||
} while (0)
|
||||
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
|
||||
void synchronize_rcu_tasks(void);
|
||||
void rcu_tasks_torture_stats_print(char *tt, char *tf);
|
||||
# else
|
||||
# define rcu_tasks_classic_qs(t, preempt) do { } while (0)
|
||||
# define call_rcu_tasks call_rcu
|
||||
@ -191,6 +201,7 @@ void rcu_tasks_trace_qs_blkd(struct task_struct *t);
|
||||
rcu_tasks_trace_qs_blkd(t); \
|
||||
} \
|
||||
} while (0)
|
||||
void rcu_tasks_trace_torture_stats_print(char *tt, char *tf);
|
||||
# else
|
||||
# define rcu_tasks_trace_qs(t) do { } while (0)
|
||||
# endif
|
||||
@ -202,8 +213,8 @@ do { \
|
||||
} while (0)
|
||||
|
||||
# ifdef CONFIG_TASKS_RUDE_RCU
|
||||
void call_rcu_tasks_rude(struct rcu_head *head, rcu_callback_t func);
|
||||
void synchronize_rcu_tasks_rude(void);
|
||||
void rcu_tasks_rude_torture_stats_print(char *tt, char *tf);
|
||||
# endif
|
||||
|
||||
#define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)
|
||||
|
@ -158,7 +158,7 @@ void rcu_scheduler_starting(void);
|
||||
static inline void rcu_end_inkernel_boot(void) { }
|
||||
static inline bool rcu_inkernel_boot_has_ended(void) { return true; }
|
||||
static inline bool rcu_is_watching(void) { return true; }
|
||||
static inline void rcu_momentary_dyntick_idle(void) { }
|
||||
static inline void rcu_momentary_eqs(void) { }
|
||||
static inline void kfree_rcu_scheduler_running(void) { }
|
||||
static inline bool rcu_gp_might_be_stalled(void) { return false; }
|
||||
|
||||
|
@ -37,7 +37,7 @@ void synchronize_rcu_expedited(void);
|
||||
void kvfree_call_rcu(struct rcu_head *head, void *ptr);
|
||||
|
||||
void rcu_barrier(void);
|
||||
void rcu_momentary_dyntick_idle(void);
|
||||
void rcu_momentary_eqs(void);
|
||||
void kfree_rcu_scheduler_running(void);
|
||||
bool rcu_gp_might_be_stalled(void);
|
||||
|
||||
|
@ -294,4 +294,10 @@ int smpcfd_prepare_cpu(unsigned int cpu);
|
||||
int smpcfd_dead_cpu(unsigned int cpu);
|
||||
int smpcfd_dying_cpu(unsigned int cpu);
|
||||
|
||||
#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
|
||||
bool csd_lock_is_stuck(void);
|
||||
#else
|
||||
static inline bool csd_lock_is_stuck(void) { return false; }
|
||||
#endif
|
||||
|
||||
#endif /* __LINUX_SMP_H */
|
||||
|
@ -129,10 +129,23 @@ struct srcu_struct {
|
||||
#define SRCU_STATE_SCAN1 1
|
||||
#define SRCU_STATE_SCAN2 2
|
||||
|
||||
/*
|
||||
* Values for initializing gp sequence fields. Higher values allow wrap arounds to
|
||||
* occur earlier.
|
||||
* The second value with state is useful in the case of static initialization of
|
||||
* srcu_usage where srcu_gp_seq_needed is expected to have some state value in its
|
||||
* lower bits (or else it will appear to be already initialized within
|
||||
* the call check_init_srcu_struct()).
|
||||
*/
|
||||
#define SRCU_GP_SEQ_INITIAL_VAL ((0UL - 100UL) << RCU_SEQ_CTR_SHIFT)
|
||||
#define SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE (SRCU_GP_SEQ_INITIAL_VAL - 1)
|
||||
|
||||
#define __SRCU_USAGE_INIT(name) \
|
||||
{ \
|
||||
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
|
||||
.srcu_gp_seq_needed = -1UL, \
|
||||
.srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL, \
|
||||
.srcu_gp_seq_needed = SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE, \
|
||||
.srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL, \
|
||||
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
|
||||
}
|
||||
|
||||
|
@ -466,40 +466,40 @@ TRACE_EVENT(rcu_stall_warning,
|
||||
/*
|
||||
* Tracepoint for dyntick-idle entry/exit events. These take 2 strings
|
||||
* as argument:
|
||||
* polarity: "Start", "End", "StillNonIdle" for entering, exiting or still not
|
||||
* being in dyntick-idle mode.
|
||||
* polarity: "Start", "End", "StillWatching" for entering, exiting or still not
|
||||
* being in EQS mode.
|
||||
* context: "USER" or "IDLE" or "IRQ".
|
||||
* NMIs nested in IRQs are inferred with dynticks_nesting > 1 in IRQ context.
|
||||
* NMIs nested in IRQs are inferred with nesting > 1 in IRQ context.
|
||||
*
|
||||
* These events also take a pair of numbers, which indicate the nesting
|
||||
* depth before and after the event of interest, and a third number that is
|
||||
* the ->dynticks counter. Note that task-related and interrupt-related
|
||||
* the RCU_WATCHING counter. Note that task-related and interrupt-related
|
||||
* events use two separate counters, and that the "++=" and "--=" events
|
||||
* for irq/NMI will change the counter by two, otherwise by one.
|
||||
*/
|
||||
TRACE_EVENT_RCU(rcu_dyntick,
|
||||
TRACE_EVENT_RCU(rcu_watching,
|
||||
|
||||
TP_PROTO(const char *polarity, long oldnesting, long newnesting, int dynticks),
|
||||
TP_PROTO(const char *polarity, long oldnesting, long newnesting, int counter),
|
||||
|
||||
TP_ARGS(polarity, oldnesting, newnesting, dynticks),
|
||||
TP_ARGS(polarity, oldnesting, newnesting, counter),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(const char *, polarity)
|
||||
__field(long, oldnesting)
|
||||
__field(long, newnesting)
|
||||
__field(int, dynticks)
|
||||
__field(int, counter)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->polarity = polarity;
|
||||
__entry->oldnesting = oldnesting;
|
||||
__entry->newnesting = newnesting;
|
||||
__entry->dynticks = dynticks;
|
||||
__entry->counter = counter;
|
||||
),
|
||||
|
||||
TP_printk("%s %lx %lx %#3x", __entry->polarity,
|
||||
__entry->oldnesting, __entry->newnesting,
|
||||
__entry->dynticks & 0xfff)
|
||||
__entry->counter & 0xfff)
|
||||
);
|
||||
|
||||
/*
|
||||
|
@ -28,34 +28,34 @@
|
||||
|
||||
DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
|
||||
#ifdef CONFIG_CONTEXT_TRACKING_IDLE
|
||||
.dynticks_nesting = 1,
|
||||
.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
|
||||
.nesting = 1,
|
||||
.nmi_nesting = CT_NESTING_IRQ_NONIDLE,
|
||||
#endif
|
||||
.state = ATOMIC_INIT(RCU_DYNTICKS_IDX),
|
||||
.state = ATOMIC_INIT(CT_RCU_WATCHING),
|
||||
};
|
||||
EXPORT_SYMBOL_GPL(context_tracking);
|
||||
|
||||
#ifdef CONFIG_CONTEXT_TRACKING_IDLE
|
||||
#define TPS(x) tracepoint_string(x)
|
||||
|
||||
/* Record the current task on dyntick-idle entry. */
|
||||
static __always_inline void rcu_dynticks_task_enter(void)
|
||||
/* Record the current task on exiting RCU-tasks (dyntick-idle entry). */
|
||||
static __always_inline void rcu_task_exit(void)
|
||||
{
|
||||
#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
|
||||
WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
|
||||
#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
|
||||
}
|
||||
|
||||
/* Record no current task on dyntick-idle exit. */
|
||||
static __always_inline void rcu_dynticks_task_exit(void)
|
||||
/* Record no current task on entering RCU-tasks (dyntick-idle exit). */
|
||||
static __always_inline void rcu_task_enter(void)
|
||||
{
|
||||
#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
|
||||
WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
|
||||
#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
|
||||
}
|
||||
|
||||
/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
|
||||
static __always_inline void rcu_dynticks_task_trace_enter(void)
|
||||
/* Turn on heavyweight RCU tasks trace readers on kernel exit. */
|
||||
static __always_inline void rcu_task_trace_heavyweight_enter(void)
|
||||
{
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
|
||||
@ -63,8 +63,8 @@ static __always_inline void rcu_dynticks_task_trace_enter(void)
|
||||
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
}
|
||||
|
||||
/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
|
||||
static __always_inline void rcu_dynticks_task_trace_exit(void)
|
||||
/* Turn off heavyweight RCU tasks trace readers on kernel entry. */
|
||||
static __always_inline void rcu_task_trace_heavyweight_exit(void)
|
||||
{
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
|
||||
@ -87,10 +87,10 @@ static noinstr void ct_kernel_exit_state(int offset)
|
||||
* critical sections, and we also must force ordering with the
|
||||
* next idle sojourn.
|
||||
*/
|
||||
rcu_dynticks_task_trace_enter(); // Before ->dynticks update!
|
||||
rcu_task_trace_heavyweight_enter(); // Before CT state update!
|
||||
seq = ct_state_inc(offset);
|
||||
// RCU is no longer watching. Better be in extended quiescent state!
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & RCU_DYNTICKS_IDX));
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
|
||||
}
|
||||
|
||||
/*
|
||||
@ -109,15 +109,15 @@ static noinstr void ct_kernel_enter_state(int offset)
|
||||
*/
|
||||
seq = ct_state_inc(offset);
|
||||
// RCU is now watching. Better not be in an extended quiescent state!
|
||||
rcu_dynticks_task_trace_exit(); // After ->dynticks update!
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & RCU_DYNTICKS_IDX));
|
||||
rcu_task_trace_heavyweight_exit(); // After CT state update!
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & CT_RCU_WATCHING));
|
||||
}
|
||||
|
||||
/*
|
||||
* Enter an RCU extended quiescent state, which can be either the
|
||||
* idle loop or adaptive-tickless usermode execution.
|
||||
*
|
||||
* We crowbar the ->dynticks_nmi_nesting field to zero to allow for
|
||||
* We crowbar the ->nmi_nesting field to zero to allow for
|
||||
* the possibility of usermode upcalls having messed up our count
|
||||
* of interrupt nesting level during the prior busy period.
|
||||
*/
|
||||
@ -125,19 +125,19 @@ static void noinstr ct_kernel_exit(bool user, int offset)
|
||||
{
|
||||
struct context_tracking *ct = this_cpu_ptr(&context_tracking);
|
||||
|
||||
WARN_ON_ONCE(ct_dynticks_nmi_nesting() != DYNTICK_IRQ_NONIDLE);
|
||||
WRITE_ONCE(ct->dynticks_nmi_nesting, 0);
|
||||
WARN_ON_ONCE(ct_nmi_nesting() != CT_NESTING_IRQ_NONIDLE);
|
||||
WRITE_ONCE(ct->nmi_nesting, 0);
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
|
||||
ct_dynticks_nesting() == 0);
|
||||
if (ct_dynticks_nesting() != 1) {
|
||||
ct_nesting() == 0);
|
||||
if (ct_nesting() != 1) {
|
||||
// RCU will still be watching, so just do accounting and leave.
|
||||
ct->dynticks_nesting--;
|
||||
ct->nesting--;
|
||||
return;
|
||||
}
|
||||
|
||||
instrumentation_begin();
|
||||
lockdep_assert_irqs_disabled();
|
||||
trace_rcu_dyntick(TPS("Start"), ct_dynticks_nesting(), 0, ct_dynticks());
|
||||
trace_rcu_watching(TPS("End"), ct_nesting(), 0, ct_rcu_watching());
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
|
||||
rcu_preempt_deferred_qs(current);
|
||||
|
||||
@ -145,18 +145,18 @@ static void noinstr ct_kernel_exit(bool user, int offset)
|
||||
instrument_atomic_write(&ct->state, sizeof(ct->state));
|
||||
|
||||
instrumentation_end();
|
||||
WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
|
||||
WRITE_ONCE(ct->nesting, 0); /* Avoid irq-access tearing. */
|
||||
// RCU is watching here ...
|
||||
ct_kernel_exit_state(offset);
|
||||
// ... but is no longer watching here.
|
||||
rcu_dynticks_task_enter();
|
||||
rcu_task_exit();
|
||||
}
|
||||
|
||||
/*
|
||||
* Exit an RCU extended quiescent state, which can be either the
|
||||
* idle loop or adaptive-tickless usermode execution.
|
||||
*
|
||||
* We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
|
||||
* We crowbar the ->nmi_nesting field to CT_NESTING_IRQ_NONIDLE to
|
||||
* allow for the possibility of usermode upcalls messing up our count of
|
||||
* interrupt nesting level during the busy period that is just now starting.
|
||||
*/
|
||||
@ -166,14 +166,14 @@ static void noinstr ct_kernel_enter(bool user, int offset)
|
||||
long oldval;
|
||||
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
|
||||
oldval = ct_dynticks_nesting();
|
||||
oldval = ct_nesting();
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
|
||||
if (oldval) {
|
||||
// RCU was already watching, so just do accounting and leave.
|
||||
ct->dynticks_nesting++;
|
||||
ct->nesting++;
|
||||
return;
|
||||
}
|
||||
rcu_dynticks_task_exit();
|
||||
rcu_task_enter();
|
||||
// RCU is not watching here ...
|
||||
ct_kernel_enter_state(offset);
|
||||
// ... but is watching here.
|
||||
@ -182,11 +182,11 @@ static void noinstr ct_kernel_enter(bool user, int offset)
|
||||
// instrumentation for the noinstr ct_kernel_enter_state()
|
||||
instrument_atomic_write(&ct->state, sizeof(ct->state));
|
||||
|
||||
trace_rcu_dyntick(TPS("End"), ct_dynticks_nesting(), 1, ct_dynticks());
|
||||
trace_rcu_watching(TPS("Start"), ct_nesting(), 1, ct_rcu_watching());
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
|
||||
WRITE_ONCE(ct->dynticks_nesting, 1);
|
||||
WARN_ON_ONCE(ct_dynticks_nmi_nesting());
|
||||
WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
|
||||
WRITE_ONCE(ct->nesting, 1);
|
||||
WARN_ON_ONCE(ct_nmi_nesting());
|
||||
WRITE_ONCE(ct->nmi_nesting, CT_NESTING_IRQ_NONIDLE);
|
||||
instrumentation_end();
|
||||
}
|
||||
|
||||
@ -194,7 +194,7 @@ static void noinstr ct_kernel_enter(bool user, int offset)
|
||||
* ct_nmi_exit - inform RCU of exit from NMI context
|
||||
*
|
||||
* If we are returning from the outermost NMI handler that interrupted an
|
||||
* RCU-idle period, update ct->state and ct->dynticks_nmi_nesting
|
||||
* RCU-idle period, update ct->state and ct->nmi_nesting
|
||||
* to let the RCU grace-period handling know that the CPU is back to
|
||||
* being RCU-idle.
|
||||
*
|
||||
@ -207,47 +207,47 @@ void noinstr ct_nmi_exit(void)
|
||||
|
||||
instrumentation_begin();
|
||||
/*
|
||||
* Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
|
||||
* Check for ->nmi_nesting underflow and bad CT state.
|
||||
* (We are exiting an NMI handler, so RCU better be paying attention
|
||||
* to us!)
|
||||
*/
|
||||
WARN_ON_ONCE(ct_dynticks_nmi_nesting() <= 0);
|
||||
WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
|
||||
WARN_ON_ONCE(ct_nmi_nesting() <= 0);
|
||||
WARN_ON_ONCE(!rcu_is_watching_curr_cpu());
|
||||
|
||||
/*
|
||||
* If the nesting level is not 1, the CPU wasn't RCU-idle, so
|
||||
* leave it in non-RCU-idle state.
|
||||
*/
|
||||
if (ct_dynticks_nmi_nesting() != 1) {
|
||||
trace_rcu_dyntick(TPS("--="), ct_dynticks_nmi_nesting(), ct_dynticks_nmi_nesting() - 2,
|
||||
ct_dynticks());
|
||||
WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
|
||||
ct_dynticks_nmi_nesting() - 2);
|
||||
if (ct_nmi_nesting() != 1) {
|
||||
trace_rcu_watching(TPS("--="), ct_nmi_nesting(), ct_nmi_nesting() - 2,
|
||||
ct_rcu_watching());
|
||||
WRITE_ONCE(ct->nmi_nesting, /* No store tearing. */
|
||||
ct_nmi_nesting() - 2);
|
||||
instrumentation_end();
|
||||
return;
|
||||
}
|
||||
|
||||
/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
|
||||
trace_rcu_dyntick(TPS("Startirq"), ct_dynticks_nmi_nesting(), 0, ct_dynticks());
|
||||
WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
|
||||
trace_rcu_watching(TPS("Endirq"), ct_nmi_nesting(), 0, ct_rcu_watching());
|
||||
WRITE_ONCE(ct->nmi_nesting, 0); /* Avoid store tearing. */
|
||||
|
||||
// instrumentation for the noinstr ct_kernel_exit_state()
|
||||
instrument_atomic_write(&ct->state, sizeof(ct->state));
|
||||
instrumentation_end();
|
||||
|
||||
// RCU is watching here ...
|
||||
ct_kernel_exit_state(RCU_DYNTICKS_IDX);
|
||||
ct_kernel_exit_state(CT_RCU_WATCHING);
|
||||
// ... but is no longer watching here.
|
||||
|
||||
if (!in_nmi())
|
||||
rcu_dynticks_task_enter();
|
||||
rcu_task_exit();
|
||||
}
|
||||
|
||||
/**
|
||||
* ct_nmi_enter - inform RCU of entry to NMI context
|
||||
*
|
||||
* If the CPU was idle from RCU's viewpoint, update ct->state and
|
||||
* ct->dynticks_nmi_nesting to let the RCU grace-period handling know
|
||||
* ct->nmi_nesting to let the RCU grace-period handling know
|
||||
* that the CPU is active. This implementation permits nested NMIs, as
|
||||
* long as the nesting level does not overflow an int. (You will probably
|
||||
* run out of stack space first.)
|
||||
@ -261,27 +261,27 @@ void noinstr ct_nmi_enter(void)
|
||||
struct context_tracking *ct = this_cpu_ptr(&context_tracking);
|
||||
|
||||
/* Complain about underflow. */
|
||||
WARN_ON_ONCE(ct_dynticks_nmi_nesting() < 0);
|
||||
WARN_ON_ONCE(ct_nmi_nesting() < 0);
|
||||
|
||||
/*
|
||||
* If idle from RCU viewpoint, atomically increment ->dynticks
|
||||
* to mark non-idle and increment ->dynticks_nmi_nesting by one.
|
||||
* Otherwise, increment ->dynticks_nmi_nesting by two. This means
|
||||
* if ->dynticks_nmi_nesting is equal to one, we are guaranteed
|
||||
* If idle from RCU viewpoint, atomically increment CT state
|
||||
* to mark non-idle and increment ->nmi_nesting by one.
|
||||
* Otherwise, increment ->nmi_nesting by two. This means
|
||||
* if ->nmi_nesting is equal to one, we are guaranteed
|
||||
* to be in the outermost NMI handler that interrupted an RCU-idle
|
||||
* period (observation due to Andy Lutomirski).
|
||||
*/
|
||||
if (rcu_dynticks_curr_cpu_in_eqs()) {
|
||||
if (!rcu_is_watching_curr_cpu()) {
|
||||
|
||||
if (!in_nmi())
|
||||
rcu_dynticks_task_exit();
|
||||
rcu_task_enter();
|
||||
|
||||
// RCU is not watching here ...
|
||||
ct_kernel_enter_state(RCU_DYNTICKS_IDX);
|
||||
ct_kernel_enter_state(CT_RCU_WATCHING);
|
||||
// ... but is watching here.
|
||||
|
||||
instrumentation_begin();
|
||||
// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
|
||||
// instrumentation for the noinstr rcu_is_watching_curr_cpu()
|
||||
instrument_atomic_read(&ct->state, sizeof(ct->state));
|
||||
// instrumentation for the noinstr ct_kernel_enter_state()
|
||||
instrument_atomic_write(&ct->state, sizeof(ct->state));
|
||||
@ -294,12 +294,12 @@ void noinstr ct_nmi_enter(void)
|
||||
instrumentation_begin();
|
||||
}
|
||||
|
||||
trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
|
||||
ct_dynticks_nmi_nesting(),
|
||||
ct_dynticks_nmi_nesting() + incby, ct_dynticks());
|
||||
trace_rcu_watching(incby == 1 ? TPS("Startirq") : TPS("++="),
|
||||
ct_nmi_nesting(),
|
||||
ct_nmi_nesting() + incby, ct_rcu_watching());
|
||||
instrumentation_end();
|
||||
WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
|
||||
ct_dynticks_nmi_nesting() + incby);
|
||||
WRITE_ONCE(ct->nmi_nesting, /* Prevent store tearing. */
|
||||
ct_nmi_nesting() + incby);
|
||||
barrier();
|
||||
}
|
||||
|
||||
@ -317,7 +317,7 @@ void noinstr ct_nmi_enter(void)
|
||||
void noinstr ct_idle_enter(void)
|
||||
{
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
|
||||
ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE);
|
||||
ct_kernel_exit(false, CT_RCU_WATCHING + CT_STATE_IDLE);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ct_idle_enter);
|
||||
|
||||
@ -335,7 +335,7 @@ void noinstr ct_idle_exit(void)
|
||||
unsigned long flags;
|
||||
|
||||
raw_local_irq_save(flags);
|
||||
ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE);
|
||||
ct_kernel_enter(false, CT_RCU_WATCHING - CT_STATE_IDLE);
|
||||
raw_local_irq_restore(flags);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ct_idle_exit);
|
||||
@ -485,7 +485,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
|
||||
* user_exit() or ct_irq_enter(). Let's remove RCU's dependency
|
||||
* on the tick.
|
||||
*/
|
||||
if (state == CONTEXT_USER) {
|
||||
if (state == CT_STATE_USER) {
|
||||
instrumentation_begin();
|
||||
trace_user_enter(0);
|
||||
vtime_user_enter(current);
|
||||
@ -504,7 +504,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
|
||||
* CPU doesn't need to maintain the tick for RCU maintenance purposes
|
||||
* when the CPU runs in userspace.
|
||||
*/
|
||||
ct_kernel_exit(true, RCU_DYNTICKS_IDX + state);
|
||||
ct_kernel_exit(true, CT_RCU_WATCHING + state);
|
||||
|
||||
/*
|
||||
* Special case if we only track user <-> kernel transitions for tickless
|
||||
@ -534,7 +534,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
|
||||
/*
|
||||
* Tracking for vtime and RCU EQS. Make sure we don't race
|
||||
* with NMIs. OTOH we don't care about ordering here since
|
||||
* RCU only requires RCU_DYNTICKS_IDX increments to be fully
|
||||
* RCU only requires CT_RCU_WATCHING increments to be fully
|
||||
* ordered.
|
||||
*/
|
||||
raw_atomic_add(state, &ct->state);
|
||||
@ -620,8 +620,8 @@ void noinstr __ct_user_exit(enum ctx_state state)
|
||||
* Exit RCU idle mode while entering the kernel because it can
|
||||
* run a RCU read side critical section anytime.
|
||||
*/
|
||||
ct_kernel_enter(true, RCU_DYNTICKS_IDX - state);
|
||||
if (state == CONTEXT_USER) {
|
||||
ct_kernel_enter(true, CT_RCU_WATCHING - state);
|
||||
if (state == CT_STATE_USER) {
|
||||
instrumentation_begin();
|
||||
vtime_user_exit(current);
|
||||
trace_user_exit(0);
|
||||
@ -634,17 +634,17 @@ void noinstr __ct_user_exit(enum ctx_state state)
|
||||
* In this we case we don't care about any concurrency/ordering.
|
||||
*/
|
||||
if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE))
|
||||
raw_atomic_set(&ct->state, CONTEXT_KERNEL);
|
||||
raw_atomic_set(&ct->state, CT_STATE_KERNEL);
|
||||
|
||||
} else {
|
||||
if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE)) {
|
||||
/* Tracking for vtime only, no concurrent RCU EQS accounting */
|
||||
raw_atomic_set(&ct->state, CONTEXT_KERNEL);
|
||||
raw_atomic_set(&ct->state, CT_STATE_KERNEL);
|
||||
} else {
|
||||
/*
|
||||
* Tracking for vtime and RCU EQS. Make sure we don't race
|
||||
* with NMIs. OTOH we don't care about ordering here since
|
||||
* RCU only requires RCU_DYNTICKS_IDX increments to be fully
|
||||
* RCU only requires CT_RCU_WATCHING increments to be fully
|
||||
* ordered.
|
||||
*/
|
||||
raw_atomic_sub(state, &ct->state);
|
||||
|
@ -182,7 +182,7 @@ static void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
|
||||
unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
|
||||
unsigned long nr = syscall_get_nr(current, regs);
|
||||
|
||||
CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
|
||||
CT_WARN_ON(ct_state() != CT_STATE_KERNEL);
|
||||
|
||||
if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
|
||||
if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr))
|
||||
|
@ -54,9 +54,6 @@
|
||||
* grace-period sequence number.
|
||||
*/
|
||||
|
||||
#define RCU_SEQ_CTR_SHIFT 2
|
||||
#define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1)
|
||||
|
||||
/* Low-order bit definition for polled grace-period APIs. */
|
||||
#define RCU_GET_STATE_COMPLETED 0x1
|
||||
|
||||
@ -255,6 +252,11 @@ static inline void debug_rcu_head_callback(struct rcu_head *rhp)
|
||||
kmem_dump_obj(rhp);
|
||||
}
|
||||
|
||||
static inline bool rcu_barrier_cb_is_done(struct rcu_head *rhp)
|
||||
{
|
||||
return rhp->next == rhp;
|
||||
}
|
||||
|
||||
extern int rcu_cpu_stall_suppress_at_boot;
|
||||
|
||||
static inline bool rcu_stall_is_suppressed_at_boot(void)
|
||||
@ -606,7 +608,7 @@ void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags,
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TINY_RCU
|
||||
static inline bool rcu_dynticks_zero_in_eqs(int cpu, int *vp) { return false; }
|
||||
static inline bool rcu_watching_zero_in_eqs(int cpu, int *vp) { return false; }
|
||||
static inline unsigned long rcu_get_gp_seq(void) { return 0; }
|
||||
static inline unsigned long rcu_exp_batches_completed(void) { return 0; }
|
||||
static inline unsigned long
|
||||
@ -619,7 +621,7 @@ static inline void rcu_fwd_progress_check(unsigned long j) { }
|
||||
static inline void rcu_gp_slow_register(atomic_t *rgssp) { }
|
||||
static inline void rcu_gp_slow_unregister(atomic_t *rgssp) { }
|
||||
#else /* #ifdef CONFIG_TINY_RCU */
|
||||
bool rcu_dynticks_zero_in_eqs(int cpu, int *vp);
|
||||
bool rcu_watching_zero_in_eqs(int cpu, int *vp);
|
||||
unsigned long rcu_get_gp_seq(void);
|
||||
unsigned long rcu_exp_batches_completed(void);
|
||||
unsigned long srcu_batches_completed(struct srcu_struct *sp);
|
||||
|
@ -260,17 +260,6 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
|
||||
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_ENABLED);
|
||||
}
|
||||
|
||||
/*
|
||||
* Mark the specified rcu_segcblist structure as offloaded (or not)
|
||||
*/
|
||||
void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload)
|
||||
{
|
||||
if (offload)
|
||||
rcu_segcblist_set_flags(rsclp, SEGCBLIST_LOCKING | SEGCBLIST_OFFLOADED);
|
||||
else
|
||||
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_OFFLOADED);
|
||||
}
|
||||
|
||||
/*
|
||||
* Does the specified rcu_segcblist structure contain callbacks that
|
||||
* are ready to be invoked?
|
||||
|
@ -89,16 +89,7 @@ static inline bool rcu_segcblist_is_enabled(struct rcu_segcblist *rsclp)
|
||||
static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp)
|
||||
{
|
||||
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
|
||||
rcu_segcblist_test_flags(rsclp, SEGCBLIST_LOCKING))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline bool rcu_segcblist_completely_offloaded(struct rcu_segcblist *rsclp)
|
||||
{
|
||||
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
|
||||
!rcu_segcblist_test_flags(rsclp, SEGCBLIST_RCU_CORE))
|
||||
rcu_segcblist_test_flags(rsclp, SEGCBLIST_OFFLOADED))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
|
@ -39,6 +39,7 @@
|
||||
#include <linux/torture.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <linux/rcupdate_trace.h>
|
||||
#include <linux/sched/debug.h>
|
||||
|
||||
#include "rcu.h"
|
||||
|
||||
@ -104,6 +105,20 @@ static char *scale_type = "rcu";
|
||||
module_param(scale_type, charp, 0444);
|
||||
MODULE_PARM_DESC(scale_type, "Type of RCU to scalability-test (rcu, srcu, ...)");
|
||||
|
||||
// Structure definitions for custom fixed-per-task allocator.
|
||||
struct writer_mblock {
|
||||
struct rcu_head wmb_rh;
|
||||
struct llist_node wmb_node;
|
||||
struct writer_freelist *wmb_wfl;
|
||||
};
|
||||
|
||||
struct writer_freelist {
|
||||
struct llist_head ws_lhg;
|
||||
atomic_t ws_inflight;
|
||||
struct llist_head ____cacheline_internodealigned_in_smp ws_lhp;
|
||||
struct writer_mblock *ws_mblocks;
|
||||
};
|
||||
|
||||
static int nrealreaders;
|
||||
static int nrealwriters;
|
||||
static struct task_struct **writer_tasks;
|
||||
@ -111,6 +126,8 @@ static struct task_struct **reader_tasks;
|
||||
static struct task_struct *shutdown_task;
|
||||
|
||||
static u64 **writer_durations;
|
||||
static bool *writer_done;
|
||||
static struct writer_freelist *writer_freelists;
|
||||
static int *writer_n_durations;
|
||||
static atomic_t n_rcu_scale_reader_started;
|
||||
static atomic_t n_rcu_scale_writer_started;
|
||||
@ -120,7 +137,6 @@ static u64 t_rcu_scale_writer_started;
|
||||
static u64 t_rcu_scale_writer_finished;
|
||||
static unsigned long b_rcu_gp_test_started;
|
||||
static unsigned long b_rcu_gp_test_finished;
|
||||
static DEFINE_PER_CPU(atomic_t, n_async_inflight);
|
||||
|
||||
#define MAX_MEAS 10000
|
||||
#define MIN_MEAS 100
|
||||
@ -143,6 +159,7 @@ struct rcu_scale_ops {
|
||||
void (*sync)(void);
|
||||
void (*exp_sync)(void);
|
||||
struct task_struct *(*rso_gp_kthread)(void);
|
||||
void (*stats)(void);
|
||||
const char *name;
|
||||
};
|
||||
|
||||
@ -224,6 +241,11 @@ static void srcu_scale_synchronize(void)
|
||||
synchronize_srcu(srcu_ctlp);
|
||||
}
|
||||
|
||||
static void srcu_scale_stats(void)
|
||||
{
|
||||
srcu_torture_stats_print(srcu_ctlp, scale_type, SCALE_FLAG);
|
||||
}
|
||||
|
||||
static void srcu_scale_synchronize_expedited(void)
|
||||
{
|
||||
synchronize_srcu_expedited(srcu_ctlp);
|
||||
@ -241,6 +263,7 @@ static struct rcu_scale_ops srcu_ops = {
|
||||
.gp_barrier = srcu_rcu_barrier,
|
||||
.sync = srcu_scale_synchronize,
|
||||
.exp_sync = srcu_scale_synchronize_expedited,
|
||||
.stats = srcu_scale_stats,
|
||||
.name = "srcu"
|
||||
};
|
||||
|
||||
@ -270,6 +293,7 @@ static struct rcu_scale_ops srcud_ops = {
|
||||
.gp_barrier = srcu_rcu_barrier,
|
||||
.sync = srcu_scale_synchronize,
|
||||
.exp_sync = srcu_scale_synchronize_expedited,
|
||||
.stats = srcu_scale_stats,
|
||||
.name = "srcud"
|
||||
};
|
||||
|
||||
@ -288,6 +312,11 @@ static void tasks_scale_read_unlock(int idx)
|
||||
{
|
||||
}
|
||||
|
||||
static void rcu_tasks_scale_stats(void)
|
||||
{
|
||||
rcu_tasks_torture_stats_print(scale_type, SCALE_FLAG);
|
||||
}
|
||||
|
||||
static struct rcu_scale_ops tasks_ops = {
|
||||
.ptype = RCU_TASKS_FLAVOR,
|
||||
.init = rcu_sync_scale_init,
|
||||
@ -300,6 +329,7 @@ static struct rcu_scale_ops tasks_ops = {
|
||||
.sync = synchronize_rcu_tasks,
|
||||
.exp_sync = synchronize_rcu_tasks,
|
||||
.rso_gp_kthread = get_rcu_tasks_gp_kthread,
|
||||
.stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_scale_stats,
|
||||
.name = "tasks"
|
||||
};
|
||||
|
||||
@ -326,6 +356,11 @@ static void tasks_rude_scale_read_unlock(int idx)
|
||||
{
|
||||
}
|
||||
|
||||
static void rcu_tasks_rude_scale_stats(void)
|
||||
{
|
||||
rcu_tasks_rude_torture_stats_print(scale_type, SCALE_FLAG);
|
||||
}
|
||||
|
||||
static struct rcu_scale_ops tasks_rude_ops = {
|
||||
.ptype = RCU_TASKS_RUDE_FLAVOR,
|
||||
.init = rcu_sync_scale_init,
|
||||
@ -333,11 +368,10 @@ static struct rcu_scale_ops tasks_rude_ops = {
|
||||
.readunlock = tasks_rude_scale_read_unlock,
|
||||
.get_gp_seq = rcu_no_completed,
|
||||
.gp_diff = rcu_seq_diff,
|
||||
.async = call_rcu_tasks_rude,
|
||||
.gp_barrier = rcu_barrier_tasks_rude,
|
||||
.sync = synchronize_rcu_tasks_rude,
|
||||
.exp_sync = synchronize_rcu_tasks_rude,
|
||||
.rso_gp_kthread = get_rcu_tasks_rude_gp_kthread,
|
||||
.stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_rude_scale_stats,
|
||||
.name = "tasks-rude"
|
||||
};
|
||||
|
||||
@ -366,6 +400,11 @@ static void tasks_trace_scale_read_unlock(int idx)
|
||||
rcu_read_unlock_trace();
|
||||
}
|
||||
|
||||
static void rcu_tasks_trace_scale_stats(void)
|
||||
{
|
||||
rcu_tasks_trace_torture_stats_print(scale_type, SCALE_FLAG);
|
||||
}
|
||||
|
||||
static struct rcu_scale_ops tasks_tracing_ops = {
|
||||
.ptype = RCU_TASKS_FLAVOR,
|
||||
.init = rcu_sync_scale_init,
|
||||
@ -378,6 +417,7 @@ static struct rcu_scale_ops tasks_tracing_ops = {
|
||||
.sync = synchronize_rcu_tasks_trace,
|
||||
.exp_sync = synchronize_rcu_tasks_trace,
|
||||
.rso_gp_kthread = get_rcu_tasks_trace_gp_kthread,
|
||||
.stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_trace_scale_stats,
|
||||
.name = "tasks-tracing"
|
||||
};
|
||||
|
||||
@ -437,13 +477,53 @@ rcu_scale_reader(void *arg)
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Allocate a writer_mblock structure for the specified rcu_scale_writer
|
||||
* task.
|
||||
*/
|
||||
static struct writer_mblock *rcu_scale_alloc(long me)
|
||||
{
|
||||
struct llist_node *llnp;
|
||||
struct writer_freelist *wflp;
|
||||
struct writer_mblock *wmbp;
|
||||
|
||||
if (WARN_ON_ONCE(!writer_freelists))
|
||||
return NULL;
|
||||
wflp = &writer_freelists[me];
|
||||
if (llist_empty(&wflp->ws_lhp)) {
|
||||
// ->ws_lhp is private to its rcu_scale_writer task.
|
||||
wmbp = container_of(llist_del_all(&wflp->ws_lhg), struct writer_mblock, wmb_node);
|
||||
wflp->ws_lhp.first = &wmbp->wmb_node;
|
||||
}
|
||||
llnp = llist_del_first(&wflp->ws_lhp);
|
||||
if (!llnp)
|
||||
return NULL;
|
||||
return container_of(llnp, struct writer_mblock, wmb_node);
|
||||
}
|
||||
|
||||
/*
|
||||
* Free a writer_mblock structure to its rcu_scale_writer task.
|
||||
*/
|
||||
static void rcu_scale_free(struct writer_mblock *wmbp)
|
||||
{
|
||||
struct writer_freelist *wflp;
|
||||
|
||||
if (!wmbp)
|
||||
return;
|
||||
wflp = wmbp->wmb_wfl;
|
||||
llist_add(&wmbp->wmb_node, &wflp->ws_lhg);
|
||||
}
|
||||
|
||||
/*
|
||||
* Callback function for asynchronous grace periods from rcu_scale_writer().
|
||||
*/
|
||||
static void rcu_scale_async_cb(struct rcu_head *rhp)
|
||||
{
|
||||
atomic_dec(this_cpu_ptr(&n_async_inflight));
|
||||
kfree(rhp);
|
||||
struct writer_mblock *wmbp = container_of(rhp, struct writer_mblock, wmb_rh);
|
||||
struct writer_freelist *wflp = wmbp->wmb_wfl;
|
||||
|
||||
atomic_dec(&wflp->ws_inflight);
|
||||
rcu_scale_free(wmbp);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -456,12 +536,14 @@ rcu_scale_writer(void *arg)
|
||||
int i_max;
|
||||
unsigned long jdone;
|
||||
long me = (long)arg;
|
||||
struct rcu_head *rhp = NULL;
|
||||
bool selfreport = false;
|
||||
bool started = false, done = false, alldone = false;
|
||||
u64 t;
|
||||
DEFINE_TORTURE_RANDOM(tr);
|
||||
u64 *wdp;
|
||||
u64 *wdpp = writer_durations[me];
|
||||
struct writer_freelist *wflp = &writer_freelists[me];
|
||||
struct writer_mblock *wmbp = NULL;
|
||||
|
||||
VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started");
|
||||
WARN_ON(!wdpp);
|
||||
@ -493,30 +575,34 @@ rcu_scale_writer(void *arg)
|
||||
|
||||
jdone = jiffies + minruntime * HZ;
|
||||
do {
|
||||
bool gp_succeeded = false;
|
||||
|
||||
if (writer_holdoff)
|
||||
udelay(writer_holdoff);
|
||||
if (writer_holdoff_jiffies)
|
||||
schedule_timeout_idle(torture_random(&tr) % writer_holdoff_jiffies + 1);
|
||||
wdp = &wdpp[i];
|
||||
*wdp = ktime_get_mono_fast_ns();
|
||||
if (gp_async) {
|
||||
retry:
|
||||
if (!rhp)
|
||||
rhp = kmalloc(sizeof(*rhp), GFP_KERNEL);
|
||||
if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) {
|
||||
atomic_inc(this_cpu_ptr(&n_async_inflight));
|
||||
cur_ops->async(rhp, rcu_scale_async_cb);
|
||||
rhp = NULL;
|
||||
if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) {
|
||||
if (!wmbp)
|
||||
wmbp = rcu_scale_alloc(me);
|
||||
if (wmbp && atomic_read(&wflp->ws_inflight) < gp_async_max) {
|
||||
atomic_inc(&wflp->ws_inflight);
|
||||
cur_ops->async(&wmbp->wmb_rh, rcu_scale_async_cb);
|
||||
wmbp = NULL;
|
||||
gp_succeeded = true;
|
||||
} else if (!kthread_should_stop()) {
|
||||
cur_ops->gp_barrier();
|
||||
goto retry;
|
||||
} else {
|
||||
kfree(rhp); /* Because we are stopping. */
|
||||
rcu_scale_free(wmbp); /* Because we are stopping. */
|
||||
wmbp = NULL;
|
||||
}
|
||||
} else if (gp_exp) {
|
||||
cur_ops->exp_sync();
|
||||
gp_succeeded = true;
|
||||
} else {
|
||||
cur_ops->sync();
|
||||
gp_succeeded = true;
|
||||
}
|
||||
t = ktime_get_mono_fast_ns();
|
||||
*wdp = t - *wdp;
|
||||
@ -526,6 +612,7 @@ retry:
|
||||
started = true;
|
||||
if (!done && i >= MIN_MEAS && time_after(jiffies, jdone)) {
|
||||
done = true;
|
||||
WRITE_ONCE(writer_done[me], true);
|
||||
sched_set_normal(current, 0);
|
||||
pr_alert("%s%s rcu_scale_writer %ld has %d measurements\n",
|
||||
scale_type, SCALE_FLAG, me, MIN_MEAS);
|
||||
@ -551,11 +638,32 @@ retry:
|
||||
if (done && !alldone &&
|
||||
atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters)
|
||||
alldone = true;
|
||||
if (started && !alldone && i < MAX_MEAS - 1)
|
||||
if (done && !alldone && time_after(jiffies, jdone + HZ * 60)) {
|
||||
static atomic_t dumped;
|
||||
int i;
|
||||
|
||||
if (!atomic_xchg(&dumped, 1)) {
|
||||
for (i = 0; i < nrealwriters; i++) {
|
||||
if (writer_done[i])
|
||||
continue;
|
||||
pr_info("%s: Task %ld flags writer %d:\n", __func__, me, i);
|
||||
sched_show_task(writer_tasks[i]);
|
||||
}
|
||||
if (cur_ops->stats)
|
||||
cur_ops->stats();
|
||||
}
|
||||
}
|
||||
if (!selfreport && time_after(jiffies, jdone + HZ * (70 + me))) {
|
||||
pr_info("%s: Writer %ld self-report: started %d done %d/%d->%d i %d jdone %lu.\n",
|
||||
__func__, me, started, done, writer_done[me], atomic_read(&n_rcu_scale_writer_finished), i, jiffies - jdone);
|
||||
selfreport = true;
|
||||
}
|
||||
if (gp_succeeded && started && !alldone && i < MAX_MEAS - 1)
|
||||
i++;
|
||||
rcu_scale_wait_shutdown();
|
||||
} while (!torture_must_stop());
|
||||
if (gp_async) {
|
||||
if (gp_async && cur_ops->async) {
|
||||
rcu_scale_free(wmbp);
|
||||
cur_ops->gp_barrier();
|
||||
}
|
||||
writer_n_durations[me] = i_max + 1;
|
||||
@ -713,6 +821,7 @@ kfree_scale_cleanup(void)
|
||||
torture_stop_kthread(kfree_scale_thread,
|
||||
kfree_reader_tasks[i]);
|
||||
kfree(kfree_reader_tasks);
|
||||
kfree_reader_tasks = NULL;
|
||||
}
|
||||
|
||||
torture_cleanup_end();
|
||||
@ -881,6 +990,7 @@ rcu_scale_cleanup(void)
|
||||
torture_stop_kthread(rcu_scale_reader,
|
||||
reader_tasks[i]);
|
||||
kfree(reader_tasks);
|
||||
reader_tasks = NULL;
|
||||
}
|
||||
|
||||
if (writer_tasks) {
|
||||
@ -919,10 +1029,33 @@ rcu_scale_cleanup(void)
|
||||
schedule_timeout_uninterruptible(1);
|
||||
}
|
||||
kfree(writer_durations[i]);
|
||||
if (writer_freelists) {
|
||||
int ctr = 0;
|
||||
struct llist_node *llnp;
|
||||
struct writer_freelist *wflp = &writer_freelists[i];
|
||||
|
||||
if (wflp->ws_mblocks) {
|
||||
llist_for_each(llnp, wflp->ws_lhg.first)
|
||||
ctr++;
|
||||
llist_for_each(llnp, wflp->ws_lhp.first)
|
||||
ctr++;
|
||||
WARN_ONCE(ctr != gp_async_max,
|
||||
"%s: ctr = %d gp_async_max = %d\n",
|
||||
__func__, ctr, gp_async_max);
|
||||
kfree(wflp->ws_mblocks);
|
||||
}
|
||||
}
|
||||
}
|
||||
kfree(writer_tasks);
|
||||
writer_tasks = NULL;
|
||||
kfree(writer_durations);
|
||||
writer_durations = NULL;
|
||||
kfree(writer_n_durations);
|
||||
writer_n_durations = NULL;
|
||||
kfree(writer_done);
|
||||
writer_done = NULL;
|
||||
kfree(writer_freelists);
|
||||
writer_freelists = NULL;
|
||||
}
|
||||
|
||||
/* Do torture-type-specific cleanup operations. */
|
||||
@ -949,8 +1082,9 @@ rcu_scale_shutdown(void *arg)
|
||||
static int __init
|
||||
rcu_scale_init(void)
|
||||
{
|
||||
long i;
|
||||
int firsterr = 0;
|
||||
long i;
|
||||
long j;
|
||||
static struct rcu_scale_ops *scale_ops[] = {
|
||||
&rcu_ops, &srcu_ops, &srcud_ops, TASKS_OPS TASKS_RUDE_OPS TASKS_TRACING_OPS
|
||||
};
|
||||
@ -1017,14 +1151,22 @@ rcu_scale_init(void)
|
||||
}
|
||||
while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders)
|
||||
schedule_timeout_uninterruptible(1);
|
||||
writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]),
|
||||
GFP_KERNEL);
|
||||
writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations),
|
||||
GFP_KERNEL);
|
||||
writer_n_durations =
|
||||
kcalloc(nrealwriters, sizeof(*writer_n_durations),
|
||||
GFP_KERNEL);
|
||||
if (!writer_tasks || !writer_durations || !writer_n_durations) {
|
||||
writer_tasks = kcalloc(nrealwriters, sizeof(writer_tasks[0]), GFP_KERNEL);
|
||||
writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL);
|
||||
writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL);
|
||||
writer_done = kcalloc(nrealwriters, sizeof(writer_done[0]), GFP_KERNEL);
|
||||
if (gp_async) {
|
||||
if (gp_async_max <= 0) {
|
||||
pr_warn("%s: gp_async_max = %d must be greater than zero.\n",
|
||||
__func__, gp_async_max);
|
||||
WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST));
|
||||
firsterr = -EINVAL;
|
||||
goto unwind;
|
||||
}
|
||||
writer_freelists = kcalloc(nrealwriters, sizeof(writer_freelists[0]), GFP_KERNEL);
|
||||
}
|
||||
if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done ||
|
||||
(gp_async && !writer_freelists)) {
|
||||
SCALEOUT_ERRSTRING("out of memory");
|
||||
firsterr = -ENOMEM;
|
||||
goto unwind;
|
||||
@ -1037,6 +1179,24 @@ rcu_scale_init(void)
|
||||
firsterr = -ENOMEM;
|
||||
goto unwind;
|
||||
}
|
||||
if (writer_freelists) {
|
||||
struct writer_freelist *wflp = &writer_freelists[i];
|
||||
|
||||
init_llist_head(&wflp->ws_lhg);
|
||||
init_llist_head(&wflp->ws_lhp);
|
||||
wflp->ws_mblocks = kcalloc(gp_async_max, sizeof(wflp->ws_mblocks[0]),
|
||||
GFP_KERNEL);
|
||||
if (!wflp->ws_mblocks) {
|
||||
firsterr = -ENOMEM;
|
||||
goto unwind;
|
||||
}
|
||||
for (j = 0; j < gp_async_max; j++) {
|
||||
struct writer_mblock *wmbp = &wflp->ws_mblocks[j];
|
||||
|
||||
wmbp->wmb_wfl = wflp;
|
||||
llist_add(&wmbp->wmb_node, &wflp->ws_lhp);
|
||||
}
|
||||
}
|
||||
firsterr = torture_create_kthread(rcu_scale_writer, (void *)i,
|
||||
writer_tasks[i]);
|
||||
if (torture_init_error(firsterr))
|
||||
|
@ -115,6 +115,7 @@ torture_param(int, stall_cpu_holdoff, 10, "Time to wait before starting stall (s
|
||||
torture_param(bool, stall_no_softlockup, false, "Avoid softlockup warning during cpu stall.");
|
||||
torture_param(int, stall_cpu_irqsoff, 0, "Disable interrupts while stalling.");
|
||||
torture_param(int, stall_cpu_block, 0, "Sleep while stalling.");
|
||||
torture_param(int, stall_cpu_repeat, 0, "Number of additional stalls after the first one.");
|
||||
torture_param(int, stall_gp_kthread, 0, "Grace-period kthread stall duration (s).");
|
||||
torture_param(int, stat_interval, 60, "Number of seconds between stats printk()s");
|
||||
torture_param(int, stutter, 5, "Number of seconds to run/halt test");
|
||||
@ -366,8 +367,6 @@ struct rcu_torture_ops {
|
||||
bool (*same_gp_state_full)(struct rcu_gp_oldstate *rgosp1, struct rcu_gp_oldstate *rgosp2);
|
||||
unsigned long (*get_gp_state)(void);
|
||||
void (*get_gp_state_full)(struct rcu_gp_oldstate *rgosp);
|
||||
unsigned long (*get_gp_completed)(void);
|
||||
void (*get_gp_completed_full)(struct rcu_gp_oldstate *rgosp);
|
||||
unsigned long (*start_gp_poll)(void);
|
||||
void (*start_gp_poll_full)(struct rcu_gp_oldstate *rgosp);
|
||||
bool (*poll_gp_state)(unsigned long oldstate);
|
||||
@ -375,6 +374,8 @@ struct rcu_torture_ops {
|
||||
bool (*poll_need_2gp)(bool poll, bool poll_full);
|
||||
void (*cond_sync)(unsigned long oldstate);
|
||||
void (*cond_sync_full)(struct rcu_gp_oldstate *rgosp);
|
||||
int poll_active;
|
||||
int poll_active_full;
|
||||
call_rcu_func_t call;
|
||||
void (*cb_barrier)(void);
|
||||
void (*fqs)(void);
|
||||
@ -553,8 +554,6 @@ static struct rcu_torture_ops rcu_ops = {
|
||||
.get_comp_state_full = get_completed_synchronize_rcu_full,
|
||||
.get_gp_state = get_state_synchronize_rcu,
|
||||
.get_gp_state_full = get_state_synchronize_rcu_full,
|
||||
.get_gp_completed = get_completed_synchronize_rcu,
|
||||
.get_gp_completed_full = get_completed_synchronize_rcu_full,
|
||||
.start_gp_poll = start_poll_synchronize_rcu,
|
||||
.start_gp_poll_full = start_poll_synchronize_rcu_full,
|
||||
.poll_gp_state = poll_state_synchronize_rcu,
|
||||
@ -562,6 +561,8 @@ static struct rcu_torture_ops rcu_ops = {
|
||||
.poll_need_2gp = rcu_poll_need_2gp,
|
||||
.cond_sync = cond_synchronize_rcu,
|
||||
.cond_sync_full = cond_synchronize_rcu_full,
|
||||
.poll_active = NUM_ACTIVE_RCU_POLL_OLDSTATE,
|
||||
.poll_active_full = NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE,
|
||||
.get_gp_state_exp = get_state_synchronize_rcu,
|
||||
.start_gp_poll_exp = start_poll_synchronize_rcu_expedited,
|
||||
.start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full,
|
||||
@ -740,9 +741,12 @@ static struct rcu_torture_ops srcu_ops = {
|
||||
.deferred_free = srcu_torture_deferred_free,
|
||||
.sync = srcu_torture_synchronize,
|
||||
.exp_sync = srcu_torture_synchronize_expedited,
|
||||
.same_gp_state = same_state_synchronize_srcu,
|
||||
.get_comp_state = get_completed_synchronize_srcu,
|
||||
.get_gp_state = srcu_torture_get_gp_state,
|
||||
.start_gp_poll = srcu_torture_start_gp_poll,
|
||||
.poll_gp_state = srcu_torture_poll_gp_state,
|
||||
.poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE,
|
||||
.call = srcu_torture_call,
|
||||
.cb_barrier = srcu_torture_barrier,
|
||||
.stats = srcu_torture_stats,
|
||||
@ -780,9 +784,12 @@ static struct rcu_torture_ops srcud_ops = {
|
||||
.deferred_free = srcu_torture_deferred_free,
|
||||
.sync = srcu_torture_synchronize,
|
||||
.exp_sync = srcu_torture_synchronize_expedited,
|
||||
.same_gp_state = same_state_synchronize_srcu,
|
||||
.get_comp_state = get_completed_synchronize_srcu,
|
||||
.get_gp_state = srcu_torture_get_gp_state,
|
||||
.start_gp_poll = srcu_torture_start_gp_poll,
|
||||
.poll_gp_state = srcu_torture_poll_gp_state,
|
||||
.poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE,
|
||||
.call = srcu_torture_call,
|
||||
.cb_barrier = srcu_torture_barrier,
|
||||
.stats = srcu_torture_stats,
|
||||
@ -915,11 +922,6 @@ static struct rcu_torture_ops tasks_ops = {
|
||||
* Definitions for rude RCU-tasks torture testing.
|
||||
*/
|
||||
|
||||
static void rcu_tasks_rude_torture_deferred_free(struct rcu_torture *p)
|
||||
{
|
||||
call_rcu_tasks_rude(&p->rtort_rcu, rcu_torture_cb);
|
||||
}
|
||||
|
||||
static struct rcu_torture_ops tasks_rude_ops = {
|
||||
.ttype = RCU_TASKS_RUDE_FLAVOR,
|
||||
.init = rcu_sync_torture_init,
|
||||
@ -927,11 +929,8 @@ static struct rcu_torture_ops tasks_rude_ops = {
|
||||
.read_delay = rcu_read_delay, /* just reuse rcu's version. */
|
||||
.readunlock = rcu_torture_read_unlock_trivial,
|
||||
.get_gp_seq = rcu_no_completed,
|
||||
.deferred_free = rcu_tasks_rude_torture_deferred_free,
|
||||
.sync = synchronize_rcu_tasks_rude,
|
||||
.exp_sync = synchronize_rcu_tasks_rude,
|
||||
.call = call_rcu_tasks_rude,
|
||||
.cb_barrier = rcu_barrier_tasks_rude,
|
||||
.gp_kthread_dbg = show_rcu_tasks_rude_gp_kthread,
|
||||
.get_gp_data = rcu_tasks_rude_get_gp_data,
|
||||
.cbflood_max = 50000,
|
||||
@ -1318,6 +1317,7 @@ static void rcu_torture_write_types(void)
|
||||
} else if (gp_sync && !cur_ops->sync) {
|
||||
pr_alert("%s: gp_sync without primitives.\n", __func__);
|
||||
}
|
||||
pr_alert("%s: Testing %d update types.\n", __func__, nsynctypes);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -1374,17 +1374,20 @@ rcu_torture_writer(void *arg)
|
||||
int i;
|
||||
int idx;
|
||||
int oldnice = task_nice(current);
|
||||
struct rcu_gp_oldstate rgo[NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE];
|
||||
struct rcu_gp_oldstate *rgo = NULL;
|
||||
int rgo_size = 0;
|
||||
struct rcu_torture *rp;
|
||||
struct rcu_torture *old_rp;
|
||||
static DEFINE_TORTURE_RANDOM(rand);
|
||||
unsigned long stallsdone = jiffies;
|
||||
bool stutter_waited;
|
||||
unsigned long ulo[NUM_ACTIVE_RCU_POLL_OLDSTATE];
|
||||
unsigned long *ulo = NULL;
|
||||
int ulo_size = 0;
|
||||
|
||||
// If a new stall test is added, this must be adjusted.
|
||||
if (stall_cpu_holdoff + stall_gp_kthread + stall_cpu)
|
||||
stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) * HZ;
|
||||
stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) *
|
||||
HZ * (stall_cpu_repeat + 1);
|
||||
VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
|
||||
if (!can_expedite)
|
||||
pr_alert("%s" TORTURE_FLAG
|
||||
@ -1401,6 +1404,16 @@ rcu_torture_writer(void *arg)
|
||||
torture_kthread_stopping("rcu_torture_writer");
|
||||
return 0;
|
||||
}
|
||||
if (cur_ops->poll_active > 0) {
|
||||
ulo = kzalloc(cur_ops->poll_active * sizeof(ulo[0]), GFP_KERNEL);
|
||||
if (!WARN_ON(!ulo))
|
||||
ulo_size = cur_ops->poll_active;
|
||||
}
|
||||
if (cur_ops->poll_active_full > 0) {
|
||||
rgo = kzalloc(cur_ops->poll_active_full * sizeof(rgo[0]), GFP_KERNEL);
|
||||
if (!WARN_ON(!rgo))
|
||||
rgo_size = cur_ops->poll_active_full;
|
||||
}
|
||||
|
||||
do {
|
||||
rcu_torture_writer_state = RTWS_FIXED_DELAY;
|
||||
@ -1437,8 +1450,8 @@ rcu_torture_writer(void *arg)
|
||||
rcu_torture_writer_state_getname(),
|
||||
rcu_torture_writer_state,
|
||||
cookie, cur_ops->get_gp_state());
|
||||
if (cur_ops->get_gp_completed) {
|
||||
cookie = cur_ops->get_gp_completed();
|
||||
if (cur_ops->get_comp_state) {
|
||||
cookie = cur_ops->get_comp_state();
|
||||
WARN_ON_ONCE(!cur_ops->poll_gp_state(cookie));
|
||||
}
|
||||
cur_ops->readunlock(idx);
|
||||
@ -1452,8 +1465,8 @@ rcu_torture_writer(void *arg)
|
||||
rcu_torture_writer_state_getname(),
|
||||
rcu_torture_writer_state,
|
||||
cpumask_pr_args(cpu_online_mask));
|
||||
if (cur_ops->get_gp_completed_full) {
|
||||
cur_ops->get_gp_completed_full(&cookie_full);
|
||||
if (cur_ops->get_comp_state_full) {
|
||||
cur_ops->get_comp_state_full(&cookie_full);
|
||||
WARN_ON_ONCE(!cur_ops->poll_gp_state_full(&cookie_full));
|
||||
}
|
||||
cur_ops->readunlock(idx);
|
||||
@ -1502,19 +1515,19 @@ rcu_torture_writer(void *arg)
|
||||
break;
|
||||
case RTWS_POLL_GET:
|
||||
rcu_torture_writer_state = RTWS_POLL_GET;
|
||||
for (i = 0; i < ARRAY_SIZE(ulo); i++)
|
||||
for (i = 0; i < ulo_size; i++)
|
||||
ulo[i] = cur_ops->get_comp_state();
|
||||
gp_snap = cur_ops->start_gp_poll();
|
||||
rcu_torture_writer_state = RTWS_POLL_WAIT;
|
||||
while (!cur_ops->poll_gp_state(gp_snap)) {
|
||||
gp_snap1 = cur_ops->get_gp_state();
|
||||
for (i = 0; i < ARRAY_SIZE(ulo); i++)
|
||||
for (i = 0; i < ulo_size; i++)
|
||||
if (cur_ops->poll_gp_state(ulo[i]) ||
|
||||
cur_ops->same_gp_state(ulo[i], gp_snap1)) {
|
||||
ulo[i] = gp_snap1;
|
||||
break;
|
||||
}
|
||||
WARN_ON_ONCE(i >= ARRAY_SIZE(ulo));
|
||||
WARN_ON_ONCE(ulo_size > 0 && i >= ulo_size);
|
||||
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
|
||||
&rand);
|
||||
}
|
||||
@ -1522,20 +1535,20 @@ rcu_torture_writer(void *arg)
|
||||
break;
|
||||
case RTWS_POLL_GET_FULL:
|
||||
rcu_torture_writer_state = RTWS_POLL_GET_FULL;
|
||||
for (i = 0; i < ARRAY_SIZE(rgo); i++)
|
||||
for (i = 0; i < rgo_size; i++)
|
||||
cur_ops->get_comp_state_full(&rgo[i]);
|
||||
cur_ops->start_gp_poll_full(&gp_snap_full);
|
||||
rcu_torture_writer_state = RTWS_POLL_WAIT_FULL;
|
||||
while (!cur_ops->poll_gp_state_full(&gp_snap_full)) {
|
||||
cur_ops->get_gp_state_full(&gp_snap1_full);
|
||||
for (i = 0; i < ARRAY_SIZE(rgo); i++)
|
||||
for (i = 0; i < rgo_size; i++)
|
||||
if (cur_ops->poll_gp_state_full(&rgo[i]) ||
|
||||
cur_ops->same_gp_state_full(&rgo[i],
|
||||
&gp_snap1_full)) {
|
||||
rgo[i] = gp_snap1_full;
|
||||
break;
|
||||
}
|
||||
WARN_ON_ONCE(i >= ARRAY_SIZE(rgo));
|
||||
WARN_ON_ONCE(rgo_size > 0 && i >= rgo_size);
|
||||
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
|
||||
&rand);
|
||||
}
|
||||
@ -1617,6 +1630,8 @@ rcu_torture_writer(void *arg)
|
||||
pr_alert("%s" TORTURE_FLAG
|
||||
" Dynamic grace-period expediting was disabled.\n",
|
||||
torture_type);
|
||||
kfree(ulo);
|
||||
kfree(rgo);
|
||||
rcu_torture_writer_state = RTWS_STOPPING;
|
||||
torture_kthread_stopping("rcu_torture_writer");
|
||||
return 0;
|
||||
@ -2370,7 +2385,7 @@ rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
|
||||
"test_boost=%d/%d test_boost_interval=%d "
|
||||
"test_boost_duration=%d shutdown_secs=%d "
|
||||
"stall_cpu=%d stall_cpu_holdoff=%d stall_cpu_irqsoff=%d "
|
||||
"stall_cpu_block=%d "
|
||||
"stall_cpu_block=%d stall_cpu_repeat=%d "
|
||||
"n_barrier_cbs=%d "
|
||||
"onoff_interval=%d onoff_holdoff=%d "
|
||||
"read_exit_delay=%d read_exit_burst=%d "
|
||||
@ -2382,7 +2397,7 @@ rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
|
||||
test_boost, cur_ops->can_boost,
|
||||
test_boost_interval, test_boost_duration, shutdown_secs,
|
||||
stall_cpu, stall_cpu_holdoff, stall_cpu_irqsoff,
|
||||
stall_cpu_block,
|
||||
stall_cpu_block, stall_cpu_repeat,
|
||||
n_barrier_cbs,
|
||||
onoff_interval, onoff_holdoff,
|
||||
read_exit_delay, read_exit_burst,
|
||||
@ -2460,19 +2475,11 @@ static struct notifier_block rcu_torture_stall_block = {
|
||||
* induces a CPU stall for the time specified by stall_cpu. If a new
|
||||
* stall test is added, stallsdone in rcu_torture_writer() must be adjusted.
|
||||
*/
|
||||
static int rcu_torture_stall(void *args)
|
||||
static void rcu_torture_stall_one(int rep, int irqsoff)
|
||||
{
|
||||
int idx;
|
||||
int ret;
|
||||
unsigned long stop_at;
|
||||
|
||||
VERBOSE_TOROUT_STRING("rcu_torture_stall task started");
|
||||
if (rcu_cpu_stall_notifiers) {
|
||||
ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block);
|
||||
if (ret)
|
||||
pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n",
|
||||
__func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : "");
|
||||
}
|
||||
if (stall_cpu_holdoff > 0) {
|
||||
VERBOSE_TOROUT_STRING("rcu_torture_stall begin holdoff");
|
||||
schedule_timeout_interruptible(stall_cpu_holdoff * HZ);
|
||||
@ -2492,12 +2499,12 @@ static int rcu_torture_stall(void *args)
|
||||
stop_at = ktime_get_seconds() + stall_cpu;
|
||||
/* RCU CPU stall is expected behavior in following code. */
|
||||
idx = cur_ops->readlock();
|
||||
if (stall_cpu_irqsoff)
|
||||
if (irqsoff)
|
||||
local_irq_disable();
|
||||
else if (!stall_cpu_block)
|
||||
preempt_disable();
|
||||
pr_alert("%s start on CPU %d.\n",
|
||||
__func__, raw_smp_processor_id());
|
||||
pr_alert("%s start stall episode %d on CPU %d.\n",
|
||||
__func__, rep + 1, raw_smp_processor_id());
|
||||
while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), stop_at) &&
|
||||
!kthread_should_stop())
|
||||
if (stall_cpu_block) {
|
||||
@ -2509,12 +2516,42 @@ static int rcu_torture_stall(void *args)
|
||||
} else if (stall_no_softlockup) {
|
||||
touch_softlockup_watchdog();
|
||||
}
|
||||
if (stall_cpu_irqsoff)
|
||||
if (irqsoff)
|
||||
local_irq_enable();
|
||||
else if (!stall_cpu_block)
|
||||
preempt_enable();
|
||||
cur_ops->readunlock(idx);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* CPU-stall kthread. Invokes rcu_torture_stall_one() once, and then as many
|
||||
* additional times as specified by the stall_cpu_repeat module parameter.
|
||||
* Note that stall_cpu_irqsoff is ignored on the second and subsequent
|
||||
* stall.
|
||||
*/
|
||||
static int rcu_torture_stall(void *args)
|
||||
{
|
||||
int i;
|
||||
int repeat = stall_cpu_repeat;
|
||||
int ret;
|
||||
|
||||
VERBOSE_TOROUT_STRING("rcu_torture_stall task started");
|
||||
if (repeat < 0) {
|
||||
repeat = 0;
|
||||
WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST));
|
||||
}
|
||||
if (rcu_cpu_stall_notifiers) {
|
||||
ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block);
|
||||
if (ret)
|
||||
pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n",
|
||||
__func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : "");
|
||||
}
|
||||
for (i = 0; i <= repeat; i++) {
|
||||
if (kthread_should_stop())
|
||||
break;
|
||||
rcu_torture_stall_one(i, i == 0 ? stall_cpu_irqsoff : 0);
|
||||
}
|
||||
pr_alert("%s end.\n", __func__);
|
||||
if (rcu_cpu_stall_notifiers && !ret) {
|
||||
ret = rcu_stall_chain_notifier_unregister(&rcu_torture_stall_block);
|
||||
@ -2680,7 +2717,7 @@ static unsigned long rcu_torture_fwd_prog_cbfree(struct rcu_fwd *rfp)
|
||||
rcu_torture_fwd_prog_cond_resched(freed);
|
||||
if (tick_nohz_full_enabled()) {
|
||||
local_irq_save(flags);
|
||||
rcu_momentary_dyntick_idle();
|
||||
rcu_momentary_eqs();
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
}
|
||||
@ -2830,7 +2867,7 @@ static void rcu_torture_fwd_prog_cr(struct rcu_fwd *rfp)
|
||||
rcu_torture_fwd_prog_cond_resched(n_launders + n_max_cbs);
|
||||
if (tick_nohz_full_enabled()) {
|
||||
local_irq_save(flags);
|
||||
rcu_momentary_dyntick_idle();
|
||||
rcu_momentary_eqs();
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
}
|
||||
|
@ -28,6 +28,7 @@
|
||||
#include <linux/rcupdate_trace.h>
|
||||
#include <linux/reboot.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/seq_buf.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/stat.h>
|
||||
@ -134,7 +135,7 @@ struct ref_scale_ops {
|
||||
const char *name;
|
||||
};
|
||||
|
||||
static struct ref_scale_ops *cur_ops;
|
||||
static const struct ref_scale_ops *cur_ops;
|
||||
|
||||
static void un_delay(const int udl, const int ndl)
|
||||
{
|
||||
@ -170,7 +171,7 @@ static bool rcu_sync_scale_init(void)
|
||||
return true;
|
||||
}
|
||||
|
||||
static struct ref_scale_ops rcu_ops = {
|
||||
static const struct ref_scale_ops rcu_ops = {
|
||||
.init = rcu_sync_scale_init,
|
||||
.readsection = ref_rcu_read_section,
|
||||
.delaysection = ref_rcu_delay_section,
|
||||
@ -204,7 +205,7 @@ static void srcu_ref_scale_delay_section(const int nloops, const int udl, const
|
||||
}
|
||||
}
|
||||
|
||||
static struct ref_scale_ops srcu_ops = {
|
||||
static const struct ref_scale_ops srcu_ops = {
|
||||
.init = rcu_sync_scale_init,
|
||||
.readsection = srcu_ref_scale_read_section,
|
||||
.delaysection = srcu_ref_scale_delay_section,
|
||||
@ -231,7 +232,7 @@ static void rcu_tasks_ref_scale_delay_section(const int nloops, const int udl, c
|
||||
un_delay(udl, ndl);
|
||||
}
|
||||
|
||||
static struct ref_scale_ops rcu_tasks_ops = {
|
||||
static const struct ref_scale_ops rcu_tasks_ops = {
|
||||
.init = rcu_sync_scale_init,
|
||||
.readsection = rcu_tasks_ref_scale_read_section,
|
||||
.delaysection = rcu_tasks_ref_scale_delay_section,
|
||||
@ -270,7 +271,7 @@ static void rcu_trace_ref_scale_delay_section(const int nloops, const int udl, c
|
||||
}
|
||||
}
|
||||
|
||||
static struct ref_scale_ops rcu_trace_ops = {
|
||||
static const struct ref_scale_ops rcu_trace_ops = {
|
||||
.init = rcu_sync_scale_init,
|
||||
.readsection = rcu_trace_ref_scale_read_section,
|
||||
.delaysection = rcu_trace_ref_scale_delay_section,
|
||||
@ -309,7 +310,7 @@ static void ref_refcnt_delay_section(const int nloops, const int udl, const int
|
||||
}
|
||||
}
|
||||
|
||||
static struct ref_scale_ops refcnt_ops = {
|
||||
static const struct ref_scale_ops refcnt_ops = {
|
||||
.init = rcu_sync_scale_init,
|
||||
.readsection = ref_refcnt_section,
|
||||
.delaysection = ref_refcnt_delay_section,
|
||||
@ -346,7 +347,7 @@ static void ref_rwlock_delay_section(const int nloops, const int udl, const int
|
||||
}
|
||||
}
|
||||
|
||||
static struct ref_scale_ops rwlock_ops = {
|
||||
static const struct ref_scale_ops rwlock_ops = {
|
||||
.init = ref_rwlock_init,
|
||||
.readsection = ref_rwlock_section,
|
||||
.delaysection = ref_rwlock_delay_section,
|
||||
@ -383,7 +384,7 @@ static void ref_rwsem_delay_section(const int nloops, const int udl, const int n
|
||||
}
|
||||
}
|
||||
|
||||
static struct ref_scale_ops rwsem_ops = {
|
||||
static const struct ref_scale_ops rwsem_ops = {
|
||||
.init = ref_rwsem_init,
|
||||
.readsection = ref_rwsem_section,
|
||||
.delaysection = ref_rwsem_delay_section,
|
||||
@ -418,7 +419,7 @@ static void ref_lock_delay_section(const int nloops, const int udl, const int nd
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
static struct ref_scale_ops lock_ops = {
|
||||
static const struct ref_scale_ops lock_ops = {
|
||||
.readsection = ref_lock_section,
|
||||
.delaysection = ref_lock_delay_section,
|
||||
.name = "lock"
|
||||
@ -453,7 +454,7 @@ static void ref_lock_irq_delay_section(const int nloops, const int udl, const in
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
static struct ref_scale_ops lock_irq_ops = {
|
||||
static const struct ref_scale_ops lock_irq_ops = {
|
||||
.readsection = ref_lock_irq_section,
|
||||
.delaysection = ref_lock_irq_delay_section,
|
||||
.name = "lock-irq"
|
||||
@ -489,7 +490,7 @@ static void ref_acqrel_delay_section(const int nloops, const int udl, const int
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
static struct ref_scale_ops acqrel_ops = {
|
||||
static const struct ref_scale_ops acqrel_ops = {
|
||||
.readsection = ref_acqrel_section,
|
||||
.delaysection = ref_acqrel_delay_section,
|
||||
.name = "acqrel"
|
||||
@ -523,7 +524,7 @@ static void ref_clock_delay_section(const int nloops, const int udl, const int n
|
||||
stopopts = x;
|
||||
}
|
||||
|
||||
static struct ref_scale_ops clock_ops = {
|
||||
static const struct ref_scale_ops clock_ops = {
|
||||
.readsection = ref_clock_section,
|
||||
.delaysection = ref_clock_delay_section,
|
||||
.name = "clock"
|
||||
@ -555,7 +556,7 @@ static void ref_jiffies_delay_section(const int nloops, const int udl, const int
|
||||
stopopts = x;
|
||||
}
|
||||
|
||||
static struct ref_scale_ops jiffies_ops = {
|
||||
static const struct ref_scale_ops jiffies_ops = {
|
||||
.readsection = ref_jiffies_section,
|
||||
.delaysection = ref_jiffies_delay_section,
|
||||
.name = "jiffies"
|
||||
@ -705,9 +706,9 @@ static void refscale_typesafe_ctor(void *rtsp_in)
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
static struct ref_scale_ops typesafe_ref_ops;
|
||||
static struct ref_scale_ops typesafe_lock_ops;
|
||||
static struct ref_scale_ops typesafe_seqlock_ops;
|
||||
static const struct ref_scale_ops typesafe_ref_ops;
|
||||
static const struct ref_scale_ops typesafe_lock_ops;
|
||||
static const struct ref_scale_ops typesafe_seqlock_ops;
|
||||
|
||||
// Initialize for a typesafe test.
|
||||
static bool typesafe_init(void)
|
||||
@ -768,7 +769,7 @@ static void typesafe_cleanup(void)
|
||||
}
|
||||
|
||||
// The typesafe_init() function distinguishes these structures by address.
|
||||
static struct ref_scale_ops typesafe_ref_ops = {
|
||||
static const struct ref_scale_ops typesafe_ref_ops = {
|
||||
.init = typesafe_init,
|
||||
.cleanup = typesafe_cleanup,
|
||||
.readsection = typesafe_read_section,
|
||||
@ -776,7 +777,7 @@ static struct ref_scale_ops typesafe_ref_ops = {
|
||||
.name = "typesafe_ref"
|
||||
};
|
||||
|
||||
static struct ref_scale_ops typesafe_lock_ops = {
|
||||
static const struct ref_scale_ops typesafe_lock_ops = {
|
||||
.init = typesafe_init,
|
||||
.cleanup = typesafe_cleanup,
|
||||
.readsection = typesafe_read_section,
|
||||
@ -784,7 +785,7 @@ static struct ref_scale_ops typesafe_lock_ops = {
|
||||
.name = "typesafe_lock"
|
||||
};
|
||||
|
||||
static struct ref_scale_ops typesafe_seqlock_ops = {
|
||||
static const struct ref_scale_ops typesafe_seqlock_ops = {
|
||||
.init = typesafe_init,
|
||||
.cleanup = typesafe_cleanup,
|
||||
.readsection = typesafe_read_section,
|
||||
@ -891,32 +892,34 @@ static u64 process_durations(int n)
|
||||
{
|
||||
int i;
|
||||
struct reader_task *rt;
|
||||
char buf1[64];
|
||||
struct seq_buf s;
|
||||
char *buf;
|
||||
u64 sum = 0;
|
||||
|
||||
buf = kmalloc(800 + 64, GFP_KERNEL);
|
||||
if (!buf)
|
||||
return 0;
|
||||
buf[0] = 0;
|
||||
sprintf(buf, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
|
||||
exp_idx);
|
||||
seq_buf_init(&s, buf, 800 + 64);
|
||||
|
||||
seq_buf_printf(&s, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
|
||||
exp_idx);
|
||||
|
||||
for (i = 0; i < n && !torture_must_stop(); i++) {
|
||||
rt = &(reader_tasks[i]);
|
||||
sprintf(buf1, "%d: %llu\t", i, rt->last_duration_ns);
|
||||
|
||||
if (i % 5 == 0)
|
||||
strcat(buf, "\n");
|
||||
if (strlen(buf) >= 800) {
|
||||
pr_alert("%s", buf);
|
||||
buf[0] = 0;
|
||||
seq_buf_putc(&s, '\n');
|
||||
|
||||
if (seq_buf_used(&s) >= 800) {
|
||||
pr_alert("%s", seq_buf_str(&s));
|
||||
seq_buf_clear(&s);
|
||||
}
|
||||
strcat(buf, buf1);
|
||||
|
||||
seq_buf_printf(&s, "%d: %llu\t", i, rt->last_duration_ns);
|
||||
|
||||
sum += rt->last_duration_ns;
|
||||
}
|
||||
pr_alert("%s\n", buf);
|
||||
pr_alert("%s\n", seq_buf_str(&s));
|
||||
|
||||
kfree(buf);
|
||||
return sum;
|
||||
@ -1023,7 +1026,7 @@ end:
|
||||
}
|
||||
|
||||
static void
|
||||
ref_scale_print_module_parms(struct ref_scale_ops *cur_ops, const char *tag)
|
||||
ref_scale_print_module_parms(const struct ref_scale_ops *cur_ops, const char *tag)
|
||||
{
|
||||
pr_alert("%s" SCALE_FLAG
|
||||
"--- %s: verbose=%d verbose_batched=%d shutdown=%d holdoff=%d lookup_instances=%ld loops=%ld nreaders=%d nruns=%d readdelay=%d\n", scale_type, tag,
|
||||
@ -1078,7 +1081,7 @@ ref_scale_init(void)
|
||||
{
|
||||
long i;
|
||||
int firsterr = 0;
|
||||
static struct ref_scale_ops *scale_ops[] = {
|
||||
static const struct ref_scale_ops *scale_ops[] = {
|
||||
&rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops,
|
||||
&rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops, &jiffies_ops,
|
||||
&typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,
|
||||
|
@ -137,6 +137,7 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
|
||||
sdp->srcu_cblist_invoking = false;
|
||||
sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq;
|
||||
sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq;
|
||||
sdp->srcu_barrier_head.next = &sdp->srcu_barrier_head;
|
||||
sdp->mynode = NULL;
|
||||
sdp->cpu = cpu;
|
||||
INIT_WORK(&sdp->work, srcu_invoke_callbacks);
|
||||
@ -247,7 +248,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
|
||||
mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
|
||||
mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
|
||||
ssp->srcu_idx = 0;
|
||||
ssp->srcu_sup->srcu_gp_seq = 0;
|
||||
ssp->srcu_sup->srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL;
|
||||
ssp->srcu_sup->srcu_barrier_seq = 0;
|
||||
mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
|
||||
atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
|
||||
@ -258,7 +259,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
|
||||
if (!ssp->sda)
|
||||
goto err_free_sup;
|
||||
init_srcu_struct_data(ssp);
|
||||
ssp->srcu_sup->srcu_gp_seq_needed_exp = 0;
|
||||
ssp->srcu_sup->srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL;
|
||||
ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
|
||||
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
|
||||
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC))
|
||||
@ -266,7 +267,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
|
||||
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
|
||||
}
|
||||
ssp->srcu_sup->srcu_ssp = ssp;
|
||||
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
|
||||
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed,
|
||||
SRCU_GP_SEQ_INITIAL_VAL); /* Init done. */
|
||||
return 0;
|
||||
|
||||
err_free_sda:
|
||||
@ -628,6 +630,7 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
|
||||
if (time_after(j, gpstart))
|
||||
jbase += j - gpstart;
|
||||
if (!jbase) {
|
||||
ASSERT_EXCLUSIVE_WRITER(sup->srcu_n_exp_nodelay);
|
||||
WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1);
|
||||
if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
|
||||
jbase = 1;
|
||||
@ -1560,6 +1563,7 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
|
||||
struct srcu_data *sdp;
|
||||
struct srcu_struct *ssp;
|
||||
|
||||
rhp->next = rhp; // Mark the callback as having been invoked.
|
||||
sdp = container_of(rhp, struct srcu_data, srcu_barrier_head);
|
||||
ssp = sdp->ssp;
|
||||
if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
|
||||
@ -1818,6 +1822,7 @@ static void process_srcu(struct work_struct *work)
|
||||
} else {
|
||||
j = jiffies;
|
||||
if (READ_ONCE(sup->reschedule_jiffies) == j) {
|
||||
ASSERT_EXCLUSIVE_WRITER(sup->reschedule_count);
|
||||
WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1);
|
||||
if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay)
|
||||
curdelay = 1;
|
||||
|
@ -34,6 +34,7 @@ typedef void (*postgp_func_t)(struct rcu_tasks *rtp);
|
||||
* @rtp_blkd_tasks: List of tasks blocked as readers.
|
||||
* @rtp_exit_list: List of tasks in the latter portion of do_exit().
|
||||
* @cpu: CPU number corresponding to this entry.
|
||||
* @index: Index of this CPU in rtpcp_array of the rcu_tasks structure.
|
||||
* @rtpp: Pointer to the rcu_tasks structure.
|
||||
*/
|
||||
struct rcu_tasks_percpu {
|
||||
@ -49,6 +50,7 @@ struct rcu_tasks_percpu {
|
||||
struct list_head rtp_blkd_tasks;
|
||||
struct list_head rtp_exit_list;
|
||||
int cpu;
|
||||
int index;
|
||||
struct rcu_tasks *rtpp;
|
||||
};
|
||||
|
||||
@ -63,7 +65,7 @@ struct rcu_tasks_percpu {
|
||||
* @init_fract: Initial backoff sleep interval.
|
||||
* @gp_jiffies: Time of last @gp_state transition.
|
||||
* @gp_start: Most recent grace-period start in jiffies.
|
||||
* @tasks_gp_seq: Number of grace periods completed since boot.
|
||||
* @tasks_gp_seq: Number of grace periods completed since boot in upper bits.
|
||||
* @n_ipis: Number of IPIs sent to encourage grace periods to end.
|
||||
* @n_ipis_fails: Number of IPI-send failures.
|
||||
* @kthread_ptr: This flavor's grace-period/callback-invocation kthread.
|
||||
@ -76,6 +78,7 @@ struct rcu_tasks_percpu {
|
||||
* @call_func: This flavor's call_rcu()-equivalent function.
|
||||
* @wait_state: Task state for synchronous grace-period waits (default TASK_UNINTERRUPTIBLE).
|
||||
* @rtpcpu: This flavor's rcu_tasks_percpu structure.
|
||||
* @rtpcp_array: Array of pointers to rcu_tasks_percpu structure of CPUs in cpu_possible_mask.
|
||||
* @percpu_enqueue_shift: Shift down CPU ID this much when enqueuing callbacks.
|
||||
* @percpu_enqueue_lim: Number of per-CPU callback queues in use for enqueuing.
|
||||
* @percpu_dequeue_lim: Number of per-CPU callback queues in use for dequeuing.
|
||||
@ -84,6 +87,7 @@ struct rcu_tasks_percpu {
|
||||
* @barrier_q_count: Number of queues being waited on.
|
||||
* @barrier_q_completion: Barrier wait/wakeup mechanism.
|
||||
* @barrier_q_seq: Sequence number for barrier operations.
|
||||
* @barrier_q_start: Most recent barrier start in jiffies.
|
||||
* @name: This flavor's textual name.
|
||||
* @kname: This flavor's kthread name.
|
||||
*/
|
||||
@ -110,6 +114,7 @@ struct rcu_tasks {
|
||||
call_rcu_func_t call_func;
|
||||
unsigned int wait_state;
|
||||
struct rcu_tasks_percpu __percpu *rtpcpu;
|
||||
struct rcu_tasks_percpu **rtpcp_array;
|
||||
int percpu_enqueue_shift;
|
||||
int percpu_enqueue_lim;
|
||||
int percpu_dequeue_lim;
|
||||
@ -118,6 +123,7 @@ struct rcu_tasks {
|
||||
atomic_t barrier_q_count;
|
||||
struct completion barrier_q_completion;
|
||||
unsigned long barrier_q_seq;
|
||||
unsigned long barrier_q_start;
|
||||
char *name;
|
||||
char *kname;
|
||||
};
|
||||
@ -182,6 +188,8 @@ module_param(rcu_task_collapse_lim, int, 0444);
|
||||
static int rcu_task_lazy_lim __read_mostly = 32;
|
||||
module_param(rcu_task_lazy_lim, int, 0444);
|
||||
|
||||
static int rcu_task_cpu_ids;
|
||||
|
||||
/* RCU tasks grace-period state for debugging. */
|
||||
#define RTGS_INIT 0
|
||||
#define RTGS_WAIT_WAIT_CBS 1
|
||||
@ -245,6 +253,8 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
|
||||
int cpu;
|
||||
int lim;
|
||||
int shift;
|
||||
int maxcpu;
|
||||
int index = 0;
|
||||
|
||||
if (rcu_task_enqueue_lim < 0) {
|
||||
rcu_task_enqueue_lim = 1;
|
||||
@ -254,14 +264,9 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
|
||||
}
|
||||
lim = rcu_task_enqueue_lim;
|
||||
|
||||
if (lim > nr_cpu_ids)
|
||||
lim = nr_cpu_ids;
|
||||
shift = ilog2(nr_cpu_ids / lim);
|
||||
if (((nr_cpu_ids - 1) >> shift) >= lim)
|
||||
shift++;
|
||||
WRITE_ONCE(rtp->percpu_enqueue_shift, shift);
|
||||
WRITE_ONCE(rtp->percpu_dequeue_lim, lim);
|
||||
smp_store_release(&rtp->percpu_enqueue_lim, lim);
|
||||
rtp->rtpcp_array = kcalloc(num_possible_cpus(), sizeof(struct rcu_tasks_percpu *), GFP_KERNEL);
|
||||
BUG_ON(!rtp->rtpcp_array);
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
|
||||
|
||||
@ -273,14 +278,30 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
|
||||
INIT_WORK(&rtpcp->rtp_work, rcu_tasks_invoke_cbs_wq);
|
||||
rtpcp->cpu = cpu;
|
||||
rtpcp->rtpp = rtp;
|
||||
rtpcp->index = index;
|
||||
rtp->rtpcp_array[index] = rtpcp;
|
||||
index++;
|
||||
if (!rtpcp->rtp_blkd_tasks.next)
|
||||
INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks);
|
||||
if (!rtpcp->rtp_exit_list.next)
|
||||
INIT_LIST_HEAD(&rtpcp->rtp_exit_list);
|
||||
rtpcp->barrier_q_head.next = &rtpcp->barrier_q_head;
|
||||
maxcpu = cpu;
|
||||
}
|
||||
|
||||
pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d.\n", rtp->name,
|
||||
data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim), rcu_task_cb_adjust);
|
||||
rcu_task_cpu_ids = maxcpu + 1;
|
||||
if (lim > rcu_task_cpu_ids)
|
||||
lim = rcu_task_cpu_ids;
|
||||
shift = ilog2(rcu_task_cpu_ids / lim);
|
||||
if (((rcu_task_cpu_ids - 1) >> shift) >= lim)
|
||||
shift++;
|
||||
WRITE_ONCE(rtp->percpu_enqueue_shift, shift);
|
||||
WRITE_ONCE(rtp->percpu_dequeue_lim, lim);
|
||||
smp_store_release(&rtp->percpu_enqueue_lim, lim);
|
||||
|
||||
pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d rcu_task_cpu_ids=%d.\n",
|
||||
rtp->name, data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim),
|
||||
rcu_task_cb_adjust, rcu_task_cpu_ids);
|
||||
}
|
||||
|
||||
// Compute wakeup time for lazy callback timer.
|
||||
@ -339,6 +360,7 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
|
||||
rcu_read_lock();
|
||||
ideal_cpu = smp_processor_id() >> READ_ONCE(rtp->percpu_enqueue_shift);
|
||||
chosen_cpu = cpumask_next(ideal_cpu - 1, cpu_possible_mask);
|
||||
WARN_ON_ONCE(chosen_cpu >= rcu_task_cpu_ids);
|
||||
rtpcp = per_cpu_ptr(rtp->rtpcpu, chosen_cpu);
|
||||
if (!raw_spin_trylock_rcu_node(rtpcp)) { // irqs already disabled.
|
||||
raw_spin_lock_rcu_node(rtpcp); // irqs already disabled.
|
||||
@ -348,7 +370,7 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
|
||||
rtpcp->rtp_n_lock_retries = 0;
|
||||
}
|
||||
if (rcu_task_cb_adjust && ++rtpcp->rtp_n_lock_retries > rcu_task_contend_lim &&
|
||||
READ_ONCE(rtp->percpu_enqueue_lim) != nr_cpu_ids)
|
||||
READ_ONCE(rtp->percpu_enqueue_lim) != rcu_task_cpu_ids)
|
||||
needadjust = true; // Defer adjustment to avoid deadlock.
|
||||
}
|
||||
// Queuing callbacks before initialization not yet supported.
|
||||
@ -368,10 +390,10 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
|
||||
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
|
||||
if (unlikely(needadjust)) {
|
||||
raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
|
||||
if (rtp->percpu_enqueue_lim != nr_cpu_ids) {
|
||||
if (rtp->percpu_enqueue_lim != rcu_task_cpu_ids) {
|
||||
WRITE_ONCE(rtp->percpu_enqueue_shift, 0);
|
||||
WRITE_ONCE(rtp->percpu_dequeue_lim, nr_cpu_ids);
|
||||
smp_store_release(&rtp->percpu_enqueue_lim, nr_cpu_ids);
|
||||
WRITE_ONCE(rtp->percpu_dequeue_lim, rcu_task_cpu_ids);
|
||||
smp_store_release(&rtp->percpu_enqueue_lim, rcu_task_cpu_ids);
|
||||
pr_info("Switching %s to per-CPU callback queuing.\n", rtp->name);
|
||||
}
|
||||
raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
|
||||
@ -388,6 +410,7 @@ static void rcu_barrier_tasks_generic_cb(struct rcu_head *rhp)
|
||||
struct rcu_tasks *rtp;
|
||||
struct rcu_tasks_percpu *rtpcp;
|
||||
|
||||
rhp->next = rhp; // Mark the callback as having been invoked.
|
||||
rtpcp = container_of(rhp, struct rcu_tasks_percpu, barrier_q_head);
|
||||
rtp = rtpcp->rtpp;
|
||||
if (atomic_dec_and_test(&rtp->barrier_q_count))
|
||||
@ -396,7 +419,7 @@ static void rcu_barrier_tasks_generic_cb(struct rcu_head *rhp)
|
||||
|
||||
// Wait for all in-flight callbacks for the specified RCU Tasks flavor.
|
||||
// Operates in a manner similar to rcu_barrier().
|
||||
static void rcu_barrier_tasks_generic(struct rcu_tasks *rtp)
|
||||
static void __maybe_unused rcu_barrier_tasks_generic(struct rcu_tasks *rtp)
|
||||
{
|
||||
int cpu;
|
||||
unsigned long flags;
|
||||
@ -409,6 +432,7 @@ static void rcu_barrier_tasks_generic(struct rcu_tasks *rtp)
|
||||
mutex_unlock(&rtp->barrier_q_mutex);
|
||||
return;
|
||||
}
|
||||
rtp->barrier_q_start = jiffies;
|
||||
rcu_seq_start(&rtp->barrier_q_seq);
|
||||
init_completion(&rtp->barrier_q_completion);
|
||||
atomic_set(&rtp->barrier_q_count, 2);
|
||||
@ -444,6 +468,8 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
|
||||
|
||||
dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim);
|
||||
for (cpu = 0; cpu < dequeue_limit; cpu++) {
|
||||
if (!cpu_possible(cpu))
|
||||
continue;
|
||||
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
|
||||
|
||||
/* Advance and accelerate any new callbacks. */
|
||||
@ -481,7 +507,7 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
|
||||
if (rcu_task_cb_adjust && ncbs <= rcu_task_collapse_lim) {
|
||||
raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
|
||||
if (rtp->percpu_enqueue_lim > 1) {
|
||||
WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids));
|
||||
WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(rcu_task_cpu_ids));
|
||||
smp_store_release(&rtp->percpu_enqueue_lim, 1);
|
||||
rtp->percpu_dequeue_gpseq = get_state_synchronize_rcu();
|
||||
gpdone = false;
|
||||
@ -496,7 +522,9 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
|
||||
pr_info("Completing switch %s to CPU-0 callback queuing.\n", rtp->name);
|
||||
}
|
||||
if (rtp->percpu_dequeue_lim == 1) {
|
||||
for (cpu = rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) {
|
||||
for (cpu = rtp->percpu_dequeue_lim; cpu < rcu_task_cpu_ids; cpu++) {
|
||||
if (!cpu_possible(cpu))
|
||||
continue;
|
||||
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
|
||||
|
||||
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist));
|
||||
@ -511,30 +539,32 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
|
||||
// Advance callbacks and invoke any that are ready.
|
||||
static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu *rtpcp)
|
||||
{
|
||||
int cpu;
|
||||
int cpunext;
|
||||
int cpuwq;
|
||||
unsigned long flags;
|
||||
int len;
|
||||
int index;
|
||||
struct rcu_head *rhp;
|
||||
struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
|
||||
struct rcu_tasks_percpu *rtpcp_next;
|
||||
|
||||
cpu = rtpcp->cpu;
|
||||
cpunext = cpu * 2 + 1;
|
||||
if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
|
||||
rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext);
|
||||
cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND;
|
||||
queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
|
||||
cpunext++;
|
||||
if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
|
||||
rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext);
|
||||
cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND;
|
||||
index = rtpcp->index * 2 + 1;
|
||||
if (index < num_possible_cpus()) {
|
||||
rtpcp_next = rtp->rtpcp_array[index];
|
||||
if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
|
||||
cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND;
|
||||
queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
|
||||
index++;
|
||||
if (index < num_possible_cpus()) {
|
||||
rtpcp_next = rtp->rtpcp_array[index];
|
||||
if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
|
||||
cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND;
|
||||
queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (rcu_segcblist_empty(&rtpcp->cblist) || !cpu_possible(cpu))
|
||||
if (rcu_segcblist_empty(&rtpcp->cblist))
|
||||
return;
|
||||
raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
|
||||
rcu_segcblist_advance(&rtpcp->cblist, rcu_seq_current(&rtp->tasks_gp_seq));
|
||||
@ -687,9 +717,7 @@ static void __init rcu_tasks_bootup_oddness(void)
|
||||
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
|
||||
}
|
||||
|
||||
#endif /* #ifndef CONFIG_TINY_RCU */
|
||||
|
||||
#ifndef CONFIG_TINY_RCU
|
||||
/* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */
|
||||
static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
|
||||
{
|
||||
@ -723,6 +751,53 @@ static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
|
||||
rtp->lazy_jiffies,
|
||||
s);
|
||||
}
|
||||
|
||||
/* Dump out more rcutorture-relevant state common to all RCU-tasks flavors. */
|
||||
static void rcu_tasks_torture_stats_print_generic(struct rcu_tasks *rtp, char *tt,
|
||||
char *tf, char *tst)
|
||||
{
|
||||
cpumask_var_t cm;
|
||||
int cpu;
|
||||
bool gotcb = false;
|
||||
unsigned long j = jiffies;
|
||||
|
||||
pr_alert("%s%s Tasks%s RCU g%ld gp_start %lu gp_jiffies %lu gp_state %d (%s).\n",
|
||||
tt, tf, tst, data_race(rtp->tasks_gp_seq),
|
||||
j - data_race(rtp->gp_start), j - data_race(rtp->gp_jiffies),
|
||||
data_race(rtp->gp_state), tasks_gp_state_getname(rtp));
|
||||
pr_alert("\tEnqueue shift %d limit %d Dequeue limit %d gpseq %lu.\n",
|
||||
data_race(rtp->percpu_enqueue_shift),
|
||||
data_race(rtp->percpu_enqueue_lim),
|
||||
data_race(rtp->percpu_dequeue_lim),
|
||||
data_race(rtp->percpu_dequeue_gpseq));
|
||||
(void)zalloc_cpumask_var(&cm, GFP_KERNEL);
|
||||
pr_alert("\tCallback counts:");
|
||||
for_each_possible_cpu(cpu) {
|
||||
long n;
|
||||
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
|
||||
|
||||
if (cpumask_available(cm) && !rcu_barrier_cb_is_done(&rtpcp->barrier_q_head))
|
||||
cpumask_set_cpu(cpu, cm);
|
||||
n = rcu_segcblist_n_cbs(&rtpcp->cblist);
|
||||
if (!n)
|
||||
continue;
|
||||
pr_cont(" %d:%ld", cpu, n);
|
||||
gotcb = true;
|
||||
}
|
||||
if (gotcb)
|
||||
pr_cont(".\n");
|
||||
else
|
||||
pr_cont(" (none).\n");
|
||||
pr_alert("\tBarrier seq %lu start %lu count %d holdout CPUs ",
|
||||
data_race(rtp->barrier_q_seq), j - data_race(rtp->barrier_q_start),
|
||||
atomic_read(&rtp->barrier_q_count));
|
||||
if (cpumask_available(cm) && !cpumask_empty(cm))
|
||||
pr_cont(" %*pbl.\n", cpumask_pr_args(cm));
|
||||
else
|
||||
pr_cont("(none).\n");
|
||||
free_cpumask_var(cm);
|
||||
}
|
||||
|
||||
#endif // #ifndef CONFIG_TINY_RCU
|
||||
|
||||
static void exit_tasks_rcu_finish_trace(struct task_struct *t);
|
||||
@ -1174,6 +1249,12 @@ void show_rcu_tasks_classic_gp_kthread(void)
|
||||
show_rcu_tasks_generic_gp_kthread(&rcu_tasks, "");
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread);
|
||||
|
||||
void rcu_tasks_torture_stats_print(char *tt, char *tf)
|
||||
{
|
||||
rcu_tasks_torture_stats_print_generic(&rcu_tasks, tt, tf, "");
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_tasks_torture_stats_print);
|
||||
#endif // !defined(CONFIG_TINY_RCU)
|
||||
|
||||
struct task_struct *get_rcu_tasks_gp_kthread(void)
|
||||
@ -1244,13 +1325,12 @@ void exit_tasks_rcu_finish(void) { exit_tasks_rcu_finish_trace(current); }
|
||||
|
||||
////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// "Rude" variant of Tasks RCU, inspired by Steve Rostedt's trick of
|
||||
// passing an empty function to schedule_on_each_cpu(). This approach
|
||||
// provides an asynchronous call_rcu_tasks_rude() API and batching of
|
||||
// concurrent calls to the synchronous synchronize_rcu_tasks_rude() API.
|
||||
// This invokes schedule_on_each_cpu() in order to send IPIs far and wide
|
||||
// and induces otherwise unnecessary context switches on all online CPUs,
|
||||
// whether idle or not.
|
||||
// "Rude" variant of Tasks RCU, inspired by Steve Rostedt's
|
||||
// trick of passing an empty function to schedule_on_each_cpu().
|
||||
// This approach provides batching of concurrent calls to the synchronous
|
||||
// synchronize_rcu_tasks_rude() API. This invokes schedule_on_each_cpu()
|
||||
// in order to send IPIs far and wide and induces otherwise unnecessary
|
||||
// context switches on all online CPUs, whether idle or not.
|
||||
//
|
||||
// Callback handling is provided by the rcu_tasks_kthread() function.
|
||||
//
|
||||
@ -1268,11 +1348,11 @@ static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
|
||||
schedule_on_each_cpu(rcu_tasks_be_rude);
|
||||
}
|
||||
|
||||
void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func);
|
||||
static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func);
|
||||
DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude,
|
||||
"RCU Tasks Rude");
|
||||
|
||||
/**
|
||||
/*
|
||||
* call_rcu_tasks_rude() - Queue a callback rude task-based grace period
|
||||
* @rhp: structure to be used for queueing the RCU updates.
|
||||
* @func: actual callback function to be invoked after the grace period
|
||||
@ -1289,12 +1369,14 @@ DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude,
|
||||
*
|
||||
* See the description of call_rcu() for more detailed information on
|
||||
* memory ordering guarantees.
|
||||
*
|
||||
* This is no longer exported, and is instead reserved for use by
|
||||
* synchronize_rcu_tasks_rude().
|
||||
*/
|
||||
void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func)
|
||||
static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func)
|
||||
{
|
||||
call_rcu_tasks_generic(rhp, func, &rcu_tasks_rude);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(call_rcu_tasks_rude);
|
||||
|
||||
/**
|
||||
* synchronize_rcu_tasks_rude - wait for a rude rcu-tasks grace period
|
||||
@ -1320,26 +1402,9 @@ void synchronize_rcu_tasks_rude(void)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_rude);
|
||||
|
||||
/**
|
||||
* rcu_barrier_tasks_rude - Wait for in-flight call_rcu_tasks_rude() callbacks.
|
||||
*
|
||||
* Although the current implementation is guaranteed to wait, it is not
|
||||
* obligated to, for example, if there are no pending callbacks.
|
||||
*/
|
||||
void rcu_barrier_tasks_rude(void)
|
||||
{
|
||||
rcu_barrier_tasks_generic(&rcu_tasks_rude);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_barrier_tasks_rude);
|
||||
|
||||
int rcu_tasks_rude_lazy_ms = -1;
|
||||
module_param(rcu_tasks_rude_lazy_ms, int, 0444);
|
||||
|
||||
static int __init rcu_spawn_tasks_rude_kthread(void)
|
||||
{
|
||||
rcu_tasks_rude.gp_sleep = HZ / 10;
|
||||
if (rcu_tasks_rude_lazy_ms >= 0)
|
||||
rcu_tasks_rude.lazy_jiffies = msecs_to_jiffies(rcu_tasks_rude_lazy_ms);
|
||||
rcu_spawn_tasks_kthread_generic(&rcu_tasks_rude);
|
||||
return 0;
|
||||
}
|
||||
@ -1350,6 +1415,12 @@ void show_rcu_tasks_rude_gp_kthread(void)
|
||||
show_rcu_tasks_generic_gp_kthread(&rcu_tasks_rude, "");
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(show_rcu_tasks_rude_gp_kthread);
|
||||
|
||||
void rcu_tasks_rude_torture_stats_print(char *tt, char *tf)
|
||||
{
|
||||
rcu_tasks_torture_stats_print_generic(&rcu_tasks_rude, tt, tf, "");
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_tasks_rude_torture_stats_print);
|
||||
#endif // !defined(CONFIG_TINY_RCU)
|
||||
|
||||
struct task_struct *get_rcu_tasks_rude_gp_kthread(void)
|
||||
@ -1613,7 +1684,7 @@ static int trc_inspect_reader(struct task_struct *t, void *bhp_in)
|
||||
// However, we cannot safely change its state.
|
||||
n_heavy_reader_attempts++;
|
||||
// Check for "running" idle tasks on offline CPUs.
|
||||
if (!rcu_dynticks_zero_in_eqs(cpu, &t->trc_reader_nesting))
|
||||
if (!rcu_watching_zero_in_eqs(cpu, &t->trc_reader_nesting))
|
||||
return -EINVAL; // No quiescent state, do it the hard way.
|
||||
n_heavy_reader_updates++;
|
||||
nesting = 0;
|
||||
@ -2027,6 +2098,12 @@ void show_rcu_tasks_trace_gp_kthread(void)
|
||||
show_rcu_tasks_generic_gp_kthread(&rcu_tasks_trace, buf);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(show_rcu_tasks_trace_gp_kthread);
|
||||
|
||||
void rcu_tasks_trace_torture_stats_print(char *tt, char *tf)
|
||||
{
|
||||
rcu_tasks_torture_stats_print_generic(&rcu_tasks_trace, tt, tf, "");
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_tasks_trace_torture_stats_print);
|
||||
#endif // !defined(CONFIG_TINY_RCU)
|
||||
|
||||
struct task_struct *get_rcu_tasks_trace_gp_kthread(void)
|
||||
@ -2069,11 +2146,6 @@ static struct rcu_tasks_test_desc tests[] = {
|
||||
/* If not defined, the test is skipped. */
|
||||
.notrun = IS_ENABLED(CONFIG_TASKS_RCU),
|
||||
},
|
||||
{
|
||||
.name = "call_rcu_tasks_rude()",
|
||||
/* If not defined, the test is skipped. */
|
||||
.notrun = IS_ENABLED(CONFIG_TASKS_RUDE_RCU),
|
||||
},
|
||||
{
|
||||
.name = "call_rcu_tasks_trace()",
|
||||
/* If not defined, the test is skipped. */
|
||||
@ -2081,6 +2153,7 @@ static struct rcu_tasks_test_desc tests[] = {
|
||||
}
|
||||
};
|
||||
|
||||
#if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
|
||||
static void test_rcu_tasks_callback(struct rcu_head *rhp)
|
||||
{
|
||||
struct rcu_tasks_test_desc *rttd =
|
||||
@ -2090,6 +2163,7 @@ static void test_rcu_tasks_callback(struct rcu_head *rhp)
|
||||
|
||||
rttd->notrun = false;
|
||||
}
|
||||
#endif // #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
|
||||
|
||||
static void rcu_tasks_initiate_self_tests(void)
|
||||
{
|
||||
@ -2102,16 +2176,14 @@ static void rcu_tasks_initiate_self_tests(void)
|
||||
|
||||
#ifdef CONFIG_TASKS_RUDE_RCU
|
||||
pr_info("Running RCU Tasks Rude wait API self tests\n");
|
||||
tests[1].runstart = jiffies;
|
||||
synchronize_rcu_tasks_rude();
|
||||
call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_TASKS_TRACE_RCU
|
||||
pr_info("Running RCU Tasks Trace wait API self tests\n");
|
||||
tests[2].runstart = jiffies;
|
||||
tests[1].runstart = jiffies;
|
||||
synchronize_rcu_tasks_trace();
|
||||
call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback);
|
||||
call_rcu_tasks_trace(&tests[1].rh, test_rcu_tasks_callback);
|
||||
#endif
|
||||
}
|
||||
|
||||
|
@ -79,9 +79,6 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *);
|
||||
|
||||
static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
|
||||
.gpwrap = true,
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
.cblist.flags = SEGCBLIST_RCU_CORE,
|
||||
#endif
|
||||
};
|
||||
static struct rcu_state rcu_state = {
|
||||
.level = { &rcu_state.node[0] },
|
||||
@ -97,6 +94,9 @@ static struct rcu_state rcu_state = {
|
||||
.srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work,
|
||||
rcu_sr_normal_gp_cleanup_work),
|
||||
.srs_cleanups_pending = ATOMIC_INIT(0),
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
.nocb_mutex = __MUTEX_INITIALIZER(rcu_state.nocb_mutex),
|
||||
#endif
|
||||
};
|
||||
|
||||
/* Dump rcu_node combining tree at boot to verify correct setup. */
|
||||
@ -283,37 +283,45 @@ void rcu_softirq_qs(void)
|
||||
}
|
||||
|
||||
/*
|
||||
* Reset the current CPU's ->dynticks counter to indicate that the
|
||||
* Reset the current CPU's RCU_WATCHING counter to indicate that the
|
||||
* newly onlined CPU is no longer in an extended quiescent state.
|
||||
* This will either leave the counter unchanged, or increment it
|
||||
* to the next non-quiescent value.
|
||||
*
|
||||
* The non-atomic test/increment sequence works because the upper bits
|
||||
* of the ->dynticks counter are manipulated only by the corresponding CPU,
|
||||
* of the ->state variable are manipulated only by the corresponding CPU,
|
||||
* or when the corresponding CPU is offline.
|
||||
*/
|
||||
static void rcu_dynticks_eqs_online(void)
|
||||
static void rcu_watching_online(void)
|
||||
{
|
||||
if (ct_dynticks() & RCU_DYNTICKS_IDX)
|
||||
if (ct_rcu_watching() & CT_RCU_WATCHING)
|
||||
return;
|
||||
ct_state_inc(RCU_DYNTICKS_IDX);
|
||||
ct_state_inc(CT_RCU_WATCHING);
|
||||
}
|
||||
|
||||
/*
|
||||
* Return true if the snapshot returned from rcu_dynticks_snap()
|
||||
* Return true if the snapshot returned from ct_rcu_watching()
|
||||
* indicates that RCU is in an extended quiescent state.
|
||||
*/
|
||||
static bool rcu_dynticks_in_eqs(int snap)
|
||||
static bool rcu_watching_snap_in_eqs(int snap)
|
||||
{
|
||||
return !(snap & RCU_DYNTICKS_IDX);
|
||||
return !(snap & CT_RCU_WATCHING);
|
||||
}
|
||||
|
||||
/*
|
||||
* Return true if the CPU corresponding to the specified rcu_data
|
||||
* structure has spent some time in an extended quiescent state since
|
||||
* rcu_dynticks_snap() returned the specified snapshot.
|
||||
/**
|
||||
* rcu_watching_snap_stopped_since() - Has RCU stopped watching a given CPU
|
||||
* since the specified @snap?
|
||||
*
|
||||
* @rdp: The rcu_data corresponding to the CPU for which to check EQS.
|
||||
* @snap: rcu_watching snapshot taken when the CPU wasn't in an EQS.
|
||||
*
|
||||
* Returns true if the CPU corresponding to @rdp has spent some time in an
|
||||
* extended quiescent state since @snap. Note that this doesn't check if it
|
||||
* /still/ is in an EQS, just that it went through one since @snap.
|
||||
*
|
||||
* This is meant to be used in a loop waiting for a CPU to go through an EQS.
|
||||
*/
|
||||
static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
|
||||
static bool rcu_watching_snap_stopped_since(struct rcu_data *rdp, int snap)
|
||||
{
|
||||
/*
|
||||
* The first failing snapshot is already ordered against the accesses
|
||||
@ -323,26 +331,29 @@ static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
|
||||
* performed by the remote CPU prior to entering idle and therefore can
|
||||
* rely solely on acquire semantics.
|
||||
*/
|
||||
return snap != ct_dynticks_cpu_acquire(rdp->cpu);
|
||||
if (WARN_ON_ONCE(rcu_watching_snap_in_eqs(snap)))
|
||||
return true;
|
||||
|
||||
return snap != ct_rcu_watching_cpu_acquire(rdp->cpu);
|
||||
}
|
||||
|
||||
/*
|
||||
* Return true if the referenced integer is zero while the specified
|
||||
* CPU remains within a single extended quiescent state.
|
||||
*/
|
||||
bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
|
||||
bool rcu_watching_zero_in_eqs(int cpu, int *vp)
|
||||
{
|
||||
int snap;
|
||||
|
||||
// If not quiescent, force back to earlier extended quiescent state.
|
||||
snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX;
|
||||
smp_rmb(); // Order ->dynticks and *vp reads.
|
||||
snap = ct_rcu_watching_cpu(cpu) & ~CT_RCU_WATCHING;
|
||||
smp_rmb(); // Order CT state and *vp reads.
|
||||
if (READ_ONCE(*vp))
|
||||
return false; // Non-zero, so report failure;
|
||||
smp_rmb(); // Order *vp read and ->dynticks re-read.
|
||||
smp_rmb(); // Order *vp read and CT state re-read.
|
||||
|
||||
// If still in the same extended quiescent state, we are good!
|
||||
return snap == ct_dynticks_cpu(cpu);
|
||||
return snap == ct_rcu_watching_cpu(cpu);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -356,17 +367,17 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
|
||||
*
|
||||
* The caller must have disabled interrupts and must not be idle.
|
||||
*/
|
||||
notrace void rcu_momentary_dyntick_idle(void)
|
||||
notrace void rcu_momentary_eqs(void)
|
||||
{
|
||||
int seq;
|
||||
|
||||
raw_cpu_write(rcu_data.rcu_need_heavy_qs, false);
|
||||
seq = ct_state_inc(2 * RCU_DYNTICKS_IDX);
|
||||
seq = ct_state_inc(2 * CT_RCU_WATCHING);
|
||||
/* It is illegal to call this from idle state. */
|
||||
WARN_ON_ONCE(!(seq & RCU_DYNTICKS_IDX));
|
||||
WARN_ON_ONCE(!(seq & CT_RCU_WATCHING));
|
||||
rcu_preempt_deferred_qs(current);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle);
|
||||
EXPORT_SYMBOL_GPL(rcu_momentary_eqs);
|
||||
|
||||
/**
|
||||
* rcu_is_cpu_rrupt_from_idle - see if 'interrupted' from idle
|
||||
@ -388,13 +399,13 @@ static int rcu_is_cpu_rrupt_from_idle(void)
|
||||
lockdep_assert_irqs_disabled();
|
||||
|
||||
/* Check for counter underflows */
|
||||
RCU_LOCKDEP_WARN(ct_dynticks_nesting() < 0,
|
||||
"RCU dynticks_nesting counter underflow!");
|
||||
RCU_LOCKDEP_WARN(ct_dynticks_nmi_nesting() <= 0,
|
||||
"RCU dynticks_nmi_nesting counter underflow/zero!");
|
||||
RCU_LOCKDEP_WARN(ct_nesting() < 0,
|
||||
"RCU nesting counter underflow!");
|
||||
RCU_LOCKDEP_WARN(ct_nmi_nesting() <= 0,
|
||||
"RCU nmi_nesting counter underflow/zero!");
|
||||
|
||||
/* Are we at first interrupt nesting level? */
|
||||
nesting = ct_dynticks_nmi_nesting();
|
||||
nesting = ct_nmi_nesting();
|
||||
if (nesting > 1)
|
||||
return false;
|
||||
|
||||
@ -404,7 +415,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
|
||||
WARN_ON_ONCE(!nesting && !is_idle_task(current));
|
||||
|
||||
/* Does CPU appear to be idle from an RCU standpoint? */
|
||||
return ct_dynticks_nesting() == 0;
|
||||
return ct_nesting() == 0;
|
||||
}
|
||||
|
||||
#define DEFAULT_RCU_BLIMIT (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) ? 1000 : 10)
|
||||
@ -596,12 +607,12 @@ void rcu_irq_exit_check_preempt(void)
|
||||
{
|
||||
lockdep_assert_irqs_disabled();
|
||||
|
||||
RCU_LOCKDEP_WARN(ct_dynticks_nesting() <= 0,
|
||||
"RCU dynticks_nesting counter underflow/zero!");
|
||||
RCU_LOCKDEP_WARN(ct_dynticks_nmi_nesting() !=
|
||||
DYNTICK_IRQ_NONIDLE,
|
||||
"Bad RCU dynticks_nmi_nesting counter\n");
|
||||
RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
|
||||
RCU_LOCKDEP_WARN(ct_nesting() <= 0,
|
||||
"RCU nesting counter underflow/zero!");
|
||||
RCU_LOCKDEP_WARN(ct_nmi_nesting() !=
|
||||
CT_NESTING_IRQ_NONIDLE,
|
||||
"Bad RCU nmi_nesting counter\n");
|
||||
RCU_LOCKDEP_WARN(!rcu_is_watching_curr_cpu(),
|
||||
"RCU in extended quiescent state!");
|
||||
}
|
||||
#endif /* #ifdef CONFIG_PROVE_RCU */
|
||||
@ -641,7 +652,7 @@ void __rcu_irq_enter_check_tick(void)
|
||||
if (in_nmi())
|
||||
return;
|
||||
|
||||
RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
|
||||
RCU_LOCKDEP_WARN(!rcu_is_watching_curr_cpu(),
|
||||
"Illegal rcu_irq_enter_check_tick() from extended quiescent state");
|
||||
|
||||
if (!tick_nohz_full_cpu(rdp->cpu) ||
|
||||
@ -723,7 +734,7 @@ notrace bool rcu_is_watching(void)
|
||||
bool ret;
|
||||
|
||||
preempt_disable_notrace();
|
||||
ret = !rcu_dynticks_curr_cpu_in_eqs();
|
||||
ret = rcu_is_watching_curr_cpu();
|
||||
preempt_enable_notrace();
|
||||
return ret;
|
||||
}
|
||||
@ -765,11 +776,11 @@ static void rcu_gpnum_ovf(struct rcu_node *rnp, struct rcu_data *rdp)
|
||||
}
|
||||
|
||||
/*
|
||||
* Snapshot the specified CPU's dynticks counter so that we can later
|
||||
* Snapshot the specified CPU's RCU_WATCHING counter so that we can later
|
||||
* credit them with an implicit quiescent state. Return 1 if this CPU
|
||||
* is in dynticks idle mode, which is an extended quiescent state.
|
||||
*/
|
||||
static int dyntick_save_progress_counter(struct rcu_data *rdp)
|
||||
static int rcu_watching_snap_save(struct rcu_data *rdp)
|
||||
{
|
||||
/*
|
||||
* Full ordering between remote CPU's post idle accesses and updater's
|
||||
@ -782,8 +793,8 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp)
|
||||
* Ordering between remote CPU's pre idle accesses and post grace period
|
||||
* updater's accesses is enforced by the below acquire semantic.
|
||||
*/
|
||||
rdp->dynticks_snap = ct_dynticks_cpu_acquire(rdp->cpu);
|
||||
if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) {
|
||||
rdp->watching_snap = ct_rcu_watching_cpu_acquire(rdp->cpu);
|
||||
if (rcu_watching_snap_in_eqs(rdp->watching_snap)) {
|
||||
trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti"));
|
||||
rcu_gpnum_ovf(rdp->mynode, rdp);
|
||||
return 1;
|
||||
@ -794,14 +805,14 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp)
|
||||
/*
|
||||
* Returns positive if the specified CPU has passed through a quiescent state
|
||||
* by virtue of being in or having passed through an dynticks idle state since
|
||||
* the last call to dyntick_save_progress_counter() for this same CPU, or by
|
||||
* the last call to rcu_watching_snap_save() for this same CPU, or by
|
||||
* virtue of having been offline.
|
||||
*
|
||||
* Returns negative if the specified CPU needs a force resched.
|
||||
*
|
||||
* Returns zero otherwise.
|
||||
*/
|
||||
static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
|
||||
static int rcu_watching_snap_recheck(struct rcu_data *rdp)
|
||||
{
|
||||
unsigned long jtsq;
|
||||
int ret = 0;
|
||||
@ -815,7 +826,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
|
||||
* read-side critical section that started before the beginning
|
||||
* of the current RCU grace period.
|
||||
*/
|
||||
if (rcu_dynticks_in_eqs_since(rdp, rdp->dynticks_snap)) {
|
||||
if (rcu_watching_snap_stopped_since(rdp, rdp->watching_snap)) {
|
||||
trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti"));
|
||||
rcu_gpnum_ovf(rnp, rdp);
|
||||
return 1;
|
||||
@ -1649,7 +1660,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
|
||||
* the done tail list manipulations are protected here.
|
||||
*/
|
||||
done = smp_load_acquire(&rcu_state.srs_done_tail);
|
||||
if (!done)
|
||||
if (WARN_ON_ONCE(!done))
|
||||
return;
|
||||
|
||||
WARN_ON_ONCE(!rcu_sr_is_wait_head(done));
|
||||
@ -1984,10 +1995,10 @@ static void rcu_gp_fqs(bool first_time)
|
||||
|
||||
if (first_time) {
|
||||
/* Collect dyntick-idle snapshots. */
|
||||
force_qs_rnp(dyntick_save_progress_counter);
|
||||
force_qs_rnp(rcu_watching_snap_save);
|
||||
} else {
|
||||
/* Handle dyntick-idle and offline CPUs. */
|
||||
force_qs_rnp(rcu_implicit_dynticks_qs);
|
||||
force_qs_rnp(rcu_watching_snap_recheck);
|
||||
}
|
||||
/* Clear flag to prevent immediate re-entry. */
|
||||
if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) {
|
||||
@ -2383,7 +2394,6 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
|
||||
{
|
||||
unsigned long flags;
|
||||
unsigned long mask;
|
||||
bool needacc = false;
|
||||
struct rcu_node *rnp;
|
||||
|
||||
WARN_ON_ONCE(rdp->cpu != smp_processor_id());
|
||||
@ -2420,23 +2430,11 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
|
||||
* to return true. So complain, but don't awaken.
|
||||
*/
|
||||
WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp));
|
||||
} else if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
|
||||
/*
|
||||
* ...but NOCB kthreads may miss or delay callbacks acceleration
|
||||
* if in the middle of a (de-)offloading process.
|
||||
*/
|
||||
needacc = true;
|
||||
}
|
||||
|
||||
rcu_disable_urgency_upon_qs(rdp);
|
||||
rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
|
||||
/* ^^^ Released rnp->lock */
|
||||
|
||||
if (needacc) {
|
||||
rcu_nocb_lock_irqsave(rdp, flags);
|
||||
rcu_accelerate_cbs_unlocked(rnp, rdp);
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@ -2791,24 +2789,6 @@ static __latent_entropy void rcu_core(void)
|
||||
unsigned long flags;
|
||||
struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);
|
||||
struct rcu_node *rnp = rdp->mynode;
|
||||
/*
|
||||
* On RT rcu_core() can be preempted when IRQs aren't disabled.
|
||||
* Therefore this function can race with concurrent NOCB (de-)offloading
|
||||
* on this CPU and the below condition must be considered volatile.
|
||||
* However if we race with:
|
||||
*
|
||||
* _ Offloading: In the worst case we accelerate or process callbacks
|
||||
* concurrently with NOCB kthreads. We are guaranteed to
|
||||
* call rcu_nocb_lock() if that happens.
|
||||
*
|
||||
* _ Deoffloading: In the worst case we miss callbacks acceleration or
|
||||
* processing. This is fine because the early stage
|
||||
* of deoffloading invokes rcu_core() after setting
|
||||
* SEGCBLIST_RCU_CORE. So we guarantee that we'll process
|
||||
* what could have been dismissed without the need to wait
|
||||
* for the next rcu_pending() check in the next jiffy.
|
||||
*/
|
||||
const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist);
|
||||
|
||||
if (cpu_is_offline(smp_processor_id()))
|
||||
return;
|
||||
@ -2828,17 +2808,17 @@ static __latent_entropy void rcu_core(void)
|
||||
|
||||
/* No grace period and unregistered callbacks? */
|
||||
if (!rcu_gp_in_progress() &&
|
||||
rcu_segcblist_is_enabled(&rdp->cblist) && do_batch) {
|
||||
rcu_nocb_lock_irqsave(rdp, flags);
|
||||
rcu_segcblist_is_enabled(&rdp->cblist) && !rcu_rdp_is_offloaded(rdp)) {
|
||||
local_irq_save(flags);
|
||||
if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
|
||||
rcu_accelerate_cbs_unlocked(rnp, rdp);
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
|
||||
rcu_check_gp_start_stall(rnp, rdp, rcu_jiffies_till_stall_check());
|
||||
|
||||
/* If there are callbacks ready, invoke them. */
|
||||
if (do_batch && rcu_segcblist_ready_cbs(&rdp->cblist) &&
|
||||
if (!rcu_rdp_is_offloaded(rdp) && rcu_segcblist_ready_cbs(&rdp->cblist) &&
|
||||
likely(READ_ONCE(rcu_scheduler_fully_active))) {
|
||||
rcu_do_batch(rdp);
|
||||
/* Re-invoke RCU core processing if there are callbacks remaining. */
|
||||
@ -3227,7 +3207,7 @@ struct kvfree_rcu_bulk_data {
|
||||
struct list_head list;
|
||||
struct rcu_gp_oldstate gp_snap;
|
||||
unsigned long nr_records;
|
||||
void *records[];
|
||||
void *records[] __counted_by(nr_records);
|
||||
};
|
||||
|
||||
/*
|
||||
@ -3539,10 +3519,10 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
|
||||
if (delayed_work_pending(&krcp->monitor_work)) {
|
||||
delay_left = krcp->monitor_work.timer.expires - jiffies;
|
||||
if (delay < delay_left)
|
||||
mod_delayed_work(system_wq, &krcp->monitor_work, delay);
|
||||
mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
|
||||
return;
|
||||
}
|
||||
queue_delayed_work(system_wq, &krcp->monitor_work, delay);
|
||||
queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
|
||||
}
|
||||
|
||||
static void
|
||||
@ -3634,7 +3614,7 @@ static void kfree_rcu_monitor(struct work_struct *work)
|
||||
// be that the work is in the pending state when
|
||||
// channels have been detached following by each
|
||||
// other.
|
||||
queue_rcu_work(system_wq, &krwp->rcu_work);
|
||||
queue_rcu_work(system_unbound_wq, &krwp->rcu_work);
|
||||
}
|
||||
}
|
||||
|
||||
@ -3704,7 +3684,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp)
|
||||
if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
|
||||
!atomic_xchg(&krcp->work_in_progress, 1)) {
|
||||
if (atomic_read(&krcp->backoff_page_cache_fill)) {
|
||||
queue_delayed_work(system_wq,
|
||||
queue_delayed_work(system_unbound_wq,
|
||||
&krcp->page_cache_work,
|
||||
msecs_to_jiffies(rcu_delay_page_cache_fill_msec));
|
||||
} else {
|
||||
@ -3767,7 +3747,8 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
|
||||
}
|
||||
|
||||
// Finally insert and update the GP for this page.
|
||||
bnode->records[bnode->nr_records++] = ptr;
|
||||
bnode->nr_records++;
|
||||
bnode->records[bnode->nr_records - 1] = ptr;
|
||||
get_state_synchronize_rcu_full(&bnode->gp_snap);
|
||||
atomic_inc(&(*krcp)->bulk_count[idx]);
|
||||
|
||||
@ -4403,6 +4384,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
|
||||
{
|
||||
unsigned long __maybe_unused s = rcu_state.barrier_sequence;
|
||||
|
||||
rhp->next = rhp; // Mark the callback as having been invoked.
|
||||
if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) {
|
||||
rcu_barrier_trace(TPS("LastCB"), -1, s);
|
||||
complete(&rcu_state.barrier_completion);
|
||||
@ -4804,8 +4786,8 @@ rcu_boot_init_percpu_data(int cpu)
|
||||
/* Set up local state, ensuring consistent view of global state. */
|
||||
rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
|
||||
INIT_WORK(&rdp->strict_work, strict_work_handler);
|
||||
WARN_ON_ONCE(ct->dynticks_nesting != 1);
|
||||
WARN_ON_ONCE(rcu_dynticks_in_eqs(ct_dynticks_cpu(cpu)));
|
||||
WARN_ON_ONCE(ct->nesting != 1);
|
||||
WARN_ON_ONCE(rcu_watching_snap_in_eqs(ct_rcu_watching_cpu(cpu)));
|
||||
rdp->barrier_seq_snap = rcu_state.barrier_sequence;
|
||||
rdp->rcu_ofl_gp_seq = rcu_state.gp_seq;
|
||||
rdp->rcu_ofl_gp_state = RCU_GP_CLEANED;
|
||||
@ -4898,7 +4880,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
|
||||
rdp->qlen_last_fqs_check = 0;
|
||||
rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
|
||||
rdp->blimit = blimit;
|
||||
ct->dynticks_nesting = 1; /* CPU not up, no tearing. */
|
||||
ct->nesting = 1; /* CPU not up, no tearing. */
|
||||
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
|
||||
|
||||
/*
|
||||
@ -5058,7 +5040,7 @@ void rcutree_report_cpu_starting(unsigned int cpu)
|
||||
rnp = rdp->mynode;
|
||||
mask = rdp->grpmask;
|
||||
arch_spin_lock(&rcu_state.ofl_lock);
|
||||
rcu_dynticks_eqs_online();
|
||||
rcu_watching_online();
|
||||
raw_spin_lock(&rcu_state.barrier_lock);
|
||||
raw_spin_lock_rcu_node(rnp);
|
||||
WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);
|
||||
@ -5424,6 +5406,8 @@ static void __init rcu_init_one(void)
|
||||
while (i > rnp->grphi)
|
||||
rnp++;
|
||||
per_cpu_ptr(&rcu_data, i)->mynode = rnp;
|
||||
per_cpu_ptr(&rcu_data, i)->barrier_head.next =
|
||||
&per_cpu_ptr(&rcu_data, i)->barrier_head;
|
||||
rcu_boot_init_percpu_data(i);
|
||||
}
|
||||
}
|
||||
|
@ -206,7 +206,7 @@ struct rcu_data {
|
||||
long blimit; /* Upper limit on a processed batch */
|
||||
|
||||
/* 3) dynticks interface. */
|
||||
int dynticks_snap; /* Per-GP tracking for dynticks. */
|
||||
int watching_snap; /* Per-GP tracking for dynticks. */
|
||||
bool rcu_need_heavy_qs; /* GP old, so heavy quiescent state! */
|
||||
bool rcu_urgent_qs; /* GP old need light quiescent state. */
|
||||
bool rcu_forced_tick; /* Forced tick to provide QS. */
|
||||
@ -215,7 +215,7 @@ struct rcu_data {
|
||||
/* 4) rcu_barrier(), OOM callbacks, and expediting. */
|
||||
unsigned long barrier_seq_snap; /* Snap of rcu_state.barrier_sequence. */
|
||||
struct rcu_head barrier_head;
|
||||
int exp_dynticks_snap; /* Double-check need for IPI. */
|
||||
int exp_watching_snap; /* Double-check need for IPI. */
|
||||
|
||||
/* 5) Callback offloading. */
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
@ -411,7 +411,6 @@ struct rcu_state {
|
||||
arch_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp;
|
||||
/* Synchronize offline with */
|
||||
/* GP pre-initialization. */
|
||||
int nocb_is_setup; /* nocb is setup from boot */
|
||||
|
||||
/* synchronize_rcu() part. */
|
||||
struct llist_head srs_next; /* request a GP users. */
|
||||
@ -420,6 +419,11 @@ struct rcu_state {
|
||||
struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX];
|
||||
struct work_struct srs_cleanup_work;
|
||||
atomic_t srs_cleanups_pending; /* srs inflight worker cleanups. */
|
||||
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
struct mutex nocb_mutex; /* Guards (de-)offloading */
|
||||
int nocb_is_setup; /* nocb is setup from boot */
|
||||
#endif
|
||||
};
|
||||
|
||||
/* Values for rcu_state structure's gp_flags field. */
|
||||
|
@ -377,11 +377,11 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp)
|
||||
* post grace period updater's accesses is enforced by the
|
||||
* below acquire semantic.
|
||||
*/
|
||||
snap = ct_dynticks_cpu_acquire(cpu);
|
||||
if (rcu_dynticks_in_eqs(snap))
|
||||
snap = ct_rcu_watching_cpu_acquire(cpu);
|
||||
if (rcu_watching_snap_in_eqs(snap))
|
||||
mask_ofl_test |= mask;
|
||||
else
|
||||
rdp->exp_dynticks_snap = snap;
|
||||
rdp->exp_watching_snap = snap;
|
||||
}
|
||||
}
|
||||
mask_ofl_ipi = rnp->expmask & ~mask_ofl_test;
|
||||
@ -401,7 +401,7 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp)
|
||||
unsigned long mask = rdp->grpmask;
|
||||
|
||||
retry_ipi:
|
||||
if (rcu_dynticks_in_eqs_since(rdp, rdp->exp_dynticks_snap)) {
|
||||
if (rcu_watching_snap_stopped_since(rdp, rdp->exp_watching_snap)) {
|
||||
mask_ofl_test |= mask;
|
||||
continue;
|
||||
}
|
||||
@ -543,6 +543,67 @@ static bool synchronize_rcu_expedited_wait_once(long tlimit)
|
||||
return false;
|
||||
}
|
||||
|
||||
/*
|
||||
* Print out an expedited RCU CPU stall warning message.
|
||||
*/
|
||||
static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigned long j)
|
||||
{
|
||||
int cpu;
|
||||
unsigned long mask;
|
||||
int ndetected;
|
||||
struct rcu_node *rnp;
|
||||
struct rcu_node *rnp_root = rcu_get_root();
|
||||
|
||||
if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) {
|
||||
pr_err("INFO: %s detected expedited stalls, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name);
|
||||
return;
|
||||
}
|
||||
pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", rcu_state.name);
|
||||
ndetected = 0;
|
||||
rcu_for_each_leaf_node(rnp) {
|
||||
ndetected += rcu_print_task_exp_stall(rnp);
|
||||
for_each_leaf_node_possible_cpu(rnp, cpu) {
|
||||
struct rcu_data *rdp;
|
||||
|
||||
mask = leaf_node_cpu_bit(rnp, cpu);
|
||||
if (!(READ_ONCE(rnp->expmask) & mask))
|
||||
continue;
|
||||
ndetected++;
|
||||
rdp = per_cpu_ptr(&rcu_data, cpu);
|
||||
pr_cont(" %d-%c%c%c%c", cpu,
|
||||
"O."[!!cpu_online(cpu)],
|
||||
"o."[!!(rdp->grpmask & rnp->expmaskinit)],
|
||||
"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
|
||||
"D."[!!data_race(rdp->cpu_no_qs.b.exp)]);
|
||||
}
|
||||
}
|
||||
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
|
||||
j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask),
|
||||
".T"[!!data_race(rnp_root->exp_tasks)]);
|
||||
if (ndetected) {
|
||||
pr_err("blocking rcu_node structures (internal RCU debug):");
|
||||
rcu_for_each_node_breadth_first(rnp) {
|
||||
if (rnp == rnp_root)
|
||||
continue; /* printed unconditionally */
|
||||
if (sync_rcu_exp_done_unlocked(rnp))
|
||||
continue;
|
||||
pr_cont(" l=%u:%d-%d:%#lx/%c",
|
||||
rnp->level, rnp->grplo, rnp->grphi, data_race(rnp->expmask),
|
||||
".T"[!!data_race(rnp->exp_tasks)]);
|
||||
}
|
||||
pr_cont("\n");
|
||||
}
|
||||
rcu_for_each_leaf_node(rnp) {
|
||||
for_each_leaf_node_possible_cpu(rnp, cpu) {
|
||||
mask = leaf_node_cpu_bit(rnp, cpu);
|
||||
if (!(READ_ONCE(rnp->expmask) & mask))
|
||||
continue;
|
||||
dump_cpu_task(cpu);
|
||||
}
|
||||
rcu_exp_print_detail_task_stall_rnp(rnp);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Wait for the expedited grace period to elapse, issuing any needed
|
||||
* RCU CPU stall warnings along the way.
|
||||
@ -554,10 +615,8 @@ static void synchronize_rcu_expedited_wait(void)
|
||||
unsigned long jiffies_stall;
|
||||
unsigned long jiffies_start;
|
||||
unsigned long mask;
|
||||
int ndetected;
|
||||
struct rcu_data *rdp;
|
||||
struct rcu_node *rnp;
|
||||
struct rcu_node *rnp_root = rcu_get_root();
|
||||
unsigned long flags;
|
||||
|
||||
trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait"));
|
||||
@ -597,55 +656,7 @@ static void synchronize_rcu_expedited_wait(void)
|
||||
j = jiffies;
|
||||
rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_EXP, (void *)(j - jiffies_start));
|
||||
trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall"));
|
||||
pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {",
|
||||
rcu_state.name);
|
||||
ndetected = 0;
|
||||
rcu_for_each_leaf_node(rnp) {
|
||||
ndetected += rcu_print_task_exp_stall(rnp);
|
||||
for_each_leaf_node_possible_cpu(rnp, cpu) {
|
||||
struct rcu_data *rdp;
|
||||
|
||||
mask = leaf_node_cpu_bit(rnp, cpu);
|
||||
if (!(READ_ONCE(rnp->expmask) & mask))
|
||||
continue;
|
||||
ndetected++;
|
||||
rdp = per_cpu_ptr(&rcu_data, cpu);
|
||||
pr_cont(" %d-%c%c%c%c", cpu,
|
||||
"O."[!!cpu_online(cpu)],
|
||||
"o."[!!(rdp->grpmask & rnp->expmaskinit)],
|
||||
"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
|
||||
"D."[!!data_race(rdp->cpu_no_qs.b.exp)]);
|
||||
}
|
||||
}
|
||||
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
|
||||
j - jiffies_start, rcu_state.expedited_sequence,
|
||||
data_race(rnp_root->expmask),
|
||||
".T"[!!data_race(rnp_root->exp_tasks)]);
|
||||
if (ndetected) {
|
||||
pr_err("blocking rcu_node structures (internal RCU debug):");
|
||||
rcu_for_each_node_breadth_first(rnp) {
|
||||
if (rnp == rnp_root)
|
||||
continue; /* printed unconditionally */
|
||||
if (sync_rcu_exp_done_unlocked(rnp))
|
||||
continue;
|
||||
pr_cont(" l=%u:%d-%d:%#lx/%c",
|
||||
rnp->level, rnp->grplo, rnp->grphi,
|
||||
data_race(rnp->expmask),
|
||||
".T"[!!data_race(rnp->exp_tasks)]);
|
||||
}
|
||||
pr_cont("\n");
|
||||
}
|
||||
rcu_for_each_leaf_node(rnp) {
|
||||
for_each_leaf_node_possible_cpu(rnp, cpu) {
|
||||
mask = leaf_node_cpu_bit(rnp, cpu);
|
||||
if (!(READ_ONCE(rnp->expmask) & mask))
|
||||
continue;
|
||||
preempt_disable(); // For smp_processor_id() in dump_cpu_task().
|
||||
dump_cpu_task(cpu);
|
||||
preempt_enable();
|
||||
}
|
||||
rcu_exp_print_detail_task_stall_rnp(rnp);
|
||||
}
|
||||
synchronize_rcu_expedited_stall(jiffies_start, j);
|
||||
jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3;
|
||||
|
||||
nbcon_cpu_emergency_exit();
|
||||
|
@ -16,10 +16,6 @@
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
|
||||
static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */
|
||||
static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
|
||||
{
|
||||
return lockdep_is_held(&rdp->nocb_lock);
|
||||
}
|
||||
|
||||
static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
|
||||
{
|
||||
@ -220,7 +216,7 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
|
||||
raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
|
||||
if (needwake) {
|
||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake"));
|
||||
wake_up_process(rdp_gp->nocb_gp_kthread);
|
||||
swake_up_one_online(&rdp_gp->nocb_gp_wq);
|
||||
}
|
||||
|
||||
return needwake;
|
||||
@ -413,14 +409,6 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||
return false;
|
||||
}
|
||||
|
||||
// In the process of (de-)offloading: no bypassing, but
|
||||
// locking.
|
||||
if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
|
||||
rcu_nocb_lock(rdp);
|
||||
*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
|
||||
return false; /* Not offloaded, no bypassing. */
|
||||
}
|
||||
|
||||
// Don't use ->nocb_bypass during early boot.
|
||||
if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
|
||||
rcu_nocb_lock(rdp);
|
||||
@ -505,7 +493,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
|
||||
}
|
||||
rcu_nocb_bypass_unlock(rdp);
|
||||
smp_mb(); /* Order enqueue before wake. */
|
||||
|
||||
// A wake up of the grace period kthread or timer adjustment
|
||||
// needs to be done only if:
|
||||
// 1. Bypass list was fully empty before (this is the first
|
||||
@ -616,37 +604,33 @@ static void call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *head,
|
||||
}
|
||||
}
|
||||
|
||||
static int nocb_gp_toggle_rdp(struct rcu_data *rdp)
|
||||
static void nocb_gp_toggle_rdp(struct rcu_data *rdp_gp, struct rcu_data *rdp)
|
||||
{
|
||||
struct rcu_segcblist *cblist = &rdp->cblist;
|
||||
unsigned long flags;
|
||||
int ret;
|
||||
|
||||
rcu_nocb_lock_irqsave(rdp, flags);
|
||||
if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) &&
|
||||
!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) {
|
||||
/*
|
||||
* Locking orders future de-offloaded callbacks enqueue against previous
|
||||
* handling of this rdp. Ie: Make sure rcuog is done with this rdp before
|
||||
* deoffloaded callbacks can be enqueued.
|
||||
*/
|
||||
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
|
||||
if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
|
||||
/*
|
||||
* Offloading. Set our flag and notify the offload worker.
|
||||
* We will handle this rdp until it ever gets de-offloaded.
|
||||
*/
|
||||
rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_GP);
|
||||
ret = 1;
|
||||
} else if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) &&
|
||||
rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) {
|
||||
list_add_tail(&rdp->nocb_entry_rdp, &rdp_gp->nocb_head_rdp);
|
||||
rcu_segcblist_set_flags(cblist, SEGCBLIST_OFFLOADED);
|
||||
} else {
|
||||
/*
|
||||
* De-offloading. Clear our flag and notify the de-offload worker.
|
||||
* We will ignore this rdp until it ever gets re-offloaded.
|
||||
*/
|
||||
rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP);
|
||||
ret = 0;
|
||||
} else {
|
||||
WARN_ON_ONCE(1);
|
||||
ret = -1;
|
||||
list_del(&rdp->nocb_entry_rdp);
|
||||
rcu_segcblist_clear_flags(cblist, SEGCBLIST_OFFLOADED);
|
||||
}
|
||||
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
|
||||
return ret;
|
||||
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
|
||||
}
|
||||
|
||||
static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu)
|
||||
@ -853,14 +837,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
|
||||
}
|
||||
|
||||
if (rdp_toggling) {
|
||||
int ret;
|
||||
|
||||
ret = nocb_gp_toggle_rdp(rdp_toggling);
|
||||
if (ret == 1)
|
||||
list_add_tail(&rdp_toggling->nocb_entry_rdp, &my_rdp->nocb_head_rdp);
|
||||
else if (ret == 0)
|
||||
list_del(&rdp_toggling->nocb_entry_rdp);
|
||||
|
||||
nocb_gp_toggle_rdp(my_rdp, rdp_toggling);
|
||||
swake_up_one(&rdp_toggling->nocb_state_wq);
|
||||
}
|
||||
|
||||
@ -917,7 +894,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
|
||||
WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
|
||||
|
||||
local_irq_save(flags);
|
||||
rcu_momentary_dyntick_idle();
|
||||
rcu_momentary_eqs();
|
||||
local_irq_restore(flags);
|
||||
/*
|
||||
* Disable BH to provide the expected environment. Also, when
|
||||
@ -1030,16 +1007,11 @@ void rcu_nocb_flush_deferred_wakeup(void)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup);
|
||||
|
||||
static int rdp_offload_toggle(struct rcu_data *rdp,
|
||||
bool offload, unsigned long flags)
|
||||
__releases(rdp->nocb_lock)
|
||||
static int rcu_nocb_queue_toggle_rdp(struct rcu_data *rdp)
|
||||
{
|
||||
struct rcu_segcblist *cblist = &rdp->cblist;
|
||||
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
|
||||
bool wake_gp = false;
|
||||
|
||||
rcu_segcblist_offload(cblist, offload);
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
unsigned long flags;
|
||||
|
||||
raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
|
||||
// Queue this rdp for add/del to/from the list to iterate on rcuog
|
||||
@ -1053,89 +1025,74 @@ static int rdp_offload_toggle(struct rcu_data *rdp,
|
||||
return wake_gp;
|
||||
}
|
||||
|
||||
static long rcu_nocb_rdp_deoffload(void *arg)
|
||||
static bool rcu_nocb_rdp_deoffload_wait_cond(struct rcu_data *rdp)
|
||||
{
|
||||
unsigned long flags;
|
||||
bool ret;
|
||||
|
||||
/*
|
||||
* Locking makes sure rcuog is done handling this rdp before deoffloaded
|
||||
* enqueue can happen. Also it keeps the SEGCBLIST_OFFLOADED flag stable
|
||||
* while the ->nocb_lock is held.
|
||||
*/
|
||||
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
|
||||
ret = !rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
|
||||
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
|
||||
{
|
||||
struct rcu_data *rdp = arg;
|
||||
struct rcu_segcblist *cblist = &rdp->cblist;
|
||||
unsigned long flags;
|
||||
int wake_gp;
|
||||
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
|
||||
|
||||
/*
|
||||
* rcu_nocb_rdp_deoffload() may be called directly if
|
||||
* rcuog/o[p] spawn failed, because at this time the rdp->cpu
|
||||
* is not online yet.
|
||||
*/
|
||||
WARN_ON_ONCE((rdp->cpu != raw_smp_processor_id()) && cpu_online(rdp->cpu));
|
||||
/* CPU must be offline, unless it's early boot */
|
||||
WARN_ON_ONCE(cpu_online(rdp->cpu) && rdp->cpu != raw_smp_processor_id());
|
||||
|
||||
pr_info("De-offloading %d\n", rdp->cpu);
|
||||
|
||||
/* Flush all callbacks from segcblist and bypass */
|
||||
rcu_barrier();
|
||||
|
||||
/*
|
||||
* Make sure the rcuoc kthread isn't in the middle of a nocb locked
|
||||
* sequence while offloading is deactivated, along with nocb locking.
|
||||
*/
|
||||
if (rdp->nocb_cb_kthread)
|
||||
kthread_park(rdp->nocb_cb_kthread);
|
||||
|
||||
rcu_nocb_lock_irqsave(rdp, flags);
|
||||
/*
|
||||
* Flush once and for all now. This suffices because we are
|
||||
* running on the target CPU holding ->nocb_lock (thus having
|
||||
* interrupts disabled), and because rdp_offload_toggle()
|
||||
* invokes rcu_segcblist_offload(), which clears SEGCBLIST_OFFLOADED.
|
||||
* Thus future calls to rcu_segcblist_completely_offloaded() will
|
||||
* return false, which means that future calls to rcu_nocb_try_bypass()
|
||||
* will refuse to put anything into the bypass.
|
||||
*/
|
||||
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
|
||||
/*
|
||||
* Start with invoking rcu_core() early. This way if the current thread
|
||||
* happens to preempt an ongoing call to rcu_core() in the middle,
|
||||
* leaving some work dismissed because rcu_core() still thinks the rdp is
|
||||
* completely offloaded, we are guaranteed a nearby future instance of
|
||||
* rcu_core() to catch up.
|
||||
*/
|
||||
rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE);
|
||||
invoke_rcu_core();
|
||||
wake_gp = rdp_offload_toggle(rdp, false, flags);
|
||||
WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
|
||||
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
|
||||
wake_gp = rcu_nocb_queue_toggle_rdp(rdp);
|
||||
|
||||
mutex_lock(&rdp_gp->nocb_gp_kthread_mutex);
|
||||
|
||||
if (rdp_gp->nocb_gp_kthread) {
|
||||
if (wake_gp)
|
||||
wake_up_process(rdp_gp->nocb_gp_kthread);
|
||||
|
||||
swait_event_exclusive(rdp->nocb_state_wq,
|
||||
!rcu_segcblist_test_flags(cblist,
|
||||
SEGCBLIST_KTHREAD_GP));
|
||||
if (rdp->nocb_cb_kthread)
|
||||
kthread_park(rdp->nocb_cb_kthread);
|
||||
rcu_nocb_rdp_deoffload_wait_cond(rdp));
|
||||
} else {
|
||||
/*
|
||||
* No kthread to clear the flags for us or remove the rdp from the nocb list
|
||||
* to iterate. Do it here instead. Locking doesn't look stricly necessary
|
||||
* but we stick to paranoia in this rare path.
|
||||
*/
|
||||
rcu_nocb_lock_irqsave(rdp, flags);
|
||||
rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP);
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
|
||||
rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
|
||||
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
|
||||
|
||||
list_del(&rdp->nocb_entry_rdp);
|
||||
}
|
||||
|
||||
mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex);
|
||||
|
||||
/*
|
||||
* Lock one last time to acquire latest callback updates from kthreads
|
||||
* so we can later handle callbacks locally without locking.
|
||||
*/
|
||||
rcu_nocb_lock_irqsave(rdp, flags);
|
||||
/*
|
||||
* Theoretically we could clear SEGCBLIST_LOCKING after the nocb
|
||||
* lock is released but how about being paranoid for once?
|
||||
*/
|
||||
rcu_segcblist_clear_flags(cblist, SEGCBLIST_LOCKING);
|
||||
/*
|
||||
* Without SEGCBLIST_LOCKING, we can't use
|
||||
* rcu_nocb_unlock_irqrestore() anymore.
|
||||
*/
|
||||
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
|
||||
|
||||
/* Sanity check */
|
||||
WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
|
||||
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -1145,33 +1102,42 @@ int rcu_nocb_cpu_deoffload(int cpu)
|
||||
int ret = 0;
|
||||
|
||||
cpus_read_lock();
|
||||
mutex_lock(&rcu_state.barrier_mutex);
|
||||
mutex_lock(&rcu_state.nocb_mutex);
|
||||
if (rcu_rdp_is_offloaded(rdp)) {
|
||||
if (cpu_online(cpu)) {
|
||||
ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
|
||||
if (!cpu_online(cpu)) {
|
||||
ret = rcu_nocb_rdp_deoffload(rdp);
|
||||
if (!ret)
|
||||
cpumask_clear_cpu(cpu, rcu_nocb_mask);
|
||||
} else {
|
||||
pr_info("NOCB: Cannot CB-deoffload offline CPU %d\n", rdp->cpu);
|
||||
pr_info("NOCB: Cannot CB-deoffload online CPU %d\n", rdp->cpu);
|
||||
ret = -EINVAL;
|
||||
}
|
||||
}
|
||||
mutex_unlock(&rcu_state.barrier_mutex);
|
||||
mutex_unlock(&rcu_state.nocb_mutex);
|
||||
cpus_read_unlock();
|
||||
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
|
||||
|
||||
static long rcu_nocb_rdp_offload(void *arg)
|
||||
static bool rcu_nocb_rdp_offload_wait_cond(struct rcu_data *rdp)
|
||||
{
|
||||
struct rcu_data *rdp = arg;
|
||||
struct rcu_segcblist *cblist = &rdp->cblist;
|
||||
unsigned long flags;
|
||||
bool ret;
|
||||
|
||||
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
|
||||
ret = rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
|
||||
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
|
||||
{
|
||||
int wake_gp;
|
||||
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
|
||||
|
||||
WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
|
||||
WARN_ON_ONCE(cpu_online(rdp->cpu));
|
||||
/*
|
||||
* For now we only support re-offload, ie: the rdp must have been
|
||||
* offloaded on boot first.
|
||||
@ -1184,44 +1150,17 @@ static long rcu_nocb_rdp_offload(void *arg)
|
||||
|
||||
pr_info("Offloading %d\n", rdp->cpu);
|
||||
|
||||
/*
|
||||
* Can't use rcu_nocb_lock_irqsave() before SEGCBLIST_LOCKING
|
||||
* is set.
|
||||
*/
|
||||
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
|
||||
WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
|
||||
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
|
||||
|
||||
/*
|
||||
* We didn't take the nocb lock while working on the
|
||||
* rdp->cblist with SEGCBLIST_LOCKING cleared (pure softirq/rcuc mode).
|
||||
* Every modifications that have been done previously on
|
||||
* rdp->cblist must be visible remotely by the nocb kthreads
|
||||
* upon wake up after reading the cblist flags.
|
||||
*
|
||||
* The layout against nocb_lock enforces that ordering:
|
||||
*
|
||||
* __rcu_nocb_rdp_offload() nocb_cb_wait()/nocb_gp_wait()
|
||||
* ------------------------- ----------------------------
|
||||
* WRITE callbacks rcu_nocb_lock()
|
||||
* rcu_nocb_lock() READ flags
|
||||
* WRITE flags READ callbacks
|
||||
* rcu_nocb_unlock() rcu_nocb_unlock()
|
||||
*/
|
||||
wake_gp = rdp_offload_toggle(rdp, true, flags);
|
||||
wake_gp = rcu_nocb_queue_toggle_rdp(rdp);
|
||||
if (wake_gp)
|
||||
wake_up_process(rdp_gp->nocb_gp_kthread);
|
||||
|
||||
kthread_unpark(rdp->nocb_cb_kthread);
|
||||
|
||||
swait_event_exclusive(rdp->nocb_state_wq,
|
||||
rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP));
|
||||
rcu_nocb_rdp_offload_wait_cond(rdp));
|
||||
|
||||
/*
|
||||
* All kthreads are ready to work, we can finally relieve rcu_core() and
|
||||
* enable nocb bypass.
|
||||
*/
|
||||
rcu_nocb_lock_irqsave(rdp, flags);
|
||||
rcu_segcblist_clear_flags(cblist, SEGCBLIST_RCU_CORE);
|
||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||
kthread_unpark(rdp->nocb_cb_kthread);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -1232,18 +1171,18 @@ int rcu_nocb_cpu_offload(int cpu)
|
||||
int ret = 0;
|
||||
|
||||
cpus_read_lock();
|
||||
mutex_lock(&rcu_state.barrier_mutex);
|
||||
mutex_lock(&rcu_state.nocb_mutex);
|
||||
if (!rcu_rdp_is_offloaded(rdp)) {
|
||||
if (cpu_online(cpu)) {
|
||||
ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
|
||||
if (!cpu_online(cpu)) {
|
||||
ret = rcu_nocb_rdp_offload(rdp);
|
||||
if (!ret)
|
||||
cpumask_set_cpu(cpu, rcu_nocb_mask);
|
||||
} else {
|
||||
pr_info("NOCB: Cannot CB-offload offline CPU %d\n", rdp->cpu);
|
||||
pr_info("NOCB: Cannot CB-offload online CPU %d\n", rdp->cpu);
|
||||
ret = -EINVAL;
|
||||
}
|
||||
}
|
||||
mutex_unlock(&rcu_state.barrier_mutex);
|
||||
mutex_unlock(&rcu_state.nocb_mutex);
|
||||
cpus_read_unlock();
|
||||
|
||||
return ret;
|
||||
@ -1261,7 +1200,7 @@ lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
|
||||
return 0;
|
||||
|
||||
/* Protect rcu_nocb_mask against concurrent (de-)offloading. */
|
||||
if (!mutex_trylock(&rcu_state.barrier_mutex))
|
||||
if (!mutex_trylock(&rcu_state.nocb_mutex))
|
||||
return 0;
|
||||
|
||||
/* Snapshot count of all CPUs */
|
||||
@ -1271,7 +1210,7 @@ lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
|
||||
count += READ_ONCE(rdp->lazy_len);
|
||||
}
|
||||
|
||||
mutex_unlock(&rcu_state.barrier_mutex);
|
||||
mutex_unlock(&rcu_state.nocb_mutex);
|
||||
|
||||
return count ? count : SHRINK_EMPTY;
|
||||
}
|
||||
@ -1289,9 +1228,9 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
|
||||
* Protect against concurrent (de-)offloading. Otherwise nocb locking
|
||||
* may be ignored or imbalanced.
|
||||
*/
|
||||
if (!mutex_trylock(&rcu_state.barrier_mutex)) {
|
||||
if (!mutex_trylock(&rcu_state.nocb_mutex)) {
|
||||
/*
|
||||
* But really don't insist if barrier_mutex is contended since we
|
||||
* But really don't insist if nocb_mutex is contended since we
|
||||
* can't guarantee that it will never engage in a dependency
|
||||
* chain involving memory allocation. The lock is seldom contended
|
||||
* anyway.
|
||||
@ -1330,7 +1269,7 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
|
||||
break;
|
||||
}
|
||||
|
||||
mutex_unlock(&rcu_state.barrier_mutex);
|
||||
mutex_unlock(&rcu_state.nocb_mutex);
|
||||
|
||||
return count ? count : SHRINK_STOP;
|
||||
}
|
||||
@ -1396,9 +1335,7 @@ void __init rcu_init_nohz(void)
|
||||
rdp = per_cpu_ptr(&rcu_data, cpu);
|
||||
if (rcu_segcblist_empty(&rdp->cblist))
|
||||
rcu_segcblist_init(&rdp->cblist);
|
||||
rcu_segcblist_offload(&rdp->cblist, true);
|
||||
rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP);
|
||||
rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_RCU_CORE);
|
||||
rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
|
||||
}
|
||||
rcu_organize_nocb_kthreads();
|
||||
}
|
||||
@ -1446,7 +1383,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
|
||||
"rcuog/%d", rdp_gp->cpu);
|
||||
if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) {
|
||||
mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex);
|
||||
goto end;
|
||||
goto err;
|
||||
}
|
||||
WRITE_ONCE(rdp_gp->nocb_gp_kthread, t);
|
||||
if (kthread_prio)
|
||||
@ -1458,7 +1395,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
|
||||
t = kthread_create(rcu_nocb_cb_kthread, rdp,
|
||||
"rcuo%c/%d", rcu_state.abbr, cpu);
|
||||
if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
|
||||
goto end;
|
||||
goto err;
|
||||
|
||||
if (rcu_rdp_is_offloaded(rdp))
|
||||
wake_up_process(t);
|
||||
@ -1471,13 +1408,21 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
|
||||
WRITE_ONCE(rdp->nocb_cb_kthread, t);
|
||||
WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
|
||||
return;
|
||||
end:
|
||||
mutex_lock(&rcu_state.barrier_mutex);
|
||||
|
||||
err:
|
||||
/*
|
||||
* No need to protect against concurrent rcu_barrier()
|
||||
* because the number of callbacks should be 0 for a non-boot CPU,
|
||||
* therefore rcu_barrier() shouldn't even try to grab the nocb_lock.
|
||||
* But hold nocb_mutex to avoid nocb_lock imbalance from shrinker.
|
||||
*/
|
||||
WARN_ON_ONCE(system_state > SYSTEM_BOOTING && rcu_segcblist_n_cbs(&rdp->cblist));
|
||||
mutex_lock(&rcu_state.nocb_mutex);
|
||||
if (rcu_rdp_is_offloaded(rdp)) {
|
||||
rcu_nocb_rdp_deoffload(rdp);
|
||||
cpumask_clear_cpu(cpu, rcu_nocb_mask);
|
||||
}
|
||||
mutex_unlock(&rcu_state.barrier_mutex);
|
||||
mutex_unlock(&rcu_state.nocb_mutex);
|
||||
}
|
||||
|
||||
/* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */
|
||||
@ -1653,16 +1598,6 @@ static void show_rcu_nocb_state(struct rcu_data *rdp)
|
||||
|
||||
#else /* #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
|
||||
static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
/* No ->nocb_lock to acquire. */
|
||||
static void rcu_nocb_lock(struct rcu_data *rdp)
|
||||
{
|
||||
|
@ -24,10 +24,11 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
|
||||
* timers have their own means of synchronization against the
|
||||
* offloaded state updaters.
|
||||
*/
|
||||
RCU_LOCKDEP_WARN(
|
||||
RCU_NOCB_LOCKDEP_WARN(
|
||||
!(lockdep_is_held(&rcu_state.barrier_mutex) ||
|
||||
(IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
|
||||
rcu_lockdep_is_held_nocb(rdp) ||
|
||||
lockdep_is_held(&rdp->nocb_lock) ||
|
||||
lockdep_is_held(&rcu_state.nocb_mutex) ||
|
||||
(!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) &&
|
||||
rdp == this_cpu_ptr(&rcu_data)) ||
|
||||
rcu_current_is_nocb_kthread(rdp)),
|
||||
@ -869,7 +870,7 @@ static void rcu_qs(void)
|
||||
|
||||
/*
|
||||
* Register an urgently needed quiescent state. If there is an
|
||||
* emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight
|
||||
* emergency, invoke rcu_momentary_eqs() to do a heavy-weight
|
||||
* dyntick-idle quiescent state visible to other CPUs, which will in
|
||||
* some cases serve for expedited as well as normal grace periods.
|
||||
* Either way, register a lightweight quiescent state.
|
||||
@ -889,7 +890,7 @@ void rcu_all_qs(void)
|
||||
this_cpu_write(rcu_data.rcu_urgent_qs, false);
|
||||
if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) {
|
||||
local_irq_save(flags);
|
||||
rcu_momentary_dyntick_idle();
|
||||
rcu_momentary_eqs();
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
rcu_qs();
|
||||
@ -909,7 +910,7 @@ void rcu_note_context_switch(bool preempt)
|
||||
goto out;
|
||||
this_cpu_write(rcu_data.rcu_urgent_qs, false);
|
||||
if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs)))
|
||||
rcu_momentary_dyntick_idle();
|
||||
rcu_momentary_eqs();
|
||||
out:
|
||||
rcu_tasks_qs(current, preempt);
|
||||
trace_rcu_utilization(TPS("End context switch"));
|
||||
|
@ -10,6 +10,7 @@
|
||||
#include <linux/console.h>
|
||||
#include <linux/kvm_para.h>
|
||||
#include <linux/rcu_notifier.h>
|
||||
#include <linux/smp.h>
|
||||
|
||||
//////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
@ -371,6 +372,7 @@ static void rcu_dump_cpu_stacks(void)
|
||||
struct rcu_node *rnp;
|
||||
|
||||
rcu_for_each_leaf_node(rnp) {
|
||||
printk_deferred_enter();
|
||||
raw_spin_lock_irqsave_rcu_node(rnp, flags);
|
||||
for_each_leaf_node_possible_cpu(rnp, cpu)
|
||||
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
|
||||
@ -380,6 +382,7 @@ static void rcu_dump_cpu_stacks(void)
|
||||
dump_cpu_task(cpu);
|
||||
}
|
||||
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
|
||||
printk_deferred_exit();
|
||||
}
|
||||
}
|
||||
|
||||
@ -502,7 +505,7 @@ static void print_cpu_stall_info(int cpu)
|
||||
}
|
||||
delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
|
||||
falsepositive = rcu_is_gp_kthread_starving(NULL) &&
|
||||
rcu_dynticks_in_eqs(ct_dynticks_cpu(cpu));
|
||||
rcu_watching_snap_in_eqs(ct_rcu_watching_cpu(cpu));
|
||||
rcuc_starved = rcu_is_rcuc_kthread_starving(rdp, &j);
|
||||
if (rcuc_starved)
|
||||
// Print signed value, as negative values indicate a probable bug.
|
||||
@ -516,8 +519,8 @@ static void print_cpu_stall_info(int cpu)
|
||||
rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
|
||||
"!."[!delta],
|
||||
ticks_value, ticks_title,
|
||||
ct_dynticks_cpu(cpu) & 0xffff,
|
||||
ct_dynticks_nesting_cpu(cpu), ct_dynticks_nmi_nesting_cpu(cpu),
|
||||
ct_rcu_watching_cpu(cpu) & 0xffff,
|
||||
ct_nesting_cpu(cpu), ct_nmi_nesting_cpu(cpu),
|
||||
rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
|
||||
data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
|
||||
rcuc_starved ? buf : "",
|
||||
@ -728,6 +731,9 @@ static void print_cpu_stall(unsigned long gps)
|
||||
set_preempt_need_resched();
|
||||
}
|
||||
|
||||
static bool csd_lock_suppress_rcu_stall;
|
||||
module_param(csd_lock_suppress_rcu_stall, bool, 0644);
|
||||
|
||||
static void check_cpu_stall(struct rcu_data *rdp)
|
||||
{
|
||||
bool self_detected;
|
||||
@ -800,7 +806,9 @@ static void check_cpu_stall(struct rcu_data *rdp)
|
||||
return;
|
||||
|
||||
rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_NORM, (void *)j - gps);
|
||||
if (self_detected) {
|
||||
if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) {
|
||||
pr_err("INFO: %s detected stall, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name);
|
||||
} else if (self_detected) {
|
||||
/* We haven't checked in, so go dump stack. */
|
||||
print_cpu_stall(gps);
|
||||
} else {
|
||||
|
@ -5762,7 +5762,7 @@ static inline void schedule_debug(struct task_struct *prev, bool preempt)
|
||||
preempt_count_set(PREEMPT_DISABLED);
|
||||
}
|
||||
rcu_sleep_check();
|
||||
SCHED_WARN_ON(ct_state() == CONTEXT_USER);
|
||||
SCHED_WARN_ON(ct_state() == CT_STATE_USER);
|
||||
|
||||
profile_hit(SCHED_PROFILING, __builtin_return_address(0));
|
||||
|
||||
@ -6658,7 +6658,7 @@ asmlinkage __visible void __sched schedule_user(void)
|
||||
* we find a better solution.
|
||||
*
|
||||
* NB: There are buggy callers of this function. Ideally we
|
||||
* should warn if prev_state != CONTEXT_USER, but that will trigger
|
||||
* should warn if prev_state != CT_STATE_USER, but that will trigger
|
||||
* too frequently to make sense yet.
|
||||
*/
|
||||
enum ctx_state prev_state = exception_enter();
|
||||
@ -9752,7 +9752,7 @@ struct cgroup_subsys cpu_cgrp_subsys = {
|
||||
|
||||
void dump_cpu_task(int cpu)
|
||||
{
|
||||
if (cpu == smp_processor_id() && in_hardirq()) {
|
||||
if (in_hardirq() && cpu == smp_processor_id()) {
|
||||
struct pt_regs *regs;
|
||||
|
||||
regs = get_irq_regs();
|
||||
|
38
kernel/smp.c
@ -208,12 +208,25 @@ static int csd_lock_wait_getcpu(call_single_data_t *csd)
|
||||
return -1;
|
||||
}
|
||||
|
||||
static atomic_t n_csd_lock_stuck;
|
||||
|
||||
/**
|
||||
* csd_lock_is_stuck - Has a CSD-lock acquisition been stuck too long?
|
||||
*
|
||||
* Returns @true if a CSD-lock acquisition is stuck and has been stuck
|
||||
* long enough for a "non-responsive CSD lock" message to be printed.
|
||||
*/
|
||||
bool csd_lock_is_stuck(void)
|
||||
{
|
||||
return !!atomic_read(&n_csd_lock_stuck);
|
||||
}
|
||||
|
||||
/*
|
||||
* Complain if too much time spent waiting. Note that only
|
||||
* the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
|
||||
* so waiting on other types gets much less information.
|
||||
*/
|
||||
static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
|
||||
static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id, unsigned long *nmessages)
|
||||
{
|
||||
int cpu = -1;
|
||||
int cpux;
|
||||
@ -229,15 +242,26 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
|
||||
cpu = csd_lock_wait_getcpu(csd);
|
||||
pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
|
||||
*bug_id, raw_smp_processor_id(), cpu);
|
||||
atomic_dec(&n_csd_lock_stuck);
|
||||
return true;
|
||||
}
|
||||
|
||||
ts2 = sched_clock();
|
||||
/* How long since we last checked for a stuck CSD lock.*/
|
||||
ts_delta = ts2 - *ts1;
|
||||
if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0))
|
||||
if (likely(ts_delta <= csd_lock_timeout_ns * (*nmessages + 1) *
|
||||
(!*nmessages ? 1 : (ilog2(num_online_cpus()) / 2 + 1)) ||
|
||||
csd_lock_timeout_ns == 0))
|
||||
return false;
|
||||
|
||||
if (ts0 > ts2) {
|
||||
/* Our own sched_clock went backward; don't blame another CPU. */
|
||||
ts_delta = ts0 - ts2;
|
||||
pr_alert("sched_clock on CPU %d went backward by %llu ns\n", raw_smp_processor_id(), ts_delta);
|
||||
*ts1 = ts2;
|
||||
return false;
|
||||
}
|
||||
|
||||
firsttime = !*bug_id;
|
||||
if (firsttime)
|
||||
*bug_id = atomic_inc_return(&csd_bug_count);
|
||||
@ -249,9 +273,12 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
|
||||
cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
|
||||
/* How long since this CSD lock was stuck. */
|
||||
ts_delta = ts2 - ts0;
|
||||
pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
|
||||
firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts_delta,
|
||||
pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n",
|
||||
firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta,
|
||||
cpu, csd->func, csd->info);
|
||||
(*nmessages)++;
|
||||
if (firsttime)
|
||||
atomic_inc(&n_csd_lock_stuck);
|
||||
/*
|
||||
* If the CSD lock is still stuck after 5 minutes, it is unlikely
|
||||
* to become unstuck. Use a signed comparison to avoid triggering
|
||||
@ -290,12 +317,13 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
|
||||
*/
|
||||
static void __csd_lock_wait(call_single_data_t *csd)
|
||||
{
|
||||
unsigned long nmessages = 0;
|
||||
int bug_id = 0;
|
||||
u64 ts0, ts1;
|
||||
|
||||
ts1 = ts0 = sched_clock();
|
||||
for (;;) {
|
||||
if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id))
|
||||
if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages))
|
||||
break;
|
||||
cpu_relax();
|
||||
}
|
||||
|
@ -251,7 +251,7 @@ static int multi_cpu_stop(void *data)
|
||||
*/
|
||||
touch_nmi_watchdog();
|
||||
}
|
||||
rcu_momentary_dyntick_idle();
|
||||
rcu_momentary_eqs();
|
||||
} while (curstate != MULTI_STOP_EXIT);
|
||||
|
||||
local_irq_restore(flags);
|
||||
|
@ -1541,7 +1541,7 @@ static int run_osnoise(void)
|
||||
* This will eventually cause unwarranted noise as PREEMPT_RCU
|
||||
* will force preemption as the means of ending the current
|
||||
* grace period. We avoid this problem by calling
|
||||
* rcu_momentary_dyntick_idle(), which performs a zero duration
|
||||
* rcu_momentary_eqs(), which performs a zero duration
|
||||
* EQS allowing PREEMPT_RCU to end the current grace period.
|
||||
* This call shouldn't be wrapped inside an RCU critical
|
||||
* section.
|
||||
@ -1553,7 +1553,7 @@ static int run_osnoise(void)
|
||||
if (!disable_irq)
|
||||
local_irq_disable();
|
||||
|
||||
rcu_momentary_dyntick_idle();
|
||||
rcu_momentary_eqs();
|
||||
|
||||
if (!disable_irq)
|
||||
local_irq_enable();
|
||||
|
@ -1614,6 +1614,7 @@ config SCF_TORTURE_TEST
|
||||
config CSD_LOCK_WAIT_DEBUG
|
||||
bool "Debugging for csd_lock_wait(), called from smp_call_function*()"
|
||||
depends on DEBUG_KERNEL
|
||||
depends on SMP
|
||||
depends on 64BIT
|
||||
default n
|
||||
help
|
||||
|
@ -21,12 +21,10 @@ fi
|
||||
bpftrace -e 'kprobe:kvfree_call_rcu,
|
||||
kprobe:call_rcu,
|
||||
kprobe:call_rcu_tasks,
|
||||
kprobe:call_rcu_tasks_rude,
|
||||
kprobe:call_rcu_tasks_trace,
|
||||
kprobe:call_srcu,
|
||||
kprobe:rcu_barrier,
|
||||
kprobe:rcu_barrier_tasks,
|
||||
kprobe:rcu_barrier_tasks_rude,
|
||||
kprobe:rcu_barrier_tasks_trace,
|
||||
kprobe:srcu_barrier,
|
||||
kprobe:synchronize_rcu,
|
||||
|
@ -68,6 +68,8 @@ config_override_param "--gdb options" KcList "$TORTURE_KCONFIG_GDB_ARG"
|
||||
config_override_param "--kasan options" KcList "$TORTURE_KCONFIG_KASAN_ARG"
|
||||
config_override_param "--kcsan options" KcList "$TORTURE_KCONFIG_KCSAN_ARG"
|
||||
config_override_param "--kconfig argument" KcList "$TORTURE_KCONFIG_ARG"
|
||||
config_override_param "$config_dir/CFcommon.$(uname -m)" KcList \
|
||||
"`cat $config_dir/CFcommon.$(uname -m) 2> /dev/null`"
|
||||
cp $T/KcList $resdir/ConfigFragment
|
||||
|
||||
base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'`
|
||||
|
@ -19,10 +19,10 @@ PATH=${RCUTORTURE}/bin:$PATH; export PATH
|
||||
|
||||
TORTURE_ALLOTED_CPUS="`identify_qemu_vcpus`"
|
||||
MAKE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS*2))
|
||||
HALF_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2))
|
||||
if test "$HALF_ALLOTED_CPUS" -lt 1
|
||||
SCALE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2))
|
||||
if test "$SCALE_ALLOTED_CPUS" -lt 1
|
||||
then
|
||||
HALF_ALLOTED_CPUS=1
|
||||
SCALE_ALLOTED_CPUS=1
|
||||
fi
|
||||
VERBOSE_BATCH_CPUS=$((TORTURE_ALLOTED_CPUS/16))
|
||||
if test "$VERBOSE_BATCH_CPUS" -lt 2
|
||||
@ -90,6 +90,7 @@ usage () {
|
||||
echo " --do-scftorture / --do-no-scftorture / --no-scftorture"
|
||||
echo " --do-srcu-lockdep / --do-no-srcu-lockdep / --no-srcu-lockdep"
|
||||
echo " --duration [ <minutes> | <hours>h | <days>d ]"
|
||||
echo " --guest-cpu-limit N"
|
||||
echo " --kcsan-kmake-arg kernel-make-arguments"
|
||||
exit 1
|
||||
}
|
||||
@ -203,6 +204,21 @@ do
|
||||
duration_base=$(($ts*mult))
|
||||
shift
|
||||
;;
|
||||
--guest-cpu-limit|--guest-cpu-lim)
|
||||
checkarg --guest-cpu-limit "(number)" "$#" "$2" '^[0-9]*$' '^--'
|
||||
if (("$2" <= "$TORTURE_ALLOTED_CPUS" / 2))
|
||||
then
|
||||
SCALE_ALLOTED_CPUS="$2"
|
||||
VERBOSE_BATCH_CPUS="$((SCALE_ALLOTED_CPUS/8))"
|
||||
if (("$VERBOSE_BATCH_CPUS" < 2))
|
||||
then
|
||||
VERBOSE_BATCH_CPUS=0
|
||||
fi
|
||||
else
|
||||
echo "Ignoring value of $2 for --guest-cpu-limit which is greater than (("$TORTURE_ALLOTED_CPUS" / 2))."
|
||||
fi
|
||||
shift
|
||||
;;
|
||||
--kcsan-kmake-arg|--kcsan-kmake-args)
|
||||
checkarg --kcsan-kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
|
||||
kcsan_kmake_args="`echo "$kcsan_kmake_args $2" | sed -e 's/^ *//' -e 's/ *$//'`"
|
||||
@ -425,9 +441,9 @@ fi
|
||||
if test "$do_scftorture" = "yes"
|
||||
then
|
||||
# Scale memory based on the number of CPUs.
|
||||
scfmem=$((3+HALF_ALLOTED_CPUS/16))
|
||||
torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
|
||||
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory ${scfmem}G --trust-make
|
||||
scfmem=$((3+SCALE_ALLOTED_CPUS/16))
|
||||
torture_bootargs="scftorture.nthreads=$SCALE_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
|
||||
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory ${scfmem}G --trust-make
|
||||
fi
|
||||
|
||||
if test "$do_rt" = "yes"
|
||||
@ -471,8 +487,8 @@ for prim in $primlist
|
||||
do
|
||||
if test -n "$firsttime"
|
||||
then
|
||||
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
|
||||
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$SCALE_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
|
||||
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
|
||||
if test -f "$T/last-resdir-kasan"
|
||||
then
|
||||
@ -520,8 +536,8 @@ for prim in $primlist
|
||||
do
|
||||
if test -n "$firsttime"
|
||||
then
|
||||
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
|
||||
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$SCALE_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --trust-make
|
||||
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
|
||||
if test -f "$T/last-resdir-kasan"
|
||||
then
|
||||
@ -559,7 +575,7 @@ do_kcsan="$do_kcsan_save"
|
||||
if test "$do_kvfree" = "yes"
|
||||
then
|
||||
torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
|
||||
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
|
||||
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory 2G --trust-make
|
||||
fi
|
||||
|
||||
if test "$do_clocksourcewd" = "yes"
|
||||
|
@ -1,7 +1,5 @@
|
||||
CONFIG_RCU_TORTURE_TEST=y
|
||||
CONFIG_PRINTK_TIME=y
|
||||
CONFIG_HYPERVISOR_GUEST=y
|
||||
CONFIG_PARAVIRT=y
|
||||
CONFIG_KVM_GUEST=y
|
||||
CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n
|
||||
CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n
|
||||
|
@ -0,0 +1,2 @@
|
||||
CONFIG_HYPERVISOR_GUEST=y
|
||||
CONFIG_KVM_GUEST=y
|
@ -0,0 +1 @@
|
||||
CONFIG_KVM_GUEST=y
|
@ -0,0 +1,2 @@
|
||||
CONFIG_HYPERVISOR_GUEST=y
|
||||
CONFIG_KVM_GUEST=y
|
@ -2,3 +2,4 @@ nohz_full=2-9
|
||||
rcutorture.stall_cpu=14
|
||||
rcutorture.stall_cpu_holdoff=90
|
||||
rcutorture.fwd_progress=0
|
||||
rcutree.nohz_full_patience_delay=1000
|
||||
|
20
tools/testing/selftests/rcutorture/configs/refscale/TINY
Normal file
@ -0,0 +1,20 @@
|
||||
CONFIG_SMP=n
|
||||
CONFIG_PREEMPT_NONE=y
|
||||
CONFIG_PREEMPT_VOLUNTARY=n
|
||||
CONFIG_PREEMPT=n
|
||||
CONFIG_PREEMPT_DYNAMIC=n
|
||||
#CHECK#CONFIG_PREEMPT_RCU=n
|
||||
CONFIG_HZ_PERIODIC=n
|
||||
CONFIG_NO_HZ_IDLE=y
|
||||
CONFIG_NO_HZ_FULL=n
|
||||
CONFIG_HOTPLUG_CPU=n
|
||||
CONFIG_SUSPEND=n
|
||||
CONFIG_HIBERNATION=n
|
||||
CONFIG_RCU_NOCB_CPU=n
|
||||
CONFIG_DEBUG_LOCK_ALLOC=n
|
||||
CONFIG_PROVE_LOCKING=n
|
||||
CONFIG_RCU_BOOST=n
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
|
||||
CONFIG_RCU_EXPERT=y
|
||||
CONFIG_KPROBES=n
|
||||
CONFIG_FTRACE=n
|