Merge branch 'for-mingo-rcu' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu

Pull RCU updates from Paul E. McKenney:

- Documentation updates.

- Miscellaneous fixes.

- kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return
  addresses to more easily locate bugs.  This has a couple of RCU-related commits,
  but is mostly MM.  Was pulled in with akpm's agreement.

- Per-callback-batch tracking of numbers of callbacks,
  which enables better debugging information and smarter
  reactions to large numbers of callbacks.

- The first round of changes to allow CPUs to be runtime switched from and to
  callback-offloaded state.

- CONFIG_PREEMPT_RT-related changes.

- RCU CPU stall warning updates.
- Addition of polling grace-period APIs for SRCU.

- Torture-test and torture-test scripting updates, including a "torture everything"
  script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale.
  Plus does an allmodconfig build.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
Ingo Molnar 2021-02-12 12:50:09 +01:00
commit 85e853c5ec
61 changed files with 2996 additions and 728 deletions

View File

@ -38,7 +38,7 @@ sections.
RCU-preempt Expedited Grace Periods
===================================
``CONFIG_PREEMPT=y`` kernels implement RCU-preempt.
``CONFIG_PREEMPTION=y`` kernels implement RCU-preempt.
The overall flow of the handling of a given CPU by an RCU-preempt
expedited grace period is shown in the following diagram:
@ -112,7 +112,7 @@ things.
RCU-sched Expedited Grace Periods
---------------------------------
``CONFIG_PREEMPT=n`` kernels implement RCU-sched. The overall flow of
``CONFIG_PREEMPTION=n`` kernels implement RCU-sched. The overall flow of
the handling of a given CPU by an RCU-sched expedited grace period is
shown in the following diagram:

File diff suppressed because it is too large Load Diff

View File

@ -70,7 +70,7 @@ over a rather long period of time, but improvements are always welcome!
is less readable and prevents lockdep from detecting locking issues.
Letting RCU-protected pointers "leak" out of an RCU read-side
critical section is every bid as bad as letting them leak out
critical section is every bit as bad as letting them leak out
from under a lock. Unless, of course, you have arranged some
other means of protection, such as a lock or a reference count
-before- letting them out of the RCU read-side critical section.
@ -129,9 +129,7 @@ over a rather long period of time, but improvements are always welcome!
accesses. The rcu_dereference() primitive ensures that
the CPU picks up the pointer before it picks up the data
that the pointer points to. This really is necessary
on Alpha CPUs. If you don't believe me, see:
http://www.openvms.compaq.com/wizard/wiz_2637.html
on Alpha CPUs.
The rcu_dereference() primitive is also an excellent
documentation aid, letting the person reading the
@ -214,9 +212,9 @@ over a rather long period of time, but improvements are always welcome!
the rest of the system.
7. As of v4.20, a given kernel implements only one RCU flavor,
which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
which is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
If the updater uses call_rcu() or synchronize_rcu(),
then the corresponding readers my use rcu_read_lock() and
then the corresponding readers may use rcu_read_lock() and
rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
or any pair of primitives that disables and re-enables preemption,
for example, rcu_read_lock_sched() and rcu_read_unlock_sched().

View File

@ -9,7 +9,7 @@ RCU (read-copy update) is a synchronization mechanism that can be thought
of as a replacement for read-writer locking (among other things), but with
very low-overhead readers that are immune to deadlock, priority inversion,
and unbounded latency. RCU read-side critical sections are delimited
by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPT
by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION
kernels, generate no code whatsoever.
This means that RCU writers are unaware of the presence of concurrent
@ -329,10 +329,10 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last
to smp_call_function() and further to smp_call_function_on_cpu(),
causing this latter to spin until the cross-CPU invocation of
rcu_barrier_func() has completed. This by itself would prevent
a grace period from completing on non-CONFIG_PREEMPT kernels,
a grace period from completing on non-CONFIG_PREEMPTION kernels,
since each CPU must undergo a context switch (or other quiescent
state) before the grace period can complete. However, this is
of no use in CONFIG_PREEMPT kernels.
of no use in CONFIG_PREEMPTION kernels.
Therefore, on_each_cpu() disables preemption across its call
to smp_call_function() and also across the local call to

View File

@ -25,7 +25,7 @@ warnings:
- A CPU looping with bottom halves disabled.
- For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
- For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel
without invoking schedule(). If the looping in the kernel is
really expected and desirable behavior, you might need to add
some calls to cond_resched().
@ -44,7 +44,7 @@ warnings:
result in the ``rcu_.*kthread starved for`` console-log message,
which will include additional debugging information.
- A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
- A CPU-bound real-time task in a CONFIG_PREEMPTION kernel, which might
happen to preempt a low-priority task in the middle of an RCU
read-side critical section. This is especially damaging if
that low-priority task is not permitted to run on any other CPU,
@ -92,7 +92,9 @@ warnings:
buggy timer hardware through bugs in the interrupt or exception
path (whether hardware, firmware, or software) through bugs
in Linux's timer subsystem through bugs in the scheduler, and,
yes, even including bugs in RCU itself.
yes, even including bugs in RCU itself. It can also result in
the ``rcu_.*timer wakeup didn't happen for`` console-log message,
which will include additional debugging information.
- A bug in the RCU implementation.
@ -292,6 +294,25 @@ kthread is waiting for a short timeout, the "state" precedes value of the
task_struct ->state field, and the "cpu" indicates that the grace-period
kthread last ran on CPU 5.
If the relevant grace-period kthread does not wake from FQS wait in a
reasonable time, then the following additional line is printed::
kthread timer wakeup didn't happen for 23804 jiffies! g7076 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
The "23804" indicates that kthread's timer expired more than 23 thousand
jiffies ago. The rest of the line has meaning similar to the kthread
starvation case.
Additionally, the following line is printed::
Possible timer handling issue on cpu=4 timer-softirq=11142
Here "cpu" indicates that the grace-period kthread last ran on CPU 4,
where it queued the fqs timer. The number following the "timer-softirq"
is the current ``TIMER_SOFTIRQ`` count on cpu 4. If this value does not
change on successive RCU CPU stall warnings, there is further reason to
suspect a timer problem.
Multiple Warnings From One Stall
================================

View File

@ -683,7 +683,7 @@ Quick Quiz #1:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This section presents a "toy" RCU implementation that is based on
"classic RCU". It is also short on performance (but only for updates) and
on features such as hotplug CPU and the ability to run in CONFIG_PREEMPT
on features such as hotplug CPU and the ability to run in CONFIG_PREEMPTION
kernels. The definitions of rcu_dereference() and rcu_assign_pointer()
are the same as those shown in the preceding section, so they are omitted.
::
@ -739,7 +739,7 @@ Quick Quiz #2:
Quick Quiz #3:
If it is illegal to block in an RCU read-side
critical section, what the heck do you do in
PREEMPT_RT, where normal spinlocks can block???
CONFIG_PREEMPT_RT, where normal spinlocks can block???
:ref:`Answers to Quick Quiz <8_whatisRCU>`
@ -1093,7 +1093,7 @@ Quick Quiz #2:
overhead is **negative**.
Answer:
Imagine a single-CPU system with a non-CONFIG_PREEMPT
Imagine a single-CPU system with a non-CONFIG_PREEMPTION
kernel where a routing table is used by process-context
code, but can be updated by irq-context code (for example,
by an "ICMP REDIRECT" packet). The usual way of handling
@ -1120,10 +1120,10 @@ Answer:
Quick Quiz #3:
If it is illegal to block in an RCU read-side
critical section, what the heck do you do in
PREEMPT_RT, where normal spinlocks can block???
CONFIG_PREEMPT_RT, where normal spinlocks can block???
Answer:
Just as PREEMPT_RT permits preemption of spinlock
Just as CONFIG_PREEMPT_RT permits preemption of spinlock
critical sections, it permits preemption of RCU
read-side critical sections. It also permits
spinlocks blocking while in RCU read-side critical

View File

@ -4092,6 +4092,10 @@
value, meaning that RCU_SOFTIRQ is used by default.
Specify rcutree.use_softirq=0 to use rcuc kthreads.
But note that CONFIG_PREEMPT_RT=y kernels disable
this kernel boot parameter, forcibly setting it
to zero.
rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might
@ -4179,12 +4183,6 @@
Set wakeup interval for idle CPUs that have
RCU callbacks (RCU_FAST_NO_HZ=y).
rcutree.rcu_idle_lazy_gp_delay= [KNL]
Set wakeup interval for idle CPUs that have
only "lazy" RCU callbacks (RCU_FAST_NO_HZ=y).
Lazy RCU callbacks are those which RCU can
prove do nothing more than free memory.
rcutree.rcu_kick_kthreads= [KNL]
Cause the grace-period kthread to get an extra
wake_up() if it sleeps three times longer than
@ -4338,6 +4336,14 @@
stress RCU, they don't participate in the actual
test, hence the "fake".
rcutorture.nocbs_nthreads= [KNL]
Set number of RCU callback-offload togglers.
Zero (the default) disables toggling.
rcutorture.nocbs_toggle= [KNL]
Set the delay in milliseconds between successive
callback-offload toggling attempts.
rcutorture.nreaders= [KNL]
Set number of RCU readers. The value -1 selects
N-1, where N is the number of CPUs. A value
@ -4470,6 +4476,13 @@
only normal grace-period primitives. No effect
on CONFIG_TINY_RCU kernels.
But note that CONFIG_PREEMPT_RT=y kernels enables
this kernel boot parameter, forcibly setting
it to the value one, that is, converting any
post-boot attempt at an expedited RCU grace
period to instead use normal non-expedited
grace-period processing.
rcupdate.rcu_task_ipi_delay= [KNL]
Set time in jiffies during which RCU tasks will
avoid sending IPIs, starting with the beginning
@ -4557,6 +4570,12 @@
refscale.verbose= [KNL]
Enable additional printk() statements.
refscale.verbose_batched= [KNL]
Batch the additional printk() statements. If zero
(the default) or negative, print everything. Otherwise,
print every Nth verbose statement, where N is the value
specified.
relax_domain_level=
[KNL, SMP] Set scheduler's default relax_domain_level.
See Documentation/admin-guide/cgroup-v1/cpusets.rst.
@ -5331,6 +5350,14 @@
are running concurrently, especially on systems
with rotating-rust storage.
torture.verbose_sleep_frequency= [KNL]
Specifies how many verbose printk()s should be
emitted between each sleep. The default of zero
disables verbose-printk() sleeping.
torture.verbose_sleep_duration= [KNL]
Duration of each verbose-printk() sleep in jiffies.
tp720= [HW,PS2]
tpm_suspend_pcr=[HW,TPM]

View File

@ -111,6 +111,8 @@ static inline void cpu_maps_update_done(void)
#endif /* CONFIG_SMP */
extern struct bus_type cpu_subsys;
extern int lockdep_is_cpus_held(void);
#ifdef CONFIG_HOTPLUG_CPU
extern void cpus_write_lock(void);
extern void cpus_write_unlock(void);

View File

@ -901,7 +901,7 @@ static inline void hlist_add_before(struct hlist_node *n,
}
/**
* hlist_add_behing - add a new entry after the one specified
* hlist_add_behind - add a new entry after the one specified
* @n: new entry to be added
* @prev: hlist node to add it after, which must be non-NULL
*/

View File

@ -3177,5 +3177,7 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
extern int sysctl_nr_trim_pages;
void mem_dump_obj(void *object);
#endif /* __KERNEL__ */
#endif /* _LINUX_MM_H */

View File

@ -63,6 +63,122 @@ struct rcu_cblist {
#define RCU_NEXT_TAIL 3
#define RCU_CBLIST_NSEGS 4
/*
* ==NOCB Offloading state machine==
*
*
* ----------------------------------------------------------------------------
* | SEGCBLIST_SOFTIRQ_ONLY |
* | |
* | Callbacks processed by rcu_core() from softirqs or local |
* | rcuc kthread, without holding nocb_lock. |
* ----------------------------------------------------------------------------
* |
* v
* ----------------------------------------------------------------------------
* | SEGCBLIST_OFFLOADED |
* | |
* | Callbacks processed by rcu_core() from softirqs or local |
* | rcuc kthread, while holding nocb_lock. Waking up CB and GP kthreads, |
* | allowing nocb_timer to be armed. |
* ----------------------------------------------------------------------------
* |
* v
* -----------------------------------
* | |
* v v
* --------------------------------------- ----------------------------------|
* | SEGCBLIST_OFFLOADED | | | SEGCBLIST_OFFLOADED | |
* | SEGCBLIST_KTHREAD_CB | | SEGCBLIST_KTHREAD_GP |
* | | | |
* | | | |
* | CB kthread woke up and | | GP kthread woke up and |
* | acknowledged SEGCBLIST_OFFLOADED. | | acknowledged SEGCBLIST_OFFLOADED|
* | Processes callbacks concurrently | | |
* | with rcu_core(), holding | | |
* | nocb_lock. | | |
* --------------------------------------- -----------------------------------
* | |
* -----------------------------------
* |
* v
* |--------------------------------------------------------------------------|
* | SEGCBLIST_OFFLOADED | |
* | SEGCBLIST_KTHREAD_CB | |
* | SEGCBLIST_KTHREAD_GP |
* | |
* | Kthreads handle callbacks holding nocb_lock, local rcu_core() stops |
* | handling callbacks. |
* ----------------------------------------------------------------------------
*/
/*
* ==NOCB De-Offloading state machine==
*
*
* |--------------------------------------------------------------------------|
* | SEGCBLIST_OFFLOADED | |
* | SEGCBLIST_KTHREAD_CB | |
* | SEGCBLIST_KTHREAD_GP |
* | |
* | CB/GP kthreads handle callbacks holding nocb_lock, local rcu_core() |
* | ignores callbacks. |
* ----------------------------------------------------------------------------
* |
* v
* |--------------------------------------------------------------------------|
* | SEGCBLIST_KTHREAD_CB | |
* | SEGCBLIST_KTHREAD_GP |
* | |
* | CB/GP kthreads and local rcu_core() handle callbacks concurrently |
* | holding nocb_lock. Wake up CB and GP kthreads if necessary. |
* ----------------------------------------------------------------------------
* |
* v
* -----------------------------------
* | |
* v v
* ---------------------------------------------------------------------------|
* | |
* | SEGCBLIST_KTHREAD_CB | SEGCBLIST_KTHREAD_GP |
* | | |
* | GP kthread woke up and | CB kthread woke up and |
* | acknowledged the fact that | acknowledged the fact that |
* | SEGCBLIST_OFFLOADED got cleared. | SEGCBLIST_OFFLOADED got cleared. |
* | | The CB kthread goes to sleep |
* | The callbacks from the target CPU | until it ever gets re-offloaded. |
* | will be ignored from the GP kthread | |
* | loop. | |
* ----------------------------------------------------------------------------
* | |
* -----------------------------------
* |
* v
* ----------------------------------------------------------------------------
* | 0 |
* | |
* | Callbacks processed by rcu_core() from softirqs or local |
* | rcuc kthread, while holding nocb_lock. Forbid nocb_timer to be armed. |
* | Flush pending nocb_timer. Flush nocb bypass callbacks. |
* ----------------------------------------------------------------------------
* |
* v
* ----------------------------------------------------------------------------
* | SEGCBLIST_SOFTIRQ_ONLY |
* | |
* | Callbacks processed by rcu_core() from softirqs or local |
* | rcuc kthread, without holding nocb_lock. |
* ----------------------------------------------------------------------------
*/
#define SEGCBLIST_ENABLED BIT(0)
#define SEGCBLIST_SOFTIRQ_ONLY BIT(1)
#define SEGCBLIST_KTHREAD_CB BIT(2)
#define SEGCBLIST_KTHREAD_GP BIT(3)
#define SEGCBLIST_OFFLOADED BIT(4)
struct rcu_segcblist {
struct rcu_head *head;
struct rcu_head **tails[RCU_CBLIST_NSEGS];
@ -72,8 +188,8 @@ struct rcu_segcblist {
#else
long len;
#endif
u8 enabled;
u8 offloaded;
long seglen[RCU_CBLIST_NSEGS];
u8 flags;
};
#define RCU_SEGCBLIST_INITIALIZER(n) \

View File

@ -33,6 +33,8 @@
#define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b))
#define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b))
#define ulong2long(a) (*(long *)(&(a)))
#define USHORT_CMP_GE(a, b) (USHRT_MAX / 2 >= (unsigned short)((a) - (b)))
#define USHORT_CMP_LT(a, b) (USHRT_MAX / 2 < (unsigned short)((a) - (b)))
/* Exported common interfaces */
void call_rcu(struct rcu_head *head, rcu_callback_t func);
@ -110,8 +112,12 @@ static inline void rcu_user_exit(void) { }
#ifdef CONFIG_RCU_NOCB_CPU
void rcu_init_nohz(void);
int rcu_nocb_cpu_offload(int cpu);
int rcu_nocb_cpu_deoffload(int cpu);
#else /* #ifdef CONFIG_RCU_NOCB_CPU */
static inline void rcu_init_nohz(void) { }
static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; }
static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; }
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
/**
@ -846,19 +852,11 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
*/
#define __is_kvfree_rcu_offset(offset) ((offset) < 4096)
/*
* Helper macro for kfree_rcu() to prevent argument-expansion eyestrain.
*/
#define __kvfree_rcu(head, offset) \
do { \
BUILD_BUG_ON(!__is_kvfree_rcu_offset(offset)); \
kvfree_call_rcu(head, (rcu_callback_t)(unsigned long)(offset)); \
} while (0)
/**
* kfree_rcu() - kfree an object after a grace period.
* @ptr: pointer to kfree
* @rhf: the name of the struct rcu_head within the type of @ptr.
* @ptr: pointer to kfree for both single- and double-argument invocations.
* @rhf: the name of the struct rcu_head within the type of @ptr,
* but only for double-argument invocations.
*
* Many rcu callbacks functions just call kfree() on the base structure.
* These functions are trivial, but their size adds up, and furthermore
@ -871,7 +869,7 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
* Because the functions are not allowed in the low-order 4096 bytes of
* kernel virtual memory, offsets up to 4095 bytes can be accommodated.
* If the offset is larger than 4095 bytes, a compile-time error will
* be generated in __kvfree_rcu(). If this error is triggered, you can
* be generated in kvfree_rcu_arg_2(). If this error is triggered, you can
* either fall back to use of call_rcu() or rearrange the structure to
* position the rcu_head structure into the first 4096 bytes.
*
@ -881,13 +879,7 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
* The BUILD_BUG_ON check must not involve any function calls, hence the
* checks are done in macros here.
*/
#define kfree_rcu(ptr, rhf) \
do { \
typeof (ptr) ___p = (ptr); \
\
if (___p) \
__kvfree_rcu(&((___p)->rhf), offsetof(typeof(*(ptr)), rhf)); \
} while (0)
#define kfree_rcu kvfree_rcu
/**
* kvfree_rcu() - kvfree an object after a grace period.
@ -919,7 +911,17 @@ do { \
kvfree_rcu_arg_2, kvfree_rcu_arg_1)(__VA_ARGS__)
#define KVFREE_GET_MACRO(_1, _2, NAME, ...) NAME
#define kvfree_rcu_arg_2(ptr, rhf) kfree_rcu(ptr, rhf)
#define kvfree_rcu_arg_2(ptr, rhf) \
do { \
typeof (ptr) ___p = (ptr); \
\
if (___p) { \
BUILD_BUG_ON(!__is_kvfree_rcu_offset(offsetof(typeof(*(ptr)), rhf))); \
kvfree_call_rcu(&((___p)->rhf), (rcu_callback_t)(unsigned long) \
(offsetof(typeof(*(ptr)), rhf))); \
} \
} while (0)
#define kvfree_rcu_arg_1(ptr) \
do { \
typeof(ptr) ___p = (ptr); \

View File

@ -186,6 +186,8 @@ void kfree(const void *);
void kfree_sensitive(const void *);
size_t __ksize(const void *);
size_t ksize(const void *);
bool kmem_valid_obj(void *object);
void kmem_dump_obj(void *object);
#ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR
void __check_heap_object(const void *ptr, unsigned long n, struct page *page,

View File

@ -60,6 +60,9 @@ void cleanup_srcu_struct(struct srcu_struct *ssp);
int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp);
void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
void synchronize_srcu(struct srcu_struct *ssp);
unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp);
unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp);
bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie);
#ifdef CONFIG_DEBUG_LOCK_ALLOC

View File

@ -15,7 +15,8 @@
struct srcu_struct {
short srcu_lock_nesting[2]; /* srcu_read_lock() nesting depth. */
short srcu_idx; /* Current reader array element. */
unsigned short srcu_idx; /* Current reader array element in bit 0x2. */
unsigned short srcu_idx_max; /* Furthest future srcu_idx request. */
u8 srcu_gp_running; /* GP workqueue running? */
u8 srcu_gp_waiting; /* GP waiting for readers? */
struct swait_queue_head srcu_wq;
@ -59,7 +60,7 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
{
int idx;
idx = READ_ONCE(ssp->srcu_idx);
idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], ssp->srcu_lock_nesting[idx] + 1);
return idx;
}
@ -80,7 +81,7 @@ static inline void srcu_torture_stats_print(struct srcu_struct *ssp,
{
int idx;
idx = READ_ONCE(ssp->srcu_idx) & 0x1;
idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%hd,%hd)\n",
tt, tf, idx,
READ_ONCE(ssp->srcu_lock_nesting[!idx]),

View File

@ -192,6 +192,8 @@ extern int try_to_del_timer_sync(struct timer_list *timer);
#define del_singleshot_timer_sync(t) del_timer_sync(t)
extern bool timer_curr_running(struct timer_list *timer);
extern void init_timers(void);
struct hrtimer;
extern enum hrtimer_restart it_real_fn(struct hrtimer *);

View File

@ -32,11 +32,27 @@
#define TOROUT_STRING(s) \
pr_alert("%s" TORTURE_FLAG " %s\n", torture_type, s)
#define VERBOSE_TOROUT_STRING(s) \
do { if (verbose) pr_alert("%s" TORTURE_FLAG " %s\n", torture_type, s); } while (0)
do { \
if (verbose) { \
verbose_torout_sleep(); \
pr_alert("%s" TORTURE_FLAG " %s\n", torture_type, s); \
} \
} while (0)
#define VERBOSE_TOROUT_ERRSTRING(s) \
do { if (verbose) pr_alert("%s" TORTURE_FLAG "!!! %s\n", torture_type, s); } while (0)
do { \
if (verbose) { \
verbose_torout_sleep(); \
pr_alert("%s" TORTURE_FLAG "!!! %s\n", torture_type, s); \
} \
} while (0)
void verbose_torout_sleep(void);
/* Definitions for online/offline exerciser. */
#ifdef CONFIG_HOTPLUG_CPU
int torture_num_online_cpus(void);
#else /* #ifdef CONFIG_HOTPLUG_CPU */
static inline int torture_num_online_cpus(void) { return 1; }
#endif /* #else #ifdef CONFIG_HOTPLUG_CPU */
typedef void torture_ofl_func(void);
bool torture_offline(int cpu, long *n_onl_attempts, long *n_onl_successes,
unsigned long *sum_offl, int *min_onl, int *max_onl);
@ -61,6 +77,13 @@ static inline void torture_random_init(struct torture_random_state *trsp)
trsp->trs_count = 0;
}
/* Definitions for high-resolution-timer sleeps. */
int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, struct torture_random_state *trsp);
int torture_hrtimeout_us(u32 baset_us, u32 fuzzt_ns, struct torture_random_state *trsp);
int torture_hrtimeout_ms(u32 baset_ms, u32 fuzzt_us, struct torture_random_state *trsp);
int torture_hrtimeout_jiffies(u32 baset_j, struct torture_random_state *trsp);
int torture_hrtimeout_s(u32 baset_s, u32 fuzzt_ms, struct torture_random_state *trsp);
/* Task shuffler, which causes CPUs to occasionally go idle. */
void torture_shuffle_task_register(struct task_struct *tp);
int torture_shuffle_init(long shuffint);

View File

@ -241,4 +241,10 @@ pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
int register_vmap_purge_notifier(struct notifier_block *nb);
int unregister_vmap_purge_notifier(struct notifier_block *nb);
#ifdef CONFIG_MMU
bool vmalloc_dump_obj(void *object);
#else
static inline bool vmalloc_dump_obj(void *object) { return false; }
#endif
#endif /* _LINUX_VMALLOC_H */

View File

@ -505,6 +505,32 @@ TRACE_EVENT_RCU(rcu_callback,
__entry->qlen)
);
TRACE_EVENT_RCU(rcu_segcb_stats,
TP_PROTO(struct rcu_segcblist *rs, const char *ctx),
TP_ARGS(rs, ctx),
TP_STRUCT__entry(
__field(const char *, ctx)
__array(unsigned long, gp_seq, RCU_CBLIST_NSEGS)
__array(long, seglen, RCU_CBLIST_NSEGS)
),
TP_fast_assign(
__entry->ctx = ctx;
memcpy(__entry->seglen, rs->seglen, RCU_CBLIST_NSEGS * sizeof(long));
memcpy(__entry->gp_seq, rs->gp_seq, RCU_CBLIST_NSEGS * sizeof(unsigned long));
),
TP_printk("%s seglen: (DONE=%ld, WAIT=%ld, NEXT_READY=%ld, NEXT=%ld) "
"gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, NEXT=%lu)", __entry->ctx,
__entry->seglen[0], __entry->seglen[1], __entry->seglen[2], __entry->seglen[3],
__entry->gp_seq[0], __entry->gp_seq[1], __entry->gp_seq[2], __entry->gp_seq[3])
);
/*
* Tracepoint for the registration of a single RCU callback of the special
* kvfree() form. The first argument is the RCU type, the second argument

View File

@ -330,6 +330,13 @@ void lockdep_assert_cpus_held(void)
percpu_rwsem_assert_held(&cpu_hotplug_lock);
}
#ifdef CONFIG_LOCKDEP
int lockdep_is_cpus_held(void)
{
return percpu_rwsem_is_held(&cpu_hotplug_lock);
}
#endif
static void lockdep_acquire_cpus_lock(void)
{
rwsem_acquire(&cpu_hotplug_lock.dep_map, 0, 0, _THIS_IP_);

View File

@ -27,7 +27,6 @@
#include <linux/moduleparam.h>
#include <linux/delay.h>
#include <linux/slab.h>
#include <linux/percpu-rwsem.h>
#include <linux/torture.h>
#include <linux/reboot.h>

View File

@ -95,6 +95,7 @@ config TASKS_RUDE_RCU
config TASKS_TRACE_RCU
def_bool 0
select IRQ_WORK
help
This option enables a task-based RCU implementation that uses
explicit rcu_read_lock_trace() read-side markers, and allows
@ -188,8 +189,8 @@ config RCU_FAST_NO_HZ
config RCU_BOOST
bool "Enable RCU priority boosting"
depends on RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT
default n
depends on (RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT) || PREEMPT_RT
default y if PREEMPT_RT
help
This option boosts the priority of preempted RCU readers that
block the current preemptible RCU grace period for too long.

View File

@ -378,7 +378,11 @@ do { \
smp_mb__after_unlock_lock(); \
} while (0)
#define raw_spin_unlock_rcu_node(p) raw_spin_unlock(&ACCESS_PRIVATE(p, lock))
#define raw_spin_unlock_rcu_node(p) \
do { \
lockdep_assert_irqs_disabled(); \
raw_spin_unlock(&ACCESS_PRIVATE(p, lock)); \
} while (0)
#define raw_spin_lock_irq_rcu_node(p) \
do { \
@ -387,7 +391,10 @@ do { \
} while (0)
#define raw_spin_unlock_irq_rcu_node(p) \
raw_spin_unlock_irq(&ACCESS_PRIVATE(p, lock))
do { \
lockdep_assert_irqs_disabled(); \
raw_spin_unlock_irq(&ACCESS_PRIVATE(p, lock)); \
} while (0)
#define raw_spin_lock_irqsave_rcu_node(p, flags) \
do { \
@ -396,7 +403,10 @@ do { \
} while (0)
#define raw_spin_unlock_irqrestore_rcu_node(p, flags) \
raw_spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags)
do { \
lockdep_assert_irqs_disabled(); \
raw_spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags); \
} while (0)
#define raw_spin_trylock_rcu_node(p) \
({ \

View File

@ -7,10 +7,10 @@
* Authors: Paul E. McKenney <paulmck@linux.ibm.com>
*/
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/cpu.h>
#include <linux/interrupt.h>
#include <linux/rcupdate.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include "rcu_segcblist.h"
@ -88,23 +88,135 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
#endif
}
/* Get the length of a segment of the rcu_segcblist structure. */
static long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg)
{
return READ_ONCE(rsclp->seglen[seg]);
}
/* Return number of callbacks in segmented callback list by summing seglen. */
long rcu_segcblist_n_segment_cbs(struct rcu_segcblist *rsclp)
{
long len = 0;
int i;
for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++)
len += rcu_segcblist_get_seglen(rsclp, i);
return len;
}
/* Set the length of a segment of the rcu_segcblist structure. */
static void rcu_segcblist_set_seglen(struct rcu_segcblist *rsclp, int seg, long v)
{
WRITE_ONCE(rsclp->seglen[seg], v);
}
/* Increase the numeric length of a segment by a specified amount. */
static void rcu_segcblist_add_seglen(struct rcu_segcblist *rsclp, int seg, long v)
{
WRITE_ONCE(rsclp->seglen[seg], rsclp->seglen[seg] + v);
}
/* Move from's segment length to to's segment. */
static void rcu_segcblist_move_seglen(struct rcu_segcblist *rsclp, int from, int to)
{
long len;
if (from == to)
return;
len = rcu_segcblist_get_seglen(rsclp, from);
if (!len)
return;
rcu_segcblist_add_seglen(rsclp, to, len);
rcu_segcblist_set_seglen(rsclp, from, 0);
}
/* Increment segment's length. */
static void rcu_segcblist_inc_seglen(struct rcu_segcblist *rsclp, int seg)
{
rcu_segcblist_add_seglen(rsclp, seg, 1);
}
/*
* Increase the numeric length of an rcu_segcblist structure by the
* specified amount, which can be negative. This can cause the ->len
* field to disagree with the actual number of callbacks on the structure.
* This increase is fully ordered with respect to the callers accesses
* both before and after.
*
* So why on earth is a memory barrier required both before and after
* the update to the ->len field???
*
* The reason is that rcu_barrier() locklessly samples each CPU's ->len
* field, and if a given CPU's field is zero, avoids IPIing that CPU.
* This can of course race with both queuing and invoking of callbacks.
* Failing to correctly handle either of these races could result in
* rcu_barrier() failing to IPI a CPU that actually had callbacks queued
* which rcu_barrier() was obligated to wait on. And if rcu_barrier()
* failed to wait on such a callback, unloading certain kernel modules
* would result in calls to functions whose code was no longer present in
* the kernel, for but one example.
*
* Therefore, ->len transitions from 1->0 and 0->1 have to be carefully
* ordered with respect with both list modifications and the rcu_barrier().
*
* The queuing case is CASE 1 and the invoking case is CASE 2.
*
* CASE 1: Suppose that CPU 0 has no callbacks queued, but invokes
* call_rcu() just as CPU 1 invokes rcu_barrier(). CPU 0's ->len field
* will transition from 0->1, which is one of the transitions that must
* be handled carefully. Without the full memory barriers after the ->len
* update and at the beginning of rcu_barrier(), the following could happen:
*
* CPU 0 CPU 1
*
* call_rcu().
* rcu_barrier() sees ->len as 0.
* set ->len = 1.
* rcu_barrier() does nothing.
* module is unloaded.
* callback invokes unloaded function!
*
* With the full barriers, any case where rcu_barrier() sees ->len as 0 will
* have unambiguously preceded the return from the racing call_rcu(), which
* means that this call_rcu() invocation is OK to not wait on. After all,
* you are supposed to make sure that any problematic call_rcu() invocations
* happen before the rcu_barrier().
*
*
* CASE 2: Suppose that CPU 0 is invoking its last callback just as
* CPU 1 invokes rcu_barrier(). CPU 0's ->len field will transition from
* 1->0, which is one of the transitions that must be handled carefully.
* Without the full memory barriers before the ->len update and at the
* end of rcu_barrier(), the following could happen:
*
* CPU 0 CPU 1
*
* start invoking last callback
* set ->len = 0 (reordered)
* rcu_barrier() sees ->len as 0
* rcu_barrier() does nothing.
* module is unloaded
* callback executing after unloaded!
*
* With the full barriers, any case where rcu_barrier() sees ->len as 0
* will be fully ordered after the completion of the callback function,
* so that the module unloading operation is completely safe.
*
*/
static void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
{
#ifdef CONFIG_RCU_NOCB_CPU
smp_mb__before_atomic(); /* Up to the caller! */
smp_mb__before_atomic(); // Read header comment above.
atomic_long_add(v, &rsclp->len);
smp_mb__after_atomic(); /* Up to the caller! */
smp_mb__after_atomic(); // Read header comment above.
#else
smp_mb(); /* Up to the caller! */
smp_mb(); // Read header comment above.
WRITE_ONCE(rsclp->len, rsclp->len + v);
smp_mb(); /* Up to the caller! */
smp_mb(); // Read header comment above.
#endif
}
@ -119,26 +231,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp)
rcu_segcblist_add_len(rsclp, 1);
}
/*
* Exchange the numeric length of the specified rcu_segcblist structure
* with the specified value. This can cause the ->len field to disagree
* with the actual number of callbacks on the structure. This exchange is
* fully ordered with respect to the callers accesses both before and after.
*/
static long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v)
{
#ifdef CONFIG_RCU_NOCB_CPU
return atomic_long_xchg(&rsclp->len, v);
#else
long ret = rsclp->len;
smp_mb(); /* Up to the caller! */
WRITE_ONCE(rsclp->len, v);
smp_mb(); /* Up to the caller! */
return ret;
#endif
}
/*
* Initialize an rcu_segcblist structure.
*/
@ -149,10 +241,12 @@ void rcu_segcblist_init(struct rcu_segcblist *rsclp)
BUILD_BUG_ON(RCU_NEXT_TAIL + 1 != ARRAY_SIZE(rsclp->gp_seq));
BUILD_BUG_ON(ARRAY_SIZE(rsclp->tails) != ARRAY_SIZE(rsclp->gp_seq));
rsclp->head = NULL;
for (i = 0; i < RCU_CBLIST_NSEGS; i++)
for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
rsclp->tails[i] = &rsclp->head;
rcu_segcblist_set_seglen(rsclp, i, 0);
}
rcu_segcblist_set_len(rsclp, 0);
rsclp->enabled = 1;
rcu_segcblist_set_flags(rsclp, SEGCBLIST_ENABLED);
}
/*
@ -163,16 +257,21 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
{
WARN_ON_ONCE(!rcu_segcblist_empty(rsclp));
WARN_ON_ONCE(rcu_segcblist_n_cbs(rsclp));
rsclp->enabled = 0;
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_ENABLED);
}
/*
* Mark the specified rcu_segcblist structure as offloaded. This
* structure must be empty.
*/
void rcu_segcblist_offload(struct rcu_segcblist *rsclp)
void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload)
{
rsclp->offloaded = 1;
if (offload) {
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_SOFTIRQ_ONLY);
rcu_segcblist_set_flags(rsclp, SEGCBLIST_OFFLOADED);
} else {
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_OFFLOADED);
}
}
/*
@ -245,7 +344,7 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
struct rcu_head *rhp)
{
rcu_segcblist_inc_len(rsclp);
smp_mb(); /* Ensure counts are updated before callback is enqueued. */
rcu_segcblist_inc_seglen(rsclp, RCU_NEXT_TAIL);
rhp->next = NULL;
WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp);
WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], &rhp->next);
@ -274,27 +373,13 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
if (rsclp->tails[i] != rsclp->tails[i - 1])
break;
rcu_segcblist_inc_seglen(rsclp, i);
WRITE_ONCE(*rsclp->tails[i], rhp);
for (; i <= RCU_NEXT_TAIL; i++)
WRITE_ONCE(rsclp->tails[i], &rhp->next);
return true;
}
/*
* Extract only the counts from the specified rcu_segcblist structure,
* and place them in the specified rcu_cblist structure. This function
* supports both callback orphaning and invocation, hence the separation
* of counts and callbacks. (Callbacks ready for invocation must be
* orphaned and adopted separately from pending callbacks, but counts
* apply to all callbacks. Locking must be used to make sure that
* both orphaned-callbacks lists are consistent.)
*/
void rcu_segcblist_extract_count(struct rcu_segcblist *rsclp,
struct rcu_cblist *rclp)
{
rclp->len = rcu_segcblist_xchg_len(rsclp, 0);
}
/*
* Extract only those callbacks ready to be invoked from the specified
* rcu_segcblist structure and place them in the specified rcu_cblist
@ -307,6 +392,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
if (!rcu_segcblist_ready_cbs(rsclp))
return; /* Nothing to do. */
rclp->len = rcu_segcblist_get_seglen(rsclp, RCU_DONE_TAIL);
*rclp->tail = rsclp->head;
WRITE_ONCE(rsclp->head, *rsclp->tails[RCU_DONE_TAIL]);
WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
@ -314,6 +400,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
for (i = RCU_CBLIST_NSEGS - 1; i >= RCU_DONE_TAIL; i--)
if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
WRITE_ONCE(rsclp->tails[i], &rsclp->head);
rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
}
/*
@ -330,11 +417,15 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
if (!rcu_segcblist_pend_cbs(rsclp))
return; /* Nothing to do. */
rclp->len = 0;
*rclp->tail = *rsclp->tails[RCU_DONE_TAIL];
rclp->tail = rsclp->tails[RCU_NEXT_TAIL];
WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++)
for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++) {
rclp->len += rcu_segcblist_get_seglen(rsclp, i);
WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
rcu_segcblist_set_seglen(rsclp, i, 0);
}
}
/*
@ -345,7 +436,6 @@ void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
struct rcu_cblist *rclp)
{
rcu_segcblist_add_len(rsclp, rclp->len);
rclp->len = 0;
}
/*
@ -359,6 +449,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
if (!rclp->head)
return; /* No callbacks to move. */
rcu_segcblist_add_seglen(rsclp, RCU_DONE_TAIL, rclp->len);
*rclp->tail = rsclp->head;
WRITE_ONCE(rsclp->head, rclp->head);
for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++)
@ -379,6 +470,8 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
{
if (!rclp->head)
return; /* Nothing to do. */
rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
}
@ -403,6 +496,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
if (ULONG_CMP_LT(seq, rsclp->gp_seq[i]))
break;
WRITE_ONCE(rsclp->tails[RCU_DONE_TAIL], rsclp->tails[i]);
rcu_segcblist_move_seglen(rsclp, i, RCU_DONE_TAIL);
}
/* If no callbacks moved, nothing more need be done. */
@ -423,6 +517,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL])
break; /* No more callbacks. */
WRITE_ONCE(rsclp->tails[j], rsclp->tails[i]);
rcu_segcblist_move_seglen(rsclp, i, j);
rsclp->gp_seq[j] = rsclp->gp_seq[i];
}
}
@ -444,7 +539,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
*/
bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
{
int i;
int i, j;
WARN_ON_ONCE(!rcu_segcblist_is_enabled(rsclp));
if (rcu_segcblist_restempty(rsclp, RCU_DONE_TAIL))
@ -487,6 +582,10 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
if (rcu_segcblist_restempty(rsclp, i) || ++i >= RCU_NEXT_TAIL)
return false;
/* Accounting: everything below i is about to get merged into i. */
for (j = i + 1; j <= RCU_NEXT_TAIL; j++)
rcu_segcblist_move_seglen(rsclp, j, i);
/*
* Merge all later callbacks, including newly arrived callbacks,
* into the segment located by the for-loop above. Assign "seq"
@ -514,13 +613,24 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
struct rcu_cblist donecbs;
struct rcu_cblist pendcbs;
lockdep_assert_cpus_held();
rcu_cblist_init(&donecbs);
rcu_cblist_init(&pendcbs);
rcu_segcblist_extract_count(src_rsclp, &donecbs);
rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
/*
* No need smp_mb() before setting length to 0, because CPU hotplug
* lock excludes rcu_barrier.
*/
rcu_segcblist_set_len(src_rsclp, 0);
rcu_segcblist_insert_count(dst_rsclp, &donecbs);
rcu_segcblist_insert_count(dst_rsclp, &pendcbs);
rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
rcu_segcblist_init(src_rsclp);
}

View File

@ -15,6 +15,9 @@ static inline long rcu_cblist_n_cbs(struct rcu_cblist *rclp)
return READ_ONCE(rclp->len);
}
/* Return number of callbacks in segmented callback list by summing seglen. */
long rcu_segcblist_n_segment_cbs(struct rcu_segcblist *rsclp);
void rcu_cblist_init(struct rcu_cblist *rclp);
void rcu_cblist_enqueue(struct rcu_cblist *rclp, struct rcu_head *rhp);
void rcu_cblist_flush_enqueue(struct rcu_cblist *drclp,
@ -50,19 +53,51 @@ static inline long rcu_segcblist_n_cbs(struct rcu_segcblist *rsclp)
#endif
}
static inline void rcu_segcblist_set_flags(struct rcu_segcblist *rsclp,
int flags)
{
rsclp->flags |= flags;
}
static inline void rcu_segcblist_clear_flags(struct rcu_segcblist *rsclp,
int flags)
{
rsclp->flags &= ~flags;
}
static inline bool rcu_segcblist_test_flags(struct rcu_segcblist *rsclp,
int flags)
{
return READ_ONCE(rsclp->flags) & flags;
}
/*
* Is the specified rcu_segcblist enabled, for example, not corresponding
* to an offline CPU?
*/
static inline bool rcu_segcblist_is_enabled(struct rcu_segcblist *rsclp)
{
return rsclp->enabled;
return rcu_segcblist_test_flags(rsclp, SEGCBLIST_ENABLED);
}
/* Is the specified rcu_segcblist offloaded? */
/* Is the specified rcu_segcblist offloaded, or is SEGCBLIST_SOFTIRQ_ONLY set? */
static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp)
{
return IS_ENABLED(CONFIG_RCU_NOCB_CPU) && rsclp->offloaded;
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
!rcu_segcblist_test_flags(rsclp, SEGCBLIST_SOFTIRQ_ONLY))
return true;
return false;
}
static inline bool rcu_segcblist_completely_offloaded(struct rcu_segcblist *rsclp)
{
int flags = SEGCBLIST_KTHREAD_CB | SEGCBLIST_KTHREAD_GP | SEGCBLIST_OFFLOADED;
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) && (rsclp->flags & flags) == flags)
return true;
return false;
}
/*
@ -75,10 +110,22 @@ static inline bool rcu_segcblist_restempty(struct rcu_segcblist *rsclp, int seg)
return !READ_ONCE(*READ_ONCE(rsclp->tails[seg]));
}
/*
* Is the specified segment of the specified rcu_segcblist structure
* empty of callbacks?
*/
static inline bool rcu_segcblist_segempty(struct rcu_segcblist *rsclp, int seg)
{
if (seg == RCU_DONE_TAIL)
return &rsclp->head == rsclp->tails[RCU_DONE_TAIL];
return rsclp->tails[seg - 1] == rsclp->tails[seg];
}
void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v);
void rcu_segcblist_init(struct rcu_segcblist *rsclp);
void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
void rcu_segcblist_offload(struct rcu_segcblist *rsclp);
void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload);
bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp);
bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp);
struct rcu_head *rcu_segcblist_first_cb(struct rcu_segcblist *rsclp);
@ -88,8 +135,6 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
struct rcu_head *rhp);
bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
struct rcu_head *rhp);
void rcu_segcblist_extract_count(struct rcu_segcblist *rsclp,
struct rcu_cblist *rclp);
void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
struct rcu_cblist *rclp);
void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,

View File

@ -85,6 +85,7 @@ torture_param(bool, gp_cond, false, "Use conditional/async GP wait primitives");
torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
torture_param(bool, gp_normal, false,
"Use normal (non-expedited) GP wait primitives");
torture_param(bool, gp_poll, false, "Use polling GP wait primitives");
torture_param(bool, gp_sync, false, "Use synchronous GP wait primitives");
torture_param(int, irqreader, 1, "Allow RCU readers from irq handlers");
torture_param(int, leakpointer, 0, "Leak pointer dereferences from readers");
@ -97,6 +98,8 @@ torture_param(int, object_debug, 0,
torture_param(int, onoff_holdoff, 0, "Time after boot before CPU hotplugs (s)");
torture_param(int, onoff_interval, 0,
"Time between CPU hotplugs (jiffies), 0=disable");
torture_param(int, nocbs_nthreads, 0, "Number of NOCB toggle threads, 0 to disable");
torture_param(int, nocbs_toggle, 1000, "Time between toggling nocb state (ms)");
torture_param(int, read_exit_delay, 13,
"Delay between read-then-exit episodes (s)");
torture_param(int, read_exit_burst, 16,
@ -127,10 +130,12 @@ static char *torture_type = "rcu";
module_param(torture_type, charp, 0444);
MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, srcu, ...)");
static int nrealnocbers;
static int nrealreaders;
static struct task_struct *writer_task;
static struct task_struct **fakewriter_tasks;
static struct task_struct **reader_tasks;
static struct task_struct **nocb_tasks;
static struct task_struct *stats_task;
static struct task_struct *fqs_task;
static struct task_struct *boost_tasks[NR_CPUS];
@ -142,11 +147,22 @@ static struct task_struct *read_exit_task;
#define RCU_TORTURE_PIPE_LEN 10
// Mailbox-like structure to check RCU global memory ordering.
struct rcu_torture_reader_check {
unsigned long rtc_myloops;
int rtc_chkrdr;
unsigned long rtc_chkloops;
int rtc_ready;
struct rcu_torture_reader_check *rtc_assigner;
} ____cacheline_internodealigned_in_smp;
// Update-side data structure used to check RCU readers.
struct rcu_torture {
struct rcu_head rtort_rcu;
int rtort_pipe_count;
struct list_head rtort_free;
int rtort_mbtest;
struct rcu_torture_reader_check *rtort_chkp;
};
static LIST_HEAD(rcu_torture_freelist);
@ -157,10 +173,13 @@ static DEFINE_SPINLOCK(rcu_torture_lock);
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_count);
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_batch);
static atomic_t rcu_torture_wcount[RCU_TORTURE_PIPE_LEN + 1];
static struct rcu_torture_reader_check *rcu_torture_reader_mbchk;
static atomic_t n_rcu_torture_alloc;
static atomic_t n_rcu_torture_alloc_fail;
static atomic_t n_rcu_torture_free;
static atomic_t n_rcu_torture_mberror;
static atomic_t n_rcu_torture_mbchk_fail;
static atomic_t n_rcu_torture_mbchk_tries;
static atomic_t n_rcu_torture_error;
static long n_rcu_torture_barrier_error;
static long n_rcu_torture_boost_ktrerror;
@ -174,6 +193,8 @@ static unsigned long n_read_exits;
static struct list_head rcu_torture_removed;
static unsigned long shutdown_jiffies;
static unsigned long start_gp_seq;
static atomic_long_t n_nocb_offload;
static atomic_long_t n_nocb_deoffload;
static int rcu_torture_writer_state;
#define RTWS_FIXED_DELAY 0
@ -183,9 +204,11 @@ static int rcu_torture_writer_state;
#define RTWS_EXP_SYNC 4
#define RTWS_COND_GET 5
#define RTWS_COND_SYNC 6
#define RTWS_SYNC 7
#define RTWS_STUTTER 8
#define RTWS_STOPPING 9
#define RTWS_POLL_GET 7
#define RTWS_POLL_WAIT 8
#define RTWS_SYNC 9
#define RTWS_STUTTER 10
#define RTWS_STOPPING 11
static const char * const rcu_torture_writer_state_names[] = {
"RTWS_FIXED_DELAY",
"RTWS_DELAY",
@ -194,6 +217,8 @@ static const char * const rcu_torture_writer_state_names[] = {
"RTWS_EXP_SYNC",
"RTWS_COND_GET",
"RTWS_COND_SYNC",
"RTWS_POLL_GET",
"RTWS_POLL_WAIT",
"RTWS_SYNC",
"RTWS_STUTTER",
"RTWS_STOPPING",
@ -311,7 +336,9 @@ struct rcu_torture_ops {
void (*deferred_free)(struct rcu_torture *p);
void (*sync)(void);
void (*exp_sync)(void);
unsigned long (*get_state)(void);
unsigned long (*get_gp_state)(void);
unsigned long (*start_gp_poll)(void);
bool (*poll_gp_state)(unsigned long oldstate);
void (*cond_sync)(unsigned long oldstate);
call_rcu_func_t call;
void (*cb_barrier)(void);
@ -386,7 +413,12 @@ static bool
rcu_torture_pipe_update_one(struct rcu_torture *rp)
{
int i;
struct rcu_torture_reader_check *rtrcp = READ_ONCE(rp->rtort_chkp);
if (rtrcp) {
WRITE_ONCE(rp->rtort_chkp, NULL);
smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire().
}
i = READ_ONCE(rp->rtort_pipe_count);
if (i > RCU_TORTURE_PIPE_LEN)
i = RCU_TORTURE_PIPE_LEN;
@ -461,7 +493,7 @@ static struct rcu_torture_ops rcu_ops = {
.deferred_free = rcu_torture_deferred_free,
.sync = synchronize_rcu,
.exp_sync = synchronize_rcu_expedited,
.get_state = get_state_synchronize_rcu,
.get_gp_state = get_state_synchronize_rcu,
.cond_sync = cond_synchronize_rcu,
.call = call_rcu,
.cb_barrier = rcu_barrier,
@ -570,6 +602,21 @@ static void srcu_torture_synchronize(void)
synchronize_srcu(srcu_ctlp);
}
static unsigned long srcu_torture_get_gp_state(void)
{
return get_state_synchronize_srcu(srcu_ctlp);
}
static unsigned long srcu_torture_start_gp_poll(void)
{
return start_poll_synchronize_srcu(srcu_ctlp);
}
static bool srcu_torture_poll_gp_state(unsigned long oldstate)
{
return poll_state_synchronize_srcu(srcu_ctlp, oldstate);
}
static void srcu_torture_call(struct rcu_head *head,
rcu_callback_t func)
{
@ -601,6 +648,9 @@ static struct rcu_torture_ops srcu_ops = {
.deferred_free = srcu_torture_deferred_free,
.sync = srcu_torture_synchronize,
.exp_sync = srcu_torture_synchronize_expedited,
.get_gp_state = srcu_torture_get_gp_state,
.start_gp_poll = srcu_torture_start_gp_poll,
.poll_gp_state = srcu_torture_poll_gp_state,
.call = srcu_torture_call,
.cb_barrier = srcu_torture_barrier,
.stats = srcu_torture_stats,
@ -1018,42 +1068,26 @@ rcu_torture_fqs(void *arg)
return 0;
}
/*
* RCU torture writer kthread. Repeatedly substitutes a new structure
* for that pointed to by rcu_torture_current, freeing the old structure
* after a series of grace periods (the "pipeline").
*/
static int
rcu_torture_writer(void *arg)
{
bool can_expedite = !rcu_gp_is_expedited() && !rcu_gp_is_normal();
int expediting = 0;
unsigned long gp_snap;
bool gp_cond1 = gp_cond, gp_exp1 = gp_exp, gp_normal1 = gp_normal;
bool gp_sync1 = gp_sync;
int i;
int oldnice = task_nice(current);
struct rcu_torture *rp;
struct rcu_torture *old_rp;
static DEFINE_TORTURE_RANDOM(rand);
bool stutter_waited;
int synctype[] = { RTWS_DEF_FREE, RTWS_EXP_SYNC,
RTWS_COND_GET, RTWS_SYNC };
int nsynctypes = 0;
// Used by writers to randomly choose from the available grace-period
// primitives. The only purpose of the initialization is to size the array.
static int synctype[] = { RTWS_DEF_FREE, RTWS_EXP_SYNC, RTWS_COND_GET, RTWS_POLL_GET, RTWS_SYNC };
static int nsynctypes;
VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
if (!can_expedite)
pr_alert("%s" TORTURE_FLAG
" GP expediting controlled from boot/sysfs for %s.\n",
torture_type, cur_ops->name);
/*
* Determine which grace-period primitives are available.
*/
static void rcu_torture_write_types(void)
{
bool gp_cond1 = gp_cond, gp_exp1 = gp_exp, gp_normal1 = gp_normal;
bool gp_poll1 = gp_poll, gp_sync1 = gp_sync;
/* Initialize synctype[] array. If none set, take default. */
if (!gp_cond1 && !gp_exp1 && !gp_normal1 && !gp_sync1)
gp_cond1 = gp_exp1 = gp_normal1 = gp_sync1 = true;
if (gp_cond1 && cur_ops->get_state && cur_ops->cond_sync) {
if (!gp_cond1 && !gp_exp1 && !gp_normal1 && !gp_poll1 && !gp_sync1)
gp_cond1 = gp_exp1 = gp_normal1 = gp_poll1 = gp_sync1 = true;
if (gp_cond1 && cur_ops->get_gp_state && cur_ops->cond_sync) {
synctype[nsynctypes++] = RTWS_COND_GET;
pr_info("%s: Testing conditional GPs.\n", __func__);
} else if (gp_cond && (!cur_ops->get_state || !cur_ops->cond_sync)) {
} else if (gp_cond && (!cur_ops->get_gp_state || !cur_ops->cond_sync)) {
pr_alert("%s: gp_cond without primitives.\n", __func__);
}
if (gp_exp1 && cur_ops->exp_sync) {
@ -1068,12 +1102,46 @@ rcu_torture_writer(void *arg)
} else if (gp_normal && !cur_ops->deferred_free) {
pr_alert("%s: gp_normal without primitives.\n", __func__);
}
if (gp_poll1 && cur_ops->start_gp_poll && cur_ops->poll_gp_state) {
synctype[nsynctypes++] = RTWS_POLL_GET;
pr_info("%s: Testing polling GPs.\n", __func__);
} else if (gp_poll && (!cur_ops->start_gp_poll || !cur_ops->poll_gp_state)) {
pr_alert("%s: gp_poll without primitives.\n", __func__);
}
if (gp_sync1 && cur_ops->sync) {
synctype[nsynctypes++] = RTWS_SYNC;
pr_info("%s: Testing normal GPs.\n", __func__);
} else if (gp_sync && !cur_ops->sync) {
pr_alert("%s: gp_sync without primitives.\n", __func__);
}
}
/*
* RCU torture writer kthread. Repeatedly substitutes a new structure
* for that pointed to by rcu_torture_current, freeing the old structure
* after a series of grace periods (the "pipeline").
*/
static int
rcu_torture_writer(void *arg)
{
bool boot_ended;
bool can_expedite = !rcu_gp_is_expedited() && !rcu_gp_is_normal();
unsigned long cookie;
int expediting = 0;
unsigned long gp_snap;
int i;
int idx;
int oldnice = task_nice(current);
struct rcu_torture *rp;
struct rcu_torture *old_rp;
static DEFINE_TORTURE_RANDOM(rand);
bool stutter_waited;
VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
if (!can_expedite)
pr_alert("%s" TORTURE_FLAG
" GP expediting controlled from boot/sysfs for %s.\n",
torture_type, cur_ops->name);
if (WARN_ONCE(nsynctypes == 0,
"rcu_torture_writer: No update-side primitives.\n")) {
/*
@ -1087,7 +1155,7 @@ rcu_torture_writer(void *arg)
do {
rcu_torture_writer_state = RTWS_FIXED_DELAY;
schedule_timeout_uninterruptible(1);
torture_hrtimeout_us(500, 1000, &rand);
rp = rcu_torture_alloc();
if (rp == NULL)
continue;
@ -1107,6 +1175,18 @@ rcu_torture_writer(void *arg)
atomic_inc(&rcu_torture_wcount[i]);
WRITE_ONCE(old_rp->rtort_pipe_count,
old_rp->rtort_pipe_count + 1);
if (cur_ops->get_gp_state && cur_ops->poll_gp_state) {
idx = cur_ops->readlock();
cookie = cur_ops->get_gp_state();
WARN_ONCE(rcu_torture_writer_state != RTWS_DEF_FREE &&
cur_ops->poll_gp_state(cookie),
"%s: Cookie check 1 failed %s(%d) %lu->%lu\n",
__func__,
rcu_torture_writer_state_getname(),
rcu_torture_writer_state,
cookie, cur_ops->get_gp_state());
cur_ops->readunlock(idx);
}
switch (synctype[torture_random(&rand) % nsynctypes]) {
case RTWS_DEF_FREE:
rcu_torture_writer_state = RTWS_DEF_FREE;
@ -1119,15 +1199,21 @@ rcu_torture_writer(void *arg)
break;
case RTWS_COND_GET:
rcu_torture_writer_state = RTWS_COND_GET;
gp_snap = cur_ops->get_state();
i = torture_random(&rand) % 16;
if (i != 0)
schedule_timeout_interruptible(i);
udelay(torture_random(&rand) % 1000);
gp_snap = cur_ops->get_gp_state();
torture_hrtimeout_jiffies(torture_random(&rand) % 16, &rand);
rcu_torture_writer_state = RTWS_COND_SYNC;
cur_ops->cond_sync(gp_snap);
rcu_torture_pipe_update(old_rp);
break;
case RTWS_POLL_GET:
rcu_torture_writer_state = RTWS_POLL_GET;
gp_snap = cur_ops->start_gp_poll();
rcu_torture_writer_state = RTWS_POLL_WAIT;
while (!cur_ops->poll_gp_state(gp_snap))
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
&rand);
rcu_torture_pipe_update(old_rp);
break;
case RTWS_SYNC:
rcu_torture_writer_state = RTWS_SYNC;
cur_ops->sync();
@ -1137,6 +1223,14 @@ rcu_torture_writer(void *arg)
WARN_ON_ONCE(1);
break;
}
if (cur_ops->get_gp_state && cur_ops->poll_gp_state)
WARN_ONCE(rcu_torture_writer_state != RTWS_DEF_FREE &&
!cur_ops->poll_gp_state(cookie),
"%s: Cookie check 2 failed %s(%d) %lu->%lu\n",
__func__,
rcu_torture_writer_state_getname(),
rcu_torture_writer_state,
cookie, cur_ops->get_gp_state());
}
WRITE_ONCE(rcu_torture_current_version,
rcu_torture_current_version + 1);
@ -1155,12 +1249,13 @@ rcu_torture_writer(void *arg)
!rcu_gp_is_normal();
}
rcu_torture_writer_state = RTWS_STUTTER;
boot_ended = rcu_inkernel_boot_has_ended();
stutter_waited = stutter_wait("rcu_torture_writer");
if (stutter_waited &&
!READ_ONCE(rcu_fwd_cb_nodelay) &&
!cur_ops->slow_gps &&
!torture_must_stop() &&
rcu_inkernel_boot_has_ended())
boot_ended)
for (i = 0; i < ARRAY_SIZE(rcu_tortures); i++)
if (list_empty(&rcu_tortures[i].rtort_free) &&
rcu_access_pointer(rcu_torture_current) !=
@ -1194,26 +1289,43 @@ rcu_torture_writer(void *arg)
static int
rcu_torture_fakewriter(void *arg)
{
unsigned long gp_snap;
DEFINE_TORTURE_RANDOM(rand);
VERBOSE_TOROUT_STRING("rcu_torture_fakewriter task started");
set_user_nice(current, MAX_NICE);
do {
schedule_timeout_uninterruptible(1 + torture_random(&rand)%10);
udelay(torture_random(&rand) & 0x3ff);
torture_hrtimeout_jiffies(torture_random(&rand) % 10, &rand);
if (cur_ops->cb_barrier != NULL &&
torture_random(&rand) % (nfakewriters * 8) == 0) {
cur_ops->cb_barrier();
} else if (gp_normal == gp_exp) {
if (cur_ops->sync && torture_random(&rand) & 0x80)
cur_ops->sync();
else if (cur_ops->exp_sync)
} else {
switch (synctype[torture_random(&rand) % nsynctypes]) {
case RTWS_DEF_FREE:
break;
case RTWS_EXP_SYNC:
cur_ops->exp_sync();
} else if (gp_normal && cur_ops->sync) {
cur_ops->sync();
} else if (cur_ops->exp_sync) {
cur_ops->exp_sync();
break;
case RTWS_COND_GET:
gp_snap = cur_ops->get_gp_state();
torture_hrtimeout_jiffies(torture_random(&rand) % 16, &rand);
cur_ops->cond_sync(gp_snap);
break;
case RTWS_POLL_GET:
gp_snap = cur_ops->start_gp_poll();
while (!cur_ops->poll_gp_state(gp_snap)) {
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
&rand);
}
break;
case RTWS_SYNC:
cur_ops->sync();
break;
default:
WARN_ON_ONCE(1);
break;
}
}
stutter_wait("rcu_torture_fakewriter");
} while (!torture_must_stop());
@ -1227,6 +1339,62 @@ static void rcu_torture_timer_cb(struct rcu_head *rhp)
kfree(rhp);
}
// Set up and carry out testing of RCU's global memory ordering
static void rcu_torture_reader_do_mbchk(long myid, struct rcu_torture *rtp,
struct torture_random_state *trsp)
{
unsigned long loops;
int noc = torture_num_online_cpus();
int rdrchked;
int rdrchker;
struct rcu_torture_reader_check *rtrcp; // Me.
struct rcu_torture_reader_check *rtrcp_assigner; // Assigned us to do checking.
struct rcu_torture_reader_check *rtrcp_chked; // Reader being checked.
struct rcu_torture_reader_check *rtrcp_chker; // Reader doing checking when not me.
if (myid < 0)
return; // Don't try this from timer handlers.
// Increment my counter.
rtrcp = &rcu_torture_reader_mbchk[myid];
WRITE_ONCE(rtrcp->rtc_myloops, rtrcp->rtc_myloops + 1);
// Attempt to assign someone else some checking work.
rdrchked = torture_random(trsp) % nrealreaders;
rtrcp_chked = &rcu_torture_reader_mbchk[rdrchked];
rdrchker = torture_random(trsp) % nrealreaders;
rtrcp_chker = &rcu_torture_reader_mbchk[rdrchker];
if (rdrchked != myid && rdrchked != rdrchker && noc >= rdrchked && noc >= rdrchker &&
smp_load_acquire(&rtrcp->rtc_chkrdr) < 0 && // Pairs with smp_store_release below.
!READ_ONCE(rtp->rtort_chkp) &&
!smp_load_acquire(&rtrcp_chker->rtc_assigner)) { // Pairs with smp_store_release below.
rtrcp->rtc_chkloops = READ_ONCE(rtrcp_chked->rtc_myloops);
WARN_ON_ONCE(rtrcp->rtc_chkrdr >= 0);
rtrcp->rtc_chkrdr = rdrchked;
WARN_ON_ONCE(rtrcp->rtc_ready); // This gets set after the grace period ends.
if (cmpxchg_relaxed(&rtrcp_chker->rtc_assigner, NULL, rtrcp) ||
cmpxchg_relaxed(&rtp->rtort_chkp, NULL, rtrcp))
(void)cmpxchg_relaxed(&rtrcp_chker->rtc_assigner, rtrcp, NULL); // Back out.
}
// If assigned some completed work, do it!
rtrcp_assigner = READ_ONCE(rtrcp->rtc_assigner);
if (!rtrcp_assigner || !smp_load_acquire(&rtrcp_assigner->rtc_ready))
return; // No work or work not yet ready.
rdrchked = rtrcp_assigner->rtc_chkrdr;
if (WARN_ON_ONCE(rdrchked < 0))
return;
rtrcp_chked = &rcu_torture_reader_mbchk[rdrchked];
loops = READ_ONCE(rtrcp_chked->rtc_myloops);
atomic_inc(&n_rcu_torture_mbchk_tries);
if (ULONG_CMP_LT(loops, rtrcp_assigner->rtc_chkloops))
atomic_inc(&n_rcu_torture_mbchk_fail);
rtrcp_assigner->rtc_chkloops = loops + ULONG_MAX / 2;
rtrcp_assigner->rtc_ready = 0;
smp_store_release(&rtrcp->rtc_assigner, NULL); // Someone else can assign us work.
smp_store_release(&rtrcp_assigner->rtc_chkrdr, -1); // Assigner can again assign.
}
/*
* Do one extension of an RCU read-side critical section using the
* current reader state in readstate (set to zero for initial entry
@ -1362,8 +1530,9 @@ rcutorture_loop_extend(int *readstate, struct torture_random_state *trsp,
* no data to read. Can be invoked both from process context and
* from a timer handler.
*/
static bool rcu_torture_one_read(struct torture_random_state *trsp)
static bool rcu_torture_one_read(struct torture_random_state *trsp, long myid)
{
unsigned long cookie;
int i;
unsigned long started;
unsigned long completed;
@ -1379,6 +1548,8 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp)
WARN_ON_ONCE(!rcu_is_watching());
newstate = rcutorture_extend_mask(readstate, trsp);
rcutorture_one_extend(&readstate, newstate, trsp, rtrsp++);
if (cur_ops->get_gp_state && cur_ops->poll_gp_state)
cookie = cur_ops->get_gp_state();
started = cur_ops->get_gp_seq();
ts = rcu_trace_clock_local();
p = rcu_dereference_check(rcu_torture_current,
@ -1394,6 +1565,7 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp)
}
if (p->rtort_mbtest == 0)
atomic_inc(&n_rcu_torture_mberror);
rcu_torture_reader_do_mbchk(myid, p, trsp);
rtrsp = rcutorture_loop_extend(&readstate, trsp, rtrsp);
preempt_disable();
pipe_count = READ_ONCE(p->rtort_pipe_count);
@ -1415,6 +1587,13 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp)
}
__this_cpu_inc(rcu_torture_batch[completed]);
preempt_enable();
if (cur_ops->get_gp_state && cur_ops->poll_gp_state)
WARN_ONCE(cur_ops->poll_gp_state(cookie),
"%s: Cookie check 3 failed %s(%d) %lu->%lu\n",
__func__,
rcu_torture_writer_state_getname(),
rcu_torture_writer_state,
cookie, cur_ops->get_gp_state());
rcutorture_one_extend(&readstate, 0, trsp, rtrsp);
WARN_ON_ONCE(readstate & RCUTORTURE_RDR_MASK);
// This next splat is expected behavior if leakpointer, especially
@ -1443,7 +1622,7 @@ static DEFINE_TORTURE_RANDOM_PERCPU(rcu_torture_timer_rand);
static void rcu_torture_timer(struct timer_list *unused)
{
atomic_long_inc(&n_rcu_torture_timers);
(void)rcu_torture_one_read(this_cpu_ptr(&rcu_torture_timer_rand));
(void)rcu_torture_one_read(this_cpu_ptr(&rcu_torture_timer_rand), -1);
/* Test call_rcu() invocation from interrupt handler. */
if (cur_ops->call) {
@ -1479,13 +1658,13 @@ rcu_torture_reader(void *arg)
if (!timer_pending(&t))
mod_timer(&t, jiffies + 1);
}
if (!rcu_torture_one_read(&rand) && !torture_must_stop())
if (!rcu_torture_one_read(&rand, myid) && !torture_must_stop())
schedule_timeout_interruptible(HZ);
if (time_after(jiffies, lastsleep) && !torture_must_stop()) {
schedule_timeout_interruptible(1);
torture_hrtimeout_us(500, 1000, &rand);
lastsleep = jiffies + 10;
}
while (num_online_cpus() < mynumonline && !torture_must_stop())
while (torture_num_online_cpus() < mynumonline && !torture_must_stop())
schedule_timeout_interruptible(HZ / 5);
stutter_wait("rcu_torture_reader");
} while (!torture_must_stop());
@ -1498,6 +1677,53 @@ rcu_torture_reader(void *arg)
return 0;
}
/*
* Randomly Toggle CPUs' callback-offload state. This uses hrtimers to
* increase race probabilities and fuzzes the interval between toggling.
*/
static int rcu_nocb_toggle(void *arg)
{
int cpu;
int maxcpu = -1;
int oldnice = task_nice(current);
long r;
DEFINE_TORTURE_RANDOM(rand);
ktime_t toggle_delay;
unsigned long toggle_fuzz;
ktime_t toggle_interval = ms_to_ktime(nocbs_toggle);
VERBOSE_TOROUT_STRING("rcu_nocb_toggle task started");
while (!rcu_inkernel_boot_has_ended())
schedule_timeout_interruptible(HZ / 10);
for_each_online_cpu(cpu)
maxcpu = cpu;
WARN_ON(maxcpu < 0);
if (toggle_interval > ULONG_MAX)
toggle_fuzz = ULONG_MAX >> 3;
else
toggle_fuzz = toggle_interval >> 3;
if (toggle_fuzz <= 0)
toggle_fuzz = NSEC_PER_USEC;
do {
r = torture_random(&rand);
cpu = (r >> 4) % (maxcpu + 1);
if (r & 0x1) {
rcu_nocb_cpu_offload(cpu);
atomic_long_inc(&n_nocb_offload);
} else {
rcu_nocb_cpu_deoffload(cpu);
atomic_long_inc(&n_nocb_deoffload);
}
toggle_delay = torture_random(&rand) % toggle_fuzz + toggle_interval;
set_current_state(TASK_INTERRUPTIBLE);
schedule_hrtimeout(&toggle_delay, HRTIMER_MODE_REL);
if (stutter_wait("rcu_nocb_toggle"))
sched_set_normal(current, oldnice);
} while (!torture_must_stop());
torture_kthread_stopping("rcu_nocb_toggle");
return 0;
}
/*
* Print torture statistics. Caller must ensure that there is only
* one call to this function at a given time!!! This is normally
@ -1539,8 +1765,9 @@ rcu_torture_stats_print(void)
atomic_read(&n_rcu_torture_alloc),
atomic_read(&n_rcu_torture_alloc_fail),
atomic_read(&n_rcu_torture_free));
pr_cont("rtmbe: %d rtbe: %ld rtbke: %ld rtbre: %ld ",
pr_cont("rtmbe: %d rtmbkf: %d/%d rtbe: %ld rtbke: %ld rtbre: %ld ",
atomic_read(&n_rcu_torture_mberror),
atomic_read(&n_rcu_torture_mbchk_fail), atomic_read(&n_rcu_torture_mbchk_tries),
n_rcu_torture_barrier_error,
n_rcu_torture_boost_ktrerror,
n_rcu_torture_boost_rterror);
@ -1553,16 +1780,20 @@ rcu_torture_stats_print(void)
data_race(n_barrier_successes),
data_race(n_barrier_attempts),
data_race(n_rcu_torture_barrier_error));
pr_cont("read-exits: %ld\n", data_race(n_read_exits));
pr_cont("read-exits: %ld ", data_race(n_read_exits)); // Statistic.
pr_cont("nocb-toggles: %ld:%ld\n",
atomic_long_read(&n_nocb_offload), atomic_long_read(&n_nocb_deoffload));
pr_alert("%s%s ", torture_type, TORTURE_FLAG);
if (atomic_read(&n_rcu_torture_mberror) ||
atomic_read(&n_rcu_torture_mbchk_fail) ||
n_rcu_torture_barrier_error || n_rcu_torture_boost_ktrerror ||
n_rcu_torture_boost_rterror || n_rcu_torture_boost_failure ||
i > 1) {
pr_cont("%s", "!!! ");
atomic_inc(&n_rcu_torture_error);
WARN_ON_ONCE(atomic_read(&n_rcu_torture_mberror));
WARN_ON_ONCE(atomic_read(&n_rcu_torture_mbchk_fail));
WARN_ON_ONCE(n_rcu_torture_barrier_error); // rcu_barrier()
WARN_ON_ONCE(n_rcu_torture_boost_ktrerror); // no boost kthread
WARN_ON_ONCE(n_rcu_torture_boost_rterror); // can't set RT prio
@ -1647,7 +1878,8 @@ rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
"stall_cpu_block=%d "
"n_barrier_cbs=%d "
"onoff_interval=%d onoff_holdoff=%d "
"read_exit_delay=%d read_exit_burst=%d\n",
"read_exit_delay=%d read_exit_burst=%d "
"nocbs_nthreads=%d nocbs_toggle=%d\n",
torture_type, tag, nrealreaders, nfakewriters,
stat_interval, verbose, test_no_idle_hz, shuffle_interval,
stutter, irqreader, fqs_duration, fqs_holdoff, fqs_stutter,
@ -1657,7 +1889,8 @@ rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
stall_cpu_block,
n_barrier_cbs,
onoff_interval, onoff_holdoff,
read_exit_delay, read_exit_burst);
read_exit_delay, read_exit_burst,
nocbs_nthreads, nocbs_toggle);
}
static int rcutorture_booster_cleanup(unsigned int cpu)
@ -2392,7 +2625,7 @@ static int rcu_torture_read_exit_child(void *trsp_in)
// Minimize time between reading and exiting.
while (!kthread_should_stop())
schedule_timeout_uninterruptible(1);
(void)rcu_torture_one_read(trsp);
(void)rcu_torture_one_read(trsp, -1);
return 0;
}
@ -2500,6 +2733,13 @@ rcu_torture_cleanup(void)
torture_stop_kthread(rcu_torture_stall, stall_task);
torture_stop_kthread(rcu_torture_writer, writer_task);
if (nocb_tasks) {
for (i = 0; i < nrealnocbers; i++)
torture_stop_kthread(rcu_nocb_toggle, nocb_tasks[i]);
kfree(nocb_tasks);
nocb_tasks = NULL;
}
if (reader_tasks) {
for (i = 0; i < nrealreaders; i++)
torture_stop_kthread(rcu_torture_reader,
@ -2507,6 +2747,8 @@ rcu_torture_cleanup(void)
kfree(reader_tasks);
reader_tasks = NULL;
}
kfree(rcu_torture_reader_mbchk);
rcu_torture_reader_mbchk = NULL;
if (fakewriter_tasks) {
for (i = 0; i < nfakewriters; i++)
@ -2604,6 +2846,7 @@ static void rcu_test_debug_objects(void)
#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
struct rcu_head rh1;
struct rcu_head rh2;
struct rcu_head *rhp = kmalloc(sizeof(*rhp), GFP_KERNEL);
init_rcu_head_on_stack(&rh1);
init_rcu_head_on_stack(&rh2);
@ -2616,6 +2859,10 @@ static void rcu_test_debug_objects(void)
local_irq_disable(); /* Make it harder to start a new grace period. */
call_rcu(&rh2, rcu_torture_leak_cb);
call_rcu(&rh2, rcu_torture_err_cb); /* Duplicate callback. */
if (rhp) {
call_rcu(rhp, rcu_torture_leak_cb);
call_rcu(rhp, rcu_torture_err_cb); /* Another duplicate callback. */
}
local_irq_enable();
rcu_read_unlock();
preempt_enable();
@ -2710,6 +2957,8 @@ rcu_torture_init(void)
atomic_set(&n_rcu_torture_alloc_fail, 0);
atomic_set(&n_rcu_torture_free, 0);
atomic_set(&n_rcu_torture_mberror, 0);
atomic_set(&n_rcu_torture_mbchk_fail, 0);
atomic_set(&n_rcu_torture_mbchk_tries, 0);
atomic_set(&n_rcu_torture_error, 0);
n_rcu_torture_barrier_error = 0;
n_rcu_torture_boost_ktrerror = 0;
@ -2729,6 +2978,7 @@ rcu_torture_init(void)
/* Start up the kthreads. */
rcu_torture_write_types();
firsterr = torture_create_kthread(rcu_torture_writer, NULL,
writer_task);
if (firsterr)
@ -2751,17 +3001,40 @@ rcu_torture_init(void)
}
reader_tasks = kcalloc(nrealreaders, sizeof(reader_tasks[0]),
GFP_KERNEL);
if (reader_tasks == NULL) {
rcu_torture_reader_mbchk = kcalloc(nrealreaders, sizeof(*rcu_torture_reader_mbchk),
GFP_KERNEL);
if (!reader_tasks || !rcu_torture_reader_mbchk) {
VERBOSE_TOROUT_ERRSTRING("out of memory");
firsterr = -ENOMEM;
goto unwind;
}
for (i = 0; i < nrealreaders; i++) {
rcu_torture_reader_mbchk[i].rtc_chkrdr = -1;
firsterr = torture_create_kthread(rcu_torture_reader, (void *)i,
reader_tasks[i]);
if (firsterr)
goto unwind;
}
nrealnocbers = nocbs_nthreads;
if (WARN_ON(nrealnocbers < 0))
nrealnocbers = 1;
if (WARN_ON(nocbs_toggle < 0))
nocbs_toggle = HZ;
if (nrealnocbers > 0) {
nocb_tasks = kcalloc(nrealnocbers, sizeof(nocb_tasks[0]), GFP_KERNEL);
if (nocb_tasks == NULL) {
VERBOSE_TOROUT_ERRSTRING("out of memory");
firsterr = -ENOMEM;
goto unwind;
}
} else {
nocb_tasks = NULL;
}
for (i = 0; i < nrealnocbers; i++) {
firsterr = torture_create_kthread(rcu_nocb_toggle, NULL, nocb_tasks[i]);
if (firsterr)
goto unwind;
}
if (stat_interval > 0) {
firsterr = torture_create_kthread(rcu_torture_stats, NULL,
stats_task);

View File

@ -46,6 +46,18 @@
#define VERBOSE_SCALEOUT(s, x...) \
do { if (verbose) pr_alert("%s" SCALE_FLAG s, scale_type, ## x); } while (0)
static atomic_t verbose_batch_ctr;
#define VERBOSE_SCALEOUT_BATCH(s, x...) \
do { \
if (verbose && \
(verbose_batched <= 0 || \
!(atomic_inc_return(&verbose_batch_ctr) % verbose_batched))) { \
schedule_timeout_uninterruptible(1); \
pr_alert("%s" SCALE_FLAG s, scale_type, ## x); \
} \
} while (0)
#define VERBOSE_SCALEOUT_ERRSTRING(s, x...) \
do { if (verbose) pr_alert("%s" SCALE_FLAG "!!! " s, scale_type, ## x); } while (0)
@ -57,6 +69,7 @@ module_param(scale_type, charp, 0444);
MODULE_PARM_DESC(scale_type, "Type of test (rcu, srcu, refcnt, rwsem, rwlock.");
torture_param(int, verbose, 0, "Enable verbose debugging printk()s");
torture_param(int, verbose_batched, 0, "Batch verbose debugging printk()s");
// Wait until there are multiple CPUs before starting test.
torture_param(int, holdoff, IS_BUILTIN(CONFIG_RCU_REF_SCALE_TEST) ? 10 : 0,
@ -368,14 +381,14 @@ ref_scale_reader(void *arg)
u64 start;
s64 duration;
VERBOSE_SCALEOUT("ref_scale_reader %ld: task started", me);
VERBOSE_SCALEOUT_BATCH("ref_scale_reader %ld: task started", me);
set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids));
set_user_nice(current, MAX_NICE);
atomic_inc(&n_init);
if (holdoff)
schedule_timeout_interruptible(holdoff * HZ);
repeat:
VERBOSE_SCALEOUT("ref_scale_reader %ld: waiting to start next experiment on cpu %d", me, smp_processor_id());
VERBOSE_SCALEOUT_BATCH("ref_scale_reader %ld: waiting to start next experiment on cpu %d", me, smp_processor_id());
// Wait for signal that this reader can start.
wait_event(rt->wq, (atomic_read(&nreaders_exp) && smp_load_acquire(&rt->start_reader)) ||
@ -392,7 +405,7 @@ repeat:
while (atomic_read_acquire(&n_started))
cpu_relax();
VERBOSE_SCALEOUT("ref_scale_reader %ld: experiment %d started", me, exp_idx);
VERBOSE_SCALEOUT_BATCH("ref_scale_reader %ld: experiment %d started", me, exp_idx);
// To reduce noise, do an initial cache-warming invocation, check
@ -421,8 +434,8 @@ repeat:
if (atomic_dec_and_test(&nreaders_exp))
wake_up(&main_wq);
VERBOSE_SCALEOUT("ref_scale_reader %ld: experiment %d ended, (readers remaining=%d)",
me, exp_idx, atomic_read(&nreaders_exp));
VERBOSE_SCALEOUT_BATCH("ref_scale_reader %ld: experiment %d ended, (readers remaining=%d)",
me, exp_idx, atomic_read(&nreaders_exp));
if (!torture_must_stop())
goto repeat;

View File

@ -34,6 +34,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp)
ssp->srcu_gp_running = false;
ssp->srcu_gp_waiting = false;
ssp->srcu_idx = 0;
ssp->srcu_idx_max = 0;
INIT_WORK(&ssp->srcu_work, srcu_drive_gp);
INIT_LIST_HEAD(&ssp->srcu_work.entry);
return 0;
@ -84,6 +85,8 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
WARN_ON(ssp->srcu_gp_waiting);
WARN_ON(ssp->srcu_cb_head);
WARN_ON(&ssp->srcu_cb_head != ssp->srcu_cb_tail);
WARN_ON(ssp->srcu_idx != ssp->srcu_idx_max);
WARN_ON(ssp->srcu_idx & 0x1);
}
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
@ -114,7 +117,7 @@ void srcu_drive_gp(struct work_struct *wp)
struct srcu_struct *ssp;
ssp = container_of(wp, struct srcu_struct, srcu_work);
if (ssp->srcu_gp_running || !READ_ONCE(ssp->srcu_cb_head))
if (ssp->srcu_gp_running || USHORT_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
return; /* Already running or nothing to do. */
/* Remove recently arrived callbacks and wait for readers. */
@ -124,11 +127,12 @@ void srcu_drive_gp(struct work_struct *wp)
ssp->srcu_cb_head = NULL;
ssp->srcu_cb_tail = &ssp->srcu_cb_head;
local_irq_enable();
idx = ssp->srcu_idx;
WRITE_ONCE(ssp->srcu_idx, !ssp->srcu_idx);
idx = (ssp->srcu_idx & 0x2) / 2;
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
WRITE_ONCE(ssp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */
swait_event_exclusive(ssp->srcu_wq, !READ_ONCE(ssp->srcu_lock_nesting[idx]));
WRITE_ONCE(ssp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
/* Invoke the callbacks we removed above. */
while (lh) {
@ -146,11 +150,27 @@ void srcu_drive_gp(struct work_struct *wp)
* straighten that out.
*/
WRITE_ONCE(ssp->srcu_gp_running, false);
if (READ_ONCE(ssp->srcu_cb_head))
if (USHORT_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
schedule_work(&ssp->srcu_work);
}
EXPORT_SYMBOL_GPL(srcu_drive_gp);
static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
{
unsigned short cookie;
cookie = get_state_synchronize_srcu(ssp);
if (USHORT_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie))
return;
WRITE_ONCE(ssp->srcu_idx_max, cookie);
if (!READ_ONCE(ssp->srcu_gp_running)) {
if (likely(srcu_init_done))
schedule_work(&ssp->srcu_work);
else if (list_empty(&ssp->srcu_work.entry))
list_add(&ssp->srcu_work.entry, &srcu_boot_list);
}
}
/*
* Enqueue an SRCU callback on the specified srcu_struct structure,
* initiating grace-period processing if it is not already running.
@ -166,12 +186,7 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
*ssp->srcu_cb_tail = rhp;
ssp->srcu_cb_tail = &rhp->next;
local_irq_restore(flags);
if (!READ_ONCE(ssp->srcu_gp_running)) {
if (likely(srcu_init_done))
schedule_work(&ssp->srcu_work);
else if (list_empty(&ssp->srcu_work.entry))
list_add(&ssp->srcu_work.entry, &srcu_boot_list);
}
srcu_gp_start_if_needed(ssp);
}
EXPORT_SYMBOL_GPL(call_srcu);
@ -190,6 +205,48 @@ void synchronize_srcu(struct srcu_struct *ssp)
}
EXPORT_SYMBOL_GPL(synchronize_srcu);
/*
* get_state_synchronize_srcu - Provide an end-of-grace-period cookie
*/
unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)
{
unsigned long ret;
barrier();
ret = (READ_ONCE(ssp->srcu_idx) + 3) & ~0x1;
barrier();
return ret & USHRT_MAX;
}
EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
/*
* start_poll_synchronize_srcu - Provide cookie and start grace period
*
* The difference between this and get_state_synchronize_srcu() is that
* this function ensures that the poll_state_synchronize_srcu() will
* eventually return the value true.
*/
unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
{
unsigned long ret = get_state_synchronize_srcu(ssp);
srcu_gp_start_if_needed(ssp);
return ret;
}
EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
/*
* poll_state_synchronize_srcu - Has cookie's grace period ended?
*/
bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
{
bool ret = USHORT_CMP_GE(READ_ONCE(ssp->srcu_idx), cookie);
barrier();
return ret;
}
EXPORT_SYMBOL_GPL(poll_state_synchronize_srcu);
/* Lockdep diagnostics. */
void __init rcu_scheduler_starting(void)
{

View File

@ -807,6 +807,46 @@ static void srcu_leak_callback(struct rcu_head *rhp)
{
}
/*
* Start an SRCU grace period, and also queue the callback if non-NULL.
*/
static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
struct rcu_head *rhp, bool do_norm)
{
unsigned long flags;
int idx;
bool needexp = false;
bool needgp = false;
unsigned long s;
struct srcu_data *sdp;
check_init_srcu_struct(ssp);
idx = srcu_read_lock(ssp);
sdp = raw_cpu_ptr(ssp->sda);
spin_lock_irqsave_rcu_node(sdp, flags);
if (rhp)
rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
rcu_segcblist_advance(&sdp->srcu_cblist,
rcu_seq_current(&ssp->srcu_gp_seq));
s = rcu_seq_snap(&ssp->srcu_gp_seq);
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
sdp->srcu_gp_seq_needed = s;
needgp = true;
}
if (!do_norm && ULONG_CMP_LT(sdp->srcu_gp_seq_needed_exp, s)) {
sdp->srcu_gp_seq_needed_exp = s;
needexp = true;
}
spin_unlock_irqrestore_rcu_node(sdp, flags);
if (needgp)
srcu_funnel_gp_start(ssp, sdp, s, do_norm);
else if (needexp)
srcu_funnel_exp_start(ssp, sdp->mynode, s);
srcu_read_unlock(ssp, idx);
return s;
}
/*
* Enqueue an SRCU callback on the srcu_data structure associated with
* the current CPU and the specified srcu_struct structure, initiating
@ -838,14 +878,6 @@ static void srcu_leak_callback(struct rcu_head *rhp)
static void __call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
rcu_callback_t func, bool do_norm)
{
unsigned long flags;
int idx;
bool needexp = false;
bool needgp = false;
unsigned long s;
struct srcu_data *sdp;
check_init_srcu_struct(ssp);
if (debug_rcu_head_queue(rhp)) {
/* Probable double call_srcu(), so leak the callback. */
WRITE_ONCE(rhp->func, srcu_leak_callback);
@ -853,28 +885,7 @@ static void __call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
return;
}
rhp->func = func;
idx = srcu_read_lock(ssp);
sdp = raw_cpu_ptr(ssp->sda);
spin_lock_irqsave_rcu_node(sdp, flags);
rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
rcu_segcblist_advance(&sdp->srcu_cblist,
rcu_seq_current(&ssp->srcu_gp_seq));
s = rcu_seq_snap(&ssp->srcu_gp_seq);
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
sdp->srcu_gp_seq_needed = s;
needgp = true;
}
if (!do_norm && ULONG_CMP_LT(sdp->srcu_gp_seq_needed_exp, s)) {
sdp->srcu_gp_seq_needed_exp = s;
needexp = true;
}
spin_unlock_irqrestore_rcu_node(sdp, flags);
if (needgp)
srcu_funnel_gp_start(ssp, sdp, s, do_norm);
else if (needexp)
srcu_funnel_exp_start(ssp, sdp->mynode, s);
srcu_read_unlock(ssp, idx);
(void)srcu_gp_start_if_needed(ssp, rhp, do_norm);
}
/**
@ -1003,6 +1014,77 @@ void synchronize_srcu(struct srcu_struct *ssp)
}
EXPORT_SYMBOL_GPL(synchronize_srcu);
/**
* get_state_synchronize_srcu - Provide an end-of-grace-period cookie
* @ssp: srcu_struct to provide cookie for.
*
* This function returns a cookie that can be passed to
* poll_state_synchronize_srcu(), which will return true if a full grace
* period has elapsed in the meantime. It is the caller's responsibility
* to make sure that grace period happens, for example, by invoking
* call_srcu() after return from get_state_synchronize_srcu().
*/
unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)
{
// Any prior manipulation of SRCU-protected data must happen
// before the load from ->srcu_gp_seq.
smp_mb();
return rcu_seq_snap(&ssp->srcu_gp_seq);
}
EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
/**
* start_poll_synchronize_srcu - Provide cookie and start grace period
* @ssp: srcu_struct to provide cookie for.
*
* This function returns a cookie that can be passed to
* poll_state_synchronize_srcu(), which will return true if a full grace
* period has elapsed in the meantime. Unlike get_state_synchronize_srcu(),
* this function also ensures that any needed SRCU grace period will be
* started. This convenience does come at a cost in terms of CPU overhead.
*/
unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
{
return srcu_gp_start_if_needed(ssp, NULL, true);
}
EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
/**
* poll_state_synchronize_srcu - Has cookie's grace period ended?
* @ssp: srcu_struct to provide cookie for.
* @cookie: Return value from get_state_synchronize_srcu() or start_poll_synchronize_srcu().
*
* This function takes the cookie that was returned from either
* get_state_synchronize_srcu() or start_poll_synchronize_srcu(), and
* returns @true if an SRCU grace period elapsed since the time that the
* cookie was created.
*
* Because cookies are finite in size, wrapping/overflow is possible.
* This is more pronounced on 32-bit systems where cookies are 32 bits,
* where in theory wrapping could happen in about 14 hours assuming
* 25-microsecond expedited SRCU grace periods. However, a more likely
* overflow lower bound is on the order of 24 days in the case of
* one-millisecond SRCU grace periods. Of course, wrapping in a 64-bit
* system requires geologic timespans, as in more than seven million years
* even for expedited SRCU grace periods.
*
* Wrapping/overflow is much more of an issue for CONFIG_SMP=n systems
* that also have CONFIG_PREEMPTION=n, which selects Tiny SRCU. This uses
* a 16-bit cookie, which rcutorture routinely wraps in a matter of a
* few minutes. If this proves to be a problem, this counter will be
* expanded to the same size as for Tree SRCU.
*/
bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
{
if (!rcu_seq_done(&ssp->srcu_gp_seq, cookie))
return false;
// Ensure that the end of the SRCU grace period happens before
// any subsequent code that the caller might execute.
smp_mb(); // ^^^
return true;
}
EXPORT_SYMBOL_GPL(poll_state_synchronize_srcu);
/*
* Callback function for srcu_barrier() use.
*/
@ -1160,6 +1242,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
*/
static void srcu_invoke_callbacks(struct work_struct *work)
{
long len;
bool more;
struct rcu_cblist ready_cbs;
struct rcu_head *rhp;
@ -1182,6 +1265,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
/* We are on the job! Extract and invoke ready callbacks. */
sdp->srcu_cblist_invoking = true;
rcu_segcblist_extract_done_cbs(&sdp->srcu_cblist, &ready_cbs);
len = ready_cbs.len;
spin_unlock_irq_rcu_node(sdp);
rhp = rcu_cblist_dequeue(&ready_cbs);
for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) {
@ -1190,13 +1274,14 @@ static void srcu_invoke_callbacks(struct work_struct *work)
rhp->func(rhp);
local_bh_enable();
}
WARN_ON_ONCE(ready_cbs.len);
/*
* Update counts, accelerate new callbacks, and if needed,
* schedule another round of callback invocation.
*/
spin_lock_irq_rcu_node(sdp);
rcu_segcblist_insert_count(&sdp->srcu_cblist, &ready_cbs);
rcu_segcblist_add_len(&sdp->srcu_cblist, -len);
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
rcu_seq_snap(&ssp->srcu_gp_seq));
sdp->srcu_cblist_invoking = false;

View File

@ -1224,6 +1224,82 @@ void show_rcu_tasks_gp_kthreads(void)
}
#endif /* #ifndef CONFIG_TINY_RCU */
#ifdef CONFIG_PROVE_RCU
struct rcu_tasks_test_desc {
struct rcu_head rh;
const char *name;
bool notrun;
};
static struct rcu_tasks_test_desc tests[] = {
{
.name = "call_rcu_tasks()",
/* If not defined, the test is skipped. */
.notrun = !IS_ENABLED(CONFIG_TASKS_RCU),
},
{
.name = "call_rcu_tasks_rude()",
/* If not defined, the test is skipped. */
.notrun = !IS_ENABLED(CONFIG_TASKS_RUDE_RCU),
},
{
.name = "call_rcu_tasks_trace()",
/* If not defined, the test is skipped. */
.notrun = !IS_ENABLED(CONFIG_TASKS_TRACE_RCU)
}
};
static void test_rcu_tasks_callback(struct rcu_head *rhp)
{
struct rcu_tasks_test_desc *rttd =
container_of(rhp, struct rcu_tasks_test_desc, rh);
pr_info("Callback from %s invoked.\n", rttd->name);
rttd->notrun = true;
}
static void rcu_tasks_initiate_self_tests(void)
{
pr_info("Running RCU-tasks wait API self tests\n");
#ifdef CONFIG_TASKS_RCU
synchronize_rcu_tasks();
call_rcu_tasks(&tests[0].rh, test_rcu_tasks_callback);
#endif
#ifdef CONFIG_TASKS_RUDE_RCU
synchronize_rcu_tasks_rude();
call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback);
#endif
#ifdef CONFIG_TASKS_TRACE_RCU
synchronize_rcu_tasks_trace();
call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback);
#endif
}
static int rcu_tasks_verify_self_tests(void)
{
int ret = 0;
int i;
for (i = 0; i < ARRAY_SIZE(tests); i++) {
if (!tests[i].notrun) { // still hanging.
pr_err("%s has been failed.\n", tests[i].name);
ret = -1;
}
}
if (ret)
WARN_ON(1);
return ret;
}
late_initcall(rcu_tasks_verify_self_tests);
#else /* #ifdef CONFIG_PROVE_RCU */
static void rcu_tasks_initiate_self_tests(void) { }
#endif /* #else #ifdef CONFIG_PROVE_RCU */
void __init rcu_init_tasks_generic(void)
{
#ifdef CONFIG_TASKS_RCU
@ -1237,6 +1313,9 @@ void __init rcu_init_tasks_generic(void)
#ifdef CONFIG_TASKS_TRACE_RCU
rcu_spawn_tasks_trace_kthread();
#endif
// Run the self-tests.
rcu_tasks_initiate_self_tests();
}
#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */

View File

@ -83,6 +83,9 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
.dynticks_nesting = 1,
.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
.dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR),
#ifdef CONFIG_RCU_NOCB_CPU
.cblist.flags = SEGCBLIST_SOFTIRQ_ONLY,
#endif
};
static struct rcu_state rcu_state = {
.level = { &rcu_state.node[0] },
@ -100,8 +103,10 @@ static struct rcu_state rcu_state = {
static bool dump_tree;
module_param(dump_tree, bool, 0444);
/* By default, use RCU_SOFTIRQ instead of rcuc kthreads. */
static bool use_softirq = true;
static bool use_softirq = !IS_ENABLED(CONFIG_PREEMPT_RT);
#ifndef CONFIG_PREEMPT_RT
module_param(use_softirq, bool, 0444);
#endif
/* Control rcu_node-tree auto-balancing at boot time. */
static bool rcu_fanout_exact;
module_param(rcu_fanout_exact, bool, 0444);
@ -1495,6 +1500,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
if (!rcu_segcblist_pend_cbs(&rdp->cblist))
return false;
trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCbPreAcc"));
/*
* Callbacks are often registered with incomplete grace-period
* information. Something about the fact that getting exact
@ -1515,6 +1522,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
else
trace_rcu_grace_period(rcu_state.name, gp_seq_req, TPS("AccReadyCB"));
trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCbPostAcc"));
return ret;
}
@ -1765,7 +1774,7 @@ static bool rcu_gp_init(void)
* go offline later. Please also refer to "Hotplug CPU" section
* of RCU's Requirements documentation.
*/
rcu_state.gp_state = RCU_GP_ONOFF;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_ONOFF);
rcu_for_each_leaf_node(rnp) {
smp_mb(); // Pair with barriers used when updating ->ofl_seq to odd values.
firstseq = READ_ONCE(rnp->ofl_seq);
@ -1831,7 +1840,7 @@ static bool rcu_gp_init(void)
* The grace period cannot complete until the initialization
* process finishes, because this kthread handles both.
*/
rcu_state.gp_state = RCU_GP_INIT;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_INIT);
rcu_for_each_node_breadth_first(rnp) {
rcu_gp_slow(gp_init_delay);
raw_spin_lock_irqsave_rcu_node(rnp, flags);
@ -1930,17 +1939,22 @@ static void rcu_gp_fqs_loop(void)
ret = 0;
for (;;) {
if (!ret) {
rcu_state.jiffies_force_qs = jiffies + j;
WRITE_ONCE(rcu_state.jiffies_force_qs, jiffies + j);
/*
* jiffies_force_qs before RCU_GP_WAIT_FQS state
* update; required for stall checks.
*/
smp_wmb();
WRITE_ONCE(rcu_state.jiffies_kick_kthreads,
jiffies + (j ? 3 * j : 2));
}
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq,
TPS("fqswait"));
rcu_state.gp_state = RCU_GP_WAIT_FQS;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_WAIT_FQS);
ret = swait_event_idle_timeout_exclusive(
rcu_state.gp_wq, rcu_gp_fqs_check_wake(&gf), j);
rcu_gp_torture_wait();
rcu_state.gp_state = RCU_GP_DOING_FQS;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_DOING_FQS);
/* Locking provides needed memory barriers. */
/* If grace period done, leave loop. */
if (!READ_ONCE(rnp->qsmask) &&
@ -2054,7 +2068,7 @@ static void rcu_gp_cleanup(void)
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("end"));
rcu_seq_end(&rcu_state.gp_seq);
ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq);
rcu_state.gp_state = RCU_GP_IDLE;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_IDLE);
/* Check for GP requests since above loop. */
rdp = this_cpu_ptr(&rcu_data);
if (!needgp && ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed)) {
@ -2093,12 +2107,12 @@ static int __noreturn rcu_gp_kthread(void *unused)
for (;;) {
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq,
TPS("reqwait"));
rcu_state.gp_state = RCU_GP_WAIT_GPS;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_WAIT_GPS);
swait_event_idle_exclusive(rcu_state.gp_wq,
READ_ONCE(rcu_state.gp_flags) &
RCU_GP_FLAG_INIT);
rcu_gp_torture_wait();
rcu_state.gp_state = RCU_GP_DONE_GPS;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_DONE_GPS);
/* Locking provides needed memory barrier. */
if (rcu_gp_init())
break;
@ -2113,9 +2127,9 @@ static int __noreturn rcu_gp_kthread(void *unused)
rcu_gp_fqs_loop();
/* Handle grace-period end. */
rcu_state.gp_state = RCU_GP_CLEANUP;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANUP);
rcu_gp_cleanup();
rcu_state.gp_state = RCU_GP_CLEANED;
WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANED);
}
}
@ -2430,11 +2444,12 @@ int rcutree_dead_cpu(unsigned int cpu)
static void rcu_do_batch(struct rcu_data *rdp)
{
int div;
bool __maybe_unused empty;
unsigned long flags;
const bool offloaded = rcu_segcblist_is_offloaded(&rdp->cblist);
struct rcu_head *rhp;
struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
long bl, count;
long bl, count = 0;
long pending, tlimit = 0;
/* If no callbacks are ready, just return. */
@ -2471,14 +2486,18 @@ static void rcu_do_batch(struct rcu_data *rdp)
rcu_segcblist_extract_done_cbs(&rdp->cblist, &rcl);
if (offloaded)
rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist);
trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCbDequeued"));
rcu_nocb_unlock_irqrestore(rdp, flags);
/* Invoke callbacks. */
tick_dep_set_task(current, TICK_DEP_BIT_RCU);
rhp = rcu_cblist_dequeue(&rcl);
for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
rcu_callback_t f;
count++;
debug_rcu_head_unqueue(rhp);
rcu_lock_acquire(&rcu_callback_map);
@ -2492,21 +2511,19 @@ static void rcu_do_batch(struct rcu_data *rdp)
/*
* Stop only if limit reached and CPU has something to do.
* Note: The rcl structure counts down from zero.
*/
if (-rcl.len >= bl && !offloaded &&
if (count >= bl && !offloaded &&
(need_resched() ||
(!is_idle_task(current) && !rcu_is_callbacks_kthread())))
break;
if (unlikely(tlimit)) {
/* only call local_clock() every 32 callbacks */
if (likely((-rcl.len & 31) || local_clock() < tlimit))
if (likely((count & 31) || local_clock() < tlimit))
continue;
/* Exceeded the time limit, so leave. */
break;
}
if (offloaded) {
WARN_ON_ONCE(in_serving_softirq());
if (!in_serving_softirq()) {
local_bh_enable();
lockdep_assert_irqs_enabled();
cond_resched_tasks_rcu_qs();
@ -2517,15 +2534,13 @@ static void rcu_do_batch(struct rcu_data *rdp)
local_irq_save(flags);
rcu_nocb_lock(rdp);
count = -rcl.len;
rdp->n_cbs_invoked += count;
trace_rcu_batch_end(rcu_state.name, count, !!rcl.head, need_resched(),
is_idle_task(current), rcu_is_callbacks_kthread());
/* Update counts and requeue any remaining callbacks. */
rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
smp_mb(); /* List handling before counting for rcu_barrier(). */
rcu_segcblist_insert_count(&rdp->cblist, &rcl);
rcu_segcblist_add_len(&rdp->cblist, -count);
/* Reinstate batch limit if we have worked down the excess. */
count = rcu_segcblist_n_cbs(&rdp->cblist);
@ -2543,9 +2558,12 @@ static void rcu_do_batch(struct rcu_data *rdp)
* The following usually indicates a double call_rcu(). To track
* this down, try building with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y.
*/
WARN_ON_ONCE(count == 0 && !rcu_segcblist_empty(&rdp->cblist));
empty = rcu_segcblist_empty(&rdp->cblist);
WARN_ON_ONCE(count == 0 && !empty);
WARN_ON_ONCE(!IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
count != 0 && rcu_segcblist_empty(&rdp->cblist));
count != 0 && empty);
WARN_ON_ONCE(count == 0 && rcu_segcblist_n_segment_cbs(&rdp->cblist) != 0);
WARN_ON_ONCE(!empty && rcu_segcblist_n_segment_cbs(&rdp->cblist) == 0);
rcu_nocb_unlock_irqrestore(rdp, flags);
@ -2566,6 +2584,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
void rcu_sched_clock_irq(int user)
{
trace_rcu_utilization(TPS("Start scheduler-tick"));
lockdep_assert_irqs_disabled();
raw_cpu_inc(rcu_data.ticks_this_gp);
/* The load-acquire pairs with the store-release setting to true. */
if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) {
@ -2579,6 +2598,7 @@ void rcu_sched_clock_irq(int user)
rcu_flavor_sched_clock_irq(user);
if (rcu_pending(user))
invoke_rcu_core();
lockdep_assert_irqs_disabled();
trace_rcu_utilization(TPS("End scheduler-tick"));
}
@ -2688,7 +2708,7 @@ static __latent_entropy void rcu_core(void)
unsigned long flags;
struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rdp->mynode;
const bool offloaded = rcu_segcblist_is_offloaded(&rdp->cblist);
const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist);
if (cpu_is_offline(smp_processor_id()))
return;
@ -2708,17 +2728,17 @@ static __latent_entropy void rcu_core(void)
/* No grace period and unregistered callbacks? */
if (!rcu_gp_in_progress() &&
rcu_segcblist_is_enabled(&rdp->cblist) && !offloaded) {
local_irq_save(flags);
rcu_segcblist_is_enabled(&rdp->cblist) && do_batch) {
rcu_nocb_lock_irqsave(rdp, flags);
if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
rcu_accelerate_cbs_unlocked(rnp, rdp);
local_irq_restore(flags);
rcu_nocb_unlock_irqrestore(rdp, flags);
}
rcu_check_gp_start_stall(rnp, rdp, rcu_jiffies_till_stall_check());
/* If there are callbacks ready, invoke them. */
if (!offloaded && rcu_segcblist_ready_cbs(&rdp->cblist) &&
if (do_batch && rcu_segcblist_ready_cbs(&rdp->cblist) &&
likely(READ_ONCE(rcu_scheduler_fully_active)))
rcu_do_batch(rdp);
@ -2941,6 +2961,7 @@ static void check_cb_ovld(struct rcu_data *rdp)
static void
__call_rcu(struct rcu_head *head, rcu_callback_t func)
{
static atomic_t doublefrees;
unsigned long flags;
struct rcu_data *rdp;
bool was_alldone;
@ -2954,8 +2975,10 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
* Use rcu:rcu_callback trace event to find the previous
* time callback was passed to __call_rcu().
*/
WARN_ONCE(1, "__call_rcu(): Double-freed CB %p->%pS()!!!\n",
head, head->func);
if (atomic_inc_return(&doublefrees) < 4) {
pr_err("%s(): Double-freed CB %p->%pS()!!! ", __func__, head, head->func);
mem_dump_obj(head);
}
WRITE_ONCE(head->func, rcu_leak_callback);
return;
}
@ -2989,6 +3012,8 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
trace_rcu_callback(rcu_state.name, head,
rcu_segcblist_n_cbs(&rdp->cblist));
trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued"));
/* Go handle any RCU core processing required. */
if (unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
__call_rcu_nocb_wake(rdp, was_alldone, flags); /* unlocks */
@ -3498,6 +3523,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
goto unlock_return;
}
kasan_record_aux_stack(ptr);
success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr);
if (!success) {
run_page_cache_worker(krcp);
@ -3747,6 +3773,8 @@ static int rcu_pending(int user)
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rdp->mynode;
lockdep_assert_irqs_disabled();
/* Check for CPU stalls, if enabled. */
check_cpu_stall(rdp);
@ -4001,12 +4029,18 @@ int rcutree_prepare_cpu(unsigned int cpu)
rdp->qlen_last_fqs_check = 0;
rdp->n_force_qs_snap = rcu_state.n_force_qs;
rdp->blimit = blimit;
if (rcu_segcblist_empty(&rdp->cblist) && /* No early-boot CBs? */
!rcu_segcblist_is_offloaded(&rdp->cblist))
rcu_segcblist_init(&rdp->cblist); /* Re-enable callbacks. */
rdp->dynticks_nesting = 1; /* CPU not up, no tearing. */
rcu_dynticks_eqs_online();
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
/*
* Lock in case the CB/GP kthreads are still around handling
* old callbacks (longer term we should flush all callbacks
* before completing CPU offline)
*/
rcu_nocb_lock(rdp);
if (rcu_segcblist_empty(&rdp->cblist)) /* No early-boot CBs? */
rcu_segcblist_init(&rdp->cblist); /* Re-enable callbacks. */
rcu_nocb_unlock(rdp);
/*
* Add CPU to leaf rcu_node pending-online bitmask. Any needed
@ -4159,6 +4193,9 @@ void rcu_report_dead(unsigned int cpu)
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */
// Do any dangling deferred wakeups.
do_nocb_deferred_wakeup(rdp);
/* QS for any half-done expedited grace period. */
preempt_disable();
rcu_report_exp_rdp(this_cpu_ptr(&rcu_data));

View File

@ -201,6 +201,7 @@ struct rcu_data {
/* 5) Callback offloading. */
#ifdef CONFIG_RCU_NOCB_CPU
struct swait_queue_head nocb_cb_wq; /* For nocb kthreads to sleep on. */
struct swait_queue_head nocb_state_wq; /* For offloading state changes */
struct task_struct *nocb_gp_kthread;
raw_spinlock_t nocb_lock; /* Guard following pair of fields. */
atomic_t nocb_lock_contended; /* Contention experienced. */
@ -256,6 +257,7 @@ struct rcu_data {
};
/* Values for nocb_defer_wakeup field in struct rcu_data. */
#define RCU_NOCB_WAKE_OFF -1
#define RCU_NOCB_WAKE_NOT 0
#define RCU_NOCB_WAKE 1
#define RCU_NOCB_WAKE_FORCE 2

View File

@ -545,7 +545,7 @@ static void synchronize_rcu_expedited_wait(void)
data_race(rnp_root->expmask),
".T"[!!data_race(rnp_root->exp_tasks)]);
if (ndetected) {
pr_err("blocking rcu_node structures:");
pr_err("blocking rcu_node structures (internal RCU debug):");
rcu_for_each_node_breadth_first(rnp) {
if (rnp == rnp_root)
continue; /* printed unconditionally */

View File

@ -682,6 +682,7 @@ static void rcu_flavor_sched_clock_irq(int user)
{
struct task_struct *t = current;
lockdep_assert_irqs_disabled();
if (user || rcu_is_cpu_rrupt_from_idle()) {
rcu_note_voluntary_context_switch(current);
}
@ -1665,6 +1666,8 @@ static void wake_nocb_gp(struct rcu_data *rdp, bool force,
static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
const char *reason)
{
if (rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_OFF)
return;
if (rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT)
mod_timer(&rdp->nocb_timer, jiffies + 1);
if (rdp->nocb_defer_wakeup < waketype)
@ -1928,6 +1931,52 @@ static void do_nocb_bypass_wakeup_timer(struct timer_list *t)
__call_rcu_nocb_wake(rdp, true, flags);
}
/*
* Check if we ignore this rdp.
*
* We check that without holding the nocb lock but
* we make sure not to miss a freshly offloaded rdp
* with the current ordering:
*
* rdp_offload_toggle() nocb_gp_enabled_cb()
* ------------------------- ----------------------------
* WRITE flags LOCK nocb_gp_lock
* LOCK nocb_gp_lock READ/WRITE nocb_gp_sleep
* READ/WRITE nocb_gp_sleep UNLOCK nocb_gp_lock
* UNLOCK nocb_gp_lock READ flags
*/
static inline bool nocb_gp_enabled_cb(struct rcu_data *rdp)
{
u8 flags = SEGCBLIST_OFFLOADED | SEGCBLIST_KTHREAD_GP;
return rcu_segcblist_test_flags(&rdp->cblist, flags);
}
static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_state)
{
struct rcu_segcblist *cblist = &rdp->cblist;
if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) {
rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_GP);
if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
*needwake_state = true;
}
return true;
}
/*
* De-offloading. Clear our flag and notify the de-offload worker.
* We will ignore this rdp until it ever gets re-offloaded.
*/
WARN_ON_ONCE(!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP));
rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP);
if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
*needwake_state = true;
return false;
}
/*
* No-CBs GP kthreads come here to wait for additional callbacks to show up
* or for grace periods to end.
@ -1956,8 +2005,18 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
*/
WARN_ON_ONCE(my_rdp->nocb_gp_rdp != my_rdp);
for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_cb_rdp) {
bool needwake_state = false;
if (!nocb_gp_enabled_cb(rdp))
continue;
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
rcu_nocb_lock_irqsave(rdp, flags);
if (!nocb_gp_update_state(rdp, &needwake_state)) {
rcu_nocb_unlock_irqrestore(rdp, flags);
if (needwake_state)
swake_up_one(&rdp->nocb_state_wq);
continue;
}
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
if (bypass_ncbs &&
(time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
@ -1967,6 +2026,8 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
} else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) {
rcu_nocb_unlock_irqrestore(rdp, flags);
if (needwake_state)
swake_up_one(&rdp->nocb_state_wq);
continue; /* No callbacks here, try next. */
}
if (bypass_ncbs) {
@ -2018,6 +2079,8 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
}
if (needwake_gp)
rcu_gp_kthread_wake();
if (needwake_state)
swake_up_one(&rdp->nocb_state_wq);
}
my_rdp->nocb_gp_bypass = bypass;
@ -2081,14 +2144,27 @@ static int rcu_nocb_gp_kthread(void *arg)
return 0;
}
static inline bool nocb_cb_can_run(struct rcu_data *rdp)
{
u8 flags = SEGCBLIST_OFFLOADED | SEGCBLIST_KTHREAD_CB;
return rcu_segcblist_test_flags(&rdp->cblist, flags);
}
static inline bool nocb_cb_wait_cond(struct rcu_data *rdp)
{
return nocb_cb_can_run(rdp) && !READ_ONCE(rdp->nocb_cb_sleep);
}
/*
* Invoke any ready callbacks from the corresponding no-CBs CPU,
* then, if there are no more, wait for more to appear.
*/
static void nocb_cb_wait(struct rcu_data *rdp)
{
struct rcu_segcblist *cblist = &rdp->cblist;
unsigned long cur_gp_seq;
unsigned long flags;
bool needwake_state = false;
bool needwake_gp = false;
struct rcu_node *rnp = rdp->mynode;
@ -2100,32 +2176,55 @@ static void nocb_cb_wait(struct rcu_data *rdp)
local_bh_enable();
lockdep_assert_irqs_enabled();
rcu_nocb_lock_irqsave(rdp, flags);
if (rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
if (rcu_segcblist_nextgp(cblist, &cur_gp_seq) &&
rcu_seq_done(&rnp->gp_seq, cur_gp_seq) &&
raw_spin_trylock_rcu_node(rnp)) { /* irqs already disabled. */
needwake_gp = rcu_advance_cbs(rdp->mynode, rdp);
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
}
if (rcu_segcblist_ready_cbs(&rdp->cblist)) {
rcu_nocb_unlock_irqrestore(rdp, flags);
if (needwake_gp)
rcu_gp_kthread_wake();
return;
WRITE_ONCE(rdp->nocb_cb_sleep, true);
if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB)) {
rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_CB);
if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP))
needwake_state = true;
}
if (rcu_segcblist_ready_cbs(cblist))
WRITE_ONCE(rdp->nocb_cb_sleep, false);
} else {
/*
* De-offloading. Clear our flag and notify the de-offload worker.
* We won't touch the callbacks and keep sleeping until we ever
* get re-offloaded.
*/
WARN_ON_ONCE(!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB));
rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_CB);
if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP))
needwake_state = true;
}
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep"));
WRITE_ONCE(rdp->nocb_cb_sleep, true);
if (rdp->nocb_cb_sleep)
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep"));
rcu_nocb_unlock_irqrestore(rdp, flags);
if (needwake_gp)
rcu_gp_kthread_wake();
swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
!READ_ONCE(rdp->nocb_cb_sleep));
if (!smp_load_acquire(&rdp->nocb_cb_sleep)) { /* VVV */
/* ^^^ Ensure CB invocation follows _sleep test. */
return;
}
WARN_ON(signal_pending(current));
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
if (needwake_state)
swake_up_one(&rdp->nocb_state_wq);
do {
swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
nocb_cb_wait_cond(rdp));
// VVV Ensure CB invocation follows _sleep test.
if (smp_load_acquire(&rdp->nocb_cb_sleep)) { // ^^^
WARN_ON(signal_pending(current));
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
}
} while (!nocb_cb_can_run(rdp));
}
/*
@ -2148,7 +2247,7 @@ static int rcu_nocb_cb_kthread(void *arg)
/* Is a deferred wakeup of rcu_nocb_kthread() required? */
static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp)
{
return READ_ONCE(rdp->nocb_defer_wakeup);
return READ_ONCE(rdp->nocb_defer_wakeup) > RCU_NOCB_WAKE_NOT;
}
/* Do a deferred wakeup of rcu_nocb_kthread(). */
@ -2187,6 +2286,195 @@ static void do_nocb_deferred_wakeup(struct rcu_data *rdp)
do_nocb_deferred_wakeup_common(rdp);
}
static int rdp_offload_toggle(struct rcu_data *rdp,
bool offload, unsigned long flags)
__releases(rdp->nocb_lock)
{
struct rcu_segcblist *cblist = &rdp->cblist;
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
bool wake_gp = false;
rcu_segcblist_offload(cblist, offload);
if (rdp->nocb_cb_sleep)
rdp->nocb_cb_sleep = false;
rcu_nocb_unlock_irqrestore(rdp, flags);
/*
* Ignore former value of nocb_cb_sleep and force wake up as it could
* have been spuriously set to false already.
*/
swake_up_one(&rdp->nocb_cb_wq);
raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
if (rdp_gp->nocb_gp_sleep) {
rdp_gp->nocb_gp_sleep = false;
wake_gp = true;
}
raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
if (wake_gp)
wake_up_process(rdp_gp->nocb_gp_kthread);
return 0;
}
static int __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
{
struct rcu_segcblist *cblist = &rdp->cblist;
unsigned long flags;
int ret;
pr_info("De-offloading %d\n", rdp->cpu);
rcu_nocb_lock_irqsave(rdp, flags);
/*
* If there are still pending work offloaded, the offline
* CPU won't help much handling them.
*/
if (cpu_is_offline(rdp->cpu) && !rcu_segcblist_empty(&rdp->cblist)) {
rcu_nocb_unlock_irqrestore(rdp, flags);
return -EBUSY;
}
ret = rdp_offload_toggle(rdp, false, flags);
swait_event_exclusive(rdp->nocb_state_wq,
!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
SEGCBLIST_KTHREAD_GP));
rcu_nocb_lock_irqsave(rdp, flags);
/* Make sure nocb timer won't stay around */
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_OFF);
rcu_nocb_unlock_irqrestore(rdp, flags);
del_timer_sync(&rdp->nocb_timer);
/*
* Flush bypass. While IRQs are disabled and once we set
* SEGCBLIST_SOFTIRQ_ONLY, no callback is supposed to be
* enqueued on bypass.
*/
rcu_nocb_lock_irqsave(rdp, flags);
rcu_nocb_flush_bypass(rdp, NULL, jiffies);
rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);
/*
* With SEGCBLIST_SOFTIRQ_ONLY, we can't use
* rcu_nocb_unlock_irqrestore() anymore. Theoretically we
* could set SEGCBLIST_SOFTIRQ_ONLY with cb unlocked and IRQs
* disabled now, but let's be paranoid.
*/
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
return ret;
}
static long rcu_nocb_rdp_deoffload(void *arg)
{
struct rcu_data *rdp = arg;
WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
return __rcu_nocb_rdp_deoffload(rdp);
}
int rcu_nocb_cpu_deoffload(int cpu)
{
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
int ret = 0;
if (rdp == rdp->nocb_gp_rdp) {
pr_info("Can't deoffload an rdp GP leader (yet)\n");
return -EINVAL;
}
mutex_lock(&rcu_state.barrier_mutex);
cpus_read_lock();
if (rcu_segcblist_is_offloaded(&rdp->cblist)) {
if (cpu_online(cpu))
ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
else
ret = __rcu_nocb_rdp_deoffload(rdp);
if (!ret)
cpumask_clear_cpu(cpu, rcu_nocb_mask);
}
cpus_read_unlock();
mutex_unlock(&rcu_state.barrier_mutex);
return ret;
}
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
static int __rcu_nocb_rdp_offload(struct rcu_data *rdp)
{
struct rcu_segcblist *cblist = &rdp->cblist;
unsigned long flags;
int ret;
/*
* For now we only support re-offload, ie: the rdp must have been
* offloaded on boot first.
*/
if (!rdp->nocb_gp_rdp)
return -EINVAL;
pr_info("Offloading %d\n", rdp->cpu);
/*
* Can't use rcu_nocb_lock_irqsave() while we are in
* SEGCBLIST_SOFTIRQ_ONLY mode.
*/
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
/* Re-enable nocb timer */
WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
/*
* We didn't take the nocb lock while working on the
* rdp->cblist in SEGCBLIST_SOFTIRQ_ONLY mode.
* Every modifications that have been done previously on
* rdp->cblist must be visible remotely by the nocb kthreads
* upon wake up after reading the cblist flags.
*
* The layout against nocb_lock enforces that ordering:
*
* __rcu_nocb_rdp_offload() nocb_cb_wait()/nocb_gp_wait()
* ------------------------- ----------------------------
* WRITE callbacks rcu_nocb_lock()
* rcu_nocb_lock() READ flags
* WRITE flags READ callbacks
* rcu_nocb_unlock() rcu_nocb_unlock()
*/
ret = rdp_offload_toggle(rdp, true, flags);
swait_event_exclusive(rdp->nocb_state_wq,
rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB) &&
rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP));
return ret;
}
static long rcu_nocb_rdp_offload(void *arg)
{
struct rcu_data *rdp = arg;
WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
return __rcu_nocb_rdp_offload(rdp);
}
int rcu_nocb_cpu_offload(int cpu)
{
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
int ret = 0;
mutex_lock(&rcu_state.barrier_mutex);
cpus_read_lock();
if (!rcu_segcblist_is_offloaded(&rdp->cblist)) {
if (cpu_online(cpu))
ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
else
ret = __rcu_nocb_rdp_offload(rdp);
if (!ret)
cpumask_set_cpu(cpu, rcu_nocb_mask);
}
cpus_read_unlock();
mutex_unlock(&rcu_state.barrier_mutex);
return ret;
}
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);
void __init rcu_init_nohz(void)
{
int cpu;
@ -2229,7 +2517,9 @@ void __init rcu_init_nohz(void)
rdp = per_cpu_ptr(&rcu_data, cpu);
if (rcu_segcblist_empty(&rdp->cblist))
rcu_segcblist_init(&rdp->cblist);
rcu_segcblist_offload(&rdp->cblist);
rcu_segcblist_offload(&rdp->cblist, true);
rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_CB);
rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP);
}
rcu_organize_nocb_kthreads();
}
@ -2239,6 +2529,7 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
{
init_swait_queue_head(&rdp->nocb_cb_wq);
init_swait_queue_head(&rdp->nocb_gp_wq);
init_swait_queue_head(&rdp->nocb_state_wq);
raw_spin_lock_init(&rdp->nocb_lock);
raw_spin_lock_init(&rdp->nocb_bypass_lock);
raw_spin_lock_init(&rdp->nocb_gp_lock);
@ -2381,6 +2672,19 @@ void rcu_bind_current_to_nocb(void)
}
EXPORT_SYMBOL_GPL(rcu_bind_current_to_nocb);
// The ->on_cpu field is available only in CONFIG_SMP=y, so...
#ifdef CONFIG_SMP
static char *show_rcu_should_be_on_cpu(struct task_struct *tsp)
{
return tsp && tsp->state == TASK_RUNNING && !tsp->on_cpu ? "!" : "";
}
#else // #ifdef CONFIG_SMP
static char *show_rcu_should_be_on_cpu(struct task_struct *tsp)
{
return "";
}
#endif // #else #ifdef CONFIG_SMP
/*
* Dump out nocb grace-period kthread state for the specified rcu_data
* structure.
@ -2389,7 +2693,7 @@ static void show_rcu_nocb_gp_state(struct rcu_data *rdp)
{
struct rcu_node *rnp = rdp->mynode;
pr_info("nocb GP %d %c%c%c%c%c%c %c[%c%c] %c%c:%ld rnp %d:%d %lu\n",
pr_info("nocb GP %d %c%c%c%c%c%c %c[%c%c] %c%c:%ld rnp %d:%d %lu %c CPU %d%s\n",
rdp->cpu,
"kK"[!!rdp->nocb_gp_kthread],
"lL"[raw_spin_is_locked(&rdp->nocb_gp_lock)],
@ -2403,12 +2707,17 @@ static void show_rcu_nocb_gp_state(struct rcu_data *rdp)
".B"[!!rdp->nocb_gp_bypass],
".G"[!!rdp->nocb_gp_gp],
(long)rdp->nocb_gp_seq,
rnp->grplo, rnp->grphi, READ_ONCE(rdp->nocb_gp_loops));
rnp->grplo, rnp->grphi, READ_ONCE(rdp->nocb_gp_loops),
rdp->nocb_gp_kthread ? task_state_to_char(rdp->nocb_gp_kthread) : '.',
rdp->nocb_cb_kthread ? (int)task_cpu(rdp->nocb_gp_kthread) : -1,
show_rcu_should_be_on_cpu(rdp->nocb_cb_kthread));
}
/* Dump out nocb kthread state for the specified rcu_data structure. */
static void show_rcu_nocb_state(struct rcu_data *rdp)
{
char bufw[20];
char bufr[20];
struct rcu_segcblist *rsclp = &rdp->cblist;
bool waslocked;
bool wastimer;
@ -2417,8 +2726,11 @@ static void show_rcu_nocb_state(struct rcu_data *rdp)
if (rdp->nocb_gp_rdp == rdp)
show_rcu_nocb_gp_state(rdp);
pr_info(" CB %d->%d %c%c%c%c%c%c F%ld L%ld C%d %c%c%c%c%c q%ld\n",
sprintf(bufw, "%ld", rsclp->gp_seq[RCU_WAIT_TAIL]);
sprintf(bufr, "%ld", rsclp->gp_seq[RCU_NEXT_READY_TAIL]);
pr_info(" CB %d^%d->%d %c%c%c%c%c%c F%ld L%ld C%d %c%c%s%c%s%c%c q%ld %c CPU %d%s\n",
rdp->cpu, rdp->nocb_gp_rdp->cpu,
rdp->nocb_next_cb_rdp ? rdp->nocb_next_cb_rdp->cpu : -1,
"kK"[!!rdp->nocb_cb_kthread],
"bB"[raw_spin_is_locked(&rdp->nocb_bypass_lock)],
"cC"[!!atomic_read(&rdp->nocb_lock_contended)],
@ -2429,11 +2741,16 @@ static void show_rcu_nocb_state(struct rcu_data *rdp)
jiffies - rdp->nocb_nobypass_last,
rdp->nocb_nobypass_count,
".D"[rcu_segcblist_ready_cbs(rsclp)],
".W"[!rcu_segcblist_restempty(rsclp, RCU_DONE_TAIL)],
".R"[!rcu_segcblist_restempty(rsclp, RCU_WAIT_TAIL)],
".N"[!rcu_segcblist_restempty(rsclp, RCU_NEXT_READY_TAIL)],
".W"[!rcu_segcblist_segempty(rsclp, RCU_WAIT_TAIL)],
rcu_segcblist_segempty(rsclp, RCU_WAIT_TAIL) ? "" : bufw,
".R"[!rcu_segcblist_segempty(rsclp, RCU_NEXT_READY_TAIL)],
rcu_segcblist_segempty(rsclp, RCU_NEXT_READY_TAIL) ? "" : bufr,
".N"[!rcu_segcblist_segempty(rsclp, RCU_NEXT_TAIL)],
".B"[!!rcu_cblist_n_cbs(&rdp->nocb_bypass)],
rcu_segcblist_n_cbs(&rdp->cblist));
rcu_segcblist_n_cbs(&rdp->cblist),
rdp->nocb_cb_kthread ? task_state_to_char(rdp->nocb_cb_kthread) : '.',
rdp->nocb_cb_kthread ? (int)task_cpu(rdp->nocb_gp_kthread) : -1,
show_rcu_should_be_on_cpu(rdp->nocb_cb_kthread));
/* It is OK for GP kthreads to have GP state. */
if (rdp->nocb_gp_rdp == rdp)

View File

@ -266,6 +266,7 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags)
struct task_struct *t;
struct task_struct *ts[8];
lockdep_assert_irqs_disabled();
if (!rcu_preempt_blocked_readers_cgp(rnp))
return 0;
pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
@ -290,6 +291,7 @@ static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags)
".q"[rscr.rs.b.need_qs],
".e"[rscr.rs.b.exp_hint],
".l"[rscr.on_blkd_list]);
lockdep_assert_irqs_disabled();
put_task_struct(t);
ndetected++;
}
@ -333,9 +335,12 @@ static void rcu_dump_cpu_stacks(void)
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu))
if (!trigger_single_cpu_backtrace(cpu))
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
if (cpu_is_offline(cpu))
pr_err("Offline CPU %d blocking current GP.\n", cpu);
else if (!trigger_single_cpu_backtrace(cpu))
dump_cpu_task(cpu);
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
}
@ -449,25 +454,66 @@ static void print_cpu_stall_info(int cpu)
/* Complain about starvation of grace-period kthread. */
static void rcu_check_gp_kthread_starvation(void)
{
int cpu;
struct task_struct *gpk = rcu_state.gp_kthread;
unsigned long j;
if (rcu_is_gp_kthread_starving(&j)) {
cpu = gpk ? task_cpu(gpk) : -1;
pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
rcu_state.name, j,
(long)rcu_seq_current(&rcu_state.gp_seq),
data_race(rcu_state.gp_flags),
gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
gpk ? gpk->state : ~0, cpu);
if (gpk) {
pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
pr_err("RCU grace-period kthread stack dump:\n");
sched_show_task(gpk);
if (cpu >= 0) {
if (cpu_is_offline(cpu)) {
pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
} else {
pr_err("Stack dump where RCU GP kthread last ran:\n");
if (!trigger_single_cpu_backtrace(cpu))
dump_cpu_task(cpu);
}
}
wake_up_process(gpk);
}
}
}
/* Complain about missing wakeups from expired fqs wait timer */
static void rcu_check_gp_kthread_expired_fqs_timer(void)
{
struct task_struct *gpk = rcu_state.gp_kthread;
short gp_state;
unsigned long jiffies_fqs;
int cpu;
/*
* Order reads of .gp_state and .jiffies_force_qs.
* Matching smp_wmb() is present in rcu_gp_fqs_loop().
*/
gp_state = smp_load_acquire(&rcu_state.gp_state);
jiffies_fqs = READ_ONCE(rcu_state.jiffies_force_qs);
if (gp_state == RCU_GP_WAIT_FQS &&
time_after(jiffies, jiffies_fqs + RCU_STALL_MIGHT_MIN) &&
gpk && !READ_ONCE(gpk->on_rq)) {
cpu = task_cpu(gpk);
pr_err("%s kthread timer wakeup didn't happen for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx\n",
rcu_state.name, (jiffies - jiffies_fqs),
(long)rcu_seq_current(&rcu_state.gp_seq),
data_race(rcu_state.gp_flags),
gp_state_getname(RCU_GP_WAIT_FQS), RCU_GP_WAIT_FQS,
gpk->state);
pr_err("\tPossible timer handling issue on cpu=%d timer-softirq=%u\n",
cpu, kstat_softirqs_cpu(TIMER_SOFTIRQ, cpu));
}
}
static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
{
int cpu;
@ -478,6 +524,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
struct rcu_node *rnp;
long totqlen = 0;
lockdep_assert_irqs_disabled();
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_stall_is_suppressed())
@ -499,6 +547,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
}
}
ndetected += rcu_print_task_stall(rnp, flags); // Releases rnp->lock.
lockdep_assert_irqs_disabled();
}
for_each_possible_cpu(cpu)
@ -529,6 +578,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
WRITE_ONCE(rcu_state.jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
rcu_check_gp_kthread_expired_fqs_timer();
rcu_check_gp_kthread_starvation();
panic_on_rcu_stall();
@ -544,6 +594,8 @@ static void print_cpu_stall(unsigned long gps)
struct rcu_node *rnp = rcu_get_root();
long totqlen = 0;
lockdep_assert_irqs_disabled();
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_stall_is_suppressed())
@ -564,6 +616,7 @@ static void print_cpu_stall(unsigned long gps)
jiffies - gps,
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
rcu_check_gp_kthread_expired_fqs_timer();
rcu_check_gp_kthread_starvation();
rcu_dump_cpu_stacks();
@ -598,6 +651,7 @@ static void check_cpu_stall(struct rcu_data *rdp)
unsigned long js;
struct rcu_node *rnp;
lockdep_assert_irqs_disabled();
if ((rcu_stall_is_suppressed() && !READ_ONCE(rcu_kick_kthreads)) ||
!rcu_gp_in_progress())
return;

View File

@ -56,8 +56,10 @@
#ifndef CONFIG_TINY_RCU
module_param(rcu_expedited, int, 0);
module_param(rcu_normal, int, 0);
static int rcu_normal_after_boot;
static int rcu_normal_after_boot = IS_ENABLED(CONFIG_PREEMPT_RT);
#ifndef CONFIG_PREEMPT_RT
module_param(rcu_normal_after_boot, int, 0);
#endif
#endif /* #ifndef CONFIG_TINY_RCU */
#ifdef CONFIG_DEBUG_LOCK_ALLOC

View File

@ -398,6 +398,7 @@ static void scftorture_invoke_one(struct scf_statistics *scfp, struct torture_ra
static int scftorture_invoker(void *arg)
{
int cpu;
int curcpu;
DEFINE_TORTURE_RANDOM(rand);
struct scf_statistics *scfp = (struct scf_statistics *)arg;
bool was_offline = false;
@ -412,7 +413,10 @@ static int scftorture_invoker(void *arg)
VERBOSE_SCFTORTOUT("scftorture_invoker %d: Waiting for all SCF torturers from cpu %d", scfp->cpu, smp_processor_id());
// Make sure that the CPU is affinitized appropriately during testing.
WARN_ON_ONCE(smp_processor_id() != scfp->cpu);
curcpu = smp_processor_id();
WARN_ONCE(curcpu != scfp->cpu % nr_cpu_ids,
"%s: Wanted CPU %d, running on %d, nr_cpu_ids = %d\n",
__func__, scfp->cpu, curcpu, nr_cpu_ids);
if (!atomic_dec_return(&n_started))
while (atomic_read_acquire(&n_started)) {

View File

@ -3478,7 +3478,7 @@ out:
/**
* try_invoke_on_locked_down_task - Invoke a function on task in fixed state
* @p: Process for which the function is to be invoked.
* @p: Process for which the function is to be invoked, can be @current.
* @func: Function to invoke.
* @arg: Argument to function.
*
@ -3496,12 +3496,11 @@ out:
*/
bool try_invoke_on_locked_down_task(struct task_struct *p, bool (*func)(struct task_struct *t, void *arg), void *arg)
{
bool ret = false;
struct rq_flags rf;
bool ret = false;
struct rq *rq;
lockdep_assert_irqs_enabled();
raw_spin_lock_irq(&p->pi_lock);
raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
if (p->on_rq) {
rq = __task_rq_lock(p, &rf);
if (task_rq(p) == rq)
@ -3518,7 +3517,7 @@ bool try_invoke_on_locked_down_task(struct task_struct *p, bool (*func)(struct t
ret = func(p, arg);
}
}
raw_spin_unlock_irq(&p->pi_lock);
raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
return ret;
}

View File

@ -1237,6 +1237,20 @@ int try_to_del_timer_sync(struct timer_list *timer)
}
EXPORT_SYMBOL(try_to_del_timer_sync);
bool timer_curr_running(struct timer_list *timer)
{
int i;
for (i = 0; i < NR_BASES; i++) {
struct timer_base *base = this_cpu_ptr(&timer_bases[i]);
if (base->running_timer == timer)
return true;
}
return false;
}
#ifdef CONFIG_PREEMPT_RT
static __init void timer_base_init_expiry_lock(struct timer_base *base)
{

View File

@ -48,6 +48,12 @@ module_param(disable_onoff_at_boot, bool, 0444);
static bool ftrace_dump_at_shutdown;
module_param(ftrace_dump_at_shutdown, bool, 0444);
static int verbose_sleep_frequency;
module_param(verbose_sleep_frequency, int, 0444);
static int verbose_sleep_duration = 1;
module_param(verbose_sleep_duration, int, 0444);
static char *torture_type;
static int verbose;
@ -58,6 +64,95 @@ static int verbose;
static int fullstop = FULLSTOP_RMMOD;
static DEFINE_MUTEX(fullstop_mutex);
static atomic_t verbose_sleep_counter;
/*
* Sleep if needed from VERBOSE_TOROUT*().
*/
void verbose_torout_sleep(void)
{
if (verbose_sleep_frequency > 0 &&
verbose_sleep_duration > 0 &&
!(atomic_inc_return(&verbose_sleep_counter) % verbose_sleep_frequency))
schedule_timeout_uninterruptible(verbose_sleep_duration);
}
EXPORT_SYMBOL_GPL(verbose_torout_sleep);
/*
* Schedule a high-resolution-timer sleep in nanoseconds, with a 32-bit
* nanosecond random fuzz. This function and its friends desynchronize
* testing from the timer wheel.
*/
int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, struct torture_random_state *trsp)
{
ktime_t hto = baset_ns;
if (trsp)
hto += (torture_random(trsp) >> 3) % fuzzt_ns;
set_current_state(TASK_UNINTERRUPTIBLE);
return schedule_hrtimeout(&hto, HRTIMER_MODE_REL);
}
EXPORT_SYMBOL_GPL(torture_hrtimeout_ns);
/*
* Schedule a high-resolution-timer sleep in microseconds, with a 32-bit
* nanosecond (not microsecond!) random fuzz.
*/
int torture_hrtimeout_us(u32 baset_us, u32 fuzzt_ns, struct torture_random_state *trsp)
{
ktime_t baset_ns = baset_us * NSEC_PER_USEC;
return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp);
}
EXPORT_SYMBOL_GPL(torture_hrtimeout_us);
/*
* Schedule a high-resolution-timer sleep in milliseconds, with a 32-bit
* microsecond (not millisecond!) random fuzz.
*/
int torture_hrtimeout_ms(u32 baset_ms, u32 fuzzt_us, struct torture_random_state *trsp)
{
ktime_t baset_ns = baset_ms * NSEC_PER_MSEC;
u32 fuzzt_ns;
if ((u32)~0U / NSEC_PER_USEC < fuzzt_us)
fuzzt_ns = (u32)~0U;
else
fuzzt_ns = fuzzt_us * NSEC_PER_USEC;
return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp);
}
EXPORT_SYMBOL_GPL(torture_hrtimeout_ms);
/*
* Schedule a high-resolution-timer sleep in jiffies, with an
* implied one-jiffy random fuzz. This is intended to replace calls to
* schedule_timeout_interruptible() and friends.
*/
int torture_hrtimeout_jiffies(u32 baset_j, struct torture_random_state *trsp)
{
ktime_t baset_ns = jiffies_to_nsecs(baset_j);
return torture_hrtimeout_ns(baset_ns, jiffies_to_nsecs(1), trsp);
}
EXPORT_SYMBOL_GPL(torture_hrtimeout_jiffies);
/*
* Schedule a high-resolution-timer sleep in milliseconds, with a 32-bit
* millisecond (not second!) random fuzz.
*/
int torture_hrtimeout_s(u32 baset_s, u32 fuzzt_ms, struct torture_random_state *trsp)
{
ktime_t baset_ns = baset_s * NSEC_PER_SEC;
u32 fuzzt_ns;
if ((u32)~0U / NSEC_PER_MSEC < fuzzt_ms)
fuzzt_ns = (u32)~0U;
else
fuzzt_ns = fuzzt_ms * NSEC_PER_MSEC;
return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp);
}
EXPORT_SYMBOL_GPL(torture_hrtimeout_s);
#ifdef CONFIG_HOTPLUG_CPU
/*
@ -80,6 +175,19 @@ static unsigned long sum_online;
static int min_online = -1;
static int max_online;
static int torture_online_cpus = NR_CPUS;
/*
* Some torture testing leverages confusion as to the number of online
* CPUs. This function returns the torture-testing view of this number,
* which allows torture tests to load-balance appropriately.
*/
int torture_num_online_cpus(void)
{
return READ_ONCE(torture_online_cpus);
}
EXPORT_SYMBOL_GPL(torture_num_online_cpus);
/*
* Attempt to take a CPU offline. Return false if the CPU is already
* offline or if it is not subject to CPU-hotplug operations. The
@ -134,6 +242,8 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
*min_offl = delta;
if (*max_offl < delta)
*max_offl = delta;
WRITE_ONCE(torture_online_cpus, torture_online_cpus - 1);
WARN_ON_ONCE(torture_online_cpus <= 0);
}
return true;
@ -190,12 +300,33 @@ bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
*min_onl = delta;
if (*max_onl < delta)
*max_onl = delta;
WRITE_ONCE(torture_online_cpus, torture_online_cpus + 1);
}
return true;
}
EXPORT_SYMBOL_GPL(torture_online);
/*
* Get everything online at the beginning and ends of tests.
*/
static void torture_online_all(char *phase)
{
int cpu;
int ret;
for_each_possible_cpu(cpu) {
if (cpu_online(cpu))
continue;
ret = add_cpu(cpu);
if (ret && verbose) {
pr_alert("%s" TORTURE_FLAG
"%s: %s online %d: errno %d\n",
__func__, phase, torture_type, cpu, ret);
}
}
}
/*
* Execute random CPU-hotplug operations at the interval specified
* by the onoff_interval.
@ -206,25 +337,12 @@ torture_onoff(void *arg)
int cpu;
int maxcpu = -1;
DEFINE_TORTURE_RANDOM(rand);
int ret;
VERBOSE_TOROUT_STRING("torture_onoff task started");
for_each_online_cpu(cpu)
maxcpu = cpu;
WARN_ON(maxcpu < 0);
if (!IS_MODULE(CONFIG_TORTURE_TEST)) {
for_each_possible_cpu(cpu) {
if (cpu_online(cpu))
continue;
ret = add_cpu(cpu);
if (ret && verbose) {
pr_alert("%s" TORTURE_FLAG
"%s: Initial online %d: errno %d\n",
__func__, torture_type, cpu, ret);
}
}
}
torture_online_all("Initial");
if (maxcpu == 0) {
VERBOSE_TOROUT_STRING("Only one CPU, so CPU-hotplug testing is disabled");
goto stop;
@ -252,6 +370,7 @@ torture_onoff(void *arg)
stop:
torture_kthread_stopping("torture_onoff");
torture_online_all("Final");
return 0;
}
@ -602,7 +721,6 @@ static int stutter_gap;
*/
bool stutter_wait(const char *title)
{
ktime_t delay;
unsigned int i = 0;
bool ret = false;
int spt;
@ -618,11 +736,8 @@ bool stutter_wait(const char *title)
schedule_timeout_interruptible(1);
} else if (spt == 2) {
while (READ_ONCE(stutter_pause_test)) {
if (!(i++ & 0xffff)) {
set_current_state(TASK_INTERRUPTIBLE);
delay = 10 * NSEC_PER_USEC;
schedule_hrtimeout(&delay, HRTIMER_MODE_REL);
}
if (!(i++ & 0xffff))
torture_hrtimeout_us(10, 0, NULL);
cond_resched();
}
} else {
@ -640,7 +755,6 @@ EXPORT_SYMBOL_GPL(stutter_wait);
*/
static int torture_stutter(void *arg)
{
ktime_t delay;
DEFINE_TORTURE_RANDOM(rand);
int wtime;
@ -651,20 +765,15 @@ static int torture_stutter(void *arg)
if (stutter > 2) {
WRITE_ONCE(stutter_pause_test, 1);
wtime = stutter - 3;
delay = ktime_divns(NSEC_PER_SEC * wtime, HZ);
delay += (torture_random(&rand) >> 3) % NSEC_PER_MSEC;
set_current_state(TASK_INTERRUPTIBLE);
schedule_hrtimeout(&delay, HRTIMER_MODE_REL);
torture_hrtimeout_jiffies(wtime, &rand);
wtime = 2;
}
WRITE_ONCE(stutter_pause_test, 2);
delay = ktime_divns(NSEC_PER_SEC * wtime, HZ);
set_current_state(TASK_INTERRUPTIBLE);
schedule_hrtimeout(&delay, HRTIMER_MODE_REL);
torture_hrtimeout_jiffies(wtime, NULL);
}
WRITE_ONCE(stutter_pause_test, 0);
if (!torture_must_stop())
schedule_timeout_interruptible(stutter_gap);
torture_hrtimeout_jiffies(stutter_gap, NULL);
torture_shutdown_absorb("torture_stutter");
} while (!torture_must_stop());
torture_kthread_stopping("torture_stutter");

View File

@ -5,6 +5,7 @@
#include <linux/sched.h>
#include <linux/wait.h>
#include <linux/slab.h>
#include <linux/mm.h>
#include <linux/percpu-refcount.h>
/*
@ -168,6 +169,7 @@ static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
struct percpu_ref_data, rcu);
struct percpu_ref *ref = data->ref;
unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
static atomic_t underflows;
unsigned long count = 0;
int cpu;
@ -191,9 +193,13 @@ static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
*/
atomic_long_add((long)count - PERCPU_COUNT_BIAS, &data->count);
WARN_ONCE(atomic_long_read(&data->count) <= 0,
"percpu ref (%ps) <= 0 (%ld) after switching to atomic",
data->release, atomic_long_read(&data->count));
if (WARN_ONCE(atomic_long_read(&data->count) <= 0,
"percpu ref (%ps) <= 0 (%ld) after switching to atomic",
data->release, atomic_long_read(&data->count)) &&
atomic_inc_return(&underflows) < 4) {
pr_err("%s(): percpu_ref underflow", __func__);
mem_dump_obj(data);
}
/* @ref is viewed as dead on all CPUs, send out switch confirmation */
percpu_ref_call_confirm_rcu(rcu);

View File

@ -3635,6 +3635,26 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
EXPORT_SYMBOL(__kmalloc_node_track_caller);
#endif /* CONFIG_NUMA */
void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
{
struct kmem_cache *cachep;
unsigned int objnr;
void *objp;
kpp->kp_ptr = object;
kpp->kp_page = page;
cachep = page->slab_cache;
kpp->kp_slab_cache = cachep;
objp = object - obj_offset(cachep);
kpp->kp_data_offset = obj_offset(cachep);
page = virt_to_head_page(objp);
objnr = obj_to_index(cachep, page, objp);
objp = index_to_obj(cachep, page, objnr);
kpp->kp_objp = objp;
if (DEBUG && cachep->flags & SLAB_STORE_USER)
kpp->kp_ret = *dbg_userword(cachep, objp);
}
/**
* __do_kmalloc - allocate memory
* @size: how many bytes of memory are required.

View File

@ -615,4 +615,16 @@ static inline bool slab_want_init_on_free(struct kmem_cache *c)
return false;
}
#define KS_ADDRS_COUNT 16
struct kmem_obj_info {
void *kp_ptr;
struct page *kp_page;
void *kp_objp;
unsigned long kp_data_offset;
struct kmem_cache *kp_slab_cache;
void *kp_ret;
void *kp_stack[KS_ADDRS_COUNT];
};
void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page);
#endif /* MM_SLAB_H */

View File

@ -537,6 +537,81 @@ bool slab_is_available(void)
return slab_state >= UP;
}
/**
* kmem_valid_obj - does the pointer reference a valid slab object?
* @object: pointer to query.
*
* Return: %true if the pointer is to a not-yet-freed object from
* kmalloc() or kmem_cache_alloc(), either %true or %false if the pointer
* is to an already-freed object, and %false otherwise.
*/
bool kmem_valid_obj(void *object)
{
struct page *page;
/* Some arches consider ZERO_SIZE_PTR to be a valid address. */
if (object < (void *)PAGE_SIZE || !virt_addr_valid(object))
return false;
page = virt_to_head_page(object);
return PageSlab(page);
}
/**
* kmem_dump_obj - Print available slab provenance information
* @object: slab object for which to find provenance information.
*
* This function uses pr_cont(), so that the caller is expected to have
* printed out whatever preamble is appropriate. The provenance information
* depends on the type of object and on how much debugging is enabled.
* For a slab-cache object, the fact that it is a slab object is printed,
* and, if available, the slab name, return address, and stack trace from
* the allocation of that object.
*
* This function will splat if passed a pointer to a non-slab object.
* If you are not sure what type of object you have, you should instead
* use mem_dump_obj().
*/
void kmem_dump_obj(void *object)
{
char *cp = IS_ENABLED(CONFIG_MMU) ? "" : "/vmalloc";
int i;
struct page *page;
unsigned long ptroffset;
struct kmem_obj_info kp = { };
if (WARN_ON_ONCE(!virt_addr_valid(object)))
return;
page = virt_to_head_page(object);
if (WARN_ON_ONCE(!PageSlab(page))) {
pr_cont(" non-slab memory.\n");
return;
}
kmem_obj_info(&kp, object, page);
if (kp.kp_slab_cache)
pr_cont(" slab%s %s", cp, kp.kp_slab_cache->name);
else
pr_cont(" slab%s", cp);
if (kp.kp_objp)
pr_cont(" start %px", kp.kp_objp);
if (kp.kp_data_offset)
pr_cont(" data offset %lu", kp.kp_data_offset);
if (kp.kp_objp) {
ptroffset = ((char *)object - (char *)kp.kp_objp) - kp.kp_data_offset;
pr_cont(" pointer offset %lu", ptroffset);
}
if (kp.kp_slab_cache && kp.kp_slab_cache->usersize)
pr_cont(" size %u", kp.kp_slab_cache->usersize);
if (kp.kp_ret)
pr_cont(" allocated at %pS\n", kp.kp_ret);
else
pr_cont("\n");
for (i = 0; i < ARRAY_SIZE(kp.kp_stack); i++) {
if (!kp.kp_stack[i])
break;
pr_info(" %pS\n", kp.kp_stack[i]);
}
}
#ifndef CONFIG_SLOB
/* Create a cache during boot when no slab services are available yet */
void __init create_boot_cache(struct kmem_cache *s, const char *name,

View File

@ -461,6 +461,12 @@ out:
spin_unlock_irqrestore(&slob_lock, flags);
}
void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
{
kpp->kp_ptr = object;
kpp->kp_page = page;
}
/*
* End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend.
*/

View File

@ -3933,6 +3933,46 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
return 0;
}
void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
{
void *base;
int __maybe_unused i;
unsigned int objnr;
void *objp;
void *objp0;
struct kmem_cache *s = page->slab_cache;
struct track __maybe_unused *trackp;
kpp->kp_ptr = object;
kpp->kp_page = page;
kpp->kp_slab_cache = s;
base = page_address(page);
objp0 = kasan_reset_tag(object);
#ifdef CONFIG_SLUB_DEBUG
objp = restore_red_left(s, objp0);
#else
objp = objp0;
#endif
objnr = obj_to_index(s, page, objp);
kpp->kp_data_offset = (unsigned long)((char *)objp0 - (char *)objp);
objp = base + s->size * objnr;
kpp->kp_objp = objp;
if (WARN_ON_ONCE(objp < base || objp >= base + page->objects * s->size || (objp - base) % s->size) ||
!(s->flags & SLAB_STORE_USER))
return;
#ifdef CONFIG_SLUB_DEBUG
trackp = get_track(s, objp, TRACK_ALLOC);
kpp->kp_ret = (void *)trackp->addr;
#ifdef CONFIG_STACKTRACE
for (i = 0; i < KS_ADDRS_COUNT && i < TRACK_ADDRS_COUNT; i++) {
kpp->kp_stack[i] = (void *)trackp->addrs[i];
if (!kpp->kp_stack[i])
break;
}
#endif
#endif
}
/********************************************************************
* Kmalloc subsystem
*******************************************************************/

View File

@ -982,3 +982,34 @@ int __weak memcmp_pages(struct page *page1, struct page *page2)
kunmap_atomic(addr1);
return ret;
}
/**
* mem_dump_obj - Print available provenance information
* @object: object for which to find provenance information.
*
* This function uses pr_cont(), so that the caller is expected to have
* printed out whatever preamble is appropriate. The provenance information
* depends on the type of object and on how much debugging is enabled.
* For example, for a slab-cache object, the slab name is printed, and,
* if available, the return address and stack trace from the allocation
* of that object.
*/
void mem_dump_obj(void *object)
{
if (kmem_valid_obj(object)) {
kmem_dump_obj(object);
return;
}
if (vmalloc_dump_obj(object))
return;
if (!virt_addr_valid(object)) {
if (object == NULL)
pr_cont(" NULL pointer.\n");
else if (object == ZERO_SIZE_PTR)
pr_cont(" zero-size pointer.\n");
else
pr_cont(" non-paged memory.\n");
return;
}
pr_cont(" non-slab/vmalloc memory.\n");
}

View File

@ -3450,6 +3450,19 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
}
#endif /* CONFIG_SMP */
bool vmalloc_dump_obj(void *object)
{
struct vm_struct *vm;
void *objp = (void *)PAGE_ALIGN((unsigned long)object);
vm = find_vm_area(objp);
if (!vm)
return false;
pr_cont(" %u-page vmalloc region starting at %#lx allocated at %pS\n",
vm->nr_pages, (unsigned long)vm->addr, vm->caller);
return true;
}
#ifdef CONFIG_PROC_FS
static void *s_start(struct seq_file *m, loff_t *pos)
__acquires(&vmap_purge_lock)

View File

@ -0,0 +1,67 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0+
#
# Create a spreadsheet from torture-test Kconfig options and kernel boot
# parameters. Run this in the directory containing the scenario files.
#
# Usage: config2csv path.csv [ "scenario1 scenario2 ..." ]
#
# By default, this script will take the list of scenarios from the CFLIST
# file in that directory, otherwise it will consider only the scenarios
# specified on the command line. It will examine each scenario's file
# and also its .boot file, if present, and create a column in the .csv
# output file. Note that "CFLIST" is a synonym for all the scenarios in the
# CFLIST file, which allows easy comparison of those scenarios with selected
# scenarios such as BUSTED that are normally omitted from CFLIST files.
csvout=${1}
if test -z "$csvout"
then
echo "Need .csv output file as first argument."
exit 1
fi
shift
defaultconfigs="`tr '\012' ' ' < CFLIST`"
if test "$#" -eq 0
then
scenariosarg=$defaultconfigs
else
scenariosarg=$*
fi
scenarios="`echo $scenariosarg | sed -e "s/\<CFLIST\>/$defaultconfigs/g"`"
T=/tmp/config2latex.sh.$$
trap 'rm -rf $T' 0
mkdir $T
cat << '---EOF---' >> $T/p.awk
END {
---EOF---
for i in $scenarios
do
echo ' s["'$i'"] = 1;' >> $T/p.awk
grep -v '^#' < $i | grep -v '^ *$' > $T/p
if test -r $i.boot
then
tr -s ' ' '\012' < $i.boot | grep -v '^#' >> $T/p
fi
sed -e 's/^[^=]*$/&=?/' < $T/p |
sed -e 's/^\([^=]*\)=\(.*\)$/\tp["\1:'"$i"'"] = "\2";\n\tc["\1"] = 1;/' >> $T/p.awk
done
cat << '---EOF---' >> $T/p.awk
ns = asorti(s, ss);
nc = asorti(c, cs);
for (j = 1; j <= ns; j++)
printf ",\"%s\"", ss[j];
printf "\n";
for (i = 1; i <= nc; i++) {
printf "\"%s\"", cs[i];
for (j = 1; j <= ns; j++) {
printf ",\"%s\"", p[cs[i] ":" ss[j]];
}
printf "\n";
}
}
---EOF---
awk -f $T/p.awk < /dev/null > $T/p.csv
cp $T/p.csv $csvout

View File

@ -14,4 +14,5 @@ egrep 'Badness|WARNING:|Warn|BUG|===========|Call Trace:|Oops:|detected stalls o
grep -v 'ODEBUG: ' |
grep -v 'This means that this is a DEBUG kernel and it is' |
grep -v 'Warning: unable to open an initial console' |
grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr.*the init process!' |
grep -v 'NOHZ tick-stop error: Non-RCU local softirq work is pending, handler'

View File

@ -108,6 +108,39 @@ configfrag_hotplug_cpu () {
grep -q '^CONFIG_HOTPLUG_CPU=y$' "$1"
}
# get_starttime
#
# Returns a cookie identifying the current time.
get_starttime () {
awk 'BEGIN { print systime() }' < /dev/null
}
# get_starttime_duration starttime
#
# Given the return value from get_starttime, compute a human-readable
# string denoting the time since get_starttime.
get_starttime_duration () {
awk -v starttime=$1 '
BEGIN {
ts = systime() - starttime;
tm = int(ts / 60);
th = int(ts / 3600);
td = int(ts / 86400);
d = td;
h = th - td * 24;
m = tm - th * 60;
s = ts - tm * 60;
if (d >= 1)
printf "%dd %d:%02d:%02d\n", d, h, m, s
else if (h >= 1)
printf "%d:%02d:%02d\n", h, m, s
else if (m >= 1)
printf "%d:%02d.0\n", m, s
else
print s " seconds"
}' < /dev/null
}
# identify_boot_image qemu-cmd
#
# Returns the relative path to the kernel build image. This will be
@ -170,6 +203,7 @@ identify_qemu () {
# and the TORTURE_QEMU_INTERACTIVE environment variable.
identify_qemu_append () {
echo debug_boot_weak_hash
echo panic=-1
local console=ttyS0
case "$1" in
qemu-system-x86_64|qemu-system-i386)
@ -232,7 +266,7 @@ identify_qemu_args () {
# Returns the number of virtual CPUs available to the aggregate of the
# guest OSes.
identify_qemu_vcpus () {
lscpu | grep '^CPU(s):' | sed -e 's/CPU(s)://' -e 's/[ ]*//g'
getconf _NPROCESSORS_ONLN
}
# print_bug

View File

@ -39,12 +39,14 @@ done
if test -n "$files"
then
$editor $files
editorret=1
else
echo No build errors.
fi
if grep -q -e "--buildonly" < ${rundir}/log
then
echo Build-only run, no console logs to check.
exit $editorret
fi
# Find console logs with errors
@ -62,5 +64,10 @@ then
exit 1
else
echo No errors in console logs.
exit 0
if test -n "$editorret"
then
exit $editorret
else
exit 0
fi
fi

View File

@ -87,15 +87,16 @@ do
fi
done
EDITOR=echo kvm-find-errors.sh "${@: -1}" > $T 2>&1
ret=$?
builderrors="`tr ' ' '\012' < $T | grep -c '/Make.out.diags'`"
if test "$builderrors" -gt 0
then
echo $builderrors runs with build errors.
ret=1
fi
runerrors="`tr ' ' '\012' < $T | grep -c '/console.log.diags'`"
if test "$runerrors" -gt 0
then
echo $runerrors runs with runtime errors.
ret=2
fi
exit $ret

View File

@ -125,7 +125,6 @@ seconds=$4
qemu_args=$5
boot_args=$6
kstarttime=`gawk 'BEGIN { print systime() }' < /dev/null`
if test -z "$TORTURE_BUILDONLY"
then
echo ' ---' `date`: Starting kernel
@ -158,6 +157,8 @@ then
boot_args="$boot_args $TORTURE_BOOT_GDB_ARG"
fi
echo $QEMU $qemu_args -m $TORTURE_QEMU_MEM -kernel $KERNEL -append \"$qemu_append $boot_args\" $TORTURE_QEMU_GDB_ARG > $resdir/qemu-cmd
echo "# TORTURE_SHUTDOWN_GRACE=$TORTURE_SHUTDOWN_GRACE" >> $resdir/qemu-cmd
echo "# seconds=$seconds" >> $resdir/qemu-cmd
if test -n "$TORTURE_BUILDONLY"
then
@ -174,6 +175,7 @@ echo 'echo $! > $resdir/qemu_pid' >> $T/qemu-cmd
echo "NOTE: $QEMU either did not run or was interactive" > $resdir/console.log
# Attempt to run qemu
kstarttime=`gawk 'BEGIN { print systime() }' < /dev/null`
( . $T/qemu-cmd; wait `cat $resdir/qemu_pid`; echo $? > $resdir/qemu-retval ) &
commandcompleted=0
if test -z "$TORTURE_KCONFIG_GDB_ARG"
@ -209,7 +211,7 @@ do
if test -n "$TORTURE_KCONFIG_GDB_ARG"
then
:
elif test $kruntime -ge $seconds || test -f "$TORTURE_STOPFILE"
elif test $kruntime -ge $seconds || test -f "$resdir/../STOP.1"
then
break;
fi
@ -252,16 +254,16 @@ then
fi
if test $commandcompleted -eq 0 -a -n "$qemu_pid"
then
if ! test -f "$TORTURE_STOPFILE"
if ! test -f "$resdir/../STOP.1"
then
echo Grace period for qemu job at pid $qemu_pid
fi
oldline="`tail $resdir/console.log`"
while :
do
if test -f "$TORTURE_STOPFILE"
if test -f "$resdir/../STOP.1"
then
echo "PID $qemu_pid killed due to run STOP request" >> $resdir/Warnings 2>&1
echo "PID $qemu_pid killed due to run STOP.1 request" >> $resdir/Warnings 2>&1
kill -KILL $qemu_pid
break
fi

View File

@ -47,6 +47,9 @@ cpus=0
ds=`date +%Y.%m.%d-%H.%M.%S`
jitter="-1"
startdate="`date`"
starttime="`get_starttime`"
usage () {
echo "Usage: $scriptname optional arguments:"
echo " --allcpus"
@ -57,7 +60,7 @@ usage () {
echo " --cpus N"
echo " --datestamp string"
echo " --defconfig string"
echo " --dryrun sched|script"
echo " --dryrun batches|sched|script"
echo " --duration minutes | <seconds>s | <hours>h | <days>d"
echo " --gdb"
echo " --help"
@ -85,7 +88,7 @@ do
;;
--bootargs|--bootarg)
checkarg --bootargs "(list of kernel boot arguments)" "$#" "$2" '.*' '^--'
TORTURE_BOOTARGS="$2"
TORTURE_BOOTARGS="$TORTURE_BOOTARGS $2"
shift
;;
--bootimage)
@ -97,8 +100,8 @@ do
TORTURE_BUILDONLY=1
;;
--configs|--config)
checkarg --configs "(list of config files)" "$#" "$2" '^[^/]*$' '^--'
configs="$2"
checkarg --configs "(list of config files)" "$#" "$2" '^[^/]\+$' '^--'
configs="$configs $2"
shift
;;
--cpus)
@ -113,7 +116,7 @@ do
shift
;;
--datestamp)
checkarg --datestamp "(relative pathname)" "$#" "$2" '^[^/]*$' '^--'
checkarg --datestamp "(relative pathname)" "$#" "$2" '^[a-zA-Z0-9._-/]*$' '^--'
ds=$2
shift
;;
@ -123,7 +126,7 @@ do
shift
;;
--dryrun)
checkarg --dryrun "sched|script" $# "$2" 'sched\|script' '^--'
checkarg --dryrun "batches|sched|script" $# "$2" 'batches\|sched\|script' '^--'
dryrun=$2
shift
;;
@ -162,18 +165,18 @@ do
;;
--kconfig|--kconfigs)
checkarg --kconfig "(Kconfig options)" $# "$2" '^CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\( CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\)*$' '^error$'
TORTURE_KCONFIG_ARG="$2"
TORTURE_KCONFIG_ARG="`echo "$TORTURE_KCONFIG_ARG $2" | sed -e 's/^ *//' -e 's/ *$//'`"
shift
;;
--kasan)
TORTURE_KCONFIG_KASAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KASAN=y"; export TORTURE_KCONFIG_KASAN_ARG
;;
--kcsan)
TORTURE_KCONFIG_KCSAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KCSAN=y CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_VERBOSE=y CONFIG_KCSAN_INTERRUPT_WATCHER=y"; export TORTURE_KCONFIG_KCSAN_ARG
TORTURE_KCONFIG_KCSAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KCSAN=y CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_INTERRUPT_WATCHER=y CONFIG_KCSAN_VERBOSE=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y"; export TORTURE_KCONFIG_KCSAN_ARG
;;
--kmake-arg|--kmake-args)
checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
TORTURE_KMAKE_ARG="$2"
TORTURE_KMAKE_ARG="`echo "$TORTURE_KMAKE_ARG $2" | sed -e 's/^ *//' -e 's/ *$//'`"
shift
;;
--mac)
@ -191,7 +194,7 @@ do
;;
--qemu-args|--qemu-arg)
checkarg --qemu-args "(qemu arguments)" $# "$2" '^-' '^error'
TORTURE_QEMU_ARG="$2"
TORTURE_QEMU_ARG="`echo "$TORTURE_QEMU_ARG $2" | sed -e 's/^ *//' -e 's/ *$//'`"
shift
;;
--qemu-cmd)
@ -232,7 +235,7 @@ do
shift
done
if test -z "$TORTURE_INITRD" || tools/testing/selftests/rcutorture/bin/mkinitrd.sh
if test -n "$dryrun" || test -z "$TORTURE_INITRD" || tools/testing/selftests/rcutorture/bin/mkinitrd.sh
then
:
else
@ -283,19 +286,34 @@ then
exit 1
fi
fi
for CF1 in $configs_derep
echo 'BEGIN {' > $T/cfgcpu.awk
for CF1 in `echo $configs_derep | tr -s ' ' '\012' | sort -u`
do
if test -f "$CONFIGFRAG/$CF1"
then
cpu_count=`configNR_CPUS.sh $CONFIGFRAG/$CF1`
if echo "$TORTURE_KCONFIG_ARG" | grep -q '\<CONFIG_NR_CPUS='
then
echo "$TORTURE_KCONFIG_ARG" | tr -s ' ' | tr ' ' '\012' > $T/KCONFIG_ARG
cpu_count=`configNR_CPUS.sh $T/KCONFIG_ARG`
else
cpu_count=`configNR_CPUS.sh $CONFIGFRAG/$CF1`
fi
cpu_count=`configfrag_boot_cpus "$TORTURE_BOOTARGS" "$CONFIGFRAG/$CF1" "$cpu_count"`
cpu_count=`configfrag_boot_maxcpus "$TORTURE_BOOTARGS" "$CONFIGFRAG/$CF1" "$cpu_count"`
echo $CF1 $cpu_count >> $T/cfgcpu
echo 'scenariocpu["'"$CF1"'"] = '"$cpu_count"';' >> $T/cfgcpu.awk
else
echo "The --configs file $CF1 does not exist, terminating."
exit 1
fi
done
cat << '___EOF___' >> $T/cfgcpu.awk
}
{
for (i = 1; i <= NF; i++)
print $i, scenariocpu[$i];
}
___EOF___
echo $configs_derep | awk -f $T/cfgcpu.awk > $T/cfgcpu
sort -k2nr $T/cfgcpu -T="$T" > $T/cfgcpu.sort
# Use a greedy bin-packing algorithm, sorting the list accordingly.
@ -315,11 +333,10 @@ END {
batch = 0;
nc = -1;
# Each pass through the following loop creates on test batch
# that can be executed concurrently given ncpus. Note that a
# given test that requires more than the available CPUs will run in
# their own batch. Such tests just have to make do with what
# is available.
# Each pass through the following loop creates on test batch that
# can be executed concurrently given ncpus. Note that a given test
# that requires more than the available CPUs will run in its own
# batch. Such tests just have to make do with what is available.
while (nc != ncpus) {
batch++;
nc = ncpus;
@ -375,9 +392,9 @@ if ! test -e $resdir
then
mkdir -p "$resdir" || :
fi
mkdir $resdir/$ds
mkdir -p $resdir/$ds
TORTURE_RESDIR="$resdir/$ds"; export TORTURE_RESDIR
TORTURE_STOPFILE="$resdir/$ds/STOP"; export TORTURE_STOPFILE
TORTURE_STOPFILE="$resdir/$ds/STOP.1"; export TORTURE_STOPFILE
echo Results directory: $resdir/$ds
echo $scriptname $args
touch $resdir/$ds/log
@ -517,14 +534,19 @@ END {
dump(first, i, batchnum);
}' >> $T/script
cat << ___EOF___ >> $T/script
echo
echo
echo " --- `date` Test summary:"
echo Results directory: $resdir/$ds
kcsan-collapse.sh $resdir/$ds
kvm-recheck.sh $resdir/$ds
cat << '___EOF___' >> $T/script
echo | tee -a $TORTURE_RESDIR/log
echo | tee -a $TORTURE_RESDIR/log
echo " --- `date` Test summary:" | tee -a $TORTURE_RESDIR/log
___EOF___
cat << ___EOF___ >> $T/script
echo Results directory: $resdir/$ds | tee -a $resdir/$ds/log
kcsan-collapse.sh $resdir/$ds | tee -a $resdir/$ds/log
kvm-recheck.sh $resdir/$ds > $T/kvm-recheck.sh.out 2>&1
___EOF___
echo 'ret=$?' >> $T/script
echo "cat $T/kvm-recheck.sh.out | tee -a $resdir/$ds/log" >> $T/script
echo 'exit $ret' >> $T/script
if test "$dryrun" = script
then
@ -533,13 +555,34 @@ then
elif test "$dryrun" = sched
then
# Extract the test run schedule from the script.
egrep 'Start batch|Starting build\.' $T/script |
grep -v ">>" |
egrep 'Start batch|Starting build\.' $T/script | grep -v ">>" |
sed -e 's/:.*$//' -e 's/^echo //'
nbuilds="`grep 'Starting build\.' $T/script |
grep -v ">>" | sed -e 's/:.*$//' -e 's/^echo //' |
awk '{ print $1 }' | grep -v '\.' | wc -l`"
echo Total number of builds: $nbuilds
nbatches="`grep 'Start batch' $T/script | grep -v ">>" | wc -l`"
echo Total number of batches: $nbatches
exit 0
elif test "$dryrun" = batches
then
# Extract the tests and their batches from the script.
egrep 'Start batch|Starting build\.' $T/script | grep -v ">>" |
sed -e 's/:.*$//' -e 's/^echo //' -e 's/-ovf//' |
awk '
/^----Start/ {
batchno = $3;
next;
}
{
print batchno, $1, $2
}'
else
# Not a dryrun, so run the script.
sh $T/script
bash $T/script
ret=$?
echo " --- Done at `date` (`get_starttime_duration $starttime`) exitcode $ret" | tee -a $resdir/$ds/log
exit $ret
fi
# Tracing: trace_event=rcu:rcu_grace_period,rcu:rcu_future_grace_period,rcu:rcu_grace_period_init,rcu:rcu_nocb_wake,rcu:rcu_preempt_task,rcu:rcu_unlock_preempted_task,rcu:rcu_quiescent_state_report,rcu:rcu_fqs,rcu:rcu_callback,rcu:rcu_kfree_callback,rcu:rcu_batch_start,rcu:rcu_invoke_callback,rcu:rcu_invoke_kfree_callback,rcu:rcu_batch_end,rcu:rcu_torture_read,rcu:rcu_barrier

View File

@ -21,7 +21,7 @@ mkdir $T
. functions.sh
if grep -q CC < $F || test -n "$TORTURE_TRUST_MAKE"
if grep -q CC < $F || test -n "$TORTURE_TRUST_MAKE" || grep -qe --trust-make < `dirname $F`/../log
then
:
else

View File

@ -128,7 +128,7 @@ then
then
summary="$summary Badness: $n_badness"
fi
n_warn=`grep -v 'Warning: unable to open an initial console' $file | egrep -c 'WARNING:|Warn'`
n_warn=`grep -v 'Warning: unable to open an initial console' $file | grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process' | egrep -c 'WARNING:|Warn'`
if test "$n_warn" -ne 0
then
summary="$summary Warnings: $n_warn"

View File

@ -0,0 +1,442 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Run a series of torture tests, intended for overnight or
# longer timeframes, and also for large systems.
#
# Usage: torture.sh [ options ]
#
# Copyright (C) 2020 Facebook, Inc.
#
# Authors: Paul E. McKenney <paulmck@kernel.org>
scriptname=$0
args="$*"
KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
PATH=${KVM}/bin:$PATH; export PATH
. functions.sh
TORTURE_ALLOTED_CPUS="`identify_qemu_vcpus`"
MAKE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS*2))
HALF_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2))
if test "$HALF_ALLOTED_CPUS" -lt 1
then
HALF_ALLOTED_CPUS=1
fi
VERBOSE_BATCH_CPUS=$((TORTURE_ALLOTED_CPUS/16))
if test "$VERBOSE_BATCH_CPUS" -lt 2
then
VERBOSE_BATCH_CPUS=0
fi
# Configurations/scenarios.
configs_rcutorture=
configs_locktorture=
configs_scftorture=
kcsan_kmake_args=
# Default compression, duration, and apportionment.
compress_kasan_vmlinux="`identify_qemu_vcpus`"
duration_base=10
duration_rcutorture_frac=7
duration_locktorture_frac=1
duration_scftorture_frac=2
# "yes" or "no" parameters
do_allmodconfig=yes
do_rcutorture=yes
do_locktorture=yes
do_scftorture=yes
do_rcuscale=yes
do_refscale=yes
do_kvfree=yes
do_kasan=yes
do_kcsan=no
# doyesno - Helper function for yes/no arguments
function doyesno () {
if test "$1" = "$2"
then
echo yes
else
echo no
fi
}
usage () {
echo "Usage: $scriptname optional arguments:"
echo " --compress-kasan-vmlinux concurrency"
echo " --configs-rcutorture \"config-file list w/ repeat factor (3*TINY01)\""
echo " --configs-locktorture \"config-file list w/ repeat factor (10*LOCK01)\""
echo " --configs-scftorture \"config-file list w/ repeat factor (2*CFLIST)\""
echo " --doall"
echo " --doallmodconfig / --do-no-allmodconfig"
echo " --do-kasan / --do-no-kasan"
echo " --do-kcsan / --do-no-kcsan"
echo " --do-kvfree / --do-no-kvfree"
echo " --do-locktorture / --do-no-locktorture"
echo " --do-none"
echo " --do-rcuscale / --do-no-rcuscale"
echo " --do-rcutorture / --do-no-rcutorture"
echo " --do-refscale / --do-no-refscale"
echo " --do-scftorture / --do-no-scftorture"
echo " --duration [ <minutes> | <hours>h | <days>d ]"
echo " --kcsan-kmake-arg kernel-make-arguments"
exit 1
}
while test $# -gt 0
do
case "$1" in
--compress-kasan-vmlinux)
checkarg --compress-kasan-vmlinux "(concurrency level)" $# "$2" '^[0-9][0-9]*$' '^error'
compress_kasan_vmlinux=$2
shift
;;
--config-rcutorture|--configs-rcutorture)
checkarg --configs-rcutorture "(list of config files)" "$#" "$2" '^[^/]\+$' '^--'
configs_rcutorture="$configs_rcutorture $2"
shift
;;
--config-locktorture|--configs-locktorture)
checkarg --configs-locktorture "(list of config files)" "$#" "$2" '^[^/]\+$' '^--'
configs_locktorture="$configs_locktorture $2"
shift
;;
--config-scftorture|--configs-scftorture)
checkarg --configs-scftorture "(list of config files)" "$#" "$2" '^[^/]\+$' '^--'
configs_scftorture="$configs_scftorture $2"
shift
;;
--doall)
do_allmodconfig=yes
do_rcutorture=yes
do_locktorture=yes
do_scftorture=yes
do_rcuscale=yes
do_refscale=yes
do_kvfree=yes
do_kasan=yes
do_kcsan=yes
;;
--do-allmodconfig|--do-no-allmodconfig)
do_allmodconfig=`doyesno "$1" --do-allmodconfig`
;;
--do-kasan|--do-no-kasan)
do_kasan=`doyesno "$1" --do-kasan`
;;
--do-kcsan|--do-no-kcsan)
do_kcsan=`doyesno "$1" --do-kcsan`
;;
--do-kvfree|--do-no-kvfree)
do_kvfree=`doyesno "$1" --do-kvfree`
;;
--do-locktorture|--do-no-locktorture)
do_locktorture=`doyesno "$1" --do-locktorture`
;;
--do-none)
do_allmodconfig=no
do_rcutorture=no
do_locktorture=no
do_scftorture=no
do_rcuscale=no
do_refscale=no
do_kvfree=no
do_kasan=no
do_kcsan=no
;;
--do-rcuscale|--do-no-rcuscale)
do_rcuscale=`doyesno "$1" --do-rcuscale`
;;
--do-rcutorture|--do-no-rcutorture)
do_rcutorture=`doyesno "$1" --do-rcutorture`
;;
--do-refscale|--do-no-refscale)
do_refscale=`doyesno "$1" --do-refscale`
;;
--do-scftorture|--do-no-scftorture)
do_scftorture=`doyesno "$1" --do-scftorture`
;;
--duration)
checkarg --duration "(minutes)" $# "$2" '^[0-9][0-9]*\(m\|h\|d\|\)$' '^error'
mult=1
if echo "$2" | grep -q 'm$'
then
mult=1
elif echo "$2" | grep -q 'h$'
then
mult=60
elif echo "$2" | grep -q 'd$'
then
mult=1440
fi
ts=`echo $2 | sed -e 's/[smhd]$//'`
duration_base=$(($ts*mult))
shift
;;
--kcsan-kmake-arg|--kcsan-kmake-args)
checkarg --kcsan-kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
kcsan_kmake_args="`echo "$kcsan_kmake_args $2" | sed -e 's/^ *//' -e 's/ *$//'`"
shift
;;
*)
echo Unknown argument $1
usage
;;
esac
shift
done
ds="`date +%Y.%m.%d-%H.%M.%S`-torture"
startdate="`date`"
starttime="`get_starttime`"
T=/tmp/torture.sh.$$
trap 'rm -rf $T' 0 2
mkdir $T
echo " --- " $scriptname $args | tee -a $T/log
echo " --- Results directory: " $ds | tee -a $T/log
# Calculate rcutorture defaults and apportion time
if test -z "$configs_rcutorture"
then
configs_rcutorture=CFLIST
fi
duration_rcutorture=$((duration_base*duration_rcutorture_frac/10))
if test "$duration_rcutorture" -eq 0
then
echo " --- Zero time for rcutorture, disabling" | tee -a $T/log
do_rcutorture=no
fi
# Calculate locktorture defaults and apportion time
if test -z "$configs_locktorture"
then
configs_locktorture=CFLIST
fi
duration_locktorture=$((duration_base*duration_locktorture_frac/10))
if test "$duration_locktorture" -eq 0
then
echo " --- Zero time for locktorture, disabling" | tee -a $T/log
do_locktorture=no
fi
# Calculate scftorture defaults and apportion time
if test -z "$configs_scftorture"
then
configs_scftorture=CFLIST
fi
duration_scftorture=$((duration_base*duration_scftorture_frac/10))
if test "$duration_scftorture" -eq 0
then
echo " --- Zero time for scftorture, disabling" | tee -a $T/log
do_scftorture=no
fi
touch $T/failures
touch $T/successes
# torture_one - Does a single kvm.sh run.
#
# Usage:
# torture_bootargs="[ kernel boot arguments ]"
# torture_one flavor [ kvm.sh arguments ]
#
# Note that "flavor" is an arbitrary string. Supply --torture if needed.
# Note that quoting is problematic. So on the command line, pass multiple
# values with multiple kvm.sh argument instances.
function torture_one {
local cur_bootargs=
local boottag=
echo " --- $curflavor:" Start `date` | tee -a $T/log
if test -n "$torture_bootargs"
then
boottag="--bootargs"
cur_bootargs="$torture_bootargs"
fi
"$@" $boottag "$cur_bootargs" --datestamp "$ds/results-$curflavor" > $T/$curflavor.out 2>&1
retcode=$?
resdir="`grep '^Results directory: ' $T/$curflavor.out | tail -1 | sed -e 's/^Results directory: //'`"
if test -z "$resdir"
then
cat $T/$curflavor.out | tee -a $T/log
echo retcode=$retcode | tee -a $T/log
fi
if test "$retcode" == 0
then
echo "$curflavor($retcode)" $resdir >> $T/successes
else
echo "$curflavor($retcode)" $resdir >> $T/failures
fi
}
# torture_set - Does a set of tortures with and without KASAN and KCSAN.
#
# Usage:
# torture_bootargs="[ kernel boot arguments ]"
# torture_set flavor [ kvm.sh arguments ]
#
# Note that "flavor" is an arbitrary string. Supply --torture if needed.
# Note that quoting is problematic. So on the command line, pass multiple
# values with multiple kvm.sh argument instances.
function torture_set {
local cur_kcsan_kmake_args=
local kcsan_kmake_tag=
local flavor=$1
shift
curflavor=$flavor
torture_one "$@"
if test "$do_kasan" = "yes"
then
curflavor=${flavor}-kasan
torture_one "$@" --kasan
fi
if test "$do_kcsan" = "yes"
then
curflavor=${flavor}-kcsan
if test -n "$kcsan_kmake_args"
then
kcsan_kmake_tag="--kmake-args"
cur_kcsan_kmake_args="$kcsan_kmake_args"
fi
torture_one $* --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" $kcsan_kmake_tag $cur_kcsan_kmake_args --kcsan
fi
}
# make allmodconfig
if test "$do_allmodconfig" = "yes"
then
echo " --- allmodconfig:" Start `date` | tee -a $T/log
amcdir="tools/testing/selftests/rcutorture/res/$ds/allmodconfig"
mkdir -p "$amcdir"
echo " --- make clean" > "$amcdir/Make.out" 2>&1
make -j$MAKE_ALLOTED_CPUS clean >> "$amcdir/Make.out" 2>&1
echo " --- make allmodconfig" >> "$amcdir/Make.out" 2>&1
make -j$MAKE_ALLOTED_CPUS allmodconfig >> "$amcdir/Make.out" 2>&1
echo " --- make " >> "$amcdir/Make.out" 2>&1
make -j$MAKE_ALLOTED_CPUS >> "$amcdir/Make.out" 2>&1
retcode="$?"
echo $retcode > "$amcdir/Make.exitcode"
if test "$retcode" == 0
then
echo "allmodconfig($retcode)" $amcdir >> $T/successes
else
echo "allmodconfig($retcode)" $amcdir >> $T/failures
fi
fi
# --torture rcu
if test "$do_rcutorture" = "yes"
then
torture_bootargs="rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot rcupdate.rcu_task_stall_timeout=30000"
torture_set "rcutorture" tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration "$duration_rcutorture" --configs "$configs_rcutorture" --trust-make
fi
if test "$do_locktorture" = "yes"
then
torture_bootargs="torture.disable_onoff_at_boot"
torture_set "locktorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture lock --allcpus --duration "$duration_locktorture" --configs "$configs_locktorture" --trust-make
fi
if test "$do_scftorture" = "yes"
then
torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot"
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
fi
if test "$do_refscale" = yes
then
primlist="`grep '\.name[ ]*=' kernel/rcu/refscale.c | sed -e 's/^[^"]*"//' -e 's/".*$//'`"
else
primlist=
fi
for prim in $primlist
do
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
done
if test "$do_rcuscale" = yes
then
primlist="`grep '\.name[ ]*=' kernel/rcu/rcuscale.c | sed -e 's/^[^"]*"//' -e 's/".*$//'`"
else
primlist=
fi
for prim in $primlist
do
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
done
if test "$do_kvfree" = "yes"
then
torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
fi
echo " --- " $scriptname $args
echo " --- " Done `date` | tee -a $T/log
ret=0
nsuccesses=0
echo SUCCESSES: | tee -a $T/log
if test -s "$T/successes"
then
cat "$T/successes" | tee -a $T/log
nsuccesses="`wc -l "$T/successes" | awk '{ print $1 }'`"
fi
nfailures=0
echo FAILURES: | tee -a $T/log
if test -s "$T/failures"
then
cat "$T/failures" | tee -a $T/log
nfailures="`wc -l "$T/failures" | awk '{ print $1 }'`"
ret=2
fi
echo Started at $startdate, ended at `date`, duration `get_starttime_duration $starttime`. | tee -a $T/log
echo Summary: Successes: $nsuccesses Failures: $nfailures. | tee -a $T/log
tdir="`cat $T/successes $T/failures | head -1 | awk '{ print $NF }' | sed -e 's,/[^/]\+/*$,,'`"
if test -n "$tdir" && test $compress_kasan_vmlinux -gt 0
then
# KASAN vmlinux files can approach 1GB in size, so compress them.
echo Looking for KASAN files to compress: `date` > "$tdir/log-xz" 2>&1
find "$tdir" -type d -name '*-kasan' -print > $T/xz-todo
ncompresses=0
batchno=1
if test -s $T/xz-todo
then
echo Size before compressing: `du -sh $tdir | awk '{ print $1 }'` `date` 2>&1 | tee -a "$tdir/log-xz" | tee -a $T/log
for i in `cat $T/xz-todo`
do
echo Compressing vmlinux files in ${i}: `date` >> "$tdir/log-xz" 2>&1
for j in $i/*/vmlinux
do
xz "$j" >> "$tdir/log-xz" 2>&1 &
ncompresses=$((ncompresses+1))
if test $ncompresses -ge $compress_kasan_vmlinux
then
echo Waiting for batch $batchno of $ncompresses compressions `date` | tee -a "$tdir/log-xz" | tee -a $T/log
wait
ncompresses=0
batchno=$((batchno+1))
fi
done
done
if test $ncompresses -gt 0
then
echo Waiting for final batch $batchno of $ncompresses compressions `date` | tee -a "$tdir/log-xz" | tee -a $T/log
fi
wait
echo Size after compressing: `du -sh $tdir | awk '{ print $1 }'` `date` 2>&1 | tee -a "$tdir/log-xz" | tee -a $T/log
echo Total duration `get_starttime_duration $starttime`. | tee -a $T/log
else
echo No compression needed: `date` >> "$tdir/log-xz" 2>&1
fi
fi
if test -n "$tdir"
then
cp $T/log "$tdir"
fi
exit $ret

View File

@ -1 +1,2 @@
rcutorture.torture_type=tasks-rude
rcutree.use_softirq=0

View File

@ -1 +1,2 @@
rcutorture.torture_type=tasks
rcutree.use_softirq=0

View File

@ -2,5 +2,7 @@ maxcpus=8 nr_cpus=43
rcutree.gp_preinit_delay=3
rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3
rcu_nocbs=0
rcu_nocbs=0-1,3-7
rcutorture.nocbs_nthreads=8
rcutorture.nocbs_toggle=1000
rcutorture.fwd_progress=0