From a5052657c164107032d521f0d9e92703d78845f2 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 12 Apr 2016 08:52:49 -0700 Subject: [PATCH 01/17] locking/Documentation: Clarify relationship of barrier() to control dependencies The current documentation claims that the compiler ignores barrier(), which is not the case. Instead, the compiler carefully pays attention to barrier(), but in a creative way that still manages to destroy the control dependency. This commit sets the story straight. Reported-by: Mathieu Desnoyers Signed-off-by: Paul E. McKenney Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dhowells@redhat.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476375-27803-1-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 3729cbe60e41..ec1289042396 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -813,9 +813,10 @@ In summary: the same variable, then those stores must be ordered, either by preceding both of them with smp_mb() or by using smp_store_release() to carry out the stores. Please note that it is -not- sufficient - to use barrier() at beginning of each leg of the "if" statement, - as optimizing compilers do not necessarily respect barrier() - in this case. + to use barrier() at beginning of each leg of the "if" statement + because, as shown by the example above, optimizing compilers can + destroy the control dependency while respecting the letter of the + barrier() law. (*) Control dependencies require at least one run-time conditional between the prior load and the subsequent store, and this From 166bda7122c8e817f039bf738cf05ab3b7278732 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Tue, 12 Apr 2016 08:52:50 -0700 Subject: [PATCH 02/17] locking/Documentation: Fix missed s/lock/acquire renames The terms 'lock'/'unlock' were changed to 'acquire'/'release' by the following commit: 2e4f5382d12a4 ("locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE") However, the commit missed to change the table of contents - fix that. Also, the dumb rename changed the section name 'Locking functions' to an actively misleading 'Acquiring functions' section name. Rename it to 'Lock acquisition functions' instead. Suggested-by: David Howells Signed-off-by: SeongJae Park Signed-off-by: Paul E. McKenney Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: mathieu.desnoyers@efficios.com Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476375-27803-2-git-send-email-paulmck@linux.vnet.ibm.com [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index ec1289042396..38b1ce161afb 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -31,15 +31,15 @@ Contents: (*) Implicit kernel memory barriers. - - Locking functions. + - Lock acquisition functions. - Interrupt disabling functions. - Sleep and wake-up functions. - Miscellaneous functions. - (*) Inter-CPU locking barrier effects. + (*) Inter-CPU acquiring barrier effects. - - Locks vs memory accesses. - - Locks vs I/O accesses. + - Acquires vs memory accesses. + - Acquires vs I/O accesses. (*) Where are memory barriers needed? @@ -1859,7 +1859,7 @@ This is a variation on the mandatory write barrier that causes writes to weakly ordered I/O regions to be partially ordered. Its effects may go beyond the CPU->Hardware interface and actually affect the hardware at some level. -See the subsection "Locks vs I/O accesses" for more information. +See the subsection "Acquires vs I/O accesses" for more information. =============================== @@ -1874,8 +1874,8 @@ provide more substantial guarantees, but these may not be relied upon outside of arch specific code. -ACQUIRING FUNCTIONS -------------------- +LOCK ACQUISITION FUNCTIONS +-------------------------- The Linux kernel has a number of locking constructs: From 01e1cd6de8e75fa28c268b4dc566bc1a39486e71 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Tue, 12 Apr 2016 08:52:51 -0700 Subject: [PATCH 03/17] locking/Documentation: Add missed subsection in TOC A 'Virtual Machine Guests' subsection was added by this commit: 6a65d26385bf487 ("asm-generic: implement virt_xxx memory barriers") but the TOC was not updated - update it. Signed-off-by: SeongJae Park Signed-off-by: Paul E. McKenney Acked-by: David Howells Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: mathieu.desnoyers@efficios.com Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476375-27803-3-git-send-email-paulmck@linux.vnet.ibm.com [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 38b1ce161afb..718ef2564fa0 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -61,6 +61,7 @@ Contents: (*) The things CPUs get up to. - And then there's the Alpha. + - Virtual Machine Guests. (*) Example uses. From 3dbf0913f6cac722805a94f16b1e61ffc3483eaf Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Tue, 12 Apr 2016 08:52:52 -0700 Subject: [PATCH 04/17] locking/Documentation: Fix formatting inconsistencies Signed-off-by: SeongJae Park Signed-off-by: Paul E. McKenney Acked-by: David Howells Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: mathieu.desnoyers@efficios.com Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476375-27803-4-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 718ef2564fa0..1f1541862239 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -149,7 +149,7 @@ As a further example, consider this sequence of events: CPU 1 CPU 2 =============== =============== - { A == 1, B == 2, C = 3, P == &A, Q == &C } + { A == 1, B == 2, C == 3, P == &A, Q == &C } B = 4; Q = P; P = &B D = *Q; @@ -518,7 +518,7 @@ following sequence of events: CPU 1 CPU 2 =============== =============== - { A == 1, B == 2, C = 3, P == &A, Q == &C } + { A == 1, B == 2, C == 3, P == &A, Q == &C } B = 4; WRITE_ONCE(P, &B) @@ -545,7 +545,7 @@ between the address load and the data load: CPU 1 CPU 2 =============== =============== - { A == 1, B == 2, C = 3, P == &A, Q == &C } + { A == 1, B == 2, C == 3, P == &A, Q == &C } B = 4; WRITE_ONCE(P, &B); @@ -3043,7 +3043,7 @@ The Alpha defines the Linux kernel's memory barrier model. See the subsection on "Cache Coherency" above. VIRTUAL MACHINE GUESTS -------------------- +---------------------- Guests running within virtual machines might be affected by SMP effects even if the guest itself is compiled without SMP support. This is an artifact of From 0b6fa347dc08c6f757a35f3a180269b3ffc4cd28 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Tue, 12 Apr 2016 08:52:53 -0700 Subject: [PATCH 05/17] locking/Documentation: Insert white spaces consistently The document uses two newlines between sections, one newline between item and its detailed description, and two spaces between sentences. There are a few places that used these rules inconsistently - fix them. Signed-off-by: SeongJae Park Signed-off-by: Paul E. McKenney Acked-by: David Howells Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: mathieu.desnoyers@efficios.com Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476375-27803-5-git-send-email-paulmck@linux.vnet.ibm.com [ Fixed the changelog. ] Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 43 +++++++++++++++++-------------- 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 1f1541862239..7133626a61d0 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1733,15 +1733,15 @@ The Linux kernel has eight basic CPU memory barriers: All memory barriers except the data dependency barriers imply a compiler -barrier. Data dependencies do not impose any additional compiler ordering. +barrier. Data dependencies do not impose any additional compiler ordering. Aside: In the case of data dependencies, the compiler would be expected to issue the loads in the correct order (eg. `a[b]` would have to load the value of b before loading a[b]), however there is no guarantee in the C specification that the compiler may not speculate the value of b (eg. is equal to 1) and load a before b (eg. tmp = a[1]; if (b != 1) -tmp = a[b]; ). There is also the problem of a compiler reloading b after -having loaded a[b], thus having a newer copy of b than a[b]. A consensus +tmp = a[b]; ). There is also the problem of a compiler reloading b after +having loaded a[b], thus having a newer copy of b than a[b]. A consensus has not yet been reached about these problems, however the READ_ONCE() macro is a good place to start looking. @@ -1796,6 +1796,7 @@ There are some more advanced barrier functions: (*) lockless_dereference(); + This can be thought of as a pointer-fetch wrapper around the smp_read_barrier_depends() data-dependency barrier. @@ -1897,7 +1898,7 @@ for each construct. These operations all imply certain barriers: Memory operations issued before the ACQUIRE may be completed after the ACQUIRE operation has completed. An smp_mb__before_spinlock(), combined with a following ACQUIRE, orders prior stores against - subsequent loads and stores. Note that this is weaker than smp_mb()! + subsequent loads and stores. Note that this is weaker than smp_mb()! The smp_mb__before_spinlock() primitive is free on many architectures. (2) RELEASE operation implication: @@ -2092,9 +2093,9 @@ or: event_indicated = 1; wake_up_process(event_daemon); -A write memory barrier is implied by wake_up() and co. if and only if they wake -something up. The barrier occurs before the task state is cleared, and so sits -between the STORE to indicate the event and the STORE to set TASK_RUNNING: +A write memory barrier is implied by wake_up() and co. if and only if they +wake something up. The barrier occurs before the task state is cleared, and so +sits between the STORE to indicate the event and the STORE to set TASK_RUNNING: CPU 1 CPU 2 =============================== =============================== @@ -2208,7 +2209,7 @@ three CPUs; then should the following sequence of events occur: Then there is no guarantee as to what order CPU 3 will see the accesses to *A through *H occur in, other than the constraints imposed by the separate locks -on the separate CPUs. It might, for example, see: +on the separate CPUs. It might, for example, see: *E, ACQUIRE M, ACQUIRE Q, *G, *C, *F, *A, *B, RELEASE Q, *D, *H, RELEASE M @@ -2488,9 +2489,9 @@ The following operations are special locking primitives: clear_bit_unlock(); __clear_bit_unlock(); -These implement ACQUIRE-class and RELEASE-class operations. These should be used in -preference to other operations when implementing locking primitives, because -their implementations can be optimised on many architectures. +These implement ACQUIRE-class and RELEASE-class operations. These should be +used in preference to other operations when implementing locking primitives, +because their implementations can be optimised on many architectures. [!] Note that special memory barrier primitives are available for these situations because on some CPUs the atomic instructions used imply full memory @@ -2570,12 +2571,12 @@ explicit barriers are used. Normally this won't be a problem because the I/O accesses done inside such sections will include synchronous load operations on strictly ordered I/O -registers that form implicit I/O barriers. If this isn't sufficient then an +registers that form implicit I/O barriers. If this isn't sufficient then an mmiowb() may need to be used explicitly. A similar situation may occur between an interrupt routine and two routines -running on separate CPUs that communicate with each other. If such a case is +running on separate CPUs that communicate with each other. If such a case is likely, then interrupt-disabling locks should be used to guarantee ordering. @@ -2589,8 +2590,8 @@ functions: (*) inX(), outX(): These are intended to talk to I/O space rather than memory space, but - that's primarily a CPU-specific concept. The i386 and x86_64 processors do - indeed have special I/O space access cycles and instructions, but many + that's primarily a CPU-specific concept. The i386 and x86_64 processors + do indeed have special I/O space access cycles and instructions, but many CPUs don't have such a concept. The PCI bus, amongst others, defines an I/O space concept which - on such @@ -2612,7 +2613,7 @@ functions: Whether these are guaranteed to be fully ordered and uncombined with respect to each other on the issuing CPU depends on the characteristics - defined for the memory window through which they're accessing. On later + defined for the memory window through which they're accessing. On later i386 architecture machines, for example, this is controlled by way of the MTRR registers. @@ -2637,10 +2638,10 @@ functions: (*) readX_relaxed(), writeX_relaxed() These are similar to readX() and writeX(), but provide weaker memory - ordering guarantees. Specifically, they do not guarantee ordering with + ordering guarantees. Specifically, they do not guarantee ordering with respect to normal memory accesses (e.g. DMA buffers) nor do they guarantee - ordering with respect to LOCK or UNLOCK operations. If the latter is - required, an mmiowb() barrier can be used. Note that relaxed accesses to + ordering with respect to LOCK or UNLOCK operations. If the latter is + required, an mmiowb() barrier can be used. Note that relaxed accesses to the same peripheral are guaranteed to be ordered with respect to each other. @@ -3042,6 +3043,7 @@ The Alpha defines the Linux kernel's memory barrier model. See the subsection on "Cache Coherency" above. + VIRTUAL MACHINE GUESTS ---------------------- @@ -3052,7 +3054,7 @@ barriers for this use-case would be possible but is often suboptimal. To handle this case optimally, low-level virt_mb() etc macros are available. These have the same effect as smp_mb() etc when SMP is enabled, but generate -identical code for SMP and non-SMP systems. For example, virtual machine guests +identical code for SMP and non-SMP systems. For example, virtual machine guests should use virt_mb() rather than smp_mb() when synchronizing against a (possibly SMP) host. @@ -3060,6 +3062,7 @@ These are equivalent to smp_mb() etc counterparts in all other respects, in particular, they do not control MMIO effects: to control MMIO effects, use mandatory barriers. + ============ EXAMPLE USES ============ From 787df6383caa1338a4f6640d71917bc2d8c068b1 Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Tue, 12 Apr 2016 08:52:55 -0700 Subject: [PATCH 06/17] locking/Documentation: Mention smp_cond_acquire() ... do this next to smp_load_acquire() when first mentioning ACQUIRE. While this call is briefly explained and control dependencies are mentioned later, it does not hurt the reader. Signed-off-by: Davidlohr Bueso Signed-off-by: Paul E. McKenney Cc: Andrew Morton Cc: Davidlohr Bueso Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dhowells@redhat.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: mathieu.desnoyers@efficios.com Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476375-27803-7-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 7133626a61d0..a9454b1c73bd 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -431,8 +431,9 @@ And a couple of implicit varieties: This acts as a one-way permeable barrier. It guarantees that all memory operations after the ACQUIRE operation will appear to happen after the ACQUIRE operation with respect to the other components of the system. - ACQUIRE operations include LOCK operations and smp_load_acquire() - operations. + ACQUIRE operations include LOCK operations and both smp_load_acquire() + and smp_cond_acquire() operations. The later builds the necessary ACQUIRE + semantics from relying on a control dependency and smp_rmb(). Memory operations that occur before an ACQUIRE operation may appear to happen after it completes. From 1f190931893a98ffd5d4cfdfbfc2452ad0ed3e1b Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Tue, 12 Apr 2016 08:47:17 -0700 Subject: [PATCH 07/17] locking/locktorture: Fix deboosting NULL pointer dereference For the case of rtmutex torturing we will randomly call into the boost() handler, including upon module exiting when the tasks are deboosted before stopping. In such cases the task may or may not have already been boosted, and therefore the NULL being explicitly passed can occur anywhere. Currently we only assume that the task will is at a higher prio, and in consequence, dereference a NULL pointer. This patch fixes the case of a rmmod locktorture exploding while pounding on the rtmutex lock (partial trace): task: ffff88081026cf80 ti: ffff880816120000 task.ti: ffff880816120000 RSP: 0018:ffff880816123eb0 EFLAGS: 00010206 RAX: ffff88081026cf80 RBX: ffff880816bfa630 RCX: 0000000000160d1b RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000 RBP: ffff88081026cf80 R08: 000000000000001f R09: ffff88017c20ca80 R10: 0000000000000000 R11: 000000000048c316 R12: ffffffffa05d1840 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88203f880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 0000000001c0a000 CR4: 00000000000406e0 Stack: ffffffffa05d141d ffff880816bfa630 ffffffffa05d1922 ffff88081e70c2c0 ffff880816bfa630 ffffffff81095fed 0000000000000000 ffffffff8107bf60 ffff880816bfa630 ffffffff00000000 ffff880800000000 ffff880816123f08 Call Trace: [] kthread+0xbd/0xe0 [] ret_from_fork+0x3f/0x70 This patch ensures that if the random state pointer is not NULL and current is not boosted, then do nothing. RIP: 0010:[] [] torture_random+0x5/0x60 [torture] [] torture_rtmutex_boost+0x1d/0x90 [locktorture] [] lock_torture_writer+0xe2/0x170 [locktorture] Signed-off-by: Davidlohr Bueso Signed-off-by: Paul E. McKenney Cc: Andrew Morton Cc: Davidlohr Bueso Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dhowells@redhat.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: mathieu.desnoyers@efficios.com Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476038-27060-1-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- kernel/locking/locktorture.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 8ef1919d63b2..9e9c5f454f5c 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -394,12 +394,12 @@ static void torture_rtmutex_boost(struct torture_random_state *trsp) if (!rt_task(current)) { /* - * (1) Boost priority once every ~50k operations. When the + * Boost priority once every ~50k operations. When the * task tries to take the lock, the rtmutex it will account * for the new priority, and do any corresponding pi-dance. */ - if (!(torture_random(trsp) % - (cxt.nrealwriters_stress * factor))) { + if (trsp && !(torture_random(trsp) % + (cxt.nrealwriters_stress * factor))) { policy = SCHED_FIFO; param.sched_priority = MAX_RT_PRIO - 1; } else /* common case, do nothing */ From c1c33b92db4fb274dfbff778ccf2459e4bebd48e Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Tue, 12 Apr 2016 08:47:18 -0700 Subject: [PATCH 08/17] locking/locktorture: Fix NULL pointer dereference for cleanup paths It has been found that paths that invoke cleanups through lock_torture_cleanup() can trigger NULL pointer dereferencing bugs during the statistics printing phase. This is mainly because we should not be calling into statistics before we are sure things have been set up correctly. Specifically, early checks (and the need for handling this in the cleanup call) only include parameter checks and basic statistics allocation. Once we start write/read kthreads we then consider the test as started. As such, update the function in question to check for cxt.lwsa writer stats, if not set, we either have a bogus parameter or -ENOMEM situation and therefore only need to deal with general torture calls. Reported-and-tested-by: Kefeng Wang Signed-off-by: Davidlohr Bueso Signed-off-by: Paul E. McKenney Cc: Andrew Morton Cc: Davidlohr Bueso Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bobby.prani@gmail.com Cc: dhowells@redhat.com Cc: dipankar@in.ibm.com Cc: dvhart@linux.intel.com Cc: edumazet@google.com Cc: fweisbec@gmail.com Cc: jiangshanlai@gmail.com Cc: josh@joshtriplett.org Cc: mathieu.desnoyers@efficios.com Cc: oleg@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1460476038-27060-2-git-send-email-paulmck@linux.vnet.ibm.com [ Improved the changelog. ] Signed-off-by: Ingo Molnar --- kernel/locking/locktorture.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 9e9c5f454f5c..d066a50dc87e 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -748,6 +748,15 @@ static void lock_torture_cleanup(void) if (torture_cleanup_begin()) return; + /* + * Indicates early cleanup, meaning that the test has not run, + * such as when passing bogus args when loading the module. As + * such, only perform the underlying torture-specific cleanups, + * and avoid anything related to locktorture. + */ + if (!cxt.lwsa) + goto end; + if (writer_tasks) { for (i = 0; i < cxt.nrealwriters_stress; i++) torture_stop_kthread(lock_torture_writer, @@ -776,6 +785,7 @@ static void lock_torture_cleanup(void) else lock_torture_print_module_parms(cxt.cur_ops, "End of test: SUCCESS"); +end: torture_cleanup_end(); } @@ -870,6 +880,7 @@ static int __init lock_torture_init(void) VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory"); firsterr = -ENOMEM; kfree(cxt.lwsa); + cxt.lwsa = NULL; goto unwind; } @@ -878,6 +889,7 @@ static int __init lock_torture_init(void) cxt.lrsa[i].n_lock_acquired = 0; } } + lock_torture_print_module_parms(cxt.cur_ops, "Start of test"); /* Prepare torture context. */ From c003ed928962a55eb446e78c544b1d7c4f6cb88a Mon Sep 17 00:00:00 2001 From: Denys Vlasenko Date: Fri, 8 Apr 2016 20:58:46 +0200 Subject: [PATCH 09/17] locking/lockdep: Deinline register_lock_class(), save 2328 bytes This function compiles to 1328 bytes of machine code. Three callsites. Registering a new lock class is definitely not *that* time-critical to inline it. Signed-off-by: Denys Vlasenko Cc: Andrew Morton Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1460141926-13069-5-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar --- kernel/locking/lockdep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index ed9410936a22..7cc43ef856c1 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -708,7 +708,7 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int subclass) * yet. Otherwise we look it up. We cache the result in the lock object * itself, so actual lookup of the hash should be once per lock object. */ -static inline struct lock_class * +static struct lock_class * register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force) { struct lockdep_subclass_key *key; From 1d4093d3b3a70b947822cca76d6e4132767ce089 Mon Sep 17 00:00:00 2001 From: Eric Engestrom Date: Mon, 25 Apr 2016 07:36:54 +0100 Subject: [PATCH 10/17] locking/Documentation/lockdep: Fix spelling mistakes Signed-off-by: Eric Engestrom Signed-off-by: Peter Zijlstra (Intel) Cc: Jonathan Corbet Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1461566229-4717-2-git-send-email-eric@engestrom.ch Signed-off-by: Ingo Molnar --- Documentation/locking/lockdep-design.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt index 5001280e9d82..9de1c158d44c 100644 --- a/Documentation/locking/lockdep-design.txt +++ b/Documentation/locking/lockdep-design.txt @@ -97,7 +97,7 @@ between any two lock-classes: -> -> -The first rule comes from the fact the a hardirq-safe lock could be +The first rule comes from the fact that a hardirq-safe lock could be taken by a hardirq context, interrupting a hardirq-unsafe lock - and thus could result in a lock inversion deadlock. Likewise, a softirq-safe lock could be taken by an softirq context, interrupting a softirq-unsafe @@ -220,7 +220,7 @@ calculated, which hash is unique for every lock chain. The hash value, when the chain is validated for the first time, is then put into a hash table, which hash-table can be checked in a lockfree manner. If the locking chain occurs again later on, the hash table tells us that we -dont have to validate the chain again. +don't have to validate the chain again. Troubleshooting: ---------------- From e7720af5f9ac914577e2b810d5c004cdf395fd82 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Tue, 26 Apr 2016 10:22:05 -0700 Subject: [PATCH 11/17] locking/Documentation: Add disclaimer It appears people are reading this document as a requirements list for building hardware. This is not the intent of this document. Nor is it particularly suited for this purpose. The primary purpose of this document is our collective attempt to define a set of primitives that (hopefully) allow us to write correct code on the myriad of SMP platforms Linux supports. Its a definite work in progress as our understanding of these platforms, and memory ordering in general, progresses. Nor does being mentioned in this document mean we think its a particularly good idea; the data dependency barrier required by Alpha being a prime example. Yes we have it, no you're insane to require it when building new hardware. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Paul E. McKenney Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: corbet@lwn.net Cc: dave@stgolabs.net Cc: dhowells@redhat.com Cc: linux-doc@vger.kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1461691328-5429-1-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index a9454b1c73bd..fb2dd35a823a 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -4,8 +4,24 @@ By: David Howells Paul E. McKenney + Will Deacon + Peter Zijlstra -Contents: +========== +DISCLAIMER +========== + +This document is not a specification; it is intentionally (for the sake of +brevity) and unintentionally (due to being human) incomplete. This document is +meant as a guide to using the various memory barriers provided by Linux, but +in case of any doubt (and there are many) please ask. + +To repeat, this document is not a specification of what Linux expects from +hardware. + +======== +CONTENTS +======== (*) Abstract memory access model. From 8d4840e84871847ee1bae56a776907d08a9265f7 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 26 Apr 2016 10:22:06 -0700 Subject: [PATCH 12/17] locking/Documentation: State purpose of memory-barriers.txt There has been some confusion about the purpose of memory-barriers.txt, so this commit adds a statement of purpose. Signed-off-by: David Howells Signed-off-by: Paul E. McKenney Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: corbet@lwn.net Cc: dave@stgolabs.net Cc: linux-doc@vger.kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1461691328-5429-2-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index fb2dd35a823a..8b11e54238bf 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -19,6 +19,22 @@ in case of any doubt (and there are many) please ask. To repeat, this document is not a specification of what Linux expects from hardware. +The purpose of this document is twofold: + + (1) to specify the minimum functionality that one can rely on for any + particular barrier, and + + (2) to provide a guide as to how to use the barriers that are available. + +Note that an architecture can provide more than the minimum requirement +for any particular barrier, but if the architecure provides less than +that, that architecture is incorrect. + +Note also that it is possible that a barrier may be a no-op for an +architecture because the way that arch works renders an explicit barrier +unnecessary in that case. + + ======== CONTENTS ======== From 3cfe2e8bc1cf74d78df6fe5ca3a1e1805472a004 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Tue, 26 Apr 2016 10:22:07 -0700 Subject: [PATCH 13/17] locking/Documentation: Clarify that ACQUIRE applies to loads, RELEASE applies to stores For compound atomics performing both a load and a store operation, make it clear that _acquire and _release variants refer only to the load and store portions of compound atomic. For example, xchg_acquire is an xchg operation where the load takes on ACQUIRE semantics. Signed-off-by: Will Deacon Signed-off-by: Paul E. McKenney Acked-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: corbet@lwn.net Cc: dave@stgolabs.net Cc: dhowells@redhat.com Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/1461691328-5429-3-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 8b11e54238bf..147ae8ec836f 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -498,6 +498,11 @@ And a couple of implicit varieties: This means that ACQUIRE acts as a minimal "acquire" operation and RELEASE acts as a minimal "release" operation. +A subset of the atomic operations described in atomic_ops.txt have ACQUIRE +and RELEASE variants in addition to fully-ordered and relaxed (no barrier +semantics) definitions. For compound atomics performing both a load and a +store, ACQUIRE semantics apply only to the load and RELEASE semantics apply +only to the store portion of the operation. Memory barriers are only required where there's a possibility of interaction between two CPUs or between a CPU and a device. If it can be guaranteed that From 5db4298133d99b3dfc60d6899ac9df169769c899 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 26 Apr 2016 10:22:08 -0700 Subject: [PATCH 14/17] lcoking/locktorture: Simplify the torture_runnable computation This commit replaces an #ifdef with IS_ENABLED(), saving five lines. Signed-off-by: Paul E. McKenney Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: corbet@lwn.net Cc: dave@stgolabs.net Cc: dhowells@redhat.com Cc: linux-doc@vger.kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1461691328-5429-4-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar --- kernel/locking/locktorture.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index d066a50dc87e..f8c5af52a131 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -75,12 +75,7 @@ struct lock_stress_stats { long n_lock_acquired; }; -#if defined(MODULE) -#define LOCKTORTURE_RUNNABLE_INIT 1 -#else -#define LOCKTORTURE_RUNNABLE_INIT 0 -#endif -int torture_runnable = LOCKTORTURE_RUNNABLE_INIT; +int torture_runnable = IS_ENABLED(MODULE); module_param(torture_runnable, int, 0444); MODULE_PARM_DESC(torture_runnable, "Start locktorture at module init"); From dc209a3fd73ec96d4491bcc128c3b50b0a8e8017 Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Sun, 17 Apr 2016 23:31:42 -0700 Subject: [PATCH 15/17] locking/pvqspinlock: Avoid double resetting of stats ... remove the redundant second iteration, this is most likely a copy/past buglet. Signed-off-by: Davidlohr Bueso Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dave@stgolabs.net Cc: waiman.long@hpe.com Link: http://lkml.kernel.org/r/1460961103-24953-2-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar --- kernel/locking/qspinlock_stat.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h index d734b7502001..72722334237a 100644 --- a/kernel/locking/qspinlock_stat.h +++ b/kernel/locking/qspinlock_stat.h @@ -191,8 +191,6 @@ static ssize_t qstat_write(struct file *file, const char __user *user_buf, for (i = 0 ; i < qstat_num; i++) WRITE_ONCE(ptr[i], 0); - for (i = 0 ; i < qstat_num; i++) - WRITE_ONCE(ptr[i], 0); } return count; } From b96bbdde19cc56f288372d25fd5ea7af04fc1271 Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Tue, 19 Apr 2016 21:17:25 -0700 Subject: [PATCH 16/17] locking/pvqspinlock: Robustify init_qspinlock_stat() Specifically around the debugfs file creation calls, I have no idea if they could ever possibly fail, but this is core code (debug aside) so lets at least check the return value and inform anything fishy. Signed-off-by: Davidlohr Bueso Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Waiman Long Cc: Andrew Morton Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20160420041725.GC3472@linux-uzut.site Signed-off-by: Ingo Molnar --- kernel/locking/qspinlock_stat.h | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h index 72722334237a..22e025309845 100644 --- a/kernel/locking/qspinlock_stat.h +++ b/kernel/locking/qspinlock_stat.h @@ -212,10 +212,8 @@ static int __init init_qspinlock_stat(void) struct dentry *d_qstat = debugfs_create_dir("qlockstat", NULL); int i; - if (!d_qstat) { - pr_warn("Could not create 'qlockstat' debugfs directory\n"); - return 0; - } + if (!d_qstat) + goto out; /* * Create the debugfs files @@ -225,12 +223,20 @@ static int __init init_qspinlock_stat(void) * performance. */ for (i = 0; i < qstat_num; i++) - debugfs_create_file(qstat_names[i], 0400, d_qstat, - (void *)(long)i, &fops_qstat); + if (!debugfs_create_file(qstat_names[i], 0400, d_qstat, + (void *)(long)i, &fops_qstat)) + goto fail_undo; + + if (!debugfs_create_file(qstat_names[qstat_reset_cnts], 0200, d_qstat, + (void *)(long)qstat_reset_cnts, &fops_qstat)) + goto fail_undo; - debugfs_create_file(qstat_names[qstat_reset_cnts], 0200, d_qstat, - (void *)(long)qstat_reset_cnts, &fops_qstat); return 0; +fail_undo: + debugfs_remove_recursive(d_qstat); +out: + pr_warn("Could not create 'qlockstat' debugfs entries\n"); + return -ENOMEM; } fs_initcall(init_qspinlock_stat); From a1cc5bcfcfca0b99f009b117785142dbdc3b87a3 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 21 Apr 2016 20:35:25 +0200 Subject: [PATCH 17/17] locking/atomics: Flip atomic_fetch_or() arguments All the atomic operations have their arguments the wrong way around; make atomic_fetch_or() consistent and flip them. Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar --- include/linux/atomic.h | 4 ++-- kernel/time/tick-sched.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/atomic.h b/include/linux/atomic.h index 506c3531832e..e451534fe54d 100644 --- a/include/linux/atomic.h +++ b/include/linux/atomic.h @@ -560,11 +560,11 @@ static inline int atomic_dec_if_positive(atomic_t *v) /** * atomic_fetch_or - perform *p |= mask and return old value of *p - * @p: pointer to atomic_t * @mask: mask to OR on the atomic_t + * @p: pointer to atomic_t */ #ifndef atomic_fetch_or -static inline int atomic_fetch_or(atomic_t *p, int mask) +static inline int atomic_fetch_or(int mask, atomic_t *p) { int old, val = atomic_read(p); diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 58e3310c9b21..3daa49ff0719 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -262,7 +262,7 @@ static void tick_nohz_dep_set_all(atomic_t *dep, { int prev; - prev = atomic_fetch_or(dep, BIT(bit)); + prev = atomic_fetch_or(BIT(bit), dep); if (!prev) tick_nohz_full_kick_all(); } @@ -292,7 +292,7 @@ void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) ts = per_cpu_ptr(&tick_cpu_sched, cpu); - prev = atomic_fetch_or(&ts->tick_dep_mask, BIT(bit)); + prev = atomic_fetch_or(BIT(bit), &ts->tick_dep_mask); if (!prev) { preempt_disable(); /* Perf needs local kick that is NMI safe */