linux/arch/arc
Vineet Gupta 3032f0c900 ARCv2: spinlock: remove the extra smp_mb before lock, after unlock
- ARCv2 LLSC spinlocks have smp_mb() both before and after the LLSC
   instructions, which is not required per lkmm ACQ/REL semantics.
   smp_mb() is only needed _after_ lock and _before_ unlock.
   So remove the extra barriers.
   The reason they were there was mainly historical. At the time of
   initial SMP Linux bringup on HS38 cores, I was too conservative,
   given the fluidity of both hw and sw. The last attempt to ditch the
   extra barrier showed some hackbench regression which is apparently
   not the case now (atleast for LLSC case, read on...)

 - EX based spinlocks (!CONFIG_ARC_HAS_LLSC) still needs the extra
   smp_mb(), not due to lkmm, but due to some hardware shenanigans.
   W/o that, hackbench triggers RCU stall splat so extra DMB is retained
   !LLSC based systems are not realistic Linux sstem anyways so they can
   afford to be a nit suboptimal ;-)

   | [ARCLinux]# for i in (seq 1 1 5) ; do hackbench; done
   | Running with 10 groups 400 process
   | INFO: task hackbench:158 blocked for more than 10 seconds.
   |       Not tainted 4.20.0-00005-g96b18288a88e-dirty #117
   | "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
   | hackbench       D    0   158    135 0x00000000
   |
   | Stack Trace:
   | watchdog: BUG: soft lockup - CPU#3 stuck for 59s! [hackbench:469]
   | Modules linked in:
   | Path: (null)
   | CPU: 3 PID: 469 Comm: hackbench Not tainted 4.20.0-00005-g96b18288a88e-dirty
   |
   | [ECR   ]: 0x00000000 => Check Programmer's Manual
   | [EFA   ]: 0x00000000
   | [BLINK ]: do_exit+0x4a6/0x7d0
   | [ERET  ]: _raw_write_unlock_irq+0x44/0x5c

 - And while at it, remove the extar smp_mb() from EX based
   arch_read_trylock() since the spin lock there guarantees a full
   barrier anyways

 - For LLSC case, hackbench threads improves with this patch (HAPS @ 50MHz)

   ---- before ----
   |
   | [ARCLinux]# for i in 1 2 3 4 5; do hackbench 10 thread; done
   | Running with 10 groups 400 threads
   | Time: 16.253
   | Time: 16.445
   | Time: 16.590
   | Time: 16.721
   | Time: 16.544

   ---- after ----
   |
   | [ARCLinux]# for i in 1 2 3 4 5; do hackbench 10 thread; done
   | Running with 10 groups 400 threads
   | Time: 15.638
   | Time: 15.730
   | Time: 15.870
   | Time: 15.842
   | Time: 15.729

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-03-08 11:17:49 -08:00
..
boot ARC: [plat-hsdk]: Enable AXI DW DMAC support 2019-02-25 08:52:16 -08:00
configs arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM 2019-02-25 12:11:01 -08:00
include ARCv2: spinlock: remove the extra smp_mb before lock, after unlock 2019-03-08 11:17:49 -08:00
kernel ARC: unaligned: relax the check for gcc supporting -mno-unaligned-access 2019-03-05 09:16:33 -08:00
lib ARCv2: Add explcit unaligned access support (and ability to disable too) 2019-02-25 12:10:58 -08:00
mm ARC: mm: do_page_fault fixes #1: relinquish mmap_sem if signal arrives while handle_mm_fault 2019-01-17 16:24:39 -08:00
oprofile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
plat-axs10x PCI: consolidate PCI config entry in drivers/pci 2018-11-23 11:45:34 +09:00
plat-eznps arc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c 2018-07-30 11:48:49 -07:00
plat-hsdk ARCv2: support manual regfile save on interrupts 2019-02-21 11:03:18 -08:00
plat-sim ARC: [plat-sim] Include this platform unconditionally 2017-08-04 13:49:47 +05:30
plat-tb10x arc: select GPIOLIB directly 2016-04-26 14:07:59 +02:00
Kbuild
Kconfig ARCv2: Add explcit unaligned access support (and ability to disable too) 2019-02-25 12:10:58 -08:00
Kconfig.debug Kconfig: consolidate the "Kernel hacking" menu 2018-08-02 08:06:48 +09:00
Makefile ARCv2: Add explcit unaligned access support (and ability to disable too) 2019-02-25 12:10:58 -08:00