mirror of
https://github.com/torvalds/linux.git
synced 2024-11-11 06:31:49 +00:00
f5bfdc8e39
Arm64 has a more optimized spinning loop (atomic_cond_read_acquire) using wfe for spinlock that can boost performance of sibling threads by putting the current cpu to a wait state that is broken only when the monitored variable changes or an external event happens. OSQ has a more complicated spinning loop. Besides the lock value, it also checks for need_resched() and vcpu_is_preempted(). The check for need_resched() is not a problem as it is only set by the tick interrupt handler. That will be detected by the spinning cpu right after iret. The vcpu_is_preempted() check, however, is a problem as changes to the preempt state of of previous node will not affect the wait state. For ARM64, vcpu_is_preempted is not currently defined and so is a no-op. Will has indicated that he is planning to para-virtualize wfe instead of defining vcpu_is_preempted for PV support. So just add a comment in arch/arm64/include/asm/spinlock.h to indicate that vcpu_is_preempted() should not be defined as suggested. On a 2-socket 56-core 224-thread ARM64 system, a kernel mutex locking microbenchmark was run for 10s with and without the patch. The performance numbers before patch were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 316/123,143/2,121,269 Threads = 224, Total Rate = 2,757 kop/s; Percpu Rate = 12 kop/s After patch, the numbers were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 334/147,836/1,304,787 Threads = 224, Total Rate = 3,311 kop/s; Percpu Rate = 15 kop/s So there was about 20% performance improvement. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200113150735.21956-1-longman@redhat.com |
||
---|---|---|
.. | ||
lock_events_list.h | ||
lock_events.c | ||
lock_events.h | ||
lockdep_internals.h | ||
lockdep_proc.c | ||
lockdep_states.h | ||
lockdep.c | ||
locktorture.c | ||
Makefile | ||
mcs_spinlock.h | ||
mutex-debug.c | ||
mutex-debug.h | ||
mutex.c | ||
mutex.h | ||
osq_lock.c | ||
percpu-rwsem.c | ||
qrwlock.c | ||
qspinlock_paravirt.h | ||
qspinlock_stat.h | ||
qspinlock.c | ||
rtmutex_common.h | ||
rtmutex-debug.c | ||
rtmutex-debug.h | ||
rtmutex.c | ||
rtmutex.h | ||
rwsem.c | ||
rwsem.h | ||
semaphore.c | ||
spinlock_debug.c | ||
spinlock.c | ||
test-ww_mutex.c |