linux/arch
Anton Blanchard 15c2d45d17 powerpc: Add 64bit optimised memcmp
I noticed ksm spending quite a lot of time in memcmp on a large
KVM box. The current memcmp loop is very unoptimised - byte at a
time compares with no loop unrolling. We can do much much better.

Optimise the loop in a few ways:

- Unroll the byte at a time loop

- For large (at least 32 byte) comparisons that are also 8 byte
  aligned, use an unrolled modulo scheduled loop using 8 byte
  loads. This is similar to our glibc memcmp.

A simple microbenchmark testing 10000000 iterations of an 8192 byte
memcmp was used to measure the performance:

baseline:	29.93 s

modified:	 1.70 s

Just over 17x faster.

v2: Incorporated some suggestions from Segher:

- Use andi. instead of rdlicl.

- Convert bdnzt eq, to bdnz. It's just duplicating the earlier compare
  and was a relic from a previous version.

- Don't use cr5, we have plans to use that CR field for fast local
  atomics.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-23 14:02:55 +11:00
..
alpha arch: Cleanup read_barrier_depends() and comments 2014-12-11 21:15:05 -05:00
arc Minor updates for ARC for 3.19 2014-12-18 16:26:41 -08:00
arm kernel: Provide READ_ONCE and ASSIGN_ONCE 2014-12-20 16:48:59 -08:00
arm64 arm64: mm: Add pgd_page to support RCU fast_gup 2014-12-23 16:39:17 +00:00
avr32 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
blackfin TTY/Serial driver patches for 3.19-rc1 2014-12-14 15:23:32 -08:00
c6x net, lib: kill arch_fast_hash library bits 2014-12-10 15:17:46 -05:00
cris CRISv32: Remove last remnants of ETRAX_SPI_MMC_BOARD 2014-12-20 00:06:13 +01:00
frv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
hexagon Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel 2014-12-19 17:57:51 -08:00
ia64 __get_cpu_var removed from rest of tree, drop reference from comments in arch/ia64 2014-12-19 17:07:27 -08:00
m32r Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
m68k Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
metag arch: Add lightweight memory barriers dma_rmb() and dma_wmb() 2014-12-11 21:15:06 -05:00
microblaze Microblaze patches for 3.19-rc1 2014-12-17 09:54:05 -08:00
mips kernel: Provide READ_ONCE and ASSIGN_ONCE 2014-12-20 16:48:59 -08:00
mn10300 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
nios2 nios2/uaccess: fix sparse errors 2014-12-17 13:53:41 +08:00
openrisc net, lib: kill arch_fast_hash library bits 2014-12-10 15:17:46 -05:00
parisc Merge branch 'parisc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux 2014-12-26 13:41:05 -08:00
powerpc powerpc: Add 64bit optimised memcmp 2015-01-23 14:02:55 +11:00
s390 kernel: Provide READ_ONCE and ASSIGN_ONCE 2014-12-20 16:48:59 -08:00
score net, lib: kill arch_fast_hash library bits 2014-12-10 15:17:46 -05:00
sh PM: Eliminate CONFIG_PM_RUNTIME 2014-12-19 22:55:06 +01:00
sparc sparc32: destroy_context() and switch_mm() needs to disable interrupts. 2014-12-18 12:47:54 -05:00
tile Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile 2014-12-16 13:54:16 -08:00
um TTY/Serial driver patches for 3.19-rc1 2014-12-14 15:23:32 -08:00
unicore32 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
x86 kvm: x86: drop severity of "generation wraparound" message 2014-12-27 21:52:28 +01:00
xtensa Xtensa fixes for 3.19: 2014-12-16 14:08:53 -08:00
.gitignore
Kconfig