linux/lib
Ard Biesheuvel 35129dde88 md/raid6: use faster multiplication for ARM NEON delta syndrome
The P/Q left side optimization in the delta syndrome simply involves
repeatedly multiplying a value by polynomial 'x' in GF(2^8). Given
that 'x * x * x * x' equals 'x^4' even in the polynomial world, we
can accelerate this substantially by performing up to 4 such operations
at once, using the NEON instructions for polynomial multiplication.

Results on a Cortex-A57 running in 64-bit mode:

  Before:
  -------
  raid6: neonx1   xor()  1680 MB/s
  raid6: neonx2   xor()  2286 MB/s
  raid6: neonx4   xor()  3162 MB/s
  raid6: neonx8   xor()  3389 MB/s

  After:
  ------
  raid6: neonx1   xor()  2281 MB/s
  raid6: neonx2   xor()  3362 MB/s
  raid6: neonx4   xor()  3787 MB/s
  raid6: neonx8   xor()  4239 MB/s

While we're at it, simplify MASK() by using a signed shift rather than
a vector compare involving a temp register.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 18:51:57 +01:00
..
842 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-03-17 21:38:27 -07:00
fonts lib/fonts/Kconfig: keep non-Sparc fonts listed together 2017-02-27 18:43:46 -08:00
lz4 lib/lz4: remove back-compat wrappers 2017-02-24 17:46:57 -08:00
lzo lzo: check for length overrun in variable length encoding. 2014-09-28 11:08:01 +02:00
mpi mpi: Fix NULL ptr dereference in mpi_powm() [ver #3] 2016-11-25 12:57:50 +11:00
raid6 md/raid6: use faster multiplication for ARM NEON delta syndrome 2017-08-09 18:51:57 +01:00
reed_solomon
xz lib/xz: enable all filters by default in Kconfig 2014-06-04 16:54:18 -07:00
zlib_deflate zlib_deflate/deftree: remove bi_reverse() 2015-09-10 13:29:01 -07:00
zlib_inflate lib/zlib_inflate/inftrees.c: fix potential buffer overflow 2017-05-08 17:15:12 -07:00
.gitignore
argv_split.c
asn1_decoder.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-05-17 09:33:39 -07:00
assoc_array.c assoc_array: don't call compare_object() on a node 2016-04-06 14:06:48 +01:00
atomic64_test.c lib/atomic64_test.c: add a test that atomic64_inc_not_zero() returns an int 2017-07-14 15:05:13 -07:00
atomic64.c locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() 2016-06-16 10:48:32 +02:00
audit.c syscalls: implement execveat() system call 2014-12-13 12:42:51 -08:00
bcd.c
bch.c
bitmap.c bitmap: optimise bitmap_set and bitmap_clear of a single bit 2017-07-10 16:32:34 -07:00
bitrev.c ARM: 8187/1: add CONFIG_HAVE_ARCH_BITREVERSE to support rbit instruction 2014-12-22 16:43:06 +00:00
bsearch.c lib/bsearch.c: micro-optimize pivot position calculation 2017-07-10 16:32:35 -07:00
btree.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
bug.c debug: Add _ONCE() logic to report_bug() 2017-03-30 09:37:20 +02:00
build_OID_registry
bust_spinlocks.c
chacha20.c random: replace non-blocking pool with a Chacha20-based CRNG 2016-07-03 00:57:23 -04:00
check_signature.c
checksum.c ipv4: Update parameters for csum_tcpudp_magic to their original types 2016-03-13 23:55:13 -04:00
clz_ctz.c lib/clz_ctz.c: add prototype declarations in lib/clz_ctz.c 2014-04-03 16:21:12 -07:00
clz_tab.c
cmdline.c lib/cmdline.c: fix get_options() overflow while parsing ranges 2017-06-23 16:15:55 -07:00
compat_audit.c audit: Add generic compat syscall support 2014-03-20 10:11:35 -04:00
cordic.c
cpu_rmap.c sched/topology: Rename topology_thread_cpumask() to topology_sibling_cpumask() 2015-05-27 15:22:15 +02:00
cpumask.c sched/fair, cpumask: Export for_each_cpu_wrap() 2017-05-15 10:15:23 +02:00
crc4.c lib: Add crc4 module 2017-06-09 11:52:07 +02:00
crc7.c lib/crc7: Shift crc7() output left 1 bit 2014-05-16 14:26:52 -04:00
crc8.c
crc16.c
crc32.c lib: add module support to crc32 tests 2017-02-24 17:46:57 -08:00
crc32defs.h
crc32test.c lib: add module support to crc32 tests 2017-02-24 17:46:57 -08:00
crc-ccitt.c
crc-itu-t.c lib: crc-itu-t.[ch] fix 0x0x prefix in integer constants 2015-05-26 15:26:43 +02:00
crc-t10dif.c lib: introduce crc_t10dif_update() 2015-05-30 22:42:24 -07:00
ctype.c
debug_info.c kbuild: include core debug info when DEBUG_INFO_REDUCED 2015-06-11 15:08:32 +02:00
debug_locks.c
debugobjects.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
dec_and_lock.c
decompress_bunzip2.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress_inflate.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress_unlz4.c lib/decompress_unlz4: change module to work with new LZ4 module version 2017-02-24 17:46:57 -08:00
decompress_unlzma.c lib/decompress_unlzma: Do a NULL check for pointer 2015-09-10 13:29:01 -07:00
decompress_unlzo.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress_unxz.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress.c lib/decompress: set the compressor name to NULL on error 2015-07-17 16:39:54 -07:00
devres.c devres: fix devm_ioremap_*() offset parameter kerneldoc description 2017-04-24 13:53:13 -05:00
digsig.c KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload() 2017-03-02 10:09:00 +11:00
div64.c __div64_32(): make it overridable at compile time 2015-11-16 14:42:12 -05:00
dma-debug.c dmaengine updates for 4.12-rc1 2017-05-09 15:40:28 -07:00
dma-noop.c dma: Take into account dma_pfn_offset 2017-06-28 06:55:01 -07:00
dma-virt.c dma-virt: remove dma_supported and mapping_error methods 2017-06-28 06:54:41 -07:00
dump_stack.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
dynamic_debug.c dynamic_debug: add jump label support 2016-08-04 08:50:07 -04:00
dynamic_queue_limits.c lib/dynamic_queue_limits.c: simplify includes 2015-02-12 18:54:15 -08:00
earlycpio.c lib/cpio: Make find_cpio_data()'s offset arg optional 2016-06-08 11:04:19 +02:00
errseq.c lib: add errseq_t type and infrastructure for handling it 2017-07-06 07:02:24 -04:00
extable.c lib/extable.c: use bsearch() library function in search_extable() 2017-07-10 16:32:35 -07:00
fault-inject.c fault-inject: simplify access check for fail-nth 2017-07-14 15:05:13 -07:00
fdt_empty_tree.c lib: add fdt_empty_tree.c 2014-04-30 19:49:37 +01:00
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit.c lib/find_bit.c: micro-optimise find_next_*_bit 2017-02-24 17:46:57 -08:00
flex_array.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
flex_proportions.c percpu_counter: Rename __percpu_counter_add to percpu_counter_add_batch 2017-06-20 15:42:32 -04:00
gcd.c lib/GCD.c: use binary GCD algorithm instead of Euclidean 2016-05-20 17:58:30 -07:00
gen_crc32table.c lib: crc32: constify crc32 lookup table 2015-02-13 21:21:35 -08:00
genalloc.c lib/genalloc.c: start search from start of chunk 2016-10-27 18:43:43 -07:00
glob.c lib: add module support to glob tests 2017-02-24 17:46:57 -08:00
globtest.c lib: add module support to glob tests 2017-02-24 17:46:57 -08:00
hexdump.c lib/hexdump.c: truncate output in case of overflow 2015-11-06 17:50:42 -08:00
hweight.c x86/hweight: Get rid of the special calling convention 2016-06-08 15:01:02 +02:00
idr.c idr: Add missing __rcu annotations 2017-02-13 21:44:10 -05:00
inflate.c
int_sqrt.c
interval_tree_test.c lib/interval_tree_test.c: allow full tree search 2017-07-10 16:32:35 -07:00
interval_tree.c lib/interval_tree.c: simplify includes 2015-02-12 18:54:15 -08:00
iomap_copy.c lib/iomap_copy.c: add __ioread32_copy() 2016-01-20 17:09:18 -08:00
iomap.c Kconfig: rename HAS_IOPORT to HAS_IOPORT_MAP 2014-04-07 16:36:11 -07:00
iommu-common.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2015-11-05 16:34:48 -08:00
iommu-helper.c lib/iommu-helper: skip to next segment 2016-08-02 19:35:07 -04:00
ioremap.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
iov_iter.c Merge branch 'uaccess-work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-07 20:39:20 -07:00
irq_poll.c This adds a new gcc plugin named "latent_entropy". It is designed to 2016-10-15 10:03:15 -07:00
irq_regs.c
is_single_threaded.c sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
jedec_ddr_data.c
kasprintf.c lib/kasprintf.c: add sanity check to kvasprintf 2016-01-16 11:17:27 -08:00
Kconfig libnvdimm for 4.13 2017-07-07 09:44:06 -07:00
Kconfig.debug Add wait_for_random_bytes() and get_random_*_wait() functions so that 2017-07-15 12:44:02 -07:00
Kconfig.kasan mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB 2016-07-28 16:07:41 -07:00
Kconfig.kgdb lib: update location of kgdb documentation 2017-05-16 08:44:22 -03:00
Kconfig.kmemcheck
Kconfig.ubsan Kconfig: lib/Kconfig.ubsan fix reference to ubsan documentation 2016-12-14 16:04:08 -08:00
kfifo.c kfifo: use BUG_ON 2014-08-08 15:57:25 -07:00
klist.c klist: fix starting point removed bug in klist iterators 2016-02-07 22:18:47 -08:00
kobject_uevent.c kobject: support passing in variables for synthetic uevents 2017-05-25 18:30:51 +02:00
kobject.c kobject: Export kobject_get_unless_zero() 2017-03-22 20:11:35 -06:00
kstrtox.c lib/kstrtox.c: use "unsigned int" more 2017-07-10 16:32:34 -07:00
kstrtox.h
lcm.c block: fix blk_stack_limits() regression due to lcm() change 2015-03-31 09:45:50 -06:00
libcrc32c.c crypto: Work around deallocated stack frame reference gcc bug on sparc. 2017-06-08 17:36:03 +08:00
list_debug.c bug: switch data corruption check to __must_check 2017-02-24 17:46:56 -08:00
list_sort.c lib: add module support to linked list sorting tests 2017-05-08 17:15:10 -07:00
llist.c lib/llist.c: fix data race in llist_del_first 2015-11-06 17:50:42 -08:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-rtmutex.h locking/selftest: Add RT-mutex support 2017-06-08 10:35:50 +02:00
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c locking/selftest: Add RT-mutex support 2017-06-08 10:35:50 +02:00
lockref.c locking/core: Remove cpu_relax_lowlatency() users 2016-11-16 10:15:10 +01:00
lru_cache.c lru_cache: Converted lc_seq_printf_status to return void 2015-11-25 09:22:02 -07:00
Makefile kmod: add test driver to stress test the module loader 2017-07-14 15:05:13 -07:00
memory-notifier-error-inject.c
memweight.c
net_utils.c mac_pton: Use bool not int return 2014-06-25 17:45:43 -07:00
netdev-notifier-error-inject.c net: Add support for CHANGEUPPER notifier error injection 2015-12-03 11:49:23 -05:00
nlattr.c net: manual clean code which call skb_put_[data:zero] 2017-06-20 13:30:15 -04:00
nmi_backtrace.c printk: Use the main logbuf in NMI when logbuf_lock is available 2017-05-19 14:42:19 +02:00
nodemask.c include/linux/nodemask.h: create next_node_in() helper 2016-05-19 19:12:14 -07:00
notifier-error-inject.c
notifier-error-inject.h
of-reconfig-notifier-error-inject.c
oid_registry.c
once.c once: make helper generic for calling functions once 2015-10-08 05:26:36 -07:00
parman.c lib: Introduce priority array area manager 2017-02-03 16:35:42 -05:00
parser.c parser: add u64 number parser 2016-12-06 10:17:03 +02:00
pci_iomap.c libnvdimm for 4.3: 2015-09-08 14:35:59 -07:00
percpu_counter.c writeback: rework wb_[dec|inc]_stat family of functions 2017-07-12 16:26:05 -07:00
percpu_ida.c sched/headers: Prepare to remove the <linux/gfp.h> include from <linux/sched.h> 2017-03-02 08:42:34 +01:00
percpu_test.c
percpu-refcount.c percpu-refcount: support synchronous switch to atomic mode. 2017-03-22 19:18:43 -07:00
plist.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
pm-notifier-error-inject.c
prime_numbers.c lib/prime_numbers: Suppress warn on kmalloc failure 2017-01-23 09:17:12 +01:00
radix-tree.c lockdep: allow to disable reclaim lockup detection 2017-05-03 15:52:09 -07:00
random32.c This adds a new gcc plugin named "latent_entropy". It is designed to 2016-10-15 10:03:15 -07:00
ratelimit.c ratelimit: extend to print suppressed messages on release 2016-08-02 19:35:06 -04:00
rational.c
rbtree_test.c rbtree/test: test rbtree_postorder_for_each_entry_safe() 2014-01-23 16:37:03 -08:00
rbtree.c rbtree: use designated initializers 2017-02-24 17:46:57 -08:00
reciprocal_div.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
refcount.c locking/refcount: Create unchecked atomic_t implementation 2017-06-28 18:54:46 +02:00
rhashtable.c Add wait_for_random_bytes() and get_random_*_wait() functions so that 2017-07-15 12:44:02 -07:00
sbitmap.c sbitmap: add sbitmap_get_shallow() operation 2017-04-14 14:06:52 -06:00
scatterlist.c scatterlist: add sg_zero_buffer() helper 2017-06-15 14:30:14 +02:00
seq_buf.c tracing: Use seq_buf_used() in seq_buf_to_user() instead of len 2015-12-23 14:27:20 -05:00
sg_pool.c lib: scatterlist: move SG pool code from SCSI driver to lib/sg_pool.c 2016-04-15 16:53:14 -04:00
sg_split.c lib: scatterlist: add sg splitting function 2015-08-24 14:28:01 -06:00
sha1.c lib: EXPORT_SYMBOL sha_init 2015-03-23 22:12:08 -04:00
show_mem.c lib/show_mem.c: teach show_mem to work with the given nodemask 2017-02-22 16:41:30 -08:00
siphash.c siphash: implement HalfSipHash1-3 for hash tables 2017-01-09 13:58:57 -05:00
smp_processor_id.c sched/core: Enable might_sleep() and smp_processor_id() checks early 2017-05-23 10:01:38 +02:00
sort.c lib: add CONFIG_TEST_SORT to enable self-test of sort() 2017-02-24 17:46:57 -08:00
stackdepot.c lib/stackdepot: export save/fetch stack for drivers 2016-11-11 08:12:37 -08:00
stmp_device.c lib/stmp_device.c: replace module.h include 2015-02-12 18:54:16 -08:00
string_helpers.c string_helpers: add kstrdup_quotable_file 2016-04-21 10:47:26 +10:00
string.c include/linux/string.h: add the option of fortified string.h functions 2017-07-12 16:26:03 -07:00
strncpy_from_user.c lib: harden strncpy_from_user 2016-10-11 15:06:30 -07:00
strnlen_user.c kill strlen_user() 2017-05-15 23:40:22 -04:00
swiotlb.c swiotlb: ensure that page-sized mappings are page-aligned 2017-01-15 12:37:24 -05:00
syscall.c lib/syscall: Clear return values when no stack 2017-03-24 07:43:35 +01:00
test_bitmap.c bitmap: optimise bitmap_set and bitmap_clear of a single bit 2017-07-10 16:32:34 -07:00
test_bpf.c net: introduce __skb_put_[zero, data, u8] 2017-06-20 13:30:14 -04:00
test_firmware.c test_firmware: add test custom fallback trigger 2017-01-25 11:52:34 +01:00
test_hash.c lib/test_hash.c: fix warning in preprocessor symbol evaluation 2016-09-01 17:52:01 -07:00
test_hexdump.c test_hexdump: print statistics at the end 2016-01-20 17:09:18 -08:00
test_kasan.c kasan: report only the first error by default 2017-03-31 17:13:30 -07:00
test_kmod.c kmod: add test driver to stress test the module loader 2017-07-14 15:05:13 -07:00
test_list_sort.c lib: add module support to linked list sorting tests 2017-05-08 17:15:10 -07:00
test_module.c test: add minimal module for verification testing 2014-01-23 16:36:57 -08:00
test_parman.c lib: fix spelling mistake: "actualy" -> "actually" 2017-02-26 11:03:38 -05:00
test_printf.c mm, printk: introduce new format string for flags 2016-03-15 16:55:16 -07:00
test_rhashtable.c lib: test_rhashtable: Fix KASAN warning 2017-07-25 12:35:23 -07:00
test_siphash.c siphash: implement HalfSipHash1-3 for hash tables 2017-01-09 13:58:57 -05:00
test_sort.c Revert "lib/test_sort.c: make it explicitly non-modular" 2017-05-08 17:15:10 -07:00
test_static_key_base.c locking/static_keys: Provide a selftest 2015-08-03 11:51:12 +02:00
test_static_keys.c locking/static_keys: Avoid nested functions 2016-02-09 10:27:29 +01:00
test_sysctl.c test_sysctl: test against int proc_dointvec() array support 2017-07-12 16:26:00 -07:00
test_user_copy.c lib: remove check for AVR32 arch in test_user_copy 2017-05-01 09:36:30 +02:00
test_uuid.c uuid: fix incorrect uuid_equal conversion in test_uuid_test 2017-07-21 09:38:30 +02:00
test-kstrtox.c kstrto*: accept "-0" for signed conversion 2015-09-10 13:29:01 -07:00
test-string_helpers.c lib/test-string_helpers.c: fix and improve string_get_size() tests 2016-02-03 08:28:43 -08:00
textsearch.c lib/textsearch.c: remove textsearch_put reference from comments 2014-10-14 02:18:14 +02:00
timerqueue.c timerqueue: Use rb_entry_safe() instead of open-coding it 2017-01-20 08:03:42 +01:00
ts_bm.c
ts_fsm.c
ts_kmp.c
ubsan.c UBSAN: fix typo in format string 2016-08-02 17:31:41 -04:00
ubsan.h UBSAN: run-time undefined behavior sanity checker 2016-01-20 17:09:18 -08:00
ucs2_string.c lib/ucs2_string: Speed up ucs2_utf8size() 2016-09-09 16:08:46 +01:00
usercopy.c copy_{from,to}_user(): move kasan checks and might_fault() out-of-line 2017-06-29 22:21:20 -04:00
uuid.c uuid: hoist uuid_is_null() helper from libnvdimm 2017-06-05 16:59:05 +02:00
vsprintf.c DeviceTree for 4.13: 2017-07-07 10:37:54 -07:00
win_minmax.c lib/win_minmax: windowed min or max estimator 2016-09-21 00:22:59 -04:00