linux/arch/arm/crypto
Ard Biesheuvel 86cd97ec4b crypto: arm/chacha-neon - optimize for non-block size multiples
The current NEON based ChaCha implementation for ARM is optimized for
multiples of 4x the ChaCha block size (64 bytes). This makes sense for
block encryption, but given that ChaCha is also often used in the
context of networking, it makes sense to consider arbitrary length
inputs as well.

For example, WireGuard typically uses 1420 byte packets, and performing
ChaCha encryption involves 5 invocations of chacha_4block_xor_neon()
and 3 invocations of chacha_block_xor_neon(), where the last one also
involves a memcpy() using a buffer on the stack to process the final
chunk of 1420 % 64 == 12 bytes.

Let's optimize for this case as well, by letting chacha_4block_xor_neon()
deal with any input size between 64 and 256 bytes, using NEON permutation
instructions and overlapping loads and stores. This way, the 140 byte
tail of a 1420 byte input buffer can simply be processed in one go.

This results in the following performance improvements for 1420 byte
blocks, without significant impact on power-of-2 input sizes. (Note
that Raspberry Pi is widely used in combination with a 32-bit kernel,
even though the core is 64-bit capable)

   Cortex-A8  (BeagleBone)       :   7%
   Cortex-A15 (Calxeda Midway)   :  21%
   Cortex-A53 (Raspberry Pi 3)   :   3%
   Cortex-A72 (Raspberry Pi 4)   :  19%

Cc: Eric Biggers <ebiggers@google.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-13 20:38:44 +11:00
..
.gitignore SPDX patches for 5.7-rc1. 2020-04-03 13:12:26 -07:00
aes-ce-core.S crypto: arm - use Kconfig based compiler checks for crypto opcodes 2019-10-23 19:46:56 +11:00
aes-ce-glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
aes-cipher-core.S crypto: arm/aes-cipher - switch to shared AES inverse Sbox 2019-07-26 14:58:37 +10:00
aes-cipher-glue.c crypto: arm/aes-scalar - unexport en/decryption routines 2019-07-26 14:58:38 +10:00
aes-neonbs-core.S crypto: arm/aes-neonbs - avoid loading reorder argument on encryption 2020-09-25 17:48:15 +10:00
aes-neonbs-glue.c crypto: arm/aes-neonbs - fix usage of cbc(aes) fallback 2020-11-06 14:31:15 +11:00
chacha-glue.c crypto: arm/chacha-neon - optimize for non-block size multiples 2020-11-13 20:38:44 +11:00
chacha-neon-core.S crypto: arm/chacha-neon - optimize for non-block size multiples 2020-11-13 20:38:44 +11:00
chacha-scalar-core.S crypto: arm/chacha - remove dependency on generic ChaCha driver 2019-11-17 09:02:40 +08:00
crc32-ce-core.S crypto: Replace HTTP links with HTTPS ones 2020-07-23 17:34:20 +10:00
crc32-ce-glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
crct10dif-ce-core.S crypto: arm - use Kconfig based compiler checks for crypto opcodes 2019-10-23 19:46:56 +11:00
crct10dif-ce-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
curve25519-core.S crypto: arm/curve25519 - wire up NEON implementation 2019-11-17 09:02:44 +08:00
curve25519-glue.c crypto: arm/curve25519 - include <linux/scatterlist.h> 2020-08-25 11:24:07 +10:00
ghash-ce-core.S crypto: arm/ghash-ce - define fpu before fpu registers are referenced 2020-03-06 12:28:25 +11:00
ghash-ce-glue.c crypto: arm/ghash - use variably sized key struct 2020-07-09 22:14:33 +10:00
Kconfig compiler/gcc: Raise minimum GCC version for kernel builds to 4.8 2020-04-15 21:36:20 +01:00
Makefile crypto: arm/curve25519 - wire up NEON implementation 2019-11-17 09:02:44 +08:00
nh-neon-core.S crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305 2018-11-20 14:26:56 +08:00
nhpoly1305-neon-glue.c crypto: arch/nhpoly1305 - process in explicit 4k chunks 2020-04-30 15:16:59 +10:00
poly1305-armv4.pl crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation 2019-11-17 09:02:42 +08:00
poly1305-core.S_shipped crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation 2019-11-17 09:02:42 +08:00
poly1305-glue.c crypto: arm/poly1305 - Add prototype for poly1305_blocks_neon 2020-09-04 17:57:14 +10:00
sha1_glue.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
sha1_neon_glue.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
sha1-armv4-large.S crypto: Replace HTTP links with HTTPS ones 2020-07-23 17:34:20 +10:00
sha1-armv7-neon.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sha1-ce-core.S crypto: arm - use Kconfig based compiler checks for crypto opcodes 2019-10-23 19:46:56 +11:00
sha1-ce-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sha1.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sha2-ce-core.S crypto: arm - use Kconfig based compiler checks for crypto opcodes 2019-10-23 19:46:56 +11:00
sha2-ce-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sha256_glue.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
sha256_glue.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sha256_neon_glue.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
sha256-armv4.pl crypto: arm/sha256-neon - avoid ADRL pseudo instruction 2020-09-25 17:48:13 +10:00
sha256-core.S_shipped crypto: arm/sha256-neon - avoid ADRL pseudo instruction 2020-09-25 17:48:13 +10:00
sha512-armv4.pl crypto: arm/sha512-neon - avoid ADRL pseudo instruction 2020-09-25 17:48:14 +10:00
sha512-core.S_shipped crypto: arm/sha512-neon - avoid ADRL pseudo instruction 2020-09-25 17:48:14 +10:00
sha512-glue.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-08 20:57:08 -07:00
sha512-neon-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sha512.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00