linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-29 14:21:47 +00:00

Author	SHA1	Message	Date
Ard Biesheuvel	2cc0fedb81	crypto: x86/cast6 - switch to XTS template Now that the XTS template can wrap accelerated ECB modes, it can be used to implement CAST6 in XTS mode as well, which turns out to be at least as fast, and sometimes even faster Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:27 +11:00
Ard Biesheuvel	55a7e88f01	crypto: x86/camellia - switch to XTS template Now that the XTS template can wrap accelerated ECB modes, it can be used to implement Camellia in XTS mode as well, which turns out to be at least as fast, and sometimes even faster. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:27 +11:00
Bhaskar Chowdhury	4d6a5a4b1e	crypto: marvell/cesa - Fix a spelling s/fautly/faultly/ in comment s/fautly/faulty/p Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:27 +11:00
Kai Ye	34932a6033	crypto: hisilicon/sec - register SEC device to uacce Register SEC device to uacce framework for user space. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:27 +11:00
Kai Ye	bedd04e4aa	crypto: hisilicon/hpre - register HPRE device to uacce Register HPRE device to uacce framework for user space. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Kai Ye	f8408d2b79	crypto: hisilicon - add ZIP device using mode parameter Add 'uacce_mode' parameter for ZIP, which can be set as 0(default) or 1. '0' means ZIP is only registered to kernel crypto, and '1' means it's registered to both kernel crypto and UACCE. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Kai Ye	0d61c3f144	crypto: hisilicon/qm - SVA bugfixed on Kunpeng920 Kunpeng920 SEC/HPRE/ZIP cannot support running user space SVA and kernel Crypto at the same time. Therefore, the algorithms should not be registered to Crypto as user space SVA is enabled. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Jiri Olsa	f7f2b43eaf	crypto: bcm - Rename struct device_private to bcm_device_private Renaming 'struct device_private' to 'struct bcm_device_private', because it clashes with 'struct device_private' from 'drivers/base/base.h'. While it's not a functional problem, it's causing two distinct type hierarchies in BTF data. It also breaks build with options: CONFIG_DEBUG_INFO_BTF=y CONFIG_CRYPTO_DEV_BCM_SPU=y as reported by Qais Yousef [1]. [1] https://lore.kernel.org/lkml/20201229151352.6hzmjvu3qh6p2qgg@e107158-lin/ Fixes: `9d12ba86f8` ("crypto: brcm - Add Broadcom SPU driver") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Adam Guerin	e48767c177	crypto: qat - reduce size of mapped region Restrict size of field to what is required by the operation. This issue was detected by smatch: drivers/crypto/qat/qat_common/qat_asym_algs.c:328 qat_dh_compute_value() error: dma_map_single_attrs() '&qat_req->in.dh.in.b' too small (8 vs 64) Signed-off-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Adam Guerin	80fccf18fe	crypto: qat - change format string and cast ring size Cast ADF_SIZE_TO_RING_SIZE_IN_BYTES() so it can return a 64 bit value. This issue was detected by smatch: drivers/crypto/qat/qat_common/adf_transport_debug.c:65 adf_ring_show() warn: should '(1 << (ring->ring_size - 1)) << 7' be a 64 bit type? Signed-off-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Adam Guerin	1aaae055d4	crypto: qat - fix potential spectre issue Sanitize ring_num value coming from configuration (and potentially from user space) before it is used as index in the banks array. This issue was detected by smatch: drivers/crypto/qat/qat_common/adf_transport.c:233 adf_create_ring() warn: potential spectre issue 'bank->rings' [r] (local cap) Signed-off-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Wojciech Ziemba	0db0d797ab	crypto: qat - configure arbiter mapping based on engines enabled The hardware specific function adf_get_arbiter_mapping() modifies the static array thrd_to_arb_map to disable mappings for AEs that are disabled. This static array is used for each device of the same type. If the ae mask is not identical for all devices of the same type then the arbiter mapping returned by adf_get_arbiter_mapping() may be wrong. This patch fixes this problem by ensuring the static arbiter mapping is unchanged and the device arbiter mapping is re-calculated each time based on the static mapping. Signed-off-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:26 +11:00
Ard Biesheuvel	d6cbf4eaa4	crypto: aesni - replace function pointers with static branches Replace the function pointers in the GCM implementation with static branches, which are based on code patching, which occurs only at module load time. This avoids the severe performance penalty caused by the use of retpolines. In order to retain the ability to switch between different versions of the implementation based on the input size on cores that support AVX and AVX2, use static branches instead of static calls. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:25 +11:00
Ard Biesheuvel	83c83e6588	crypto: aesni - refactor scatterlist processing Currently, the gcm(aes-ni) driver open codes the scatterlist handling that is encapsulated by the skcipher walk API. So let's switch to that instead. Also, move the handling at the end of gcmaes_crypt_by_sg() that is dependent on whether we are encrypting or decrypting into the callers, which always do one or the other. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:25 +11:00
Ard Biesheuvel	2694e23ffd	crypto: aesni - clean up mapping of associated data The gcm(aes-ni) driver is only built for x86_64, which does not make use of highmem. So testing for PageHighMem is pointless and can be omitted. While at it, replace GFP_ATOMIC with the appropriate runtime decided value based on the context. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:25 +11:00
Ard Biesheuvel	30f2c18eb5	crypto: aesni - drop unused asm prototypes Drop some prototypes that are declared but never called. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:25 +11:00
Ard Biesheuvel	a13ed1d15b	crypto: aesni - prevent misaligned buffers on the stack The GCM mode driver uses 16 byte aligned buffers on the stack to pass the IV to the asm helpers, but unfortunately, the x86 port does not guarantee that the stack pointer is 16 byte aligned upon entry in the first place. Since the compiler is not aware of this, it will not emit the additional stack realignment sequence that is needed, and so the alignment is not guaranteed to be more than 8 bytes. So instead, allocate some padding on the stack, and realign the IV pointer by hand. Cc: <stable@vger.kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:25 +11:00
Marco Chiappero	4f1a02e75a	crypto: qat - replace CRYPTO_AES with CRYPTO_LIB_AES in Kconfig Use CRYPTO_LIB_AES in place of CRYPTO_AES in the dependences for the QAT common code. Fixes: `c0e583ab20` ("crypto: qat - add CRYPTO_AES to Kconfig dependencies") Reported-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:25 +11:00
Herbert Xu	81064c96d8	crypto: stm32 - Fix last sparse warning in stm32_cryp_check_ctr_counter This patch changes the cast in stm32_cryp_check_ctr_counter from u32 to __be32 to match the prototype of stm32_cryp_hw_write_iv correctly. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-14 17:10:25 +11:00
Herbert Xu	622aae879c	crypto: vmx - Move extern declarations into header file This patch moves the extern algorithm declarations into a header file so that a number of compiler warnings are silenced. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-08 15:39:47 +11:00
Ard Biesheuvel	2481104fe9	crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper The AES-NI driver implements XTS via the glue helper, which consumes a struct with sets of function pointers which are invoked on chunks of input data of the appropriate size, as annotated in the struct. Let's get rid of this indirection, so that we can perform direct calls to the assembler helpers. Instead, let's adopt the arm64 strategy, i.e., provide a helper which can consume inputs of any size, provided that the penultimate, full block is passed via the last call if ciphertext stealing needs to be applied. This also allows us to enable the XTS mode for i386. Tested-by: Eric Biggers <ebiggers@google.com> # x86_64 Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-08 15:39:47 +11:00
Ard Biesheuvel	86ad60a65f	crypto: x86/aes-ni-xts - use direct calls to and 4-way stride The XTS asm helper arrangement is a bit odd: the 8-way stride helper consists of back-to-back calls to the 4-way core transforms, which are called indirectly, based on a boolean that indicates whether we are performing encryption or decryption. Given how costly indirect calls are on x86, let's switch to direct calls, and given how the 8-way stride doesn't really add anything substantial, use a 4-way stride instead, and make the asm core routine deal with any multiple of 4 blocks. Since 512 byte sectors or 4 KB blocks are the typical quantities XTS operates on, increase the stride exported to the glue helper to 512 bytes as well. As a result, the number of indirect calls is reduced from 3 per 64 bytes of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU) Fixes: `9697fa39ef` ("x86/retpoline/crypto: Convert crypto assembler indirect jumps") Tested-by: Eric Biggers <ebiggers@google.com> # x86_64 Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-08 15:39:47 +11:00
Rob Herring	fecff3b931	crypto: picoxcell - Remove PicoXcell driver PicoXcell has had nothing but treewide cleanups for at least the last 8 years and no signs of activity. The most recent activity is a yocto vendor kernel based on v3.0 in 2015. Cc: Jamie Iles <jamie@jamieiles.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-crypto@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 09:03:36 +11:00
Eric Biggers	1862eb0073	crypto: arm/blake2b - add NEON-accelerated BLAKE2b Add a NEON-accelerated implementation of BLAKE2b. On Cortex-A7 (which these days is the most common ARM processor that doesn't have the ARMv8 Crypto Extensions), this is over twice as fast as SHA-256, and slightly faster than SHA-1. It is also almost three times as fast as the generic implementation of BLAKE2b: Algorithm Cycles per byte (on 4096-byte messages) =================== ======================================= blake2b-256-neon 14.0 sha1-neon 16.3 blake2s-256-arm 18.8 sha1-asm 20.8 blake2s-256-generic 26.0 sha256-neon 28.9 sha256-asm 32.0 blake2b-256-generic 38.9 This implementation isn't directly based on any other implementation, but it borrows some ideas from previous NEON code I've written as well as from chacha-neon-core.S. At least on Cortex-A7, it is faster than the other NEON implementations of BLAKE2b I'm aware of (the implementation in the BLAKE2 official repository using intrinsics, and Andrew Moon's implementation which can be found in SUPERCOP). It does only one block at a time, so it performs well on short messages too. NEON-accelerated BLAKE2b is useful because there is interest in using BLAKE2b-256 for dm-verity on low-end Android devices (specifically, devices that lack the ARMv8 Crypto Extensions) to replace SHA-1. On these devices, the performance cost of upgrading to SHA-256 may be unacceptable, whereas BLAKE2b-256 would actually improve performance. Although BLAKE2b is intended for 64-bit platforms (unlike BLAKE2s which is intended for 32-bit platforms), on 32-bit ARM processors with NEON, BLAKE2b is actually faster than BLAKE2s. This is because NEON supports 64-bit operations, and because BLAKE2s's block size is too small for NEON to be helpful for it. The best I've been able to do with BLAKE2s on Cortex-A7 is 18.8 cpb with an optimized scalar implementation. (I didn't try BLAKE2sp and BLAKE3, which in theory would be faster, but they're more complex as they require running multiple hashes at once. Note that BLAKE2b already uses all the NEON bandwidth on the Cortex-A7, so I expect that any speedup from BLAKE2sp or BLAKE3 would come only from the smaller number of rounds, not from the extra parallelism.) For now this BLAKE2b implementation is only wired up to the shash API, since there is no library API for BLAKE2b yet. However, I've tried to keep things consistent with BLAKE2s, e.g. by defining blake2b_compress_arch() which is analogous to blake2s_compress_arch() and could be exported for use by the library API later if needed. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:39 +11:00
Eric Biggers	0cdc438e6e	crypto: blake2b - update file comment The file comment for blake2b_generic.c makes it sound like it's the reference implementation of BLAKE2b with only minor changes. But it's actually been changed a lot. Update the comment to make this clearer. Reviewed-by: David Sterba <dsterba@suse.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:39 +11:00
Eric Biggers	28dcca4cc0	crypto: blake2b - sync with blake2s implementation Sync the BLAKE2b code with the BLAKE2s code as much as possible: - Move a lot of code into new headers <crypto/blake2b.h> and <crypto/internal/blake2b.h>, and adjust it to be like the corresponding BLAKE2s code, i.e. like <crypto/blake2s.h> and <crypto/internal/blake2s.h>. - Rename constants, e.g. BLAKE2B__DIGEST_SIZE => BLAKE2B__HASH_SIZE. - Use a macro BLAKE2B_ALG() to define the shash_alg structs. - Export blake2b_compress_generic() for use as a fallback. This makes it much easier to add optimized implementations of BLAKE2b, as optimized implementations can use the helper functions crypto_blake2b_{setkey,init,update,final}() and blake2b_compress_generic(). The ARM implementation will use these. But this change is also helpful because it eliminates unnecessary differences between the BLAKE2b and BLAKE2s code, so that the same improvements can easily be made to both. (The two algorithms are basically identical, except for the word size and constants.) It also makes it straightforward to add a library API for BLAKE2b in the future if/when it's needed. This change does make the BLAKE2b code slightly more complicated than it needs to be, as it doesn't actually provide a library API yet. For example, __blake2b_update() doesn't really need to exist yet; it could just be inlined into crypto_blake2b_update(). But I believe this is outweighed by the benefits of keeping the code in sync. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:39 +11:00
Eric Biggers	a64bfe7ad4	wireguard: Kconfig: select CRYPTO_BLAKE2S_ARM When available, select the new implementation of BLAKE2s for 32-bit ARM. This is faster than the generic C implementation. Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:39 +11:00
Eric Biggers	5172d322d3	crypto: arm/blake2s - add ARM scalar optimized BLAKE2s Add an ARM scalar optimized implementation of BLAKE2s. NEON isn't very useful for BLAKE2s because the BLAKE2s block size is too small for NEON to help. Each NEON instruction would depend on the previous one, resulting in poor performance. With scalar instructions, on the other hand, we can take advantage of ARM's "free" rotations (like I did in chacha-scalar-core.S) to get an implementation get runs much faster than the C implementation. Performance results on Cortex-A7 in cycles per byte using the shash API: 4096-byte messages: blake2s-256-arm: 18.8 blake2s-256-generic: 26.0 500-byte messages: blake2s-256-arm: 20.3 blake2s-256-generic: 27.9 100-byte messages: blake2s-256-arm: 29.7 blake2s-256-generic: 39.2 32-byte messages: blake2s-256-arm: 50.6 blake2s-256-generic: 66.2 Except on very short messages, this is still slower than the NEON implementation of BLAKE2b which I've written; that is 14.0, 16.4, 25.8, and 76.1 cpb on 4096, 500, 100, and 32-byte messages, respectively. However, optimized BLAKE2s is useful for cases where BLAKE2s is used instead of BLAKE2b, such as WireGuard. This new implementation is added in the form of a new module blake2s-arm.ko, which is analogous to blake2s-x86_64.ko in that it provides blake2s_compress_arch() for use by the library API as well as optionally register the algorithms with the shash API. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:39 +11:00
Eric Biggers	bbda6e0f13	crypto: blake2s - include <linux/bug.h> instead of <asm/bug.h> Address the following checkpatch warning: WARNING: Use #include <linux/bug.h> instead of <asm/bug.h> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:39 +11:00
Eric Biggers	8786841bc2	crypto: blake2s - adjust include guard naming Use the full path in the include guards for the BLAKE2s headers to avoid ambiguity and to match the convention for most files in include/crypto/. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:38 +11:00
Eric Biggers	7d87131fad	crypto: blake2s - add comment for blake2s_state fields The first three fields of 'struct blake2s_state' are used in assembly code, which isn't immediately obvious, so add a comment to this effect. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:38 +11:00
Eric Biggers	42ad8cf821	crypto: blake2s - optimize blake2s initialization If no key was provided, then don't waste time initializing the block buffer, as its initial contents won't be used. Also, make crypto_blake2s_init() and blake2s() call a single internal function __blake2s_init() which treats the key as optional, rather than conditionally calling blake2s_init() or blake2s_init_key(). This reduces the compiled code size, as previously both blake2s_init() and blake2s_init_key() were being inlined into these two callers, except when the key size passed to blake2s() was a compile-time constant. These optimizations aren't that significant for BLAKE2s. However, the equivalent optimizations will be more significant for BLAKE2b, as everything is twice as big in BLAKE2b. And it's good to keep things consistent rather than making optimizations for BLAKE2b but not BLAKE2s. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:38 +11:00
Eric Biggers	8c4a93a127	crypto: blake2s - share the "shash" API boilerplate code Add helper functions for shash implementations of BLAKE2s to include/crypto/internal/blake2s.h, taking advantage of __blake2s_update() and __blake2s_final() that were added by the previous patch to share more code between the library and shash implementations. crypto_blake2s_setkey() and crypto_blake2s_init() are usable as shash_alg::setkey and shash_alg::init directly, while crypto_blake2s_update() and crypto_blake2s_final() take an extra 'blake2s_compress_t' function pointer parameter. This allows the implementation of the compression function to be overridden, which is the only part that optimized implementations really care about. The new functions are inline functions (similar to those in sha1_base.h, sha256_base.h, and sm3_base.h) because this avoids needing to add a new module blake2s_helpers.ko, they aren't too long, and this avoids indirect calls which are expensive these days. Note that they can't go in blake2s_generic.ko, as that would require selecting CRYPTO_BLAKE2S from CRYPTO_BLAKE2S_X86, which would cause a recursive dependency. Finally, use these new helper functions in the x86 implementation of BLAKE2s. (This part should be a separate patch, but unfortunately the x86 implementation used the exact same function names like "crypto_blake2s_update()", so it had to be updated at the same time.) Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:38 +11:00
Eric Biggers	057edc9c8b	crypto: blake2s - move update and final logic to internal/blake2s.h Move most of blake2s_update() and blake2s_final() into new inline functions __blake2s_update() and __blake2s_final() in include/crypto/internal/blake2s.h so that this logic can be shared by the shash helper functions. This will avoid duplicating this logic between the library and shash implementations. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:38 +11:00
Eric Biggers	df412e7efd	crypto: blake2s - remove unneeded includes It doesn't make sense for the generic implementation of BLAKE2s to include <crypto/internal/simd.h> and <linux/jump_label.h>, as these are things that would only be useful in an architecture-specific implementation. Remove these unnecessary includes. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:38 +11:00
Eric Biggers	1aa90f4cf0	crypto: x86/blake2s - define shash_alg structs using macros The shash_alg structs for the four variants of BLAKE2s are identical except for the algorithm name, driver name, and digest size. So, avoid code duplication by using a macro to define these structs. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:38 +11:00
Eric Biggers	0d396058f9	crypto: blake2s - define shash_alg structs using macros The shash_alg structs for the four variants of BLAKE2s are identical except for the algorithm name, driver name, and digest size. So, avoid code duplication by using a macro to define these structs. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Christophe JAILLET	c4ff41b93d	hwrng: ingenic - Fix a resource leak in an error handling path In case of error, we should call 'clk_disable_unprepare()' to undo a previous 'clk_prepare_enable()' call, as already done in the remove function. Fixes: `406346d222` ("hwrng: ingenic - Add hardware TRNG for Ingenic X1830") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Matthias Brugger	256693a362	hwrng: iproc-rng200 - Move enable/disable in separate function We are calling the same code for enable and disable the block in various parts of the driver. Put that code into a new function to reduce code duplication. Signed-off-by: Matthias Brugger <mbrugger@suse.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Matthias Brugger	96a6af5403	hwrng: iproc-rng200 - Fix disable of the block. When trying to disable the block we bitwise or the control register with value zero. This is confusing as using bitwise or with value zero doesn't have any effect at all. Drop this as we already set the enable bit to zero by appling inverted RNG_RBGEN_MASK. Signed-off-by: Matthias Brugger <mbrugger@suse.com> Acked-by: Scott Branden <scott.branden@broadcom.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Ard Biesheuvel	5318d3db46	crypto: arm64/aes-ctr - improve tail handling Counter mode is a stream cipher chaining mode that is typically used with inputs that are of arbitrarily length, and so a tail block which is smaller than a full AES block is rule rather than exception. The current ctr(aes) implementation for arm64 always makes a separate call into the assembler routine to process this tail block, which is suboptimal, given that it requires reloading of the AES round keys, and prevents us from handling this tail block using the 5-way stride that we use for better performance on deep pipelines. So let's update the assembler routine so it can handle any input size, and uses NEON permutation instructions and overlapping loads and stores to handle the tail block. This results in a ~16% speedup for 1420 byte blocks on cores with deep pipelines such as ThunderX2. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Ard Biesheuvel	15deb4333c	crypto: arm64/aes-ce - really hide slower algos when faster ones are enabled Commit `69b6f2e817` ("crypto: arm64/aes-neon - limit exposed routines if faster driver is enabled") intended to hide modes from the plain NEON driver that are also implemented by the faster bit sliced NEON one if both are enabled. However, the defined() CPP function does not detect if the bit sliced NEON driver is enabled as a module. So instead, let's use IS_ENABLED() here. Fixes: `69b6f2e817` ("crypto: arm64/aes-neon - limit exposed routines if ...") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Daniele Alessandrelli	5a5a27b3e1	MAINTAINERS: Add maintainers for Keem Bay OCS HCU driver Add maintainers for the Intel Keem Bay Offload Crypto Subsystem (OCS) Hash Control Unit (HCU) crypto driver. Signed-off-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com> Acked-by: Declan Murphy <declan.murphy@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Daniele Alessandrelli	b46f803688	crypto: keembay-ocs-hcu - Add optional support for sha224 Add optional support of sha224 and hmac(sha224). Co-developed-by: Declan Murphy <declan.murphy@intel.com> Signed-off-by: Declan Murphy <declan.murphy@intel.com> Signed-off-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:37 +11:00
Daniele Alessandrelli	ae832e329a	crypto: keembay-ocs-hcu - Add HMAC support Add HMAC support to the Keem Bay OCS HCU driver, thus making it provide the following additional transformations: - hmac(sha256) - hmac(sha384) - hmac(sha512) - hmac(sm3) The Keem Bay OCS HCU hardware does not allow "context-switch" for HMAC operations, i.e., it does not support computing a partial HMAC, save its state and then continue it later. Therefore, full hardware acceleration is provided only when possible (e.g., when crypto_ahash_digest() is called); in all other cases hardware acceleration is only partial (OPAD and IPAD calculation is done in software, while hashing is hardware accelerated). Co-developed-by: Declan Murphy <declan.murphy@intel.com> Signed-off-by: Declan Murphy <declan.murphy@intel.com> Signed-off-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:36 +11:00
Declan Murphy	472b04444c	crypto: keembay - Add Keem Bay OCS HCU driver Add support for the Hashing Control Unit (HCU) included in the Offload Crypto Subsystem (OCS) of the Intel Keem Bay SoC, thus enabling hardware-accelerated hashing on the Keem Bay SoC for the following algorithms: - sha256 - sha384 - sha512 - sm3 The driver is composed of two files: - 'ocs-hcu.c' which interacts with the hardware and abstracts it by providing an API following the usual paradigm used in hashing drivers / libraries (e.g., hash_init(), hash_update(), hash_final(), etc.). NOTE: this API can block and sleep, since completions are used to wait for the HW to complete the hashing. - 'keembay-ocs-hcu-core.c' which exports the functionality provided by 'ocs-hcu.c' as a ahash crypto driver. The crypto engine is used to provide asynchronous behavior. 'keembay-ocs-hcu-core.c' also takes care of the DMA mapping of the input sg list. The driver passes crypto manager self-tests, including the extra tests (CRYPTO_MANAGER_EXTRA_TESTS=y). Signed-off-by: Declan Murphy <declan.murphy@intel.com> Co-developed-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com> Signed-off-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com> Acked-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:36 +11:00
Declan Murphy	33ff64884c	dt-bindings: crypto: Add Keem Bay OCS HCU bindings Add device-tree bindings for the Intel Keem Bay Offload Crypto Subsystem (OCS) Hashing Control Unit (HCU) crypto driver. Signed-off-by: Declan Murphy <declan.murphy@intel.com> Signed-off-by: Daniele Alessandrelli <daniele.alessandrelli@intel.com> Acked-by: Mark Gross <mgross@linux.intel.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:36 +11:00
Corentin Labbe	44122cc6ee	crypto: sun4i-ss - add SPDX header and remove blank lines This patchs fixes some remaining style issue. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:36 +11:00
Corentin Labbe	b1f578b85a	crypto: sun4i-ss - enabled stats via debugfs This patch enable to access usage stats for each algorithm. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:36 +11:00
Corentin Labbe	9bc3dd24e7	crypto: sun4i-ss - fix kmap usage With the recent kmap change, some tests which were conditional on CONFIG_DEBUG_HIGHMEM now are enabled by default. This permit to detect a problem in sun4i-ss usage of kmap. sun4i-ss uses two kmap via sg_miter (one for input, one for output), but using two kmap at the same time is hard: "the ordering has to be correct and with sg_miter that's probably hard to get right." (quoting Tlgx) So the easiest solution is to never have two sg_miter/kmap open at the same time. After each use of sg_miter, I store the current index, for being able to resume sg_miter to the right place. Fixes: `6298e94821` ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-01-03 08:41:36 +11:00

1 2 3 4 5 ...

982066 Commits