To allow accelerated implementations to fall back to the generic
routines, e.g., in contexts where a SIMD based implementation is
not allowed to run, expose the generic SHA3 init/update/final
routines to other modules.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation of exposing the generic SHA3 implementation to other
versions as a fallback, simplify the code, and remove an inconsistency
in the output handling (endian swabbing rsizw words of state before
writing the output does not make sense)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The way the KECCAK transform is currently coded involves many references
into the state array using indexes that are calculated at runtime using
simple but non-trivial arithmetic. This forces the compiler to treat the
state matrix as an array in memory rather than keep it in registers,
which results in poor performance.
So instead, let's rephrase the algorithm using fixed array indexes only.
This helps the compiler keep the state matrix in registers, resulting
in the following speedup (SHA3-256 performance in cycles per byte):
before after speedup
Intel Core i7 @ 2.0 GHz (2.9 turbo) 100.6 35.7 2.8x
Cortex-A57 @ 2.0 GHz (64-bit mode) 101.6 12.7 8.0x
Cortex-A53 @ 1.0 GHz 224.4 15.8 14.2x
Cortex-A57 @ 2.0 GHz (32-bit mode) 201.8 63.0 3.2x
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure that the input is byte swabbed before injecting it into the
SHA3 transform. Use the get_unaligned() accessor for this so that
we don't perform unaligned access inadvertently on architectures
that do not support that.
Cc: <stable@vger.kernel.org>
Fixes: 53964b9ee6 ("crypto: sha3 - Add SHA-3 hash algorithm")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Async hash operations can use result pointer in final/finup/digest,
but not in init/update/export/import, so test it for misuse.
Signed-off-by: Kamil Konieczny <k.konieczny@partner.samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
My last bugfix added -Os on the command line, which unfortunately caused
a build regression on powerpc in some configurations.
I've done some more analysis of the original problem and found slightly
different workaround that avoids this regression and also results in
better performance on gcc-7.0: -fcode-hoisting is an optimization step
that got added in gcc-7 and that for all gcc-7 versions causes worse
performance.
This disables -fcode-hoisting on all compilers that understand the option.
For gcc-7.1 and 7.2 I found the same performance as my previous patch
(using -Os), in gcc-7.0 it was even better. On gcc-8 I could see no
change in performance from this patch. In theory, code hoisting should
not be able make things better for the AES cipher, so leaving it
disabled for gcc-8 only serves to simplify the Makefile change.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Link: https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg30418.html
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
Fixes: 148b974dee ("crypto: aes-generic - build with -Os on gcc-7+")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert salsa20-asm from the deprecated "blkcipher" API to the
"skcipher" API, in the process fixing it up to use the generic helpers.
This allows removing the salsa20_keysetup() and salsa20_ivsetup()
assembly functions, which aren't performance critical; the C versions do
just fine.
This also fixes the same bug that salsa20-generic had, where the state
array was being maintained directly in the transform context rather than
on the stack or in the request context. Thus, if multiple threads used
the same Salsa20 transform concurrently they produced the wrong results.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Export the Salsa20 constants, transform context, and initialization
functions so that they can be reused by the x86 implementation.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert salsa20-generic from the deprecated "blkcipher" API to the
"skcipher" API, in the process fixing it up to be thread-safe (as the
crypto API expects) by maintaining each request's state separately from
the transform context.
Also remove the unnecessary cra_alignmask and tighten validation of the
key size by accepting only 16 or 32 bytes, not anything in between.
These changes bring the code close to the way chacha20-generic does
things, so hopefully it will be easier to maintain in the future.
However, the way Salsa20 interprets the IV is still slightly different;
that was not changed.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
While testing other changes, I discovered that gcc-7.2.1 produces badly
optimized code for aes_encrypt/aes_decrypt. This is especially true when
CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
large stack usage that in turn might cause kernel stack overflows:
crypto/aes_generic.c: In function 'aes_encrypt':
crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger than 2048 bytes [-Wframe-larger-than=]
crypto/aes_generic.c: In function 'aes_decrypt':
crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
I verified that this problem exists on all architectures that are
supported by gcc-7.2, though arm64 in particular is less affected than
the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
stack usage but still produce worse code than earlier versions for this
file, apparently because of optimization passes that generally provide
a substantial improvement in object code quality but understandably fail
to find any shortcuts in the AES algorithm.
Possible workarounds include
a) disabling -ftree-pre and -ftree-sra optimizations, this was an earlier
patch I tried, which reliably fixed the stack usage, but caused a
serious performance regression in some versions, as later testing
found.
b) disabling UBSAN on this file or all ciphers, as suggested by Ard
Biesheuvel. This would lead to massively better crypto performance in
UBSAN-enabled kernels and avoid the stack usage, but there is a concern
over whether we should exclude arbitrary files from UBSAN at all.
c) Forcing the optimization level in a different way. Similar to a),
but rather than deselecting specific optimization stages,
this now uses "gcc -Os" for this file, regardless of the
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE/SIZE option. This is a reliable
workaround for the stack consumption on all architecture, and I've
retested the performance results now on x86, cycles/byte (lower is
better) for cbc(aes-generic) with 256 bit keys:
-O2 -Os
gcc-6.3.1 14.9 15.1
gcc-7.0.1 14.7 15.3
gcc-7.1.1 15.3 14.7
gcc-7.2.1 16.8 15.9
gcc-8.0.0 15.5 15.6
This implements the option c) by enabling forcing -Os on all compiler
versions starting with gcc-7.1. As a workaround for PR83356, it would
only be needed for gcc-7.2+ with UBSAN enabled, but since it also shows
better performance on gcc-7.1 without UBSAN, it seems appropriate to
use the faster version here as well.
Side note: during testing, I also played with the AES code in libressl,
which had a similar performance regression from gcc-6 to gcc-7.2,
but was three times slower overall. It might be interesting to
investigate that further and possibly port the Linux implementation
into that.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
Cc: Richard Biener <rguenther@suse.de>
Cc: Jakub Jelinek <jakub@gcc.gnu.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Similar to what was done for the hash API, update the AEAD API to track
whether each transform has been keyed, and reject encryption/decryption
if a key is needed but one hasn't been set.
This isn't quite as important as the equivalent fix for the hash API
because AEADs always require a key, so are unlikely to be used without
one. Still, tracking the key will prevent accidental unkeyed use.
algif_aead also had to track the key anyway, so the new flag replaces
that and slightly simplifies the algif_aead implementation.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Similar to what was done for the hash API, update the skcipher API to
track whether each transform has been keyed, and reject
encryption/decryption if a key is needed but one hasn't been set.
This isn't as important as the equivalent fix for the hash API because
symmetric ciphers almost always require a key (the "null cipher" is the
only exception), so are unlikely to be used without one. Still,
tracking the key will prevent accidental unkeyed use. algif_skcipher
also had to track the key anyway, so the new flag replaces that and
simplifies the algif_skcipher implementation.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the crypto API prevents a keyed hash from being used without
setting the key, there's no need for GHASH to do this check itself.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, almost none of the keyed hash algorithms check whether a key
has been set before proceeding. Some algorithms are okay with this and
will effectively just use a key of all 0's or some other bogus default.
However, others will severely break, as demonstrated using
"hmac(sha3-512-generic)", the unkeyed use of which causes a kernel crash
via a (potentially exploitable) stack buffer overflow.
A while ago, this problem was solved for AF_ALG by pairing each hash
transform with a 'has_key' bool. However, there are still other places
in the kernel where userspace can specify an arbitrary hash algorithm by
name, and the kernel uses it as unkeyed hash without checking whether it
is really unkeyed. Examples of this include:
- KEYCTL_DH_COMPUTE, via the KDF extension
- dm-verity
- dm-crypt, via the ESSIV support
- dm-integrity, via the "internal hash" mode with no key given
- drbd (Distributed Replicated Block Device)
This bug is especially bad for KEYCTL_DH_COMPUTE as that requires no
privileges to call.
Fix the bug for all users by adding a flag CRYPTO_TFM_NEED_KEY to the
->crt_flags of each hash transform that indicates whether the transform
still needs to be keyed or not. Then, make the hash init, import, and
digest functions return -ENOKEY if the key is still needed.
The new flag also replaces the 'has_key' bool which algif_hash was
previously using, thereby simplifying the algif_hash implementation.
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We need to consistently enforce that keyed hashes cannot be used without
setting the key. To do this we need a reliable way to determine whether
a given hash algorithm is keyed or not. AF_ALG currently does this by
checking for the presence of a ->setkey() method. However, this is
actually slightly broken because the CRC-32 algorithms implement
->setkey() but can also be used without a key. (The CRC-32 "key" is not
actually a cryptographic key but rather represents the initial state.
If not overridden, then a default initial state is used.)
Prepare to fix this by introducing a flag CRYPTO_ALG_OPTIONAL_KEY which
indicates that the algorithm has a ->setkey() method, but it is not
required to be called. Then set it on all the CRC-32 algorithms.
The same also applies to the Adler-32 implementation in Lustre.
Also, the cryptd and mcryptd templates have to pass through the flag
from their underlying algorithm.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since Poly1305 requires a nonce per invocation, the Linux kernel
implementations of Poly1305 don't use the crypto API's keying mechanism
and instead expect the key and nonce as the first 32 bytes of the data.
But ->setkey() is still defined as a stub returning an error code. This
prevents Poly1305 from being used through AF_ALG and will also break it
completely once we start enforcing that all crypto API users (not just
AF_ALG) call ->setkey() if present.
Fix it by removing crypto_poly1305_setkey(), leaving ->setkey as NULL.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the mcryptd template is used to wrap an unkeyed hash algorithm,
don't install a ->setkey() method to the mcryptd instance. This change
is necessary for mcryptd to keep working with unkeyed hash algorithms
once we start enforcing that ->setkey() is called when present.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the cryptd template is used to wrap an unkeyed hash algorithm,
don't install a ->setkey() method to the cryptd instance. This change
is necessary for cryptd to keep working with unkeyed hash algorithms
once we start enforcing that ->setkey() is called when present.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Templates that use an shash spawn can use crypto_shash_alg_has_setkey()
to determine whether the underlying algorithm requires a key or not.
But there was no corresponding function for ahash spawns. Add it.
Note that the new function actually has to support both shash and ahash
algorithms, since the ahash API can be used with either.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There seems to be a cut-n-paste bug with the name of the buffer being
free'd, xoutbuf should be used instead of axbuf.
Detected by CoverityScan, CID#1463420 ("Copy-paste error")
Fixes: 427988d981 ("crypto: tcrypt - add multibuf aead speed test")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Trivial fix to spelling mistakes in pr_err error message text.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The user space interface allows specifying the type and mask field used
to allocate the cipher. Only a subset of the possible flags are intended
for user space. Therefore, white-list the allowed flags.
In case the user space caller uses at least one non-allowed flag, EINVAL
is returned.
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When char is signed, storing the values 0xba (186) and 0xad (173) in the
`guard` array produces signed overflow. Change the type of `guard` to
static unsigned char to correct undefined behavior and reduce function
stack usage.
Signed-off-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that nothing in poly1305-generic assumes any special alignment,
remove the cra_alignmask so that the crypto API does not have to
unnecessarily align the buffers.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the only part of poly1305-generic which is assuming special
alignment is the part where the final digest is written. Switch this
over to the unaligned access macros so that we'll be able to remove the
cra_alignmask.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There is a message posted to the crypto notifier chain when an algorithm
is unregistered, and when a template is registered or unregistered. But
nothing is listening for those messages; currently there are only
listeners for the algorithm request and registration messages.
Get rid of these unused notifications for now.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reference counters should use refcount_t rather than atomic_t, since the
refcount_t implementation can prevent overflows, reducing the
exploitability of reference leak bugs. crypto_alg.cra_refcount is a
reference counter with the usual semantics, so switch it over to
refcount_t.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The performance of some aead tfm providers is affected by
the amount of parallelism possible with the processing.
Introduce an async aead concurrent multiple buffer
processing speed test to be able to test performance of such
tfm providers.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The performance of some skcipher tfm providers is affected by
the amount of parallelism possible with the processing.
Introduce an async skcipher concurrent multiple buffer
processing speed test to be able to test performance of such
tfm providers.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The multi buffer concurrent requests ahash speed test only
supported the cycles mode. Add support for the so called
jiffies mode that test performance of bytes/sec.
We only add support for digest mode at the moment.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For multiple buffers speed tests, the number of buffers, or
requests, used actually sets the level of parallelism a tfm
provider may utilize to hide latency. The existing number
(of 8) is good for some software based providers but not
enough for many HW providers with deep FIFOs.
Add a module parameter that allows setting the number of
multiple buffers/requests used, leaving the default at 8.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AEAD speed test pretended to support decryption, however that support
was broken as decryption requires a valid auth field which the test did
not provide.
Fix this by running the encryption path once with inout/output sgls
switched to calculate the auth field prior to performing decryption
speed tests.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The multi buffer ahash speed test was allocating multiple
buffers for use with the multiple outstanding requests
it was starting but never actually using them (except
to free them), instead using a different single statically
allocated buffer for all requests.
Fix this by actually using the allocated buffers for the test.
It is noted that it may seem tempting to instead remove the
allocation and free of the multiple buffers and leave things as
they are since this is a hash test where the input is read
only. However, after consideration I believe that multiple
buffers better reflect real life scenario with regard
to data cache and TLB behaviours etc.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch remove two unused variable and some dead "code" using it.
Fixes: 92932d03c2 ("crypto: seqiv - Remove AEAD compatibility code")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch remove two unused variable and some dead "code" using it.
Fixes: 66008d4230 ("crypto: echainiv - Remove AEAD compatibility code")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The comment in gf128mul_x8_ble() was copy-and-pasted from gf128mul.h and
makes no sense in the new context. Remove it.
Cc: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since commit 499a66e6b6 ("crypto: null - Remove default null
blkcipher"), crypto_get_default_null_skcipher2() and
crypto_put_default_null_skcipher2() are the same as their non-2
equivalents. So switch callers of the "2" versions over to the original
versions and remove the "2" versions.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto_larval_lookup() is not used outside of crypto/api.c, so unexport
it and mark it 'static'.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
pcrypt is using the old way of freeing instances, where the ->free()
method specified in the 'struct crypto_template' is passed a pointer to
the 'struct crypto_instance'. But the crypto_instance is being
kfree()'d directly, which is incorrect because the memory was actually
allocated as an aead_instance, which contains the crypto_instance at a
nonzero offset. Thus, the wrong pointer was being kfree()'d.
Fix it by switching to the new way to free aead_instance's where the
->free() method is specified in the aead_instance itself.
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 0496f56065 ("crypto: pcrypt - Add support for new AEAD interface")
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This variable was increased and decreased without any protection.
Result was an occasional misscount and negative wrap around resulting
in false resource allocation failures.
Fixes: 7d2c3f54e6 ("crypto: af_alg - remove locking in async callback")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If the rfc7539 template was instantiated with a hash algorithm with
digest size larger than 16 bytes (POLY1305_DIGEST_SIZE), then the digest
overran the 'tag' buffer in 'struct chachapoly_req_ctx', corrupting the
subsequent memory, including 'cryptlen'. This caused a crash during
crypto_skcipher_decrypt().
Fix it by, when instantiating the template, requiring that the
underlying hash algorithm has the digest size expected for Poly1305.
Reproducer:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int algfd, reqfd;
struct sockaddr_alg addr = {
.salg_type = "aead",
.salg_name = "rfc7539(chacha20,sha256)",
};
unsigned char buf[32] = { 0 };
algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (void *)&addr, sizeof(addr));
setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, sizeof(buf));
reqfd = accept(algfd, 0, 0);
write(reqfd, buf, 16);
read(reqfd, buf, 16);
}
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 71ebc4d1b2 ("crypto: chacha20poly1305 - Add a ChaCha20-Poly1305 AEAD construction, RFC7539")
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The cryptd_max_cpu_qlen module parameter is local to the source and does
not need to be in global scope, so make it static.
Cleans up sparse warning:
crypto/cryptd.c:35:14: warning: symbol 'cryptd_max_cpu_qlen' was not
declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When invoking an asynchronous cipher operation, the invocation of the
callback may be performed before the subsequent operations in the
initial code path are invoked. The callback deletes the cipher request
data structure which implies that after the invocation of the
asynchronous cipher operation, this data structure must not be accessed
any more.
The setting of the return code size with the request data structure must
therefore be moved before the invocation of the asynchronous cipher
operation.
Fixes: e870456d8e ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6a ("crypto: algif_aead - overhaul memory management")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
mcryptd_enqueue_request() grabs the per-CPU queue struct and protects
access to it with disabled preemption. Then it schedules a worker on the
same CPU. The worker in mcryptd_queue_worker() guards access to the same
per-CPU variable with disabled preemption.
If we take CPU-hotplug into account then it is possible that between
queue_work_on() and the actual invocation of the worker the CPU goes
down and the worker will be scheduled on _another_ CPU. And here the
preempt_disable() protection does not work anymore. The easiest thing is
to add a spin_lock() to guard access to the list.
Another detail: mcryptd_queue_worker() is not processing more than
MCRYPTD_BATCH invocation in a row. If there are still items left, then
it will invoke queue_work() to proceed with more later. *I* would
suggest to simply drop that check because it does not use a system
workqueue and the workqueue is already marked as "CPU_INTENSIVE". And if
preemption is required then the scheduler should do it.
However if queue_work() is used then the work item is marked as CPU
unbound. That means it will try to run on the local CPU but it may run
on another CPU as well. Especially with CONFIG_DEBUG_WQ_FORCE_RR_CPU=y.
Again, the preempt_disable() won't work here but lock which was
introduced will help.
In order to keep work-item on the local CPU (and avoid RR) I changed it
to queue_work_on().
Cc: stable@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The wait for data is a non-atomic operation that can sleep and therefore
potentially release the socket lock. The release of the socket lock
allows another thread to modify the context data structure. The waiting
operation for new data therefore must be called at the beginning of
recvmsg. This prevents a race condition where checks of the members of
the context data structure are performed by recvmsg while there is a
potential for modification of these values.
Fixes: e870456d8e ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6a ("crypto: algif_aead - overhaul memory management")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All the ChaCha20 algorithms as well as the ARM bit-sliced AES-XTS
algorithms call skcipher_walk_virt(), then access the IV (walk.iv)
before checking whether any bytes need to be processed (walk.nbytes).
But if the input is empty, then skcipher_walk_virt() doesn't set the IV,
and the algorithms crash trying to use the uninitialized IV pointer.
Fix it by setting the IV earlier in skcipher_walk_virt(). Also fix it
for the AEAD walk functions.
This isn't a perfect solution because we can't actually align the IV to
->cra_alignmask unless there are bytes to process, for one because the
temporary buffer for the aligned IV is freed by skcipher_walk_done(),
which is only called when there are bytes to process. Thus, algorithms
that require aligned IVs will still need to avoid accessing the IV when
walk.nbytes == 0. Still, many algorithms/architectures are fine with
IVs having any alignment, and even for those that aren't, a misaligned
pointer bug is much less severe than an uninitialized pointer bug.
This change also matches the behavior of the older blkcipher_walk API.
Fixes: 0cabf2af6f ("crypto: skcipher - Fix crash on zero-length input")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
->pkey_algo used to be an enum, but was changed to a string by commit
4e8ae72a75 ("X.509: Make algo identifiers text instead of enum"). But
two comparisons were not updated. Fix them to use strcmp().
This bug broke signature verification in certain configurations,
depending on whether the string constants were deduplicated or not.
Fixes: 4e8ae72a75 ("X.509: Make algo identifiers text instead of enum")
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Use crypto_shash_digest() instead of crypto_shash_init() followed by
crypto_shash_finup(). (For simplicity only; they are equivalent.)
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>