Integrate with the newly added crypto engine to make the crypto hardware
engine underutilized as each block needs to be processed before the crypto
hardware can start working on the next block.
The requests from dm-crypt will be listed into engine queue and processed
by engine automatically, so remove the 'queue' and 'queue_task' things in
omap aes driver.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use BIT()/GENMASK() macros for all register definitions instead of
hand-writing bit masks.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
AES_CTRL_REG is used to configure AES mode. Before configuring
any mode we need to make sure all other modes are reset or else
driver will misbehave. So mask all modes before configuring
any AES mode.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Increasing the priority of omap-aes hw algos, in order to take
precedence over sw algos.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Algo self tests are failing for CTR mode with omap-aes driver,
giving the following error:
[ 150.053644] omap_aes_crypt: request size is not exact amount of AES blocks
[ 150.061262] alg: skcipher: encryption failed on test 5 for ctr-aes-omap: ret=22
This is because the input length is not aligned with AES_BLOCK_SIZE.
Adding support for omap-aes driver for inputs with length not aligned
with AES_BLOCK_SIZE.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For cases where total length of an input SGs is not same as
length of the input data for encryption, omap-aes driver
crashes. This happens in the case when IPsec is trying to use
omap-aes driver.
To avoid this, we copy all the pages from the input SG list
into a contiguous buffer and prepare a single element SG list
for this buffer with length as the total bytes to crypt, which is
similar thing that is done in case of unaligned lengths.
Fixes: 6242332ff2 ("crypto: omap-aes - Add support for cases of unaligned lengths")
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Modify crypto drivers to use the generic SG helper since
both of them are equivalent and the one from crypto is redundant.
See also:
468577abe3 reverted in
b2ab4a57b0
Signed-off-by: Cristian Stoica <cristian.stoica@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use SIMPLE_DEV_PM_OPS macro in order to make the code simpler.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AES driver currently assumes that pm_runtime_get_sync will always
succeed, which may not always be true, so add error handling for the
same.
This scenario was reported in the following bug:
place. https://bugzilla.kernel.org/show_bug.cgi?id=66441
Reported-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
NIST vectors for CTR mode in testmgr.h assume the entire IV as the counter. To
get correct results that match the output of these vectors, we need to set the
counter length correctly.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Local symbols used only in this file are made static.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use devm_kzalloc instead of kzalloc. With this change, there is no need to
call kfree in error/exit paths.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For cases where offset/length of on any page of the input SG is not aligned by
AES_BLOCK_SIZE, we copy all the pages from the input SG list into a contiguous
buffer and prepare a single element SG list for this buffer with length as the
total bytes to crypt.
This is requried for cases such as when an SG list of 16 bytes total size
contains 16 pages each containing 1 byte. DMA using the direct buffers of such
instances is not possible.
For this purpose, we first detect if the unaligned case and accordingly
allocate enough number of pages to satisfy the request and prepare SG lists.
We then copy data into the buffer, and copy data out of it on completion.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In cases where requesting for DMA channels fails for some reason, or channel
numbers are not provided in DT or platform data, we switch to PIO-only mode
also checking if platform provides IRQ numbers and interrupt register offsets
in DT and platform data. All dma-only paths are avoided in this mode.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We initialize the scatter gather walk lists needed for PIO mode and avoid all
DMA paths such as mapping/unmapping buffers by checking for the pio_only flag.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We add an IRQ handler that implements a state-machine for PIO-mode and data
structures for walking the scatter-gather list. The IRQ handler is called in
succession both when data is available to read or next data can be sent for
processing. This process continues till the entire in/out SG lists have been
walked. Once the SG-list has been completely walked, the IRQ handler schedules
the done_task tasklet.
Also add a useful macro that is used through out the IRQ code for a common
pattern of calculating how much an SG list has been walked. This improves code
readability and avoids checkpatch errors.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add IRQ information to pdata and helper macros. These are required
for PIO-mode support.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Intermdiate buffers were allocated, mapped and used for DMA. These are no
longer required as we use the SGs from crypto layer directly in previous
commits in the series. Also along with it, remove the logic for copying SGs
etc as they are no longer used, and all the associated variables in omap_aes_device.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Earlier functions that did a similar sync are replaced by the dma_sync_sg_*
which can operate on entire SG list.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In early version of this driver, assumptions were made such as DMA layer
requires contiguous buffers etc. Due to this, new buffers were allocated,
mapped and used for DMA. These assumptions are no longer true and DMAEngine
scatter-gather DMA doesn't have such requirements. We simply the DMA operations
by directly using the scatter-gather buffers provided by the crypto layer
instead of creating our own.
Lot of logic that handled DMA'ing only X number of bytes of the total, or as
much as fitted into a 3rd party buffer is removed and is no longer required.
Also, good performance improvement of atleast ~20% seen with encrypting a
buffer size of 8K (1800 ops/sec vs 1400 ops/sec). Improvement will be higher
for much larger blocks though such benchmarking is left as an exercise for the
reader. Also DMA usage is much more simplified and coherent with rest of the
code.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Crypto layer only passes nbytes but number of SG elements is needed for mapping
or unmapping SGs at one time using dma_map* API and also needed to pass in for
dmaengine prep function.
We call function added to scatterwalk for this purpose in omap_aes_handle_queue
to populate the values which are used later.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When DEBUG is enabled, these macros can be used to print variables in integer
and hex format, and clearly display which registers, offsets and values are
being read/written , including printing the names of the offsets and their values.
Using statement expression macros in read path as,
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Calling runtime PM API for every block causes serious perf hit to
crypto operations that are done on a long buffer.
As crypto is performed on a page boundary, encrypting large buffers can
cause a series of crypto operations divided by page. The runtime PM API
is also called those many times.
We call runtime_pm_get_sync only at beginning on the session (cra_init)
and runtime_pm_put at the end. This result in upto a 50% speedup as below.
This doesn't make the driver to keep the system awake as runtime get/put
is only called during a crypto session which completes usually quickly.
Before:
root@beagleboard:~# time -v openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 13310 aes-128-cbc's in 0.01s
Doing aes-128-cbc for 3s on 64 size blocks: 13040 aes-128-cbc's in 0.04s
Doing aes-128-cbc for 3s on 256 size blocks: 9134 aes-128-cbc's in 0.03s
Doing aes-128-cbc for 3s on 1024 size blocks: 8939 aes-128-cbc's in 0.01s
Doing aes-128-cbc for 3s on 8192 size blocks: 4299 aes-128-cbc's in 0.00s
After:
root@beagleboard:~# time -v openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 18911 aes-128-cbc's in 0.02s
Doing aes-128-cbc for 3s on 64 size blocks: 18878 aes-128-cbc's in 0.02s
Doing aes-128-cbc for 3s on 256 size blocks: 11878 aes-128-cbc's in 0.10s
Doing aes-128-cbc for 3s on 1024 size blocks: 11538 aes-128-cbc's in 0.05s
Doing aes-128-cbc for 3s on 8192 size blocks: 4857 aes-128-cbc's in 0.03s
While at it, also drop enter and exit pr_debugs, in related code. tracers
can be used for that.
Tested on a Beaglebone (AM335x SoC) board.
Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Replace calls to deprecated devm_request_and_ioremap by devm_ioremap_resource.
Found with coccicheck and this semantic patch:
scripts/coccinelle/api/devm_request_and_ioremap.cocci.
Signed-off-by: Laurent Navet <laurent.navet@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
module_platform_driver() makes the code simpler by eliminating boilerplate
code.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
After DMA is complete, the omap_aes_finish_req function is called as
a part of the done_task tasklet. During this its atomic and any calls
to pm functions should not assume they wont sleep.
The patch replaces a call to pm_runtime_put_sync (which can sleep) with
pm_runtime_put thus fixing a kernel panic observed on AM33xx SoC during
AES operation.
Tested on an AM33xx SoC device (beaglebone board).
To reproduce the problem, I used the tcrypt kernel module as:
modprobe tcrypt sec=2 mode=500
Signed-off-by: Joel A Fernandes <joelagnel@ti.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The OMAP3 and OMAP4/AM33xx versions of the AES crypto
module support the CTR algorithm in addition to ECB
and CBC that the OMAP2 version of the module supports.
So, OMAP2 and OMAP3 share a common register set but
OMAP3 supports CTR while OMAP2 doesn't. OMAP4/AM33XX
uses a different register set from OMAP2/OMAP3 and
also supports CTR.
To add this support, use the platform_data introduced
in an ealier commit to hold the list of algorithms
supported by the current module. The probe routine
will use that list to register the correct algorithms.
Note: The code being integrated is from the TI AM33xx SDK
and was written by Greg Turner <gkmturner@gmail.com> and
Herman Schuurman (current email unknown) while at TI.
CC: Greg Turner <gkmturner@gmail.com>
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for the OMAP4 version of the AES module
that is present on OMAP4 and AM33xx SoCs.
The modules have several differences including register
offsets and how DMA is triggered. To handle these
differences, a platform_data structure is defined and
contains routine pointers, register offsets, and bit
offsets within registers. OMAP2/OMAP3-specific routines
are suffixed with '_omap2' and OMAP4/AM33xx routines are
suffixed with '_omap4'.
Note: The code being integrated is from the TI AM33xx SDK
and was written by Greg Turner <gkmturner@gmail.com> and
Herman Schuurman (current email unknown) while at TI.
CC: Greg Turner <gkmturner@gmail.com>
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the dma_request_slave_channel_compat() call instead of
the dma_request_channel() call to request a DMA channel.
This allows the omap-aes driver use different DMA engines.
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add Device Tree suport to the omap-aes crypto
driver. Currently, only support for OMAP2 and
OMAP3 is being added but support for OMAP4 will
be added in a subsequent patch.
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove usage of the private OMAP DMA API.
The dmaengine API will be used instead.
CC: Russell King <rmk+kernel@arm.linux.org.uk>
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add code to use the new dmaengine API alongside
the existing DMA code that uses the private
OMAP DMA API. The API to use is chosen by
defining or undefining 'OMAP_AES_DMA_PRIVATE'.
CC: Russell King <rmk+kernel@arm.linux.org.uk>
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add suspend/resume support to the OMAP AES driver.
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the omap-aes crypto driver to use the
pm_runtime API instead of the clk API.
CC: Kevin Hilman <khilman@deeprootsystems.com>
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AES controller only needs to be reset once and that will
be done by the hwmod infrastructure, if possible. Therefore,
remove the reset code from the omap-aes driver.
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove the unnecessary pr_info() calls from omap_aes_probe()
and omap_aes_mod_init().
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Based on earlier discussions[1] we attempted to find a suitable
location for the omap DMA header in commit 2b6c4e73 (ARM: OMAP:
DMA: Move plat/dma.h to plat-omap/dma-omap.h) until the conversion
to dmaengine is complete.
Unfortunately that was before I was able to try to test compile
of the ARM multiplatform builds for omap2+, and the end result
was not very good.
So I'm creating yet another all over the place patch to cut the
last dependency for building omap2+ for ARM multiplatform. After
this, we have finally removed the driver dependencies to the
arch/arm code, except for few drivers that are being worked on.
The other option was to make the <plat-omap/dma-omap.h> path
to work, but we'd have to add some new header directory to for
multiplatform builds.
Or we would have to manually include arch/arm/plat-omap/include
again from arch/arm/Makefile for omap2+.
Neither of these alternatives sound appealing as they will
likely lead addition of various other headers exposed to the
drivers, which we want to avoid for the multiplatform kernels.
Since we already have a minimal include/linux/omap-dma.h,
let's just use that instead and add a note to it to not
use the custom omap DMA functions any longer where possible.
Note that converting omap DMA to dmaengine depends on
dmaengine supporting automatically incrementing the FIFO
address at the device end, and converting all the remaining
legacy drivers. So it's going to be few more merge windows.
[1] https://patchwork.kernel.org/patch/1519591/#
cc: Russell King <linux@arm.linux.org.uk>
cc: Kevin Hilman <khilman@ti.com>
cc: "Benoît Cousson" <b-cousson@ti.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Vinod Koul <vinod.koul@intel.com>
cc: Dan Williams <djbw@fb.com>
cc: Mauro Carvalho Chehab <mchehab@infradead.org>
cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
cc: David Woodhouse <dwmw2@infradead.org>
cc: Kyungmin Park <kyungmin.park@samsung.com>
cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
cc: Hans Verkuil <hans.verkuil@cisco.com>
cc: Vaibhav Hiremath <hvaibhav@ti.com>
cc: Lokesh Vutla <lokeshvutla@ti.com>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
cc: Afzal Mohammed <afzal@ti.com>
cc: linux-crypto@vger.kernel.org
cc: linux-media@vger.kernel.org
cc: linux-mtd@lists.infradead.org
cc: linux-usb@vger.kernel.org
cc: linux-fbdev@vger.kernel.org
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Drivers should not use cpu_is_omap or cpu_class_is_omap macros,
they should be private to the platform init code. And we'll be
removing plat/cpu.h and only have a private soc.h for the
arch/arm/*omap* code.
This patch is intended as preparation for the core omap changes
and removes the need to include plat/cpu.h from several drivers.
This is needed for the ARM common zImage support.
These changes are OK to do because:
- omap-rng.c does not need plat/cpu.h
- omap-aes.c and omap-sham.c get the proper platform_data
passed to them so they don't need extra checks in the driver
- omap-dma.c and omap-pcm.c can test the arch locally as
omap1 and omap2 cannot be compiled together because of
conflicting compiler flags
Cc: Deepak Saxena <dsaxena@plexity.net>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Venkatraman S <svenkatr@ti.com>
Cc: Chris Ball <cjb@laptop.org>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Dan Williams <djbw@fb.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Cc: Liam Girdwood <lrg@ti.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-mmc@vger.kernel.org
Cc: alsa-devel@alsa-project.org
Cc: linux-kernel@vger.kernel.org
[tony@atomide.com: mmc changes folded in to an earlier patch]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Move plat/dma.h to plat-omap/dma-omap.h as part of single
zImage work
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Initialization of cra_list is currently mixed, most ciphers initialize this
field and most shashes do not. Initialization however is not needed at all
since cra_list is initialized/overwritten in __crypto_register_alg() with
list_add(). Therefore perform cleanup to remove all unneeded initializations
of this field in 'crypto/drivers/'.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-geode@lists.infradead.org
Cc: Michal Ludvig <michal@logix.cz>
Cc: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Cc: Varun Wadekar <vwadekar@nvidia.com>
Cc: Eric Bénard <eric@eukrea.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: Kent Yoder <key@linux.vnet.ibm.com>
Acked-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The added CRYPTO_ALG_KERN_DRIVER_ONLY indicates whether a cipher
is only available via a kernel driver. If the cipher implementation
might be available by using an instruction set or by porting the
kernel code, then it must not be set.
Signed-off-by: Nikos Mavrogiannopoulos <nmav@gnutls.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
clk_get() returns a struct clk cookie to the driver and some platforms
may return NULL if they only support a single clock. clk_get() has only
failed if it returns a ERR_PTR() encoded pointer.
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-and-tested-by: Tobias Karnat <tobias.karnat@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
AES module was initialized for every DMA transaction.
That is redundant.
Now it is initialized once per request.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Key and IV should always be set before AES operation.
So no need to check if it has changed or not.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Previous version had not error handling.
Request could remain uncompleted.
Also in the case of DMA error, FLAGS_INIT is unset
and accelerator will be initialized again.
Buffer size allignment is checked.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>