Use a single cache for all sed allocations. No need to make it per
channel. This also avoids the slub_debug warnings for multiple caches
with the same name.
Switching to dmam_pool_create() to fix leaking the dma pools on
initialization failure and lets us kill ioat3_dma_remove().
Cc: Dave Jiang <dave.jiang@intel.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There have never been any real users of MEMSET operations since they
have been introduced in January 2007 by commit 7405f74bad ("dmaengine:
refactor dmaengine around dma_async_tx_descriptor"). Therefore remove
support for them for now, it can be always brought back when needed.
[sebastian.hesselbarth@gmail.com: fix drivers/dma/mv_xor]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Acked-by: Dan Williams <djbw@fb.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Olof Johansson <olof@lixom.net>
Cc: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
v3.3 introduced 16 sources PQ operations. This also introduced super extended
descriptors to support the 16 srcs operations. This patch adds support for
the 16 sources ops and in turn adds the super extended descriptors for those
ops.
5 SED pools are created depending on the descriptor sizes. An SED can be a 64
bytes sized descriptor or larger and must be physically contiguous. A kmem
cache pool is created for allocating the software descriptor that manages the
hardware descriptor. The super extended descriptor will take place of extended
descriptor under certain operations and be "attached" to the op descriptor
during operation. This is a new feature for ioatdma v3.3.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Dan Williams <djbw@fb.com>
Acked-by: Dan Williams <djbw@fb.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
markings need to be removed.
This change removes the use of __devinit, __devexit_p, __devinitconst,
and __devexit from these drivers.
Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Viresh Kumar <viresh.linux@gmail.com>
Cc: Dan Williams <djbw@fb.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Barry Song <baohua.song@csr.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Jassi Brar <jassisinghbrar@gmail.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The alloc order can be up to 16 and 1 << 16 will over flow the 16bit
integer. Change the appropriate variables to 16bit to avoid overflow.
Reported-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Use separate locks for the descriptor prep (producer) and descriptor
cleanup (consumer) paths. Allows the producer path to run concurrently
with the cleanup path. Inspired by Documentation/circular-buffer.txt.
Cc: David Howells <dhowells@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If the calling convention of ->timer_fn() and ->cleanup_fn() are unified
across hardware versions we can drop parameters to ioat_init_channel() and
unify ioat_is_dma_complete() implementations.
Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct
dma_chan pointer.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The pending == 2 case no longer exists in the driver so, we can use
ioat2_ring_pending() outside the lock to determine if there might be any
descriptors in the ring that the hardware has not seen.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Put the ioat2 and ioat3 state machines in the halted state with all
errors cleared.
The ioat1 init path is not disturbed for stability, there are no
reported ioat1 initiaization issues.
Cc: <stable@kernel.org>
Reported-by: Roland Dreier <rdreier@cisco.com>
Tested-by: Roland Dreier <rdreier@cisco.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
All the necessary fields for handling an ioat2,3 ring entry can fit into
one cacheline. Move ->len prior to ->txd in struct ioat_ring_ent, and
move allocation of these entries to a hw-cache-aligned kmem cache to
reduce the number of cachelines dirtied for descriptor management.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The cleanup routine for the raid cases imposes extra checks for handling
raid descriptors and extended descriptors. If the channel does not
support raid it can avoid this extra overhead by using the ioat2 cleanup
path.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ioat3.2 adds xor offload support for up to 8 sources. It can also
perform an xor-zero-sum operation to validate whether all given sources
sum to zero, without writing to a destination. Xor descriptors differ
from memcpy in that one operation may require multiple descriptors
depending on the number of sources. When the number of sources exceeds
5 an extended descriptor is needed. These descriptors need to be
accounted for when updating the DMA_COUNT register.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Export driver attributes for diagnostic purposes:
'ring_size': total number of descriptors available to the engine
'ring_active': number of descriptors in-flight
'capabilities': supported operation types for this channel
'version': Intel(R) QuickData specfication revision
This also allows some chattiness to be removed from the driver startup
as this information is now available via sysfs.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Up until this point the driver for Intel(R) QuickData Technology
engines, specification versions 2 and 3, were mostly identical save for
a few quirks. Version 3.2 hardware adds many new capabilities (like
raid offload support) requiring some infrastructure that is not relevant
for v2. For better code organization of the new funcionality move v3
and v3.2 support to its own file dma_v3.c, and export some routines from
the base files (dma.c and dma_v2.c) that can be reused directly.
The first new capability included in this code reorganization is support
for v3.2 memset operations.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ioat3.2 adds raid5 and raid6 offload capabilities.
Signed-off-by: Tom Picard <tom.s.picard@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Increment the allocation order of the descriptor ring every time we run
out of descriptors up to a maximum of allocation order specified by the
module parameter 'ioat_max_alloc_order'. After each idle period
decrement the allocation order to a minimum order of
'ioat_ring_alloc_order' (i.e. the default ring size, tunable as a module
parameter).
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In order to support dynamic resizing of the descriptor ring or polling
for a descriptor in the presence of a hung channel the reset handler
needs to make progress while in a non-preemptible context. The current
workqueue implementation precludes polling channel reset completion
under spin_lock().
This conversion also allows us to return to opportunistic cleanup in the
ioat2 case as the timer implementation guarantees at least one cleanup
after every descriptor is submitted. This means the worst case
completion latency becomes the timer frequency (for exceptional
circumstances), but with the benefit of avoiding busy waiting when the
lock is contended.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Mark all single use initialization routines with __devinit.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Provide some output for debugging the driver.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Replace the current linked list munged into a ring with a native ring
buffer implementation. The benefit of this approach is reduced overhead
as many parameters can be derived from ring position with simple pointer
comparisons and descriptor allocation/freeing becomes just a
manipulation of head/tail pointers.
It requires a contiguous allocation for the software descriptor
information.
Since this arrangement is significantly different from the ioat1 chain,
move ioat2,3 support into its own file and header. Common routines are
exported from driver/dma/ioat/dma.[ch].
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>