I had this in my patchset to add target reset support, but
it got dropped due to patching conflicts. This initial patch
just renames the function and users. We are actually just
dropping the session, and so this does not have anything to do
with the host exactly. It does for software iscsi because
we allocate a host per session, but for cxgb3i this makes no
sense.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the driver knows when hardware is removed like with cxgb3i,
bnx2i, qla4xxx and iser then we will want to remove the sessions/devices
that are bound to that device before removing the host.
cxgb3i and in the future bnx2i will remove the host and that will
remove all the sessions on the hba. iser can call iscsi_kill_session
when it gets an event that indicates that a hca is removed.
And when qla4xxx is hooked in to the lib (it is only hooked into
the class right now) it can call iscsi remove host like the
partial offload card drivers.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi_tcp was updating the exp_statsn (exp_statsn acknowledges
status and tells the target it is ok to let the resources for
a iscsi pdu to be reused) before it got all the data for pdu read
into OS buffers. Data corruption was occuring if something happens
to a packet and the network layer requests a retransmit, and the
initiator has told the target about the udpated exp_statsn ack,
then the target may be sending data from a buffer it has reused
for a new iscsi pdu. This fixes the problem by having the LLD
(iscsi_tcp in this case) just handle the transferring of data, and
has libiscsi handle the processing of status (libiscsi completion
processing is done after LLD data transfers are complete).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch fixes two bugs that are related.
1. Old tools did not set can_queue/cmds_max. This patch modifies
libiscsi so that when we add the host we catch this and set it
to the default.
2. iscsi_tcp thought that the scsi command that was passed to
the eh functions needed a iscsi_cmd_task allocated for it. It
only needed a mgmt task, and now it does not matter since it
all comes from the same pool and libiscsi handles this for the
drivers. ib_iser had copied iscsi_tcp's code and set can_queue
to its max - 1 to handle this. So this patch removes the max -1,
and just sets it to the max.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Let through upto the largest command of 260 defined by the scsi standard.
iscsi core supports this already. Now that the scsi-ml supports it we can
start using large commands.
[jejb:rejections fixed up]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
__FUNCTION__ is gcc-specific, use __func__
(Update diff by Mike Christie)
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The recv lock was defined so the iscsi layer could block
the recv path from processing IO during recovery. It
turns out iser just set a lock to that pointer which was pointless.
We now disconnect the transport connection before doing recovery
so we do not need the recv lock. For iscsi_tcp we still stop
the recv path incase older tools are being used.
This patch also has iscsi_itt_to_ctask user grab the session lock
and has the caller access the task with the lock or get a ref
to it in case the target is broken and sends a tmf success response
then sends data or a response for the command that was supposed to
be affected bty the tmf.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds two new attrs used for creating initiator ports and
binding sessions to hardware.
The session level initiatorname:
Since bnx2i does a scsi_host per host device, we need to add the
iface initiator port settings on the session, so we can create
multiple initiator ports (each with different inames) per device/scsi_host.
The current iname reflects that qla4xxx can have one iname per hba, and we are
allocating a host per session for software. The iname on the host will
remain so we can export and set the hba level qla4xxx setting.
The ifacename attr:
To bind a session to a some peice of hardware in userspace we maintain
some mappings, but during boot or iscsid restart (iscsid contains the user
space part of the driver) we need to be able to figure out which of those
host mappings abstractions maps to certain sessions. This patch adds
a ifacename attr, which userspace can set to id the host side of the
endpoint across pivot_roots and iscsid restarts.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi_tcp creates its ep in userspace using sockets because
it is virtual, so we just check if we are sent a ep and fail
if we are.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently we duplicate the list of sessions, because we were using the
test for if a session was on the host list to indicate if the session
was bound or unbound. We can instead use the target_id and fix up
the class so that drivers like bnx2i do not have to manage the target id
space.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This converts iscsi_tcp to use the iscsi_task name.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Convert iscsi_tcp to support merged tasks.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently to get a ctask from the session cmd array, you have to
know to use the itt modifier. To make this easier on LLDs and
so in the future we can easilly kill the session array and use
the host shared map instead, this patch adds a nice wrapper
to strip the itt into a session->cmds index and return a ctask.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This removes the session and conn data_size fields from the iscsi_transport.
Just pass in the value like with host allocation. This patch also makes
it so the LLD iscsi_conn data is allocated with the iscsi_cls_conn.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This finishes the host/session unbinding, by adding some helpers
to add and remove hosts and the session they manage.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i allocates a host per netdevice but will use libiscsi,
so this unbinds the session from the host in that code.
This will also be useful for the iser parent device dma settings
fixes.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
max_cmd_len and max_conn are not really used. max_cmd_len is
always 16 and can be set by the LLD. max_conn is always one
since we do not support MCS.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi offload (bnx2i and qla4xx) allocate a scsi host per hba,
so the session creation path needs a shost/host_no argument.
Software iscsi/iser will follow the same behabior as before
where it allcoates a host per session, but in the future iser
will probably look more like bnx2i where the host's parent is
the hardware (rnic for iser and for bnx2i it is the nic), because
it does not use a socket layer like how iscsi_tcp does.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
access the right scsi_in() and/or scsi_out() side of things.
also for resid
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Reviewed-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Some iscsi class messages have the dev_printk prefix and some libiscsi
and iscsi_tcp messages have "iscsi" or the module name as a prefix which
is normally pretty useless when trying to figure out which session
or connection the message is attached to. This patch adds iscsi lib
and class dev_printks so all messages have a common prefix that
can be used to figure out which object printed it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
With the sg table code, every SCSI driver is now either chain capable
or broken (or has sg_tablesize set so chaining is never activated), so
there's no need to have a check in the host template.
Also tidy up the code by moving the scatterlist size defines into the
SCSI includes and permit the last entry of the scatterlist pools not
to be a power of two.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we negotiate for X r2ts we have to use only X r2ts. We cannot
round up (we could send less though). It is ok to fail if it
is not something the driver can handle, so this patch just does
that.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi_data_rsp needs to hold the sesison lock when it calls
iscsi_update_cmdsn.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The previous patches converted iscsi_tcp to support sg chaining.
This patch sets the proper flags and sets sg_table size to
4096. This allows fs io to be capped at max_sectors, but passthrough
IO to be limited by some other part of the kernel.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
A target should never send us a itt that does not match a running
task. If it does we do not really know what is coming down after the header,
unless we evaluate the hdr and do some guessing sometimes. However,
even if we know what is coming we probably do not have buffers for it or we
cannot respond (if it is a r2t for example), so just drop the session.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi_r2t_rsp checks the incoming R2T for sanity, and if it
thinks it's fishy, it will drop it silently. In this case, we
leaked an r2t_info object. If we do this often enough, we run
into a BUG_ON some time later.
Removed r2t wrappers and update patch by Mike Christie
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Convert xmit to iscsi chunks.
from michaelc@cs.wisc.edu:
Bug fixes, more digest integration, sg chaining conversion and other
sg wrapper changes, coding style sync up, and removal of io fields,
like pdu_sent, that are not needed.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
During root boot and shutdown the target could send us nops.
At this time iscsid cannot be running, so the target will drop
the session and the boot or shutdown will hang.
To handle this and allow us to better control when to check the network
this patch moves the nop handling to the kernel.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is not need to block the session during logout. Since
we are going to fail the commands that were blocked just fail them
immediately instead.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi_pool_init simplified
iscsi_pool_init currently has a lot of duplicate kfree() calls it does
when some allocation fails. This patch simplifies the code a little by
using iscsi_pool_free to tear down the pool in case of an error.
iscsi_pool_init also returns a copy of the item array to the caller.
Not all callers use this array, so we make it optional.
Instead of allocating a second array and return that, allocate just one
array, of twice the size.
Update users of iscsi_pool_{init,free}
This patch drops the (now useless) second argument to
iscsi_pool_free, and updates all callers.
It also removes the ctask->r2ts array, which was never
used anyway. Since the items argument to iscsi_pool_init
is now optional, we can pass NULL instead.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
at libiscsi generic code
- currently code assumes a storage space of pdu header is allocated
at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add
a hdr_max field pertaining to that storage, and an hdr_len that
accumulates the current use of the pdu-header.
- Add an iscsi_next_hdr() inline which returns the next free space
to write new Header at. Also iscsi_next_hdr() is used to retrieve
the address at which to write the header-digest.
- Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr()
for address of the new header, than calls iscsi_add_hdr(length) with
the size of the new header. iscsi_add_hdr() will check if space is
available and update to the new size. length must be padded according
to standard.
- Add 2 padding inline helpers thanks to Olaf. Current patch does not
use them but Following patches will.
Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had
PAD_WORD_LEN that was never used anywhere.
- Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is
possible that it will fail.
- I was tired of yet again writing a "this is a digest" comment next to
sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need
any comments. Changed all places that used sizeof(__u32) or "4" in
connection to a digest.
iscsi_tcp specific code
- At struct iscsi_tcp_cmd_task allocate maximum space allowed in
standard for all headers following the iscsi_cmd header. and mark
it so in iscsi_tcp_session_create()
- At iscsi_send_cmd_hdr() retrieve the correct headers size and
write header digest at iscsi_next_hdr().
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- Check to see that OVERFLOW is not negative indicating
a bug.
- Unify handling of UNDERFLOW and OVERFLOW to the same
code.
- Also handle BIDI_OVERFLOW.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Rewrite recv path. Fixes:
- data digest processing and error handling.
- ahs support.
Some fixups by Mike Christie
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds logical unit reset support. This should work for ib_iser,
but I have not finished testing that driver so it is not hooked in yet.
This patch also temporarily reverts the iscsi_tcp r2t write out patch.
That code is completely rewritten in this patchset.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is a race condition in iscsi_tcp.c that may cause it to forget
that it received a R2T from the target. This race may cause a data-out
command (such as a write) to lock up. The race occurs here:
static int
iscsi_send_unsol_pdu(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
struct iscsi_tcp_cmd_task *tcp_ctask = ctask->dd_data;
int rc;
if (tcp_ctask->xmstate & XMSTATE_UNS_HDR) {
BUG_ON(!ctask->unsol_count);
tcp_ctask->xmstate &= ~XMSTATE_UNS_HDR; <---- RACE
...
static int
iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
...
tcp_ctask->xmstate |= XMSTATE_SOL_HDR_INIT; <---- RACE
...
While iscsi_xmitworker() (called from scsi_queue_work()) is preparing to
send unsolicited data, iscsi_tcp_data_recv() (called from
tcp_read_sock()) interrupts it upon receipt of a R2T from the target.
Both contexts do read-modify-write of tcp_ctask->xmstate. Usually, gcc
on x86 will make &= and |= atomic on UP (not guaranteed of course), but
in this case iscsi_send_unsol_pdu() reads the value of xmstate before
clearing the bit, which causes gcc to read xmstate into a CPU register,
test it, clear the bit, and then store it back to memory. If the recv
interrupt happens during this sequence, then the XMSTATE_SOL_HDR_INIT
bit set by the recv interrupt will be lost, and the R2T will be
forgotten.
The patch below (against 2.6.24-rc1) converts accesses of xmstate to use
set_bit, clear_bit, and test_bit instead of |= and &=. I have tested
this patch and verified that it fixes the problem. Another possible
approach would be to hold a lock during most of the rx/tx setup and
post-processing, and drop the lock only for the actual rx/tx.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch fixes the errors made in the users of the crypto layer during
the sg_init_table conversion. It also adds a few conversions that were
missing altogether.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.
Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
It was found by LSI that on setups with large amounts of memory
we were bouncing buffers when we did not need to. If the iscsi tcp
code touches the data buffer (or a helper does),
it will kmap the buffer. iscsi_tcp also does not interact with hardware,
so it does not have any hw dma restrictions. This patch sets the bounce
buffer settings for our device queue so buffers should not be bounced
because of a driver limit.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This prevents the iscsi modules from being unloaded while
there are active mounts from an iscsi target.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- remove the unnecessary map_single path.
- convert to use the new accessors for the sg lists and the
parameters.
TODO: use scsi_for_each_sg().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iSCSI must support software iscsi (iscsi_tcp, iser), hardware iscsi (qla4xxx),
and partial offload (broadcom). To be able to allow each stack or driver
or port (virtual or physical) to be able to log into the same target portal
we use the initiator tuple [[HWADDRESS | NETDEVNAME], INITIATOR_NAME] and
the target tuple [TARGETNAME, CONN_ADDRESS, CONN_PORT] to id a session.
This patch adds the netdev name, which is used by software iscsi when
it binds a session to a netdevice using the SO_BINDTODEVICE sock opt.
It cannot use HWADDRESS because if someone did vlans then the same netdevice
will have the same mac and the initiator,target id will not be unique.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: David C Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch exports the local address for the session. For
qla4xxx this is the ip of the hba's port. For software
this is the src addr of the socket.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: David C Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch should fix the file descriptor leak problem. A quick look
through the kernel shows that users of sockfd_lookup use sockfd_put to
release their handle. We were using sock_release which from the comments
and code look like it does not release the get() on the file from the
lookup.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add a slave_configure function to iSCSI TCP to remove any DMA
alignment restriction. This permits the use of direct IO from
arbitrary addresses.
Signed-off-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If we got the padding, data and header in different skbs,
we were not handling the padding correctly because we attributed it
to the data's skb. This resulted in the initiator reading from
pad bytes + skb offset instead of the correct offset.
If you could not connect with the open solaris target, this
will fix the lock up problem you were hitting.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch allows us to set can_queue and cmds_per_lun from userspace
when we create the session/host. From there we can set it on a per
target basis. The patch fully converts iscsi_tcp, but only hooks
up ib_iser for cmd_per_lun since it currently has a lots of preallocations
based on can_queue.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The cmdsn allocation and pdu transmit code can race, and we can end
up sending a pdu with cmdsn 10 before a pdu with 5. The target will
then fail the connection/session. This patch fixes the problem by
delaying the cmdsn allocation until we are about to send the pdu.
This also removes the xmitmutex. We were using the connection xmitmutex
during error handling to handle races with mtask and ctask cleanup and
completion. For ctasks we now have nice refcounting and for the mtask,
if we hit the case where the mtask timesout and it is floating
around somewhere in the driver, we end up dropping the session.
And to handle session level cleanup, we use the xmit suspend bit
along with scsi_flush_queue and the session lock to make sure
that the xmit thread is not possibly transmitting a task while
we are trying to kill it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If iscsi_tcp partially sends a header, it would recalculate the
header size and readd the size of the digest (if header digests
are used).This would cause us to send sizeof(digest) extra bytes
when we sent the rest of the header.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>