For aborts, qedi needs to cleanup the FW then send the TMF from a worker
thread. While it's doing these the cmd could complete normally and the TMF
could time out. libiscsi would then complete the iscsi_task which will call
into the driver to cleanup the driver level resources while it still might
be accessing them for the cleanup/abort.
This has iscsi_eh_abort keep the iscsi_task ref if the TMF times out, so
qedi does not have to worry about if the task is being freed while in use
and does not need to get its own ref.
Link: https://lore.kernel.org/r/20210525181821.7617-18-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We set the max_active iSCSI EH works to 1, so all work is going to execute
in order by default. However, userspace can now override this in sysfs. If
max_active > 1, we can end up with the block_work on CPU1 and
iscsi_unblock_session running the unblock_work on CPU2 and the session and
target/device state will end up out of sync with each other.
This adds a flush of the block_work in iscsi_unblock_session.
Link: https://lore.kernel.org/r/20210525181821.7617-17-michael.christie@oracle.com
Fixes: 1d726aa6ef ("scsi: iscsi: Optimize work queue flush use")
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The iscsi offload drivers are setting the shost->max_id to the max number
of sessions they support. The problem is that max_id is not the max number
of targets but the highest identifier the targets can have. To use it to
limit the number of targets we need to set it to max sessions - 1, or we
can end up with a session we might not have preallocated resources for.
Link: https://lore.kernel.org/r/20210525181821.7617-15-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If we haven't done a unbind target call we can race where
iscsi_conn_teardown wakes up the EH thread and then frees the conn while
those threads are still accessing the conn ehwait.
We can only do one TMF per session so this just moves the TMF fields from
the conn to the session. We can then rely on the
iscsi_session_teardown->iscsi_remove_session->__iscsi_unbind_session call
to remove the target and it's devices, and know after that point there is
no device or scsi-ml callout trying to access the session.
Link: https://lore.kernel.org/r/20210525181821.7617-14-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If SCSI midlayer is aborting a task when we are tearing down the conn we
could free the conn while the abort thread is accessing the conn. This has
the abort handler get a ref to the conn so it won't be freed from under it.
Note: this is not needed for device/target reset because we are holding the
eh_mutex when accessing the conn.
Link: https://lore.kernel.org/r/20210525181821.7617-12-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Userspace (open-iscsi based tools at least) sets no linger on the socket to
prevent stale data from being sent. However, with the in-kernel cleanup if
userspace is not up the sockfd_put will release the socket without having
set that sockopt.
iscsid sets that opt at socket close time, but it seems ok to set this at
setup time in the kernel for all tools.
Link: https://lore.kernel.org/r/20210525181821.7617-9-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 0ab710458d ("scsi: iscsi: Perform connection failure entirely in
kernel space") has the following regressions/bugs that this patch fixes:
1. It can return cmds to upper layers like dm-multipath where that can
retry them. After they are successful the fs/app can send new I/O to the
same sectors, but we've left the cmds running in FW or in the net layer.
We need to be calling ep_disconnect if userspace is not up.
This patch only fixes the issue for offload drivers. iscsi_tcp will be
fixed in separate commit because it doesn't have a ep_disconnect call.
2. The drivers that implement ep_disconnect expect that it's called before
conn_stop. Besides crashes, if the cleanup_task callout is called before
ep_disconnect it might free up driver/card resources for session1 then they
could be allocated for session2. But because the driver's ep_disconnect is
not called it has not cleaned up the firmware so the card is still using
the resources for the original cmd.
3. The stop_conn_work_fn can run after userspace has done its recovery and
we are happily using the session. We will then end up with various bugs
depending on what is going on at the time.
We may also run stop_conn_work_fn late after userspace has called stop_conn
and ep_disconnect and is now going to call start/bind conn. If
stop_conn_work_fn runs after bind but before start, we would leave the conn
in a unbound but sort of started state where IO might be allowed even
though the drivers have been set in a state where they no longer expect
I/O.
4. Returning -EAGAIN in iscsi_if_destroy_conn if we haven't yet run the in
kernel stop_conn function is breaking userspace. We should have been doing
this for the caller.
Link: https://lore.kernel.org/r/20210525181821.7617-8-michael.christie@oracle.com
Fixes: 0ab710458d ("scsi: iscsi: Perform connection failure entirely in kernel space")
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the system is not up, we can just fail immediately since iscsid is not
going to ever answer our netlink events. We are already setting the
recovery_tmo to 0, but by passing stop_conn STOP_CONN_TERM we never will
block the session and start the recovery timer, because for that flag
userspace will do the unbind and destroy events which would remove the
devices and wake up and kill the eh.
Since the conn is dead and the system is going dowm this just has us use
STOP_CONN_RECOVER with recovery_tmo=0 so we fail immediately. However, if
the user has set the recovery_tmo=-1 we let the system hang like they
requested since they might have used that setting for specific reasons
(one known reason is for buggy cluster software).
Link: https://lore.kernel.org/r/20210525181821.7617-5-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During ep_disconnect we have been doing iscsi_suspend_tx/queue to block new
I/O but every driver except cxgbi and iscsi_tcp can still get I/O from
__iscsi_conn_send_pdu() if we haven't called iscsi_conn_failure() before
ep_disconnect. This could happen if we were terminating the session, and
the logout timed out before it was even sent to libiscsi.
Fix the issue by adding a helper which reverses the bind_conn call that
allows new I/O to be queued. Drivers implementing ep_disconnect can use this
to make sure new I/O is not queued to them when handling the disconnect.
Link: https://lore.kernel.org/r/20210525181821.7617-3-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Consider the case where a VD is deleted and the targetID of that VD is
assigned to a newly created VD. If the sequence of deletion/addition of VD
happens very quickly there is a possibility that second event (VD add)
occurs even before the driver processes the first event (VD delete). As
event processing is done in deferred context the device list remains the
same (but targetID is re-used) so driver will not learn the VD
deletion/additon. I/Os meant for the older VD will be directed to new VD
which may lead to data corruption.
Make driver detect the deleted VD as soon as possible based on the RaidMap
update and block further I/O to that device.
Link: https://lore.kernel.org/r/20210528131307.25683-4-chandrakanth.patil@broadcom.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver issues all non-ReadWrite I/Os for TYPE_ENCLOSURE devices through
the fast path with invalid dev handle. Fast path in turn directs all the
I/Os to the firmware. As firmware stopped handling those I/Os from SAS3.5
generation of controllers (Ventura generation and onwards) this will lead
to I/O failures.
Switch the driver to issue all the non-ReadWrite I/Os for TYPE_ENCLOSURE
devices directly to firmware for SAS3.5 generation of controllers and
later.
Link: https://lore.kernel.org/r/20210528131307.25683-2-chandrakanth.patil@broadcom.com
Cc: <stable@vger.kernel.org> # v5.11+
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Register driver for threaded interrupts.
By default the driver will attempt I/O completion from interrupt context
(primary handler). Since the driver tracks per reply queue outstanding
I/Os, it will schedule threaded ISR if there are any outstanding I/Os
expected on that particular reply queue.
Threaded ISR (secondary handler) will loop for I/O completion as long as
there are outstanding I/Os (speculative method using same per reply queue
outstanding counter) or it has completed some X amount of commands
(something like budget).
Link: https://lore.kernel.org/r/20210520152545.2710479-18-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Detection of firmware fault or any kind of unresponsiveness in the
controller (any admin command which times out) results in resetting the
controller. The primary reset mechanisms used are either soft reset or diag
fault reset. A reset is performed if the host sets the ResetAction field in
the HostDiagnostic register to either 001b (soft reset) or 007b (diag fault
reset). After successfully resetting the controller the driver
reinitializes the controller by going through start of the day
initialization procedure. Pending I/Os during the reset are returned back
to the SCSI midlayer for retry.
Link: https://lore.kernel.org/r/20210520152545.2710479-10-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.co
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Create operational request and reply queue pair.
The MPI3 transport interface consists of an Administrative Request Queue,
an Administrative Reply Queue, and Operational Messaging Queues. The
Operational Messaging Queues are the primary communication mechanism
between the host and the I/O Controller (IOC). Request messages, allocated
in host memory, identify I/O operations to be performed by the IOC. These
operations are queued on an Operational Request Queue by the host driver.
Reply descriptors track I/O operations as they complete. The IOC queues
these completions in an Operational Reply Queue.
To fulfil large contiguous memory requirement, driver creates multiple
segments and provide the list of segments. Each segment size should be 4K
which is a hardware requirement. An element array is contiguous or
segmented. A contiguous element array is located in contiguous physical
memory. A contiguous element array must be aligned on an element size
boundary. An element's physical address within the array may be directly
calculated from the base address, the Producer/Consumer index, and the
element size.
Expected phased identifier bit is used to find out valid entry on reply
queue. Driver sets <ephase> bit and IOC inverts the value of this bit on
each pass.
Link: https://lore.kernel.org/r/20210520152545.2710479-4-kashyap.desai@broadcom.com
Cc: sathya.prakash@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), avoid intentionally writing across
neighboring array fields.
Remove old-style 1-byte array in favor of a flexible array[1] to avoid
future false-positive cross-field memcpy() warning in:
esas2r_vda.c:
memcpy(vi->cmd.gsv.version_info, esas2r_vdaioctl_versions, ...)
The change in struct size doesn't change other structure sizes (it is
already maxed out to 256 bytes, for example here:
union {
struct atto_ioctl_vda_scsi_cmd scsi;
struct atto_ioctl_vda_flash_cmd flash;
struct atto_ioctl_vda_diag_cmd diag;
struct atto_ioctl_vda_cli_cmd cli;
struct atto_ioctl_vda_smp_cmd smp;
struct atto_ioctl_vda_cfg_cmd cfg;
struct atto_ioctl_vda_mgt_cmd mgt;
struct atto_ioctl_vda_gsv_cmd gsv;
u8 cmd_info[256];
} cmd;
No sizes are calculated using the enclosing structure, so no other
updates are needed.
Link: https://lore.kernel.org/r/20210528181337.792268-3-keescook@chromium.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The BusLogic driver has build errors on ia64 due to a name collision (in
the #included FlashPoint.c file). Rename the struct field in struct
sccb_mgr_info from si_flags to si_mflags (manager flags) to mend the build.
This is the first problem. There are 50+ others after this one:
In file included from ../include/uapi/linux/signal.h:6,
from ../include/linux/signal_types.h:10,
from ../include/linux/sched.h:29,
from ../include/linux/hardirq.h:9,
from ../include/linux/interrupt.h:11,
from ../drivers/scsi/BusLogic.c:27:
../arch/ia64/include/uapi/asm/siginfo.h:15:27: error: expected ':', ',', ';', '}' or '__attribute__' before '.' token
15 | #define si_flags _sifields._sigfault._flags
| ^
../drivers/scsi/FlashPoint.c:43:6: note: in expansion of macro 'si_flags'
43 | u16 si_flags;
| ^~~~~~~~
In file included from ../drivers/scsi/BusLogic.c:51:
../drivers/scsi/FlashPoint.c: In function 'FlashPoint_ProbeHostAdapter':
../drivers/scsi/FlashPoint.c:1076:11: error: 'struct sccb_mgr_info' has no member named '_sifields'
1076 | pCardInfo->si_flags = 0x0000;
| ^~
../drivers/scsi/FlashPoint.c:1079:12: error: 'struct sccb_mgr_info' has no member named '_sifields'
Link: https://lore.kernel.org/r/20210529234857.6870-1-rdunlap@infradead.org
Fixes: 391e2f2560 ("[SCSI] BusLogic: Port driver to 64-bit.")
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>