Conditional statements are faster than indirect calls. Hence call
scsi_done() and reset_done() directly. The changes in this patch are as
follows:
- Remove the 'done' argument from aha152x_internal_queue().
- Change ptr->scsi_done(ptr) into aha152x_scsi_done(ptr).
- Inside aha152x_scsi_done(), check the 'resetting' flag of SCp.phase
since aha152x_internal_queue() specifies the 'reset_done' function
pointer if and only if the third argument has the value 'resetting'.
Link: https://lore.kernel.org/r/20211007202923.2174984-20-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The aacraid driver invokes scmd->scsi_done(scmd) for two types of SCSI
commands:
- SCSI commands initialized by the SCSI mid-layer.
- SCSI commands initialized by aac_probe_container().
The processing sequence for SCSI commands allocated by
aac_probe_container() is as follows:
aac_probe_container()
-> _aac_probe_container(scmd, aac_probe_container_callback1)
-> scmd->SCp.ptr = aac_probe_container_callback1
-> aac_fib_send(..., _aac_probe_container1, scmd)
-> fibptr->callback = _aac_probe_container1
-> fibptr->callback_data = scmd
fibptr->callback(scmd)
-> _aac_probe_container1(scmd, fibptr)
[ ... ]
-> _aac_probe_container2(scmd, fibptr)
-> Call scmd->SCp.ptr == aac_probe_container_callback1
-> scmd->device = NULL;
The processing sequence for SCSI commands allocated by the SCSI mid-layer
if _aac_probe_container() is called is as follows:
aac_queuecommand()
-> aac_scsi_cmd()
-> _aac_probe_container(scmd, aac_probe_container_callback2)
-> scmd->SCp.ptr = aac_probe_container_callback2
-> aac_fib_send(..., _aac_probe_container1, scmd)
fibptr->callback(scmd)
-> _aac_probe_container1(scmd, fibptr)
[ ... ]
-> _aac_probe_container2(scmd, fibptr)
-> Call scmd->SCp.ptr == aac_probe_container_callback2
Preserve the existing call sequences by calling scsi_done() for commands
submitted by the mid-layer or aac_probe_container_scsi_done() for commands
submitted by aac_probe_container().
Link: https://lore.kernel.org/r/20211007202923.2174984-17-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Conditional statements are faster than indirect calls. Use a structure
member to track the SCSI command submitter such that later patches can call
scsi_done(scmd) instead of scmd->scsi_done(scmd).
The asymmetric behavior that scsi_send_eh_cmnd() sets the submission
context to the SCSI error handler and that it does not restore the
submission context to the SCSI core is retained.
Link: https://lore.kernel.org/r/20211007202923.2174984-2-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following query shows which drivers define callbacks that are called by
the power management support code in the SCSI core (scsi_pm.c):
$ git grep -nHEwA16 "$(echo $(git grep -h 'scsi_register_driver(&' |
sed 's/.*&//;s/\..*//') | sed 's/ /|/g')" |
grep '\.pm[[:blank:]]*=[[:blank:]]'
drivers/scsi/sd.c-620- .pm = &sd_pm_ops,
drivers/scsi/sr.c-100- .pm = &sr_pm_ops,
drivers/scsi/ufs/ufshcd.c-9765- .pm = &ufshcd_wl_pm_ops,
Since unconditionally runtime resuming a device during system resume is not
necessary, remove that code. Modify the SCSI disk (sd) driver such that it
follows the same approach as the UFS driver, namely to skip system suspend
and resume for devices that are runtime suspended. The CD-ROM code does not
need to be updated since its PM callbacks do not affect the device power
state.
This patch has been tested as follows:
[ shell 1 ]
cd /sys/kernel/debug/tracing
grep -E 'blk_(pre|post)_runtime|runtime_(suspend|resume)|autosuspend_delay|pm_runtime_(get|put)' available_filter_functions |
while read a b; do echo "$a"; done |
grep -v __pm_runtime_resume >set_ftrace_filter
echo function > current_tracer
echo 1 > tracing_on
cat trace_pipe
[ shell 2 ]
cd /sys/block/sr0
# Increase the event poll interval to make it easier to derive from the
# tracing output whether runtime power actions are the result of sg_inq.
echo 30000 > events_poll_msecs
cd device/power
# Enable runtime power management.
echo auto > control
echo 1000 > autosuspend_delay_ms
sleep 1
# Verify in shell 1 that sr0 has been runtime suspended
sg_inq /dev/sr0
eject /dev/sr0
sg_inq /dev/sr0
# Disable runtime power management.
echo on > control
cd /sys/block/sda/device/power
echo auto > control
echo 1000 > autosuspend_delay_ms
sleep 1
# Verify in shell 1 that sr0 has been runtime suspended
sg_inq /dev/sda
Link: https://lore.kernel.org/r/20211006215453.3318929-4-bvanassche@acm.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
tools/testing/selftests/net/ioam6.sh
7b1700e009 ("selftests: net: modify IOAM tests for undef bits")
bf77b1400a ("selftests: net: Test for the IOAM encapsulation with IPv6")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If the softreset fails in the I_T reset, libsas will then continue to issue
a controller reset to try to recover.
However a faulty disk may cause the softreset to fail, and resetting the
controller will not help this scenario. Indeed, we will just continue the
cycle of error handle handling to try to recover.
So if the softreset fails upon certain conditions, just disable the phy
associated with the disk. The user needs to handle this problem.
Link: https://lore.kernel.org/r/1634041588-74824-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When issuing a hardreset/linkreset/phy_set_linkrate from sysfs, the phy
will be disabled and re-enabled for the directly attached scenario.
It takes some time for the phy to come back up after re-enabling the phy.
If the controller becomes suspended while waiting for the phy to come back,
the phy up may be lost (along with the disk).
To solve this problem, wait for the phy up to occur with a timeout. Indeed
this is already done in hisi_sas_debug_I_T_nexus_reset() for local phys, so
just relocate the functionality to hisi_sas_control_phy().
Since the HA workqueue is drained when suspending the controller, and the
phy control function is called from the same workqueue, we can guarantee
that the controller will not be suspended during this period.
Link: https://lore.kernel.org/r/1634041588-74824-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SCSI error handler calls scsi_unjam_host() which can call the queue
function ufshcd_queuecommand() indirectly. The error handler changes the
state to UFSHCD_STATE_RESET while running, but error interrupts that
happen while the error handler is running could change the state to
UFSHCD_STATE_EH_SCHEDULED_NON_FATAL which would allow requests to go
through ufshcd_queuecommand() even though the error handler is running.
Block that hole by checking whether the error handler is in progress.
Link: https://lore.kernel.org/r/20211008084048.257498-1-adrian.hunter@intel.com
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In commit 9e67600ed6 ("scsi: iscsi: Fix race condition between login and
sync thread") we meant to add a check where before we call ->set_param() we
make sure the iscsi_cls_connection is bound. The problem is that between
versions 4 and 5 of the patch the deletion of the unchecked set_param()
call was dropped so we ended up with 2 calls. As a result we can still hit
a crash where we access the unbound connection on the first call.
This patch removes that first call.
Fixes: 9e67600ed6 ("scsi: iscsi: Fix race condition between login and sync thread")
Link: https://lore.kernel.org/r/20211010161904.60471-1-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Li Feng <fengli@smartx.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This variable is just a temporary variable, used to do an endian
conversion. The problem is that the last byte is not initialized. After
the conversion is completely done, the last byte is discarded so it doesn't
cause a problem. But static checkers and the KMSan runtime checker can
detect the uninitialized read and will complain about it.
Link: https://lore.kernel.org/r/20211006073242.GA8404@kili
Fixes: 5036f0a0ec ("[SCSI] csiostor: Fix sparse warnings.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Merge the 5.15/scsi-fixes branch into the staging tree to resolve UFS
conflict reported by sfr.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Stop the OS from re-discovering multiple LUNs for tape drive and medium
changer.
Duplicate device nodes for Ultrium tape drive and medium changer are being
created.
The Ultrium tape drive is a multi-LUN SCSI target. It presents a LUN for
the tape drive and a 2nd LUN for the medium changer. Our controller FW
lists both LUNs in the RPL results.
As a result, the smartpqi driver exposes both devices to the OS. Then the
OS does its normal device discovery via the SCSI REPORT LUNS command, which
causes it to re-discover both devices a 2nd time, which results in the
duplicate device nodes.
Link: https://lore.kernel.org/r/20210928235442.201875-10-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Send a TEST UNIT READY to HBA disks and do not present them to the OS if
0x02/0x04/0x1b (SANITIZE IN PROGRESS) is returned.
During boot-up, some OSes appear to hang when there are one or more disks
undergoing a sanitize operation.
According to SCSI SBC4 specification section 4.11.2 "Commands allowed
during SANITIZE", some SCSI commands are permitted, but read/write
operations are not.
When the OS attempts to read the disk partition table a CHECK CONDITION ASC
0x04 ASCQ 0x1b is returned which causes the OS to retry the read until
SANITIZE has completed. This can take hours.
According to document HPE Smart Storage Administrator User Guide, during
the sanitize erase operation, the drive is unusable. I.e. the expected
behavior for SANITIZE is the that disk remains offline even after SANITIZE
has completed. The customer is expected to re-enable the disk using the
management utility.
Link: https://lore.kernel.org/r/20210928235442.201875-6-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enhance check for commands queued to the controller. Add new function
pqi_nonempty_inbound_queue_count() that will wait for all I/O queued for
submission to controller across all queue groups to drain. Add helper
functions to obtain queue command counts for each queue group. These
queues should drain quickly as they are already staged to be submitted down
to the controller's IB queue.
Enhance check for outstanding command completion. Update the count of
outstanding commands while waiting. This value was not re-obtained and was
potentially causing infinite wait for all completions.
Link: https://lore.kernel.org/r/20210928235442.201875-5-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct kdump hangs when controller is locked up.
There are occasions when a controller reboot (controller soft reset) is
issued when a controller firmware crash dump is in progress.
This leads to incomplete controller firmware crash dump:
- When the controller crash dump is in progress, and a kdump is initiated,
the driver issues inbound doorbell reset to bring back the controller in
SIS mode.
- If the controller is in locked up state, the inbound doorbell reset does
not work causing controller initialization failures. This results in the
driver hanging waiting for SIS mode.
To avoid an incomplete controller crash dump, add in a controller crash
dump handshake:
- Controller will indicate start and end of the controller crash dump by
setting some register bits.
- Driver will look these bits when a kdump is initiated. If a controller
crash dump is in progress, the driver will wait for the controller crash
dump to complete before issuing the controller soft reset then complete
driver initialization.
Link: https://lore.kernel.org/r/20210928235442.201875-3-don.brace@microchip.com
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>