Commit Graph

2383 Commits

Author SHA1 Message Date
Justin Tee
0653d40935 scsi: lpfc: Change VMID driver load time parameters to read only
VMID driver support is a load time configuration setting.  Thus, change
sysfs attributes to read only.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231207224039.35466-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13 22:17:56 -05:00
Ilpo Järvinen
420ac76610 scsi: lpfc: Use PCI_HEADER_TYPE_MFD instead of literal
Replace literal 0x80 with PCI_HEADER_TYPE_MFD.

Link: https://lore.kernel.org/r/20231124090919.23687-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-01 16:50:59 -06:00
Justin Tee
1f86b0d9c7 scsi: lpfc: Copyright updates for 14.2.0.16 patches
Update copyrights to 2023 for files modified in the 14.2.0.16 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-10-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
c855e02b57 scsi: lpfc: Update lpfc version to 14.2.0.16
Update lpfc version to 14.2.0.16.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-9-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
e6af452187 scsi: lpfc: Enhance driver logging for selected discovery events
Typically, debugging discovery issues requires the ndlp reference count,
nlp flags, transport flags, and the io tag for root cause analysis.

Modify important discovery log messages to include one or more of these
attributes to aid in debugging and support.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-8-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
349b1e2c1b scsi: lpfc: Refactor and clean up mailbox command memory free
A lot of repeated clean up code exists when freeing mailbox commands in
lpfc_mem_free_all().

Introduce a lpfc_mem_free_sli_mbox() helper routine to refactor the
copy-paste code.  Additionally, reinitialize the mailbox command structure
context pointers to NULL in lpfc_sli4_mbox_cmd_free().

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-7-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
57ea41eb7f scsi: lpfc: Return early in lpfc_poll_eratt() when the driver is unloading
Add a check in lpfc_poll_eratt() when the driver is unloading.  There is no
point to check for error attention events if the driver is rmmod'ed.

If the driver is reloaded, as part of insmod initialization, then a fresh
reset is always asserted to start clean and free of error attention events.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-6-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
e07ac2d2aa scsi: lpfc: Eliminate unnecessary relocking in lpfc_check_nlp_post_devloss()
In lpfc_check_nlp_post_devloss(), retaking of the ndlp lock in the if
statement is useless because the very next line unlocks. Simply return to
avoid relocking.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-5-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Justin Tee
1dec1311b9 scsi: lpfc: Fix list_entry null check warning in lpfc_cmpl_els_plogi()
Smatch called out a warning for null checking a ptr that is assigned by
list_entry(). list_entry() does not return null and, if the list is empty,
can return an invalid ptr. Thus, the !psrp check does not execute properly.

 drivers/scsi/lpfc/lpfc_els.c:2133 lpfc_cmpl_els_plogi()
 warn: list_entry() does not return NULL 'prsp'

Replace list_entry() with list_get_first(), which does a list_empty() check
before returning the first entry.

Fixes: a3c3c0a806 ("scsi: lpfc: Validate ELS LS_ACC completion payload")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-scsi/01b7568f-4ab4-4d56-bfa6-9ecc5fc261fe@moroto.mountain/
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-4-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Justin Tee
f5779b5292 scsi: lpfc: Fix possible file string name overflow when updating firmware
Because file_name and phba->ModelName are both declared a size 80 bytes,
the extra ".grp" file extension could cause an overflow into file_name.

Define a ELX_FW_NAME_SIZE macro with value 84.  84 incorporates the 4 extra
characters from ".grp".  file_name is changed to be declared as a char and
initialized to zeros i.e. null chars.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-3-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Justin Tee
2fe4b6a677 scsi: lpfc: Correct maximum PCI function value for RAS fw logging
Currently, the ras_fwlog_func sysfs parameter allows users to input a value
greater than three when selecting a PCI function to enable RAS fw logging
feature.

The user's input is sanity checked in lpfc_sli4_ras_init(), but allowing an
input greater than three doesn't make sense because the max number of ports
per HBA is four.

Change the allowable range from [0, 7] to [0, 3].

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-2-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Linus Torvalds
6ed92e559a SCSI misc on 20231102
Updates to the usual drivers (ufs, megaraid_sas, lpfc, target, ibmvfc,
 scsi_debug) plus the usual assorted minor fixes and updates.  The
 major change this time around is a prep patch for rethreading of the
 driver reset handler API not to take a scsi_cmd structure which starts
 to reduce various drivers' dependence on scsi_cmd in error handling.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZUORLiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQ4WAQDDIhzp
 /PiJBBtt0U9ii/lYqRLrOVnN0extKEgEGO+FbwEAssKgs+5Jn/7XCgdpSrx8Co3/
 0cPXrZGxs7tFpFWLZjM=
 =AlRU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, megaraid_sas, lpfc, target, ibmvfc,
  scsi_debug) plus the usual assorted minor fixes and updates.

  The major change this time around is a prep patch for rethreading of
  the driver reset handler API not to take a scsi_cmd structure which
  starts to reduce various drivers' dependence on scsi_cmd in error
  handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (132 commits)
  scsi: ufs: core: Leave space for '\0' in utf8 desc string
  scsi: ufs: core: Conversion to bool not necessary
  scsi: ufs: core: Fix race between force complete and ISR
  scsi: megaraid: Fix up debug message in megaraid_abort_and_reset()
  scsi: aic79xx: Fix up NULL command in ahd_done()
  scsi: message: fusion: Initialize return value in mptfc_bus_reset()
  scsi: mpt3sas: Fix loop logic
  scsi: snic: Remove useless code in snic_dr_clean_pending_req()
  scsi: core: Add comment to target_destroy in scsi_host_template
  scsi: core: Clean up scsi_dev_queue_ready()
  scsi: pmcraid: Add missing scsi_device_put() in pmcraid_eh_target_reset_handler()
  scsi: target: core: Fix kernel-doc comment
  scsi: pmcraid: Fix kernel-doc comment
  scsi: core: Handle depopulation and restoration in progress
  scsi: ufs: core: Add support for parsing OPP
  scsi: ufs: core: Add OPP support for scaling clocks and regulators
  scsi: ufs: dt-bindings: common: Add OPP table
  scsi: scsi_debug: Add param to control sdev's allow_restart
  scsi: scsi_debug: Add debugfs interface to fail target reset
  scsi: scsi_debug: Add new error injection type: Reset LUN failed
  ...
2023-11-02 15:13:50 -10:00
Linus Torvalds
eb55307e67 X86 core code updates:
- Limit the hardcoded topology quirk for Hygon CPUs to those which have a
     model ID less than 4. The newer models have the topology CPUID leaf 0xB
     correctly implemented and are not affected.
 
   - Make SMT control more robust against enumeration failures
 
     SMT control was added to allow controlling SMT at boottime or
     runtime. The primary purpose was to provide a simple mechanism to
     disable SMT in the light of speculation attack vectors.
 
     It turned out that the code is sensible to enumeration failures and
     worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration
     which means the primary thread mask is not set up correctly. By chance
     a XEN/PV boot ends up with smp_num_siblings == 2, which makes the
     hotplug control stay at its default value "enabled". So the mask is
     never evaluated.
 
     The ongoing rework of the topology evaluation caused XEN/PV to end up
     with smp_num_siblings == 1, which sets the SMT control to "not
     supported" and the empty primary thread mask causes the hotplug core to
     deny the bringup of the APS.
 
     Make the decision logic more robust and take 'not supported' and 'not
     implemented' into account for the decision whether a CPU should be
     booted or not.
 
   - Fake primary thread mask for XEN/PV
 
     Pretend that all XEN/PV vCPUs are primary threads, which makes the
     usage of the primary thread mask valid on XEN/PV. That is consistent
     with because all of the topology information on XEN/PV is fake or even
     non-existent.
 
   - Encapsulate topology information in cpuinfo_x86
 
     Move the randomly scattered topology data into a separate data
     structure for readability and as a preparatory step for the topology
     evaluation overhaul.
 
   - Consolidate APIC ID data type to u32
 
     It's fixed width hardware data and not randomly u16, int, unsigned long
     or whatever developers decided to use.
 
   - Cure the abuse of cpuinfo for persisting logical IDs.
 
     Per CPU cpuinfo is used to persist the logical package and die
     IDs. That's really not the right place simply because cpuinfo is
     subject to be reinitialized when a CPU goes through an offline/online
     cycle.
 
     Use separate per CPU data for the persisting to enable the further
     topology management rework. It will be removed once the new topology
     management is in place.
 
   - Provide a debug interface for inspecting topology information
 
     Useful in general and extremly helpful for validating the topology
     management rework in terms of correctness or "bug" compatibility.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmU+yX0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoROUD/4vlvKEcpm9rbI5DzLcaq4DFHKbyEZF
 cQtzuOSM/9vTc9DHnuoNNLl9TWSYxiVYnejf3E21evfsqspYlzbTH8bId9XBCUid
 6B68AJW842M2erNuwj0b0HwF1z++zpDmBDyhGOty/KQhoM8pYOHMvntAmbzJbuso
 Dgx6BLVFcboTy6RwlfRa0EE8f9W5V+JbmG/VBDpdyCInal7VrudoVFZmWQnPIft7
 zwOJpAoehkp8OKq7geKDf79yWxu9a1sNPd62HtaVEvfHwehHqE6OaMLss1us+0vT
 SJ/D6gmRQBOwcXaZL0wL1dG7Km9Et4AisOvzhXGvTa5b2D5oljVoqJ7V7FTf5g3u
 y3aqWbeUJzERUbeJt1HoGVAKyA4GtZOvg+TNIysf6F1Z4khl9alfa9jiqjj4g1au
 zgItq/ZMBEBmJ7X4FxQUEUVBG2CDsEidyNBDRcimWQUDfBakV/iCs0suD8uu8ZOD
 K5jMx8Hi2+xFx7r1YqsfsyMBYOf/zUZw65RbNe+kI992JbJ9nhcODbnbo5MlAsyv
 vcqlK5FwXgZ4YAC8dZHU/tyTiqAW7oaOSkqKwTP5gcyNEqsjQHV//q6v+uqtjfYn
 1C4oUsRHT2vJiV9ktNJTA4GQHIYF4geGgpG8Ih2SjXsSzdGtUd3DtX1iq0YiLEOk
 eHhYsnniqsYB5g==
 =xrz8
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Thomas Gleixner:

 - Limit the hardcoded topology quirk for Hygon CPUs to those which have
   a model ID less than 4.

   The newer models have the topology CPUID leaf 0xB correctly
   implemented and are not affected.

 - Make SMT control more robust against enumeration failures

   SMT control was added to allow controlling SMT at boottime or
   runtime. The primary purpose was to provide a simple mechanism to
   disable SMT in the light of speculation attack vectors.

   It turned out that the code is sensible to enumeration failures and
   worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration
   which means the primary thread mask is not set up correctly. By
   chance a XEN/PV boot ends up with smp_num_siblings == 2, which makes
   the hotplug control stay at its default value "enabled". So the mask
   is never evaluated.

   The ongoing rework of the topology evaluation caused XEN/PV to end up
   with smp_num_siblings == 1, which sets the SMT control to "not
   supported" and the empty primary thread mask causes the hotplug core
   to deny the bringup of the APS.

   Make the decision logic more robust and take 'not supported' and 'not
   implemented' into account for the decision whether a CPU should be
   booted or not.

 - Fake primary thread mask for XEN/PV

   Pretend that all XEN/PV vCPUs are primary threads, which makes the
   usage of the primary thread mask valid on XEN/PV. That is consistent
   with because all of the topology information on XEN/PV is fake or
   even non-existent.

 - Encapsulate topology information in cpuinfo_x86

   Move the randomly scattered topology data into a separate data
   structure for readability and as a preparatory step for the topology
   evaluation overhaul.

 - Consolidate APIC ID data type to u32

   It's fixed width hardware data and not randomly u16, int, unsigned
   long or whatever developers decided to use.

 - Cure the abuse of cpuinfo for persisting logical IDs.

   Per CPU cpuinfo is used to persist the logical package and die IDs.
   That's really not the right place simply because cpuinfo is subject
   to be reinitialized when a CPU goes through an offline/online cycle.

   Use separate per CPU data for the persisting to enable the further
   topology management rework. It will be removed once the new topology
   management is in place.

 - Provide a debug interface for inspecting topology information

   Useful in general and extremly helpful for validating the topology
   management rework in terms of correctness or "bug" compatibility.

* tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too
  x86/cpu: Provide debug interface
  x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids
  x86/apic: Use u32 for wakeup_secondary_cpu[_64]()
  x86/apic: Use u32 for [gs]et_apic_id()
  x86/apic: Use u32 for phys_pkg_id()
  x86/apic: Use u32 for cpu_present_to_apicid()
  x86/apic: Use u32 for check_apicid_used()
  x86/apic: Use u32 for APIC IDs in global data
  x86/apic: Use BAD_APICID consistently
  x86/cpu: Move cpu_l[l2]c_id into topology info
  x86/cpu: Move logical package and die IDs into topology info
  x86/cpu: Remove pointless evaluation of x86_coreid_bits
  x86/cpu: Move cu_id into topology info
  x86/cpu: Move cpu_core_id into topology info
  hwmon: (fam15h_power) Use topology_core_id()
  scsi: lpfc: Use topology_core_id()
  x86/cpu: Move cpu_die_id into topology info
  x86/cpu: Move phys_proc_id into topology info
  x86/cpu: Encapsulate topology information in cpuinfo_x86
  ...
2023-10-30 17:37:47 -10:00
Martin K. Petersen
af46076d66 Merge patch series "lpfc: Update lpfc to revision 14.2.0.15"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.2.0.15

This patch set contains error handling fixes, ELS bug fixes, and
logging improvements.

The patches were cut against Martin's 6.7/scsi-queue tree.

Link: https://lore.kernel.org/r/20231009161812.97232-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 17:00:47 -04:00
Justin Tee
8a9a690b5a scsi: lpfc: Update lpfc version to 14.2.0.15
Update lpfc version to 14.2.0.15.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
41c831bbb0 scsi: lpfc: Introduce LOG_NODE_VERBOSE messaging flag
The preexisting LOG_NODE message flag frequently spams a subset of the same
log messages during normal FC driver operations.  When analyzing driver
logs, this sometimes leads to difficulty in troubleshooting.

Because LOG_IP log message flag is unused, convert it to a new
LOG_NODE_VERBOSE flag.  The LOG_NODE_VERBOSE shall specifically be used for
diagnosing issues that require precise ndlp tracking detail.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
a3c3c0a806 scsi: lpfc: Validate ELS LS_ACC completion payload
A WCQE success completion status does not guarantee valid LS_ACC receipt
for ELS commands.  So, introduce a small helper routine that validates ELS
LS_ACC frames in ELS cmpl routines.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
12e896c742 scsi: lpfc: Reject received PRLIs with only initiator fcn role for NPIV ports
Currently, NPIV ports send PRLI_ACC to all received unsolicited PRLI
requests.  For an NPIV port, there is no point to PRLI_ACC if the received
PRLI request has the initiator function bit set and the target function bit
unset.  Modify the lpfc_rcv_prli_support_check() routine to send a PRLI_RJT
in such cases.  NPIV ports are expected to send PRLI_ACC only if the Target
function bit is set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
d472a76603 scsi: lpfc: Treat IOERR_SLI_DOWN I/O completion status the same as pci offline
During receipt of a hardware error attention ACQE, IOERR_SLI_DOWN status is
set by the driver for all outstanding I/Os.

In such hardware error attention cases, we can treat the situation exactly
the same as pci_channel_offline.  Thus, add IOERR_SLI_DOWN status to the
same category as pci_channel_offline handling in lpfc_nvme_io_cmd_cmpl.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
0506814609 scsi: lpfc: Remove unnecessary zero return code assignment in lpfc_sli4_hba_setup
In order to enter the !rc if statement block in question, rc had to have
been zero to begin with.  Thus, the rc = 0 assignment is unnecessary and
can be removed.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Thomas Gleixner
09253672b5 scsi: lpfc: Use topology_core_id()
Use the provided topology helper.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085112.446856860@linutronix.de
2023-10-10 14:38:17 +02:00
Thomas Gleixner
02fb601d27 x86/cpu: Move phys_proc_id into topology info
Rename it to pkg_id which is the terminology used in the kernel.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085112.329006989@linutronix.de
2023-10-10 14:38:17 +02:00
Justin Tee
dae40be7a1 scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rports
During rmmod, when dev_loss_tmo callback is called, an ndlp kref count is
decremented twice.  Once for SCSI transport registration and second to
remove the initial node allocation kref.  If there is also an NVMe
transport registration, another reference count decrement is expected in
lpfc_nvme_unregister_port().

Race conditions between the NVMe transport remoteport_delete and
dev_loss_tmo callbacks sometimes results in premature ndlp object release
resulting in use-after-free issues.

Fix by not dropping the ndlp object in dev_loss_tmo callback with an
outstanding NVMe transport registration.  Inversely, mark the final
NLP_DROPPED flag in lpfc_nvme_unregister_port when rmmod flag is set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230908211923.37603-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:51:16 -04:00
Justin Tee
9c3034968e scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmo
When a dev_loss_tmo event occurs, an ndlp lock is taken before checking
nlp_flag for NLP_DROPPED.  There is an attempt to restore the ndlp lock
when exiting the if statement, but the nlp_put kref could be the final
decrement causing a use-after-free memory access on a released ndlp object.

Instead of trying to reacquire the ndlp lock after checking nlp_flag, just
return after calling nlp_put.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230908211852.37576-1-justintee8345@gmail.com
Reviewed-by: "Ewan D. Milne" <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:49:34 -04:00
Jinjie Ruan
7dcc683db3 scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()
Since debugfs_create_file() returns ERR_PTR and never NULL, use IS_ERR() to
check the return value.

Fixes: 2fcbc569b9 ("scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI")
Fixes: 4c47efc140 ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures")
Fixes: 6a828b0f61 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues")
Fixes: 95bfc6d8ad ("scsi: lpfc: Make FW logging dynamically configurable")
Fixes: 9f77870870 ("scsi: lpfc: Add debugfs support for cm framework buffers")
Fixes: c490850a09 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20230906030809.2847970-1-ruanjinjie@huawei.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:48:36 -04:00
James Bottomley
e03843a0f0 Merge branch 'fixes' into misc 2023-09-02 08:25:19 +01:00
Andy Shevchenko
19d7102a95 scsi: lpfc: Do not abuse UUID APIs and LPFC_COMPRESS_VMID_SIZE
The lpfc_vmid_host_uuid is not defined as uuid_t and its usage is not the
same as for uuid_t operations (like exporting or importing).  Hence replace
call to uuid_is_null() by respective memchr_inv() without abusing casting.

With that, replace LPFC_COMPRESS_VMID_SIZE with plain number and respective
sizeof() to make code robust to changes in the future, if any.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230818155452.875781-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-21 17:13:57 -04:00
Justin Tee
8eebf0e84f scsi: lpfc: Remove reftag check in DIF paths
When preparing protection DIF I/O for DMA, the driver obtains reference
tags from scsi_prot_ref_tag().  Previously, there was a wrong assumption
that an all 0xffffffff value meant error and thus the driver failed the
I/O.  This patch removes the evaluation code and accepts whatever the upper
layer returns.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230803211932.155745-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-07 21:34:08 -04:00
Justin Tee
dded1dc31a scsi: lpfc: Modify when a node should be put in device recovery mode during RSCN
Only nodes whose state is at least past a PLOGI issue and strictly less
than a PRLI issue should be put into device recovery mode upon RSCN
receipt.  Previously, the allowance of LOGO and PRLI completion states did
not make sense because those nodes should be allowed to flow through and
marked as NPort dissappeared as is normally done.  A follow up RSCN GID_FT
would recover those nodes in such cases.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230804195546.157839-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-07 21:25:47 -04:00
Justin Tee
71fe5ddac5 scsi: lpfc: Copyright updates for 14.2.0.14 patches
Update copyrights to 2023 for files modified in the 14.2.0.14 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-13-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:08 -04:00
Justin Tee
cfb9b8f506 scsi: lpfc: Update lpfc version to 14.2.0.14
Update lpfc version to 14.2.0.14

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-12-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:08 -04:00
Justin Tee
81907422ca scsi: lpfc: Clean up SLI-4 sysfs resource reporting
Currently, we have dated logic to work around the differences between SLI-4
and SLI-3 resource reporting through sysfs.

Leave the SLI-3 path untouched, but for SLI4 path, retrieve resource values
from the phba->sli4_hba->max_cfg_param structure.  Max values are populated
during ACQE events right after READ_CONFIG mbox cmd is sent.  Instead of
the dated subtraction logic, used resource calculation is directly fed into
sysfs for display.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-11-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:08 -04:00
Justin Tee
d668b368ef scsi: lpfc: Refactor cpu affinity assignment paths
During initialization, a lot of the same logic is used on MSI-X vector CPU
affinity assignment.

Create a lpfc_next_present_cpu() helper routine, and apply its usage for
refactoring purposes.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-10-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
089ea22e37 scsi: lpfc: Abort outstanding ELS cmds when mailbox timeout error is detected
A mailbox timeout error usually indicates something has gone wrong, and a
follow up reset of the HBA is a typical recovery mechanism.  Introduce a
MBX_TMO_ERR flag to detect such cases and have lpfc_els_flush_cmd abort ELS
commands if the MBX_TMO_ERR flag condition was set.  This ensures all of
the registered SGL resources meant for ELS traffic are not leaked after an
HBA reset.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
9388da3037 scsi: lpfc: Make fabric zone discovery more robust when handling unsolicited LOGO
This patch provides better target rport recovery when a target rport is
running in initiator mode to discover the fabric.  Such a target will issue
a LOGO before switching back to strict target mode and changes are made to
recover the login.  Log messages are also updated accordingly.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
04c3200114 scsi: lpfc: Set Establish Image Pair service parameter only for Target Functions
Previously, Establish Image Pair was set in all PRLI_ACC responses
regardless if the received PRLI was from an initiator or target function.
Specific target vendors that can operate in both initiator and target mode,
may view the PRLI_ACC with Establish Image Pair set as an invalid service
parameter when operating in initiator only mode.  This causes discovery
issues later when the target switches on its target mode function.

Revise logic that determines an rport's role as an initiator or target and
set the Establish Image Pair service parameter bit only if the Target
Function bit is set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
90cec07f53 scsi: lpfc: Revise ndlp kref handling for dev_loss_tmo_callbk and lpfc_drop_node
The ndlp kref count implementation in lpfc_dev_loss_tmo_callbk() removes
the initial node reference when a vport is unloading.  When lpfc_cleanup()
sends a DEVICE_RM event and is in NPR state, the driver calls
lpfc_drop_node().  Subsequently, lpfc_drop_node() also removes an ndlp kref
thinking it is the initial reference.  This unintentionally introduces an
extra kref decrement on the ndlp object.

Fix by using the NLP_DROPPED node flag in lpfc_dev_loss_tmo_callbk() and
lpfc_drop_node() to coordinate the removal of the initial node reference.

In lpfc_dev_loss_tmo_callbk(), remove the SCSI transport reference provided
the node is registered in the dev_loss context because the driver cannot
call the SCSI transport in dev_loss context or afterwards.  And, have
lpfc_drop_node() not remove a reference if another thread is acting or has
already acted on it.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
377d7abadd scsi: lpfc: Qualify ndlp discovery state when processing RSCN
Conditionalize when to put an ndlp into recovery mode when processing
RSCNs.  As long as an ndlp state is beyond a PLOGI issue and has been
mapped to a transport layer before, the ndlp qualifies to be put into
recovery mode.  Otherwise, treat the ndlp rport normally through the
discovery engine.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
869ab8b8a3 scsi: lpfc: Remove extra ndlp kref decrement in FLOGI cmpl for loop topology
In lpfc_cmpl_els_flogi(), the return out: label decrements the ndlp kref
signaling that FLOGI processing on the ndlp is complete.  In loop topology
path, there is an unnecessary ndlp put because it also branches to the out:
label.  This also signals ndlp usage completion too soon.  As such, remove
the extra lpfc_nlp_put() when in loop topology.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
1a5cd3d073 scsi: lpfc: Simplify fcp_abort transport callback log message
The driver is reaching into a nvme_fc_cmd_iu ptr that belongs to the
transport during an abort.  This could cause an unintentional ptr
dereference into memory that the driver does not own.  Since the
nvme_fc_cmd_iu ptr was for logging purposes only, simplify the log message
such that the nvme_fc_cmd_iu reference is no longer needed.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Justin Tee
4cf7cfa8ba scsi: lpfc: Pull out fw diagnostic dump log message from driver's trace buffer
The firmware diagnostic dump log message does not need to be a part of the
driver's log trace buffer because it is an expected user triggered event.
Change LOG_TRACE_EVENT verbose flag to LOG_SLI.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23 16:17:07 -04:00
Martin K. Petersen
e96277a570 Merge branch '6.5/scsi-staging' into 6.5/scsi-fixes
Pull in the currently staged SCSI fixes for 6.5.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-11 12:15:15 -04:00
Linus Torvalds
7fcd473a64 SCSI misc on 20230708
A few late arriving patches that missed the initial pull request.
 It's mostly bug fixes (the dt-bindings is a fix for the initial pull).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZKmvwSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishResAQCPDbBh
 omMRBE+W+Vx2TgOJGjo/F+T1D2JjBhLIGpNVggEApJtgrQutAToiCU/qIP9GOTl7
 evetzh5boMMuyD2s7ak=
 =pi4v
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "A few late arriving patches that missed the initial pull request. It's
  mostly bug fixes (the dt-bindings is a fix for the initial pull)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Remove unused function declaration
  scsi: target: docs: Remove tcm_mod_builder.py
  scsi: target: iblock: Quiet bool conversion warning with pr_preempt use
  scsi: dt-bindings: ufs: qcom: Fix ICE phandle
  scsi: core: Simplify scsi_cdl_check_cmd()
  scsi: isci: Fix comment typo
  scsi: smartpqi: Replace one-element arrays with flexible-array members
  scsi: target: tcmu: Replace strlcpy() with strscpy()
  scsi: ncr53c8xx: Replace strlcpy() with strscpy()
  scsi: lpfc: Fix lpfc_name struct packing
2023-07-08 12:35:18 -07:00
Tuo Li
0e881c0a4b scsi: lpfc: Fix a possible data race in lpfc_unregister_fcf_rescan()
The variable phba->fcf.fcf_flag is often protected by the lock
phba->hbalock() when is accessed. Here is an example in
lpfc_unregister_fcf_rescan():

  spin_lock_irq(&phba->hbalock);
  phba->fcf.fcf_flag |= FCF_INIT_DISC;
  spin_unlock_irq(&phba->hbalock);

However, in the same function, phba->fcf.fcf_flag is assigned with 0
without holding the lock, and thus can cause a data race:

  phba->fcf.fcf_flag = 0;

To fix this possible data race, a lock and unlock pair is added when
accessing the variable phba->fcf.fcf_flag.

Reported-by: BassCheck <bass@buaa.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Link: https://lore.kernel.org/r/20230630024748.1035993-1-islituo@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-05 21:14:22 -04:00
Linus Torvalds
ca7ce08d6a SCSI misc on 20230629
Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi,
 lpfc, qla2xxx).  We have a couple of major core changes impacting
 other systems: Command Duration Limits, which spills into block and
 ATA and block level Persistent Reservation Operations, which touches
 block, nvme, target and dm (both of which are added with merge commits
 containing a cover letter explaining what's going on).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZJ19cSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfZpAQCQBuWR
 ELcOhsaG5KzO6xLWcH8mjsOoxffKvazZjTKXlAD5ATEv7++E250oKS3t+yfjae5I
 Lc195MlDju85ItUQgfk=
 =U9ik
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi,
  lpfc, qla2xxx).

  We have a couple of major core changes impacting other systems:

   - Command Duration Limits, which spills into block and ATA

   - block level Persistent Reservation Operations, which touches block,
     nvme, target and dm

  Both of these are added with merge commits containing a cover letter
  explaining what's going on"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits)
  scsi: core: Improve warning message in scsi_device_block()
  scsi: core: Replace scsi_target_block() with scsi_block_targets()
  scsi: core: Don't wait for quiesce in scsi_device_block()
  scsi: core: Don't wait for quiesce in scsi_stop_queue()
  scsi: core: Merge scsi_internal_device_block() and device_block()
  scsi: sg: Increase number of devices
  scsi: bsg: Increase number of devices
  scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue
  scsi: ufs: ufs-pci: Add support for Intel Arrow Lake
  scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT
  scsi: ufs: wb: Add explicit flush_threshold sysfs attribute
  scsi: ufs: ufs-qcom: Switch to the new ICE API
  scsi: ufs: dt-bindings: qcom: Add ICE phandle
  scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk
  scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk
  scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC
  scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR
  scsi: ufs: core: Remove dedicated hwq for dev command
  scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command
  scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes
  ...
2023-06-30 11:57:07 -07:00
Arnd Bergmann
00c2cae6b6 scsi: lpfc: Fix lpfc_name struct packing
clang points out that the lpfc_name structure has an 8-byte alignment
requirement on most architectures, but is embedded in a number of other
structures that are forced to be only 1-byte aligned:

drivers/scsi/lpfc/lpfc_hw.h:1516:30: error: field pe within 'struct lpfc_fdmi_reg_port_list' is less aligned than 'struct lpfc_fdmi_port_entry' and is usually due to 'struct lpfc_fdmi_reg_port_list' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
        struct lpfc_fdmi_port_entry pe;
drivers/scsi/lpfc/lpfc_hw.h:850:19: error: field portName within 'struct _ADISC' is less aligned than 'struct lpfc_name' and is usually due to 'struct _ADISC' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
drivers/scsi/lpfc/lpfc_hw.h:851:19: error: field nodeName within 'struct _ADISC' is less aligned than 'struct lpfc_name' and is usually due to 'struct _ADISC' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
drivers/scsi/lpfc/lpfc_hw.h:922:19: error: field portName within 'struct _RNID' is less aligned than 'struct lpfc_name' and is usually due to 'struct _RNID' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
drivers/scsi/lpfc/lpfc_hw.h:923:19: error: field nodeName within 'struct _RNID' is less aligned than 'struct lpfc_name' and is usually due to 'struct _RNID' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]

From the git history, I can see that all the __packed annotations were done
specifically to avoid introducing implicit padding around the lpfc_name
instances, though this was probably the wrong approach.

To improve this, only annotate the one uint64_t field inside of lpfc_name
as packed, with an explicit 4-byte alignment, as is the default already on
the 32-bit x86 ABI but not on most others. With this, the other __packed
annotations can be removed again, as this avoids the incorrect padding.

Two other structures change their layout as a result of this change:

 - struct _LOGO never gained a __packed annotation even though it has the
   same alignment problem as the others but is not used anywhere in the
   driver today.

 - struct serv_param similarly has this issue, and it is used, my guess is
   that this is only an internal structure rather than part of a binary
   interface, so the padding has no negative effect here.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230616090705.2623408-1-arnd@kernel.org
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-21 21:09:18 -04:00
Justin Tee
9cefd6e7e0 scsi: lpfc: Fix incorrect big endian type assignment in bsg loopback path
The kernel test robot reported sparse warnings regarding incorrect type
assignment for __be16 variables in bsg loopback path.

Change the flagged lines to use the be16_to_cpu() and cpu_to_be16() macros
appropriately.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230614175944.3577-1-justintee8345@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202306110819.sDIKiGgg-lkp@intel.com/
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-14 21:57:48 -04:00
Gustavo A. R. Silva
a48e2c328c scsi: lpfc: Avoid -Wstringop-overflow warning
Prevent any potential integer wrapping issue, and avoid a
-Wstringop-overflow warning by using the check_mul_overflow() helper.

drivers/scsi/lpfc/lpfc.h:
837:#define LPFC_RAS_MIN_BUFF_POST_SIZE (256 * 1024)

drivers/scsi/lpfc/lpfc_debugfs.c:
2266 size = LPFC_RAS_MIN_BUFF_POST_SIZE * phba->cfg_ras_fwlog_buffsize;

this can wrap to negative if cfg_ras_fwlog_buffsize is large
enough. And even when in practice this is not possible (due to
phba->cfg_ras_fwlog_buffsize never being larger than 4[1]), the
compiler is legitimately warning us about potentially buggy code.

Fix the following warning seen under GCC-13:
In function ‘lpfc_debugfs_ras_log_data’,
    inlined from ‘lpfc_debugfs_ras_log_open’ at drivers/scsi/lpfc/lpfc_debugfs.c:2271:15:
drivers/scsi/lpfc/lpfc_debugfs.c:2210:25: warning: ‘memcpy’ specified bound between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
 2210 |                         memcpy(buffer + copied, dmabuf->virt,
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2211 |                                size - copied - 1);
      |                                ~~~~~~~~~~~~~~~~~~

Link: https://github.com/KSPP/linux/issues/305
Link: https://lore.kernel.org/linux-hardening/CABPRKS8zyzrbsWt4B5fp7kMowAZFiMLKg5kW26uELpg1cDKY3A@mail.gmail.com/ [1]
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/ZHkseX6TiFahvxJA@work
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-07 21:20:21 -04:00
Justin Tee
bb26224ed4 scsi: lpfc: Use struct_size() helper
Prefer struct_size() over open-coded versions of idiom:

sizeof(struct-with-flex-array) + sizeof(typeof-flex-array-elements) * count

where count is the max number of items the flexible array is supposed to
contain.

Link: https://github.com/KSPP/linux/issues/160
Co-developed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230531223319.24328-1-justintee8345@gmail.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-07 21:20:21 -04:00
Justin Tee
6e8a669e61 scsi: lpfc: Fix incorrect big endian type assignments in FDMI and VMID paths
The kernel test robot reported sparse warnings regarding the improper usage
of beXX_to_cpu() macros.

Change the flagged FDMI and VMID member variables to __beXX and redo the
beXX_to_cpu() macros appropriately.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230530191405.21580-1-justintee8345@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202305261159.lTW5NYrv-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202305260751.NWFvhLY5-lkp@intel.com/
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:17:52 -04:00
Martin K. Petersen
21be4d0344 Merge patch series "lpfc: Update lpfc to revision 14.2.0.13"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.2.0.13

This patch set contains discovery bug fixes, firmware logging
improvements, clean up of CQ handling, and statistics collection
enhancements.

The patches were cut against Martin's 6.5/scsi-queue tree.

Link: https://lore.kernel.org/r/20230523183206.7728-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:15:07 -04:00
Justin Tee
b93f9eb8f4 scsi: lpfc: Copyright updates for 14.2.0.13 patches
Update copyrights to 2023 for files modified in the 14.2.0.13 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-10-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
48abf8b4b5 scsi: lpfc: Update lpfc version to 14.2.0.13
Update lpfc version to 14.2.0.13

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
93190ac1d4 scsi: lpfc: Enhance congestion statistics collection
Various improvements are made for collecting congestion statistics:

 - Pre-existing logic is replaced with use of an hrtimer for increased
   reporting accuracy.

 - Congestion timestamp information is reorganized into a single struct.

 - Common statistic collection logic is refactored into a helper routine.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
6a84d01508 scsi: lpfc: Clean up SLI-4 CQE status handling
There is mishandling of SLI-4 CQE status values larger than what is allowed
by the LPFC_IOCB_STATUS_MASK of 4 bits.  The LPFC_IOCB_STATUS_MASK is a
leftover SLI-3 construct and serves no purpose in SLI-4 path.

Remove the LPFC_IOCB_STATUS_MASK and clean up general CQE status handling
in SLI-4 completion paths.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
b9951e1cff scsi: lpfc: Change firmware upgrade logging to KERN_NOTICE instead of TRACE_EVENT
A firmware upgrade does not necessitate dumping of phba->dbg_log[] to kmsg
via LOG_TRACE_EVENT.  A simple KERN_NOTICE log message should suffice to
notify the user of successful or unsuccessful firmware upgrade.  As such,
firmware upgrade log messages are updated to use KERN_NOTICE instead of
LOG_TRACE_EVENT.  Additionally, in order to notify the user of reset type
for instantiating newly downloaded firmware, lpfc_log_msg's default
KERN_LEVEL is updated to 5 or KERN_NOTICE.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
9914a3d033 scsi: lpfc: Revise NPIV ELS unsol rcv cmpl logic to drop ndlp based on nlp_state
When NPIV ports are zoned to devices that support both initiator and target
mode, a remote device's initiated PRLI results in unintended final kref
clean up of the device's ndlp structure.  This disrupts NPIV ports'
discovery for target devices that support both initiator and target mode.

Modify the NPIV lpfc_drop_node clause such that we allow the ndlp to live
so long as it was in NLP_STE_PLOGI_ISSUE, NLP_STE_REG_LOGIN_ISSUE, or
NLP_STE_PRLI_ISSUE nlp_state.  This allows lpfc's issued PRLI completion
routine to determine if the final kref clean up should execute rather than
a remote device's issued PRLI.

Fixes: db651ec225 ("scsi: lpfc: Correct used_rpi count when devloss tmo fires with no recovery")
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
73ded37869 scsi: lpfc: Account for fabric domain ctlr device loss recovery
Pre-existing device loss recovery logic via the NLP_IN_RECOV_POST_DEV_LOSS
flag only handled Fabric Port Login, Fabric Controller, Management, and
Name Server addresses.

Fabric domain controllers fall under the same category for usage of the
NLP_IN_RECOV_POST_DEV_LOSS flag.  Add a default case statement to mark an
ndlp for device loss recovery.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-4-justintee8345@gmail.com
Acked-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
fd57a687d4 scsi: lpfc: Clear NLP_IN_DEV_LOSS flag if already in rediscovery
In dev_loss_tmo callback routine, we early return if the ndlp is in a state
of rediscovery.  This occurs when a target proactively PLOGIs or PRLIs
after an RSCN before the dev_loss_tmo callback routine is scheduled to run.
Move clear of the NLP_IN_DEV_LOSS flag before the ndlp state check in such
cases.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:20 -04:00
Justin Tee
a4157aaf0f scsi: lpfc: Fix use-after-free rport memory access in lpfc_register_remote_port()
Due to a target port D_ID swap, it is possible for the
lpfc_register_remote_port() routine to touch post mortem fc_rport memory
when trying to access fc_rport->dd_data.

The D_ID swap causes a simultaneous call to lpfc_unregister_remote_port(),
where fc_remote_port_delete() reclaims fc_rport memory.

Remove the fc_rport->dd_data->pnode NULL assignment because the following
line reassigns ndlp->rport with an fc_rport object from
fc_remote_port_add() anyways.  The pnode nullification is superfluous.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-2-justintee8345@gmail.com
Acked-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 18:14:19 -04:00
Azeem Shaikh
73be26b12d scsi: lpfc: Replace all non-returning strlcpy() with strscpy()
strlcpy() reads the entire source buffer first.  This read may exceed the
destination size limit.  This is both inefficient and can lead to linear
read overflows if a source string is not NUL-terminated [1].  In an effort
to remove strlcpy() completely [2], replace strlcpy() here with strscpy().
No return values were used, so direct replacement is safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Link: https://lore.kernel.org/r/20230530155745.343032-1-azeemshaikh38@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31 17:59:06 -04:00
Gustavo A. R. Silva
e90644b0ce scsi: lpfc: Replace one-element array with flexible-array member
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element arrays with flexible-array
members in a couple of structures, and refactor the rest of the code,
accordingly.

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines
on memcpy() and help us make progress towards globally enabling
-fstrict-flex-arrays=3 [1].

This results in no differences in binary output.

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/295
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/6c6dcab88524c14c47fd06b9332bd96162656db5.1684358315.git.gustavoars@kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22 18:02:02 -04:00
Justin Tee
fd9ffa6c74 scsi: lpfc: Update lpfc version to 14.2.0.12
Update lpfc version to 14.2.0.12.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08 07:16:05 -04:00
Justin Tee
a7b94c1592 scsi: lpfc: Replace blk_irq_poll intr handler with threaded IRQ
It has been determined that the threaded IRQ API accomplishes effectively
the same performance metrics as blk_irq_poll.  As blk_irq_poll is mostly
scheduled by the softirqd and handled in softirq context, this is not
entirely desired from a Fibre Channel driver context.  A threaded IRQ model
fits cleaner.  This patch replaces the blk_irq_poll logic with threaded
IRQ.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08 07:16:05 -04:00
Justin Tee
5fc849d805 scsi: lpfc: Add new RCQE status for handling DMA failures
A new RCQE status value indicating DMA failure when transferring
asynchronously received data to an RQE is introduced.  Such errors are
unexpected and handlers are updated to log KERN_ERR and dump lpfc's debug
trace buffer to kmsg.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08 07:16:05 -04:00
Justin Tee
779d61dfb9 scsi: lpfc: Update congestion warning notification period
The CMF_SYNC_WQE command is updated to use an 8-bit field sync period.  All
related variables used to calculate congestion warning notifications are
updated to 8-bit fields accordingly.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08 07:16:05 -04:00
Justin Tee
78e9e35004 scsi: lpfc: Match lock ordering of lpfc_cmd->buf_lock and hbalock for abort paths
The SCSI version of the abort handler routine, lpfc_abort_handler(), takes
the lpfc_cmd->buf_lock and then phba->hbalock.

Make the same change for the NVMe abort path, lpfc_nvme_fcp_abort(), to
have consistent lock ordering logic between the two abort paths.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08 07:16:05 -04:00
Justin Tee
97f975823f scsi: lpfc: Fix double free in lpfc_cmpl_els_logo_acc() caused by lpfc_nlp_not_used()
Smatch detected a double free path because lpfc_nlp_not_used() releases an
ndlp object before reaching lpfc_nlp_put() at the end of
lpfc_cmpl_els_logo_acc().

Remove the outdated lpfc_nlp_not_used() routine.  In
lpfc_mbx_cmpl_ns_reg_login(), replace the call with lpfc_nlp_put().  In
lpfc_cmpl_els_logo_acc(), replace the call with lpfc_unreg_rpi() and keep
the lpfc_nlp_put() at the end of the routine.  If ndlp's rpi was
registered, then lpfc_unreg_rpi()'s completion routine performs the final
ndlp clean up after lpfc_nlp_put() is called from lpfc_cmpl_els_logo_acc().
Otherwise if ndlp has no rpi registered, the lpfc_nlp_put() at the end of
lpfc_cmpl_els_logo_acc() is the final ndlp clean up.

Fixes: 4430f7fd09 ("scsi: lpfc: Rework locations of ndlp reference taking")
Cc: <stable@vger.kernel.org> # v5.11+
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/all/Y3OefhyyJNKH%2Fiaf@kili/
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08 07:16:04 -04:00
Justin Tee
84c868a702 scsi: lpfc: Fix verbose logging for SCSI commands issued to SES devices
For SES LUNs with scsi_device sector_size member set to zero, there is no
point to log an LBA.  When verbose FCP driver logging is enabled, sanity
check sector_size before calling scsi_get_lba() on a scsi_cmnd.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08 07:16:04 -04:00
Linus Torvalds
b68ee1c613 SCSI misc on 20230426
Updates to the usual drivers (megaraid_sas, scsi_debug, lpfc, target,
 mpi3mr, hisi_sas, arcmsr).  The major core change is the
 constification of the host templates (which touches everything) along
 with other minor fixups and clean ups.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZEmJACYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishU4FAP0WYhFC
 rkbY203/+ErUuwvOKum0VwJKUowCaUD0MBwScAD+Ok/NWobmjdXUBbPUbvVkr+hE
 8B/xs9hodX+1fVJcVG0=
 =fS/j
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (megaraid_sas, scsi_debug, lpfc, target,
  mpi3mr, hisi_sas, arcmsr).

  The major core change is the constification of the host templates
  (which touches everything) along with other minor fixups and clean
  ups"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
  scsi: ufs: mcq: Use pointer arithmetic in ufshcd_send_command()
  scsi: ufs: mcq: Annotate ufshcd_inc_sq_tail() appropriately
  scsi: cxlflash: s/semahpore/semaphore/
  scsi: lpfc: Silence an incorrect device output
  scsi: mpi3mr: Use IRQ save variants of spinlock to protect chain frame allocation
  scsi: scsi_debug: Fix missing error code in scsi_debug_init()
  scsi: hisi_sas: Work around build failure in suspend function
  scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup()
  scsi: mpt3sas: Fix an issue when driver is being removed
  scsi: mpt3sas: Remove HBA BIOS version in the kernel log
  scsi: target: core: Fix invalid memory access
  scsi: scsi_debug: Drop sdebug_queue
  scsi: scsi_debug: Only allow sdebug_max_queue be modified when no shosts
  scsi: scsi_debug: Use scsi_host_busy() in delay_store() and ndelay_store()
  scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in stop_all_queued()
  scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in sdebug_blk_mq_poll()
  scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd
  scsi: scsi_debug: Use scsi_block_requests() to block queues
  scsi: scsi_debug: Protect block_unblock_all_queues() with mutex
  scsi: scsi_debug: Change shost list lock to a mutex
  ...
2023-04-26 15:39:25 -07:00
Jun Chen
8bfb89f614 scsi: lpfc: Silence an incorrect device output
In lpfc_sli4_pci_mem_unset(), case LPFC_SLI_INTF_IF_TYPE_1 does not have a
break statement, resulting in an incorrect device output.

Fix this by adding a break statement before the default option.

Signed-off-by: Jun Chen <jun_c@hust.edu.cn>
Link: https://lore.kernel.org/r/20230410023724.3209455-1-jun_c@hust.edu.cn
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-04-11 21:23:48 -04:00
Shuchang Li
91a0c0c141 scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup()
When if_type equals zero and pci_resource_start(pdev, PCI_64BIT_BAR4)
returns false, drbl_regs_memmap_p is not remapped. This passes a NULL
pointer to iounmap(), which can trigger a WARN() on certain arches.

When if_type equals six and pci_resource_start(pdev, PCI_64BIT_BAR4)
returns true, drbl_regs_memmap_p may has been remapped and
ctrl_regs_memmap_p is not remapped. This is a resource leak and passes a
NULL pointer to iounmap().

To fix these issues, we need to add null checks before iounmap(), and
change some goto labels.

Fixes: 1351e69fc6 ("scsi: lpfc: Add push-to-adapter support to sli4")
Signed-off-by: Shuchang Li <lishuchang@hust.edu.cn>
Link: https://lore.kernel.org/r/20230404072133.1022-1-lishuchang@hust.edu.cn
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-04-11 21:03:00 -04:00
Martin K. Petersen
f467b865cf Merge branch '6.3/scsi-fixes' into 6.4/scsi-staging
Pull in the fixes branch to resolve an mpi3mr conflict reported by
sfr.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-31 21:45:14 -04:00
Linus Torvalds
ef5f68cc1f SCSI fixes on 20230310
20 Fixes all in drivers except the one zone storage revalidation fix
 to sd.  The megaraid_sas fixes are more on the level of a driver
 update (enabling crash dump and increasing lun number) but I thought
 you could let this slide on -rc1 and the next most extensive update is
 a load of fixes to mpi3mr.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZAuqwyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVm/AP4t1E3w
 Jc19sSzMV6OZc5x1eTSGzso80fkTeFgVXWOmcQD9G6ymH78hS0cb7FRIwtnHcV5r
 6TwbInHF2LwNKzPhuMI=
 =BdqN
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Twenty fixes all in drivers except the one zone storage revalidation
  fix to sd.

  The megaraid_sas fixes are more on the level of a driver update
  (enabling crash dump and increasing lun number) but I thought you
  could let this slide on -rc1 and the next most extensive update is a
  load of fixes to mpi3mr"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Fix wrong zone_write_granularity value during revalidate
  scsi: storvsc: Handle BlockSize change in Hyper-V VHD/VHDX file
  scsi: megaraid_sas: Driver version update to 07.725.01.00-rc1
  scsi: megaraid_sas: Add crash dump mode capability bit in MFI capabilities
  scsi: megaraid_sas: Update max supported LD IDs to 240
  scsi: mpi3mr: Bad drive in topology results kernel crash
  scsi: mpi3mr: NVMe command size greater than 8K fails
  scsi: mpi3mr: Return proper values for failures in firmware init path
  scsi: mpi3mr: Wait for diagnostic save during controller init
  scsi: mpi3mr: Driver unload crashes host when enhanced logging is enabled
  scsi: mpi3mr: ioctl timeout when disabling/enabling interrupt
  scsi: lpfc: Avoid usage of list iterator variable after loop
  scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read()
  scsi: ufs: mcq: qcom: Clean the return path of ufs_qcom_mcq_config_resource()
  scsi: ufs: mcq: qcom: Fix passing zero to PTR_ERR
  scsi: ufs: ufs-qcom: Remove impossible check
  scsi: ufs: core: Add soft dependency on governor_simpleondemand
  scsi: hisi_sas: Check devm_add_action() return value
  scsi: qla2xxx: Add option to disable FC2 Target support
  scsi: target: iscsi: Fix an error message in iscsi_check_key()
2023-03-10 20:45:53 -08:00
Martin K. Petersen
0b31b77f28 Merge patch series "PCI/AER: Remove redundant Device Control Error Reporting Enable"
Bjorn Helgaas <helgaas@kernel.org> says:

Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is native"),
which appeared in v6.0, the PCI core has enabled PCIe error reporting for
all devices during enumeration.

Remove driver code to do this and remove unnecessary includes of
<linux/aer.h> from several other drivers.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 22:01:40 -05:00
Bjorn Helgaas
e891681b1d scsi: lpfc: Drop redundant pci_enable_pcie_error_reporting()
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-8-helgaas@kernel.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 22:00:38 -05:00
Justin Tee
22871fe3b6 scsi: lpfc: Copyright updates for 14.2.0.11 patches
Update copyrights to 2023 for files modified in the 14.2.0.11 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-11-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:45 -05:00
Justin Tee
13b149bbcf scsi: lpfc: Update lpfc version to 14.2.0.11
Update lpfc version to 14.2.0.11.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-10-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:45 -05:00
Justin Tee
796876fdae scsi: lpfc: Revise lpfc_error_lost_link() reason code evaluation logic
Extended status reason code errors should mask off the IOERR_PARAM_MASK
before checking strict equalities for IOERR values.

Update the lpfc_error_lost_link() routine as such.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:45 -05:00
Justin Tee
27c2bcf00a scsi: lpfc: Skip waiting for register ready bits when in unrecoverable state
During tolerance tests that force an HBA to become unresponsive, rmmod
hangs resulting in the inability to remove the driver.

The lpfc_pci_remove_one_s4() routine attempts to submit a clean up mailbox
command via the lpfc_sli4_post_sync_mbox() routine, but ends up waiting
forever for a mailbox register to set its ready bit.  Because the HBA is in
an unrecoverable and unresponsive state, the ready bit will never be set.

Create a new routine called lpfc_sli4_unrecoverable_port(), which checks a
port status register's error notification bits.

Use the lpfc_sli4_unrecoverable_port() routine in ready bit check routines
to early return error if port is deemed unrecoverable.

Also, when the lpfc_handle_eratt_s4() handler detects an unrecoverable
state, call the lpfc_sli4_offline_eratt() routine to kick off flushing
outstanding I/O.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:45 -05:00
Justin Tee
db651ec225 scsi: lpfc: Correct used_rpi count when devloss tmo fires with no recovery
A fabric controller can sometimes send an RDP request right before a link
down event.  Because of this outstanding RDP request, the driver does not
remove the last reference count on its ndlp causing a potential leak of RPI
resources when devloss tmo fires.

In lpfc_cmpl_els_rsp(), modify the NPIV clause to always allow the
lpfc_drop_node() routine to execute when not registered with SCSI
transport.

This relaxes the contraint that an NPIV ndlp must be in a specific state in
order to call lpfc_drop node.  Logic is revised such that the
lpfc_drop_node() routine is always called to ensure the last ndlp decrement
occurs.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:44 -05:00
Justin Tee
1d0f9fea5d scsi: lpfc: Defer issuing new PLOGI if received RSCN before completing REG_LOGIN
When mapped to a target with multiple virtual ports, a link bounce
sometimes results in unsuccessful rediscovery of all of the target's
virtual ports.  This is because a succession of repeat RSCNs for the
virtual target ports leaves ndlps in the REG_LOGIN state with the
NLP_REG_LOGIN_SEND flag set.  With NLP_REG_LOGIN_SEND set, during the next
PLOGI, the driver will UNREG_RPI.  When UNREG_RPI is processed, the driver
can be in the middle of PRLI_ISSUE or MAPPED state resulting in an illegal
state transition by the discovery engine and stalling.

Fix by calling the discovery state machine with DEVICE_RECOVERY event
during RSCN processing.  This will set the NLP_IGNR_REG_CMPL bit and
prevent the old REG_LOGIN state from advancing.  Then for the new PLOGI
issue, add the check for the NLP_IGNR_REG_CMPL bit to delay issuing the new
PLOGI until the queued REG_LOGIN and UNREG_LOGIN have been processed.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:44 -05:00
Justin Tee
06578ac65e scsi: lpfc: Record LOGO state with discovery engine even if aborted
A target vendor array reboot in P2P topology can sometimes result in
unsuccessful rediscovery.

Rework the lpfc_cmpl_els_logo() routine such that when the LOGO completes
as a failure because of driver abort, the LOGO state is still recorded with
the discovery state machine.

This is a small rework to set LOGO completion without forcing a device
removal state change.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:44 -05:00
Justin Tee
c0d6071aa2 scsi: lpfc: Fix lockdep warning for rx_monitor lock when unloading driver
Lockdep enabled kernels report a theoretical deadlock state where the
cmf_timer interrupt occurs while the rx_monitor ring is being destroyed.

During rmmod, the cmf_timer is cancelled prior to the
lpfc_rx_monitor_destroy_ring call.  This actually eliminates the need to
take the rx_monitor ring lock in lpfc_rx_monitor_destroy_ring.  Thus, just
remove lock/unlock of rx_monitor in lpfc_rx_monitor_destroy_ring.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:44 -05:00
Justin Tee
bf21c9bb62 scsi: lpfc: Reorder freeing of various DMA buffers and their list removal
Code sections where DMA resources are freed before list removal are
reworked to ensure item removal before being freed.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:44 -05:00
Justin Tee
c6087b82a9 scsi: lpfc: Prevent lpfc_debugfs_lockstat_write() buffer overflow
A static code analysis tool flagged the possibility of buffer overflow when
using copy_from_user() for a debugfs entry.

Currently, it is possible that copy_from_user() copies more bytes than what
would fit in the mybuf char array.  Add a min() restriction check between
sizeof(mybuf) - 1 and nbytes passed from the userspace buffer to protect
against buffer overflow.

Link: https://lore.kernel.org/r/20230301231626.9621-2-justintee8345@gmail.com
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09 21:21:44 -05:00
Jakob Koschel
2850b23e9f scsi: lpfc: Avoid usage of list iterator variable after loop
If the &epd_pool->list is empty when executing
lpfc_get_io_buf_from_expedite_pool() the function would return an invalid
pointer. Even in the case if the list is guaranteed to be populated, the
iterator variable should not be used after the loop to be more robust for
future changes.

Linus proposed to avoid any use of the list iterator variable after the
loop, in the attempt to move the list iterator variable declaration into
the macro to avoid any potential misuse after the loop [1].

Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Signed-off-by: Jakob Koschel <jkl820.git@gmail.com>
Link: https://lore.kernel.org/r/20230301-scsi-lpfc-avoid-list-iterator-after-loop-v1-1-325578ae7561@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06 18:33:12 -05:00
Justin Tee
312320b0e0 scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read()
If kzalloc() fails in lpfc_sli4_cgn_params_read(), then we rely on
lpfc_read_object()'s routine to NULL check pdata.

Currently, an early return error is thrown from lpfc_read_object() to
protect us from NULL ptr dereference, but the errno code is -ENODEV.

Change the errno code to a more appropriate -ENOMEM.

Reported-by: Kang Chen <void0red@gmail.com>
Link: https://lore.kernel.org/all/20230226102338.3362585-1-void0red@gmail.com
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230228044336.5195-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06 18:33:12 -05:00
Linus Torvalds
8ca09d5fa3 cpumask: fix incorrect cpumask scanning result checks
It turns out that commit 596ff4a09b ("cpumask: re-introduce
constant-sized cpumask optimizations") exposed a number of cases of
drivers not checking the result of "cpumask_next()" and friends
correctly.

The documented correct check for "no more cpus in the cpumask" is to
check for the result being equal or larger than the number of possible
CPU ids, exactly _because_ we've always done those constant-sized
cpumask scans using a widened type before.  So the return value of a
cpumask scan should be checked with

	if (cpu >= nr_cpu_ids)
		...

because the cpumask scan did not necessarily stop exactly *at* that
maximum CPU id.

But a few cases ended up instead using checks like

	if (cpu == nr_cpumask_bits)
		...

which used that internal "widened" number of bits.  And that used to
work pretty much by accident (ok, in this case "by accident" is simply
because it matched the historical internal implementation of the cpumask
scanning, so it was more of a "intentionally using implementation
details rather than an accident").

But the extended constant-sized optimizations then did that internal
implementation differently, and now that code that did things wrong but
matched the old implementation no longer worked at all.

Which then causes subsequent odd problems due to using what ends up
being an invalid CPU ID.

Most of these cases require either unusual hardware or special uses to
hit, but the random.c one triggers quite easily.

All you really need is to have a sufficiently small CONFIG_NR_CPUS value
for the bit scanning optimization to be triggered, but not enough CPUs
to then actually fill that widened cpumask.  At that point, the cpumask
scanning will return the NR_CPUS constant, which is _not_ the same as
nr_cpumask_bits.

This just does the mindless fix with

   sed -i 's/== nr_cpumask_bits/>= nr_cpu_ids/'

to fix the incorrect uses.

The ones in the SCSI lpfc driver in particular could probably be fixed
more cleanly by just removing that repeated pattern entirely, but I am
not emptionally invested enough in that driver to care.

Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/
Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/lkml/CAMuHMdUKo_Sf7TjKzcNDa8Ve+6QrK+P8nSQrSQ=6LTRmcBKNww@mail.gmail.com/
Reported-by: Vernon Yang <vernon2gm@gmail.com>
Link: https://lore.kernel.org/lkml/20230306160651.2016767-1-vernon2gm@gmail.com/
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-06 12:15:13 -08:00
Linus Torvalds
06caa75154 SCSI misc on 20230303
Updates that missed the first pull, mostly because of needing more
 soak time.  Driver updates (zfcp, ufs, mpi3mr, plus two ipr bug
 fixes), an enclosure services (ses) update (mostly bug fixes) and
 other minor bug fixes and changes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZAJf2SYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWYeAP9u6umX
 5Trq6mtfdPyhSxIhgD6AwJDgVkeApSKAHZRu0AD/dfMTQmeaJory4LD4JW+uAgVl
 yFPVz9VPlKTiaM2lwUI=
 =BKzr
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "Updates that missed the first pull, mostly because of needing more
  soak time.

  Driver updates (zfcp, ufs, mpi3mr, plus two ipr bug fixes), an
  enclosure services (ses) update (mostly bug fixes) and other minor bug
  fixes and changes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
  scsi: zfcp: Trace when request remove fails after qdio send fails
  scsi: zfcp: Change the type of all fsf request id fields and variables to u64
  scsi: zfcp: Make the type for accessing request hashtable buckets size_t
  scsi: ufs: core: Simplify ufshcd_execute_start_stop()
  scsi: ufs: core: Rely on the block layer for setting RQF_PM
  scsi: core: Extend struct scsi_exec_args
  scsi: lpfc: Fix double word in comments
  scsi: core: Remove the /proc/scsi/${proc_name} directory earlier
  scsi: core: Fix a source code comment
  scsi: cxgbi: Remove unneeded version.h include
  scsi: qedi: Remove unneeded version.h include
  scsi: mpi3mr: Remove unneeded version.h include
  scsi: mpi3mr: Fix missing mrioc->evtack_cmds initialization
  scsi: mpi3mr: Use number of bits to manage bitmap sizes
  scsi: mpi3mr: Remove unnecessary memcpy() to alltgt_info->dmi
  scsi: mpi3mr: Fix issues in mpi3mr_get_all_tgt_info()
  scsi: mpi3mr: Fix an issue found by KASAN
  scsi: mpi3mr: Replace 1-element array with flex-array
  scsi: ipr: Work around fortify-string warning
  scsi: ipr: Make ipr_probe_ioa_part2() return void
  ...
2023-03-03 14:41:50 -08:00
Linus Torvalds
8762069330 SCSI misc on 20230222
Updates to the usual drivers (ufs, lpfc, qla2xxx, libsas).  The major
 core change is a rework to remove the two helpers around
 scsi_execute_cmd and use it as the only submission interface along
 with other minor fixes and updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY/Z7BSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfF9AP9HzZaZ
 0XumuxjchcJHRntcIAzb9kE62SSWFxBrgZQCtAEAyhAO1zK284esgCa+arkzEC7p
 uhxpufQSGnYvbasStsU=
 =VsH1
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, lpfc, qla2xxx, libsas).

  The major core change is a rework to remove the two helpers around
  scsi_execute_cmd and use it as the only submission interface along
  with other minor fixes and updates"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (142 commits)
  scsi: ufs: core: Fix an error handling path in ufshcd_read_desc_param()
  scsi: ufs: core: Fix device management cmd timeout flow
  scsi: aic94xx: Add missing check for dma_map_single()
  scsi: smartpqi: Replace one-element array with flexible-array member
  scsi: mpt3sas: Fix a memory leak
  scsi: qla2xxx: Remove the unused variable wwn
  scsi: ufs: core: Fix kernel-doc syntax
  scsi: ufs: core: Add hibernation callbacks
  scsi: snic: Fix memory leak with using debugfs_lookup()
  scsi: ufs: core: Limit DMA alignment check
  scsi: Documentation: Correct spelling
  scsi: Documentation: Correct spelling
  scsi: target: Documentation: Correct spelling
  scsi: aacraid: Allocate cmd_priv with scsicmd
  scsi: ufs: qcom: dt-bindings: Add SM8550 compatible string
  scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW version major 5
  scsi: ufs: qcom: fix platform_msi_domain_free_irqs() reference
  scsi: ufs: core: Enable DMA clustering
  scsi: ufs: exynos: Fix the maximum segment size
  scsi: ufs: exynos: Fix DMA alignment for PAGE_SIZE != 4096
  ...
2023-02-22 13:41:41 -08:00
Bo Liu
442336a5a9 scsi: lpfc: Fix double word in comments
Remove the repeated word "the" in comments.

[mkp: fixed additional typos in the changed lines]

Link: https://lore.kernel.org/r/20230217083046.4090-1-liubo03@inspur.com
Signed-off-by: Bo Liu <liubo03@inspur.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-02-21 22:00:51 -05:00
Muneendra
64fd2ba977 scsi: scsi_transport_fc: Add an additional flag to fc_host_fpin_rcv()
The LLDD and the stack currently process FPINs received from the fabric,
but the stack is not aware of any action taken by the driver to alleviate
congestion. The current interface between the driver and the SCSI stack is
limited to passing the notification mainly for statistics and heuristics.

The reaction to an FPIN could be handled either by the driver or by the
stack (marginal path and failover). Amend the interface to indicate if
action on an FPIN has already been reacted to by the LLDDs or not. Add an
additional flag to fc_host_fpin_rcv() to indicate if the FPIN has been
acknowledged/reacted to by the driver.

Also added a new event code FCH_EVT_LINK_FPIN_ACK to notify to the user
that the event has been acknowledged/reacted by the LLDD driver

Link: https://lore.kernel.org/r/20230209034326.882514-1-muneendra.kumar@broadcom.com
Co-developed-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Co-developed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-02-21 18:03:29 -05:00
Jakub Kicinski
2870c4d6a5 net: add missing includes of linux/sched/clock.h
Number of files depend on linux/sched/clock.h getting included
by linux/skbuff.h which soon will no longer be the case.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:19:46 +00:00
Justin Tee
191b5a3877 scsi: lpfc: Copyright updates for 14.2.0.10 patches
Update copyrights to 2023 for files modified in the 14.2.0.10 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-01-12 00:03:15 -05:00
Justin Tee
41cf6bbe3d scsi: lpfc: Update lpfc version to 14.2.0.10
Update lpfc version to 14.2.0.10

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-01-12 00:03:15 -05:00
Justin Tee
96fb8c34e5 scsi: lpfc: Introduce new attention types for lpfc_sli4_async_fc_evt() handler
Define new FC Link ACQE with new attention types 0x8 (Link Activation
Failure) and 0x9 (Link Reset Protocol Event).

Both attention types are meant to be informational-only type ACQEs with no
action required.

0x8 is reported for diagnostic purposes, while 0x9 is posted during a
normal link up transition when activating BB Credit Recovery feature.

As such, modify lpfc_sli4_async_fc_evt() logic to log the attention types
according to its severity and early return when informational-only
attention types are encountered.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-01-12 00:03:15 -05:00
Justin Tee
f1d2337d3e scsi: lpfc: Reinitialize internal VMID data structures after FLOGI completion
After enabling VMID, an issue LIP test was erasing fabric switch VMID
information.

Introduce a lpfc_reinit_vmid() routine, which reinitializes all VMID data
structures upon FLOGI completion in fabric topology.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-01-12 00:03:15 -05:00
Justin Tee
21681b81b9 scsi: lpfc: Fix use-after-free KFENCE violation during sysfs firmware write
During the sysfs firmware write process, a use-after-free read warning is
logged from the lpfc_wr_object() routine:

  BUG: KFENCE: use-after-free read in lpfc_wr_object+0x235/0x310 [lpfc]
  Use-after-free read at 0x0000000000cf164d (in kfence-#111):
  lpfc_wr_object+0x235/0x310 [lpfc]
  lpfc_write_firmware.cold+0x206/0x30d [lpfc]
  lpfc_sli4_request_firmware_update+0xa6/0x100 [lpfc]
  lpfc_request_firmware_upgrade_store+0x66/0xb0 [lpfc]
  kernfs_fop_write_iter+0x121/0x1b0
  new_sync_write+0x11c/0x1b0
  vfs_write+0x1ef/0x280
  ksys_write+0x5f/0xe0
  do_syscall_64+0x59/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

The driver accessed wr_object pointer data, which was initialized into
mailbox payload memory, after the mailbox object was released back to the
mailbox pool.

Fix by moving the mailbox free calls to the end of the routine ensuring
that we don't reference internal mailbox memory after release.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-01-12 00:03:15 -05:00
Justin Tee
c051f1a424 scsi: lpfc: Exit PRLI completion handling early if ndlp not in PRLI_ISSUE state
In a large SAN testing configuration, frequent target port toggle tests are
occasionally resulting in missing lun path rediscoveries.  An outstanding
PRLI can be inflight when a target RSCN dissappearance occurs, causing the
driver to retry PRLIs using invalid rpi contexts.

Fix by verifying that an ndlp's state was not restarted from PRLI_ISSUE
due to an intermediate RSCN.  If not in a valid state, early exit PRLI
completion handling.

The last follow up RSCN indicating target reappearance retriggers
PLOGI/PRLI with a valid rpi context and is expected to succeed in LUN path
rediscovery.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-01-12 00:03:14 -05:00