linux

Author	SHA1	Message	Date
Martin K. Petersen	a8f808839a	Merge branch '5.11/scsi-postmerge' into 5.11/scsi-fixes Merge two commits that had dependencies on other 5.11 trees (the block and the irq trees respectively). - We reverted a megaraid_sas change in 5.10 due to missing block layer plumbing. Now that this is in place, reinstate the change. - The hisi_sas driver had a dependency on a driver core irq change that went in through Thomas' tree. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-01-04 13:27:39 -05:00
John Garry	74a2921948	scsi: hisi_sas: Expose HW queues for v2 hw As a performance enhancement, make the completion queue interrupts managed. In addition, in commit `bf0beec060` ("blk-mq: drain I/O when all CPUs in a hctx are offline"), CPU hotplug for MQ devices using managed interrupts is made safe. So expose HW queues to blk-mq to take advantage of this. Flag Scsi_host.host_tagset is also set to ensure that the HBA is not sent more commands than it can handle. However the driver still does not use request tag for IPTT as there are many HW bugs means that special rules apply for IPTT allocation. Link: https://lore.kernel.org/r/1606905417-183214-6-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-21 22:21:05 -05:00
Linus Torvalds	60f7c503d9	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, qla2xxx, smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups, a major power management rework and a load of assorted minor updates. There are a few core updates (formatting fixes being the big one) but nothing major this cycle" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits) scsi: mpt3sas: Update driver version to 36.100.00.00 scsi: mpt3sas: Handle trigger page after firmware update scsi: mpt3sas: Add persistent MPI trigger page scsi: mpt3sas: Add persistent SCSI sense trigger page scsi: mpt3sas: Add persistent Event trigger page scsi: mpt3sas: Add persistent Master trigger page scsi: mpt3sas: Add persistent trigger pages support scsi: mpt3sas: Sync time periodically between driver and firmware scsi: qla2xxx: Update version to 10.02.00.104-k scsi: qla2xxx: Fix device loss on 4G and older HBAs scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry scsi: qla2xxx: Fix the call trace for flush workqueue scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines scsi: qla2xxx: Handle aborts correctly for port undergoing deletion scsi: qla2xxx: Fix N2N and NVMe connect retry failure scsi: qla2xxx: Fix FW initialization error on big endian machines scsi: qla2xxx: Fix crash during driver load on big endian machines scsi: qla2xxx: Fix compilation issue in PPC systems scsi: qla2xxx: Don't check for fw_started while posting NVMe command scsi: qla2xxx: Tear down session if FW say it is down ...	2020-12-16 13:34:31 -08:00
Xiang Chen	359db63378	scsi: hisi_sas: Select a suitable queue for internal I/Os For when managed interrupts are used (and shost->nr_hw_queues is set), a fixed queue - set per-device - is still used for internal I/Os. If all the CPUs mapped to that queue are offlined, then the completions for that queue are not serviced and any internal I/Os will time out. Fix by selecting a queue for internal I/Os from the queue mapped from the current CPU in this scenario. This is still not ideal as it does not deal with CPU hotplug for inflight internal I/Os, and needs proper support from [0]. [0] https://lore.kernel.org/linux-scsi/20200703130122.111448-1-hare@suse.de/T/#m7d77d049b18f33a24ef206af69ebb66d07440556 Link: https://lore.kernel.org/r/1607347855-59091-1-git-send-email-john.garry@huawei.com Fixes: `8d98416a55` ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-07 21:23:51 -05:00
Ahmed S. Darwish	18577cdcae	scsi: hisi_sas: Remove preemptible() hisi_sas_task_exec() uses preemptible() to see if it's safe to block. This does not work for CONFIG_PREEMPT_COUNT=n kernels in which preemptible() always returns 0. The problem is masked when enabling some of the common Kconfig.debug options (like CONFIG_DEBUG_ATOMIC_SLEEP), as they implicitly enable the preemption counter. In general, driver leaf functions should not make logic decisions based on the context they're called from. The caller should be the entity responsible for explicitly indicating context. Since hisi_sas_task_exec() already has a gfp_t flags parameter, use it as the explicit context marker. Link: https://lore.kernel.org/r/20201126132952.2287996-3-bigeasy@linutronix.de Fixes: `214e702d4b` ("scsi: hisi_sas: Adjust task reject period during host reset") Fixes: `550c0d89d5` ("scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()") Cc: Xiaofei Tan <tanxiaofei@huawei.com> Cc: Xiang Chen <chenxiang66@hisilicon.com> Cc: John Garry <john.garry@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-01 00:03:52 -05:00
Luo Jiaxing	623a4b6d5c	scsi: hisi_sas: Move debugfs code to v3 hw driver Relocate all the debugfs code for DFX to v3 hw since no other versions support it. Link: https://lore.kernel.org/r/1606207594-196362-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-11-30 23:42:16 -05:00
John Garry	fab09aaee8	scsi: hisi_sas: Stop using queue #0 always for v2 hw In commit `8d98416a55` ("scsi: hisi_sas: Switch v3 hw to MQ"), the dispatch function was changed to choose the delivery queue based on the request tag HW queue index. This heavily degrades performance for v2 hw, since the HW queues are not exposed there, and, as such, HW queue #0 is used for every command. Revert to previous behaviour for when nr_hw_queues is not set, that being to choose the HW queue based on target device index. Link: https://lore.kernel.org/r/1602750425-240341-1-git-send-email-john.garry@huawei.com Fixes: `8d98416a55` ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-10-26 17:34:27 -04:00
Linus Torvalds	55e0500eb5	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi, hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes. There are only three core changes: adding sense codes, cleaning up noretry and adding an option for limitless retries" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits) scsi: hisi_sas: Recover PHY state according to the status before reset scsi: hisi_sas: Filter out new PHY up events during suspend scsi: hisi_sas: Add device link between SCSI devices and hisi_hba scsi: hisi_sas: Add check for methods _PS0 and _PR0 scsi: hisi_sas: Add controller runtime PM support for v3 hw scsi: hisi_sas: Switch to new framework to support suspend and resume scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq() scsi: qedf: Remove redundant assignment to variable 'rc' scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store() scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed scsi: sun_esp: Use module_platform_driver to simplify the code scsi: sun3x_esp: Use module_platform_driver to simplify the code scsi: sni_53c710: Use module_platform_driver to simplify the code scsi: qlogicpti: Use module_platform_driver to simplify the code scsi: mac_esp: Use module_platform_driver to simplify the code scsi: jazz_esp: Use module_platform_driver to simplify the code scsi: mvumi: Fix error return in mvumi_io_attach() scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req() scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs() ...	2020-10-14 15:15:35 -07:00
Xiang Chen	69f4ec1edb	scsi: hisi_sas: Recover PHY state according to the status before reset Currently the PHY state is set according to the state of the PHYs after reset. This is invalid as the PHYs are already re-initialized. Set PHY state according to the state before the reset instead of after. Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-10-06 20:47:06 -04:00
Xiang Chen	b14a37e011	scsi: hisi_sas: Filter out new PHY up events during suspend Currently sas_resume_ha() is called while resuming the controller to wait for all suspended PHYs to come up and all the libsas events to be completed. There is a scenario which will cause task hung: For direct attach with two disks connected with two PHYs, disable phy0 before suspending the disk on phy1 and the controller, then enable phy0 and resume the controller, and task hung occurs as follows: [ 591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0] [ 593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined [ 593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume [ 593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata) [ 593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200) [ 593.148350] sas: DOING DISCOVERY on port 0, pid:948 [ 593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found [ 593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0 [ 593.165663] sas: ata7: end_device-2:0: dev error handler [ 593.165730] sas: ata2: end_device-2:1: dev error handler [ 593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2 [ 593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down [ 593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133 [ 593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32) [ 593.514295] ata7.00: configured for UDMA/133 [ 593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 [ 593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005 serial:S45NNA0M712225 [ 593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0 [ 593.543674] device_link_add 324 [ 593.546801] device_link_add 352 [ 593.549930] device_link_add 406 [ 593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0 [ 593.559208] device_link_add 444 [ 593.562335] device_link_add 455 [ 593.565517] scsi 2:0:2:0: Direct-Access ATA SAMSUNG MZ7LH960 404Q PQ: 0 ANSI: 5 [ 620.057464] phy-2:1: resume timeout [ 738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds. [ 738.848295] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.862155] kworker/u256:0 D 0 8 2 0x00000028 [ 738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker [ 738.873693] Call trace: [ 738.876133] __switch_to+0xf4/0x148 [ 738.879613] __schedule+0x270/0x5d8 [ 738.883091] schedule+0x78/0x110 [ 738.886307] schedule_timeout+0x1ac/0x280 [ 738.890299] wait_for_completion+0x94/0x138 [ 738.894472] flush_workqueue+0x114/0x438 [ 738.898377] sas_porte_bytes_dmaed+0x400/0x500 [ 738.902801] sas_port_event_worker+0x28/0x40 [ 738.907053] process_one_work+0x1e8/0x360 [ 738.911046] worker_thread+0x44/0x478 [ 738.914698] kthread+0x150/0x158 [ 738.917915] ret_from_fork+0x10/0x1c [ 738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds. [ 738.928550] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.942408] kworker/u256:1 D 0 948 2 0x00000028 [ 738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain [ 738.953766] Call trace: [ 738.956203] __switch_to+0xf4/0x148 [ 738.959678] __schedule+0x270/0x5d8 [ 738.963152] schedule+0x78/0x110 [ 738.966368] rpm_resume+0xcc/0x550 [ 738.969757] __pm_runtime_resume+0x3c/0x88 [ 738.973836] rpm_get_suppliers+0x50/0x148 [ 738.977829] __pm_runtime_set_status+0x124/0x2f0 [ 738.982427] scsi_sysfs_add_sdev+0x1a0/0x2a8 [ 738.986679] scsi_probe_and_add_lun+0x888/0xab0 [ 738.991190] __scsi_scan_target+0xec/0x520 [ 738.995268] scsi_scan_target+0x11c/0x128 [ 738.999261] sas_rphy_add+0x15c/0x1e8 [ 739.002907] sas_probe_devices+0xe4/0x150 [ 739.006899] sas_discover_domain+0x33c/0x588 [ 739.011150] process_one_work+0x1e8/0x360 [ 739.015143] worker_thread+0x44/0x478 [ 739.018789] kthread+0x150/0x158 [ 739.022003] ret_from_fork+0x10/0x1c ... If an extra phy0 up happens during resume of the SAS controller, it will emit a new libsas event (event PORTE_BYTES_DMAED and event DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to resume supplier (host controller). For runtime PM core, if device is in the resuming state, the later resume request of the device will wait for previous resume request to complete synchronously. At that point in time the state of the controller is still resuming as it waits for all libsas events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked as the state of the controller is resuming which causes a deadlock. To avoid the issue, filter out new PHY up events while the controller is suspended. Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-10-06 20:47:06 -04:00
John Garry	8d98416a55	scsi: hisi_sas: Switch v3 hw to MQ Now that the block layer provides a shared tag, we can switch the driver to expose all HW queues. Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-10-06 08:33:44 -06:00
Luo Jiaxing	26f84f9bc3	scsi: hisi_sas: Code style cleanup Remove extra blank lines and add spaces around operators. Link: https://lore.kernel.org/r/1598958790-232272-9-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-09-02 22:49:08 -04:00
Xiang Chen	b601577df6	scsi: hisi_sas: Add missing newlines Newline is missing from some printk() statements. Add them. Link: https://lore.kernel.org/r/1598958790-232272-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-09-02 22:49:08 -04:00
Luo Jiaxing	981cc23e74	scsi: hisi_sas: Add BIST support for fixed code pattern Through the new debugfs interface the user can select fixed code patterns. Add two new interfaces fixed_code and fixed_code1. Link: https://lore.kernel.org/r/1598958790-232272-7-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-09-02 22:49:08 -04:00
Luo Jiaxing	2c4d582322	scsi: hisi_sas: Add BIST support for phy FFE Add BIST support for phy FFE (Feed forward equalizer) setting. The user can configure FFE through the new debugfs interface. FFE is a parameter used for link layer control. It will affect the link quality between the SAS controller and the backplane. In the BIST test, the FFE interface is provided to assist board testers in optimizing link parameters. The modification of the FFE parameter will affect the test after BIST or the normal running of the board. The user should save the initial FFE values and restore them after BIST test is complete. Link: https://lore.kernel.org/r/1598958790-232272-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-09-02 22:49:08 -04:00
Xiang Chen	847e835529	scsi: hisi_sas: Avoid accessing to SSP task for SMP I/Os hisi_sas_slot_task_free() attempts to dereference SSP task for non-ATA tasks. If the task is SMP, the code may reference the wrong structure although this may not cause any problems. To avoid this, only access to SSP task when slot->n_elem_dif is not 0 which indicates this is an SSP task. Link: https://lore.kernel.org/r/1598958790-232272-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-09-02 22:49:07 -04:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Luo Jiaxing	e16b9ed61e	scsi: hisi_sas: Do not reset phy timer to wait for stray phy up We found out that after phy up, the hardware reports another oob interrupt but did not follow a phy up interrupt: oob ready -> phy up -> DEV found -> oob read -> wait phy up -> timeout We run link reset when wait phy up timeout, and it send a normal disk into reset processing. So we made some circumvention action in the code, so that this abnormal oob interrupt will not start the timer to wait for phy up. Link: https://lore.kernel.org/r/1589552025-165012-2-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:23:57 -04:00
John Garry	11e673206f	scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask In future we will want to use hisi_sas_cq.pci_irq_mask for non-pci interrupt masks, so rename to be more general. Link: https://lore.kernel.org/r/1579522957-4393-7-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-20 19:31:14 -05:00
Luo Jiaxing	3cd2f3c35d	scsi: hisi_sas: Modify the file permissions of trigger_dump to write only The trigger_dump file is only used to manually trigger the dump, and did not provide a read callback function for it, so its file permission setting to 600 is wrong,and should be changed to 200. Link: https://lore.kernel.org/r/1579522957-4393-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-20 19:31:14 -05:00
Xiang Chen	e9dc5e11c9	scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock After changing tasklet to workqueue or threaded irq, some critical resources are only used on threads (not in interrupt or bottom half of interrupt), so replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock to protect those critical resources. Link: https://lore.kernel.org/r/1579522957-4393-3-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-20 19:31:13 -05:00
Xiang Chen	81f338e970	scsi: hisi_sas: use threaded irq to process CQ interrupts Currently IRQ_EFFECTIVE_AFF_MASK is enabled for ARM_GIC and ARM_GIC3, so it only allows a single target CPU in the affinity mask to process interrupts and also interrupt thread, and the performance of using threaded irq is almost the same as tasklet. But if the config is not enabled, the interrupt thread will be allowed all the CPUs in the affinity mask. At that situation it improves the performance (about 20%). Note: IRQ_EFFECTIVE_AFF_MASK is configured differently for different architecture chip, and it seems to be better to make it be configured easily. Link: https://lore.kernel.org/r/1579522957-4393-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-20 19:31:13 -05:00
John Garry	964231aa0c	scsi: hisi_sas: Stop converting a bool into a bool The !! operator on a bool is pointless, so remove an example in hisi_sas_rescan_topology(). Link: https://lore.kernel.org/r/1573551059-107873-5-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-11-12 22:21:34 -05:00
Xiang Chen	8c39673d54	scsi: hisi_sas: Check sas_port before using it Need to check the structure sas_port before using it. Link: https://lore.kernel.org/r/1573551059-107873-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-11-12 22:21:33 -05:00
Luo Jiaxing	f873b66119	scsi: hisi_sas: Record the phy down event in debugfs The number of phy down reflects the quality of the link between SAS controller and disk. In order to allow the user to confirm the link quality of the system, we record the number of phy down for each phy. The user can check the current phy down count by reading the debugfs file corresponding to the specific phy, or clear the phy down count by writing 0 to the debugfs file. Link: https://lore.kernel.org/r/1571926105-74636-19-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:15 -04:00
Luo Jiaxing	cabe7c10c9	scsi: hisi_sas: Delete the debugfs folder of hisi_sas when the probe fails Although if the debugfs initialization fails, we will delete the debugfs folder of hisi_sas, but we did not consider the scenario where debugfs was successfully initialized, but the probe failed for other reasons. We found out that hisi_sas folder is still remain after the probe failed. When probe fail, we should delete debugfs folder to avoid the above issue. Link: https://lore.kernel.org/r/1571926105-74636-18-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:15 -04:00
Luo Jiaxing	8f6432986e	scsi: hisi_sas: Add ability to have multiple debugfs dumps We use the module parameter debugfs_dump_count to manage the upper limit of the memory block for multiple dumps. Link: https://lore.kernel.org/r/1571926105-74636-17-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:15 -04:00
Luo Jiaxing	905ab01faf	scsi: hisi_sas: Add module parameter for debugfs dump count We still only use dump index #0 however. Link: https://lore.kernel.org/r/1571926105-74636-16-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:15 -04:00
Luo Jiaxing	a70e33eae3	scsi: hisi_sas: Allocate memory for multiple dumps of debugfs We add multiple dumps for debugfs, but only allocate memory this time and only dump #0. Link: https://lore.kernel.org/r/1571926105-74636-15-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:15 -04:00
Luo Jiaxing	357e4fc7a9	scsi: hisi_sas: Add debugfs file structure for ITCT cache Create a file structure which was used to save the memory address for ITCT cache at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-14-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:14 -04:00
Luo Jiaxing	b714dd8f36	scsi: hisi_sas: Add debugfs file structure for IOST cache Create a file structure which was used to save the memory address for IOST cache at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-13-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:14 -04:00
Luo Jiaxing	0161d55f23	scsi: hisi_sas: Add debugfs file structure for ITCT Create a file structure which was used to save the memory address for ITCT at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-12-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:14 -04:00
Luo Jiaxing	e15f2e2dff	scsi: hisi_sas: Add debugfs file structure for IOST Create a file structure which was used to save the memory address for IOST at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-11-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:14 -04:00
Luo Jiaxing	1f66e1fd26	scsi: hisi_sas: Add debugfs file structure for port Create a file structure which was used to save the memory address and phy pointer for port at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-10-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:14 -04:00
Luo Jiaxing	c611639810	scsi: hisi_sas: Add debugfs file structure for registers Create a file structure which was used to save the memory address and hisi_hba pointer for REGS at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-9-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:14 -04:00
Luo Jiaxing	1b54c4db72	scsi: hisi_sas: Add debugfs file structure for DQ Create a file structure which was used to save the memory address and DQ pointer for DQ at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-8-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:14 -04:00
Luo Jiaxing	35ea630b2b	scsi: hisi_sas: Add debugfs file structure for CQ Create a file structure which was used to save the memory address and CQ pointer for CQ at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it need. Link: https://lore.kernel.org/r/1571926105-74636-7-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:13 -04:00
Luo Jiaxing	d28ed83b76	scsi: hisi_sas: Add timestamp for a debugfs dump It's useful to know when the dump occurred, so add a timestamp file for this. Link: https://lore.kernel.org/r/1571926105-74636-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:13 -04:00
Xiang Chen	550c0d89d5	scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec() For IOs from upper layer, preemption may be disabled as it may be called by function __blk_mq_delay_run_hw_queue which will call get_cpu() (it disables preemption). So if flags HISI_SAS_REJECT_CMD_BIT is set in function hisi_sas_task_exec(), it may disable preempt twice after down() and up() which will cause following call trace: BUG: scheduling while atomic: fio/60373/0x00000002 Call trace: dump_backtrace+0x0/0x150 show_stack+0x24/0x30 dump_stack+0xa0/0xc4 __schedule_bug+0x68/0x88 __schedule+0x4b8/0x548 schedule+0x40/0xd0 schedule_timeout+0x200/0x378 __down+0x78/0xc8 down+0x54/0x70 hisi_sas_task_exec.isra.10+0x598/0x8d8 [hisi_sas_main] hisi_sas_queue_command+0x28/0x38 [hisi_sas_main] sas_queuecommand+0x168/0x1b0 [libsas] scsi_queue_rq+0x2ac/0x980 blk_mq_dispatch_rq_list+0xb0/0x550 blk_mq_do_dispatch_sched+0x6c/0x110 blk_mq_sched_dispatch_requests+0x114/0x1d8 __blk_mq_run_hw_queue+0xb8/0x130 __blk_mq_delay_run_hw_queue+0x1c0/0x220 blk_mq_run_hw_queue+0xb0/0x128 blk_mq_sched_insert_requests+0xdc/0x208 blk_mq_flush_plug_list+0x1b4/0x3a0 blk_flush_plug_list+0xdc/0x110 blk_finish_plug+0x3c/0x50 blkdev_direct_IO+0x404/0x550 generic_file_read_iter+0x9c/0x848 blkdev_read_iter+0x50/0x78 aio_read+0xc8/0x170 io_submit_one+0x1fc/0x8d8 __arm64_sys_io_submit+0xdc/0x280 el0_svc_common.constprop.0+0xe0/0x1e0 el0_svc_handler+0x34/0x90 el0_svc+0x10/0x14 ... To solve the issue, check preemptible() to avoid disabling preempt multiple when flag HISI_SAS_REJECT_CMD_BIT is set. Link: https://lore.kernel.org/r/1571926105-74636-5-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:13 -04:00
Xiang Chen	8fa9a7bd30	scsi: hisi_sas: use wait_for_completion_timeout() when clearing ITCT When injecting 2bit ecc errors, it will cause confusion inside SAS controller which needs host reset to recover it. If a device is gone at the same times inject 2bit ecc errors, we may not receive the ITCT interrupt so it will wait for completion in clear_itct_v3_hw() all the time. And host reset will also not occur because it can't require hisi_hba->sem, so the system will be suspended. To solve the issue, use wait_for_completion_timeout() instead of wait_for_completion(), and also don't mark the gone device as SAS_PHY_UNUSED when device gone. Link: https://lore.kernel.org/r/1571926105-74636-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:13 -04:00
Xiang Chen	35160421b6	scsi: hisi_sas: Don't create debugfs dump folder twice Due to a merge error, we attempt to create 2x debugfs dump folders, which fails: [ 861.101914] debugfs: Directory 'dump' with parent '0000:74:02.0' already present! This breaks the dump function. To fix, remove the superfluous attempt to create the folder. Fixes: `7ec7082c57` ("scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation") Link: https://lore.kernel.org/r/1571926105-74636-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:31:13 -04:00
Martin K. Petersen	a3a8d13f62	Merge branch '5.4/scsi-fixes' into 5.5/scsi-queue The qla2xxx driver updates for 5.5 depend on the fixes queued for 5.4. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-09 21:54:04 -04:00
Colin Ian King	b1000fcca1	scsi: hisi_sas: fix spelling mistake "digial" -> "digital" There is a spelling mistake in literal string. Fix it. Link: https://lore.kernel.org/r/20190916091706.32268-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-30 22:57:53 -04:00
YueHaibing	4b6b1bb686	scsi: hisi_sas: Make three functions static Fix sparse warnings: drivers/scsi/hisi_sas/hisi_sas_main.c:3686:6: warning: symbol 'hisi_sas_debugfs_release' was not declared. Should it be static? drivers/scsi/hisi_sas/hisi_sas_main.c:3708:5: warning: symbol 'hisi_sas_debugfs_alloc' was not declared. Should it be static? drivers/scsi/hisi_sas/hisi_sas_main.c:3799:6: warning: symbol 'hisi_sas_debugfs_bist_init' was not declared. Should it be static? Link: https://lore.kernel.org/r/20190923054035.19036-1-yuehaibing@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-23 23:09:42 -04:00
Xiang Chen	e74006edd0	scsi: hisi_sas: Fix the conflict between device gone and host reset When device gone, it will check whether it is during reset, if not, it will send internal task abort. Before internal task abort returned, reset begins, and it will check whether SAS_PHY_UNUSED is set, if not, it will call hisi_sas_init_device(), but at that time domain_device may already be freed or part of it is freed, so it may referenece null pointer in hisi_sas_init_device(). It may occur as follows: thread0 thread1 hisi_sas_dev_gone() check whether in RESET(no) internal task abort reset prep soft_reset ... (part of reset_done) internal task abort failed release resource anyway clear_itct device->lldd_dev=NULL hisi_sas_reset_init_all_device check sas_dev->dev_type is SAS_PHY_UNUSED and !device set dev_type SAS_PHY_UNUSED sas_free_device hisi_sas_init_device ... Semaphore hisi_hba.sema is used to sync the processes of device gone and host reset. To solve the issue, expand the scope that semaphore protects and let them never occur together. And also some places will check whether domain_device is NULL to judge whether the device is gone. So when device gone, need to clear sas_dev->sas_device. Link: https://lore.kernel.org/r/1567774537-20003-14-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-10 22:28:57 -04:00
Xiang Chen	97b151e758	scsi: hisi_sas: Add BIST support for phy loopback Add BIST (built in self test) support for phy loopback. Through the new debugfs interface, the user can configure loopback mode/linkrate/phy id/code mode before enabling it. And also user can enable/disable BIST function. Link: https://lore.kernel.org/r/1567774537-20003-13-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-10 22:28:57 -04:00
Luo Jiaxing	7ec7082c57	scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation We extract the code of memory allocate and construct an new function for it. We think it's convenient for subsequent optimization. Link: https://lore.kernel.org/r/1567774537-20003-12-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-10 22:28:57 -04:00
Luo Jiaxing	4bc058097a	scsi: hisi_sas: Remove some unused function arguments Some function arguments are unused, so remove them. Also move the timeout print in for wait_cmds_complete_timeout_vX_hw() callsites into that same function. Link: https://lore.kernel.org/r/1567774537-20003-11-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-10 22:28:56 -04:00
Xiang Chen	435a05cf8c	scsi: hisi_sas: Assign NCQ tag for all NCQ commands Currently the NCQ tag is only assigned for FPDMA READ and FPDMA WRITE commands, and for other NCQ commands (such as FPDMA SEND), their NCQ tags are set in the delivery command to 0. So for all the NCQ commands, we also need to assign normal NCQ tag for them, so drop the command type check in hisi_sas_get_ncq_tag() [drop hisi_sas_get_ncq_tag() altogether actually], and always use the ATA command NCQ tag when appropriate. Link: https://lore.kernel.org/r/1567774537-20003-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-10 22:28:56 -04:00
Xiang Chen	b45e05aa5d	scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device When init device for SAS disks, it will send TMF IO to clear disks. At that time TMF IO is broken by some operations such as injecting controller reset from HW RAs event, the TMF IO will be timeout, and at last device will be gone. Print is as followed: hisi_sas_v3_hw 0000:74:02.0: dev[240:1] found ... hisi_sas_v3_hw 0000:74:02.0: controller resetting... hisi_sas_v3_hw 0000:74:02.0: phyup: phy7 link_rate=10(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy0 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy2 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy3 link_rate=9(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy6 link_rate=10(sata) hisi_sas_v3_hw 0000:74:02.0: phyup: phy5 link_rate=11 hisi_sas_v3_hw 0000:74:02.0: phyup: phy4 link_rate=11 hisi_sas_v3_hw 0000:74:02.0: controller reset complete hisi_sas_v3_hw 0000:74:02.0: abort tmf: TMF task timeout and not done hisi_sas_v3_hw 0000:74:02.0: dev[240:1] is gone sas: driver on host 0000:74:02.0 cannot handle device 5000c500a75a860d, error:5 To improve the reliability, retry TMF IO max of 3 times for SAS disks which is the same as softreset does. Link: https://lore.kernel.org/r/1567774537-20003-6-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-10 22:28:56 -04:00

1 2 3 4 5 ...

294 Commits