During device discovery we ended up calling revalidate twice and thus
requested the same parameters multiple times. This was originally necessary
due to the request_queue and gendisk needing to be instantiated to
configure the block integrity profile.
Since this dependency no longer exists, reorganize the integrity probing
code so it can be run once at the end of discovery and drop the superfluous
revalidate call. Postponing the registration step involves splitting
sd_read_protection() into two functions, one to read the device protection
type and one to configure the mode of operation.
As part of this cleanup, make the printing code a bit less verbose.
Link: https://lore.kernel.org/r/20220302053559.32147-14-martin.petersen@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit a83da8a450 ("scsi: sd: Optimal I/O size should be a multiple of
physical block size") validated the reported optimal I/O size against the
physical block size to overcome problems with devices reporting nonsensical
transfer sizes.
However, some devices claim conformity to older SCSI versions that predate
the physical block size being reported. Other devices do not report a
physical block size at all. We need to be able to validate the optimal I/O
size on those devices as well.
Many devices report an OPTIMAL TRANSFER LENGTH GRANULARITY in the same VPD
page as the OPTIMAL TRANSFER LENGTH. Use this value to validate the optimal
I/O size. Also check that the reported granularity is a multiple of the
physical block size, if supported.
Link: https://lore.kernel.org/r/33fb522e-4f61-1b76-914f-c9e6a3553c9b@gmail.com
Link: https://lore.kernel.org/r/20220302053559.32147-9-martin.petersen@oracle.com
Reported-by: Bernhard Sulzer <micraft.b@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Low-level device drivers have had the ability to limit the size of an
INQUIRY for many years. This made sense for a wide variety of legacy
devices. However, we are unnecessarily truncating the INQUIRY response for
many modern devices. This prevents us from consulting fields beyond the
first 36 bytes.
If a device reports that it supports a larger INQUIRY response, and the
device also reports that it implements SPC-4 or newer, allow the larger
INQUIRY to proceed.
Link: https://lore.kernel.org/r/20220302053559.32147-4-martin.petersen@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some devices hang when a buffer size larger than expected is passed in the
ALLOCATION LENGTH field. For REPORT SUPPORTED OPERATION CODES we currently
only request a single command descriptor at a time and therefore the actual
size of the command is known ahead of time. Limit the ALLOCATION LENGTH to
the header size plus the command length of the opcode we are asking about.
Link: https://lore.kernel.org/r/20220302053559.32147-5-martin.petersen@oracle.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We currently default to 255 bytes when fetching VPD pages during discovery.
However, we have had a few devices that are known to wedge if the requested
buffer exceeds a certain size. See commit af73623f5f ("[SCSI] sd: Reduce
buffer size for vpd request") which works around one example of this
problem in the SCSI disk driver.
With commit d188b0675b ("scsi: core: Add sysfs attributes for VPD pages
0h and 89h") we now risk triggering the same issue in the generic midlayer
code.
The problem with the ATA VPD page in particular is that the SCSI portion of
the page is trailed by 512 bytes of verbatim ATA Identify Device
information. However, not all controllers actually provide the additional
512 bytes and will lock up if one asks for more than the 64 bytes
containing the SCSI protocol fields.
Instead of picking a new, somewhat arbitrary, number of bytes for the VPD
buffer size, start fetching the 4-byte header for each page. The header
contains the size of the page as far as the device is concerned. We can use
the reported size to specify the correct allocation length when
subsequently fetching the full page.
The header validation is done by a new helper function scsi_get_vpd_size()
and both scsi_get_vpd_page() and scsi_get_vpd_buf() now rely on this to
query the page size.
In addition, scsi_get_vpd_page() is simplified to mirror the logic in
scsi_get_vpd_page(). This involves removing the Supported VPD Pages lookup
prior to attempting to query a page. There does not appear any evidence,
even in the oldest SCSI specs, that this step is required. We already rely
on scsi_get_vpd_page() throughout the stack and this function never
consulted the Supported VPD Pages. Since this has not caused any problems
it should be safe to remove the precondition from scsi_get_vpd_page().
Instrumented runs also revealed that the Supported VPD Pages lookup had
little effect since the device page index often was larger than the
supplied buffer size. As a result, inquiries frequently bypassed the index
check and went through the "If we ran off the end of the buffer, give us
the benefit of the doubt" code path which assumed the page was present
despite not being listed. The revised code takes both the page size
reported by the device as well as the size of the buffer provided by the
scsi_get_vpd_page() caller into account.
Link: https://lore.kernel.org/r/20220302053559.32147-3-martin.petersen@oracle.com
Fixes: d188b0675b ("scsi: core: Add sysfs attributes for VPD pages 0h and 89h")
Reported-by: Maciej W. Rozycki <macro@orcam.me.uk>
Tested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The bug is here:
p->target_id, p->target_lun);
The list iterator 'p' will point to a bogus position containing HEAD if the
list is empty or no element is found. This case must be checked before any
use of the iterator, otherwise it will lead to an invalid memory access.
To fix this bug, add a check. Use a new variable 'iter' as the list
iterator, and use the original variable 'p' as a dedicated pointer to point
to the found element.
Link: https://lore.kernel.org/r/20220414040231.2662-1-xiam0nd.tong@gmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some devices may return invalid or zeroed data during an UIC error
condition. In addition, reading these SFRs will clear them. This means the
subsequent error handling will not be able to see them and therefore no
error handling will be scheduled.
Skip reading these SFRs in ufshcd_dump_regs().
Link: https://lore.kernel.org/r/1648689845-33521-1-git-send-email-kwmad.kim@samsung.com
Fixes: d672475664 ("scsi: ufs: Use explicit access size in ufshcd_dump_regs")
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For SCSI hosts which enable host_tagset the NUMA node returned from
blk_mq_hw_queue_to_node() is NUMA_NO_NODE always. Then, since in
scsi_mq_setup_tags() the default we choose for the tag_set NUMA node is
NUMA_NO_NODE, we always evaluate the NUMA node as NUMA_NO_NODE in functions
like blk_mq_alloc_rq_map().
The reason we get NUMA_NO_NODE from blk_mq_hw_queue_to_node() is that the
hctx_idx passed is BLK_MQ_NO_HCTX_IDX - so we can't match against a (HW)
queue mapping index.
Improve this by defaulting the tag_set NUMA node to the same NUMA node of
the SCSI host DMA dev.
Link: https://lore.kernel.org/r/1648640315-21419-1-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the 'zone_cap_mb' kernel module parameter. This parameter defines the
zone capacity. The zone capacity must be less than or equal to the zone
size.
Report that sequential write zones and gap zones are paired in the Zoned
Block Device Characteristics VPD page (page B6h).
This patch has been tested as follows:
modprobe scsi_debug delay=0 sector_size=512 dev_size_mb=128 zbc=host-managed zone_nr_conv=16 zone_size_mb=4 zone_cap_mb=3
modprobe brd rd_nr=1 rd_size=$((1<<20))
mkfs.f2fs -m /dev/ram0 -c /dev/${scsi_debug_dev}
mount /dev/ram0 /mnt
# Run a fio job that uses /mnt
Link: https://lore.kernel.org/r/20220421183023.3462291-10-bvanassche@acm.org
Cc: Douglas Gilbert <dgilbert@interlog.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
[ bvanassche: Switched to reporting a constant zone starting LBA granularity ]
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ZBC-2 allows host-managed disks to report gap zones. This allow zoned disks
to report an offset between data zone starts that is a power of two even if
the number of logical blocks with data per zone is not a power of two.
Another new feature in ZBC-2 is support for constant zone starting LBA
offsets. For zoned disks that report a constant zone starting LBA offset,
hide the gap zones from the block layer. Report the offset between data
zone starts as zone size and report the number of logical blocks with data
per zone as the zone capacity.
Link: https://lore.kernel.org/r/20220421183023.3462291-7-bvanassche@acm.org
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
[ bvanassche: Reworked this patch ]
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>