forked from Minki/linux
a6f0788ec2
This adds a new block layer operation to zero out a range of LBAs. This allows to implement zeroing for devices that don't use either discard with a predictable zero pattern or WRITE SAME of zeroes. The prominent example of that is NVMe with the Write Zeroes command, but in the future, this should also help with improving the way zeroing discards work. For this operation, suitable entry is exported in sysfs which indicate the number of maximum bytes allowed in one write zeroes operation by the device. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@hgst.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
280 lines
11 KiB
Plaintext
280 lines
11 KiB
Plaintext
What: /sys/block/<disk>/stat
|
|
Date: February 2008
|
|
Contact: Jerome Marchand <jmarchan@redhat.com>
|
|
Description:
|
|
The /sys/block/<disk>/stat files displays the I/O
|
|
statistics of disk <disk>. They contain 11 fields:
|
|
1 - reads completed successfully
|
|
2 - reads merged
|
|
3 - sectors read
|
|
4 - time spent reading (ms)
|
|
5 - writes completed
|
|
6 - writes merged
|
|
7 - sectors written
|
|
8 - time spent writing (ms)
|
|
9 - I/Os currently in progress
|
|
10 - time spent doing I/Os (ms)
|
|
11 - weighted time spent doing I/Os (ms)
|
|
For more details refer Documentation/iostats.txt
|
|
|
|
|
|
What: /sys/block/<disk>/<part>/stat
|
|
Date: February 2008
|
|
Contact: Jerome Marchand <jmarchan@redhat.com>
|
|
Description:
|
|
The /sys/block/<disk>/<part>/stat files display the
|
|
I/O statistics of partition <part>. The format is the
|
|
same as the above-written /sys/block/<disk>/stat
|
|
format.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/format
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Metadata format for integrity capable block device.
|
|
E.g. T10-DIF-TYPE1-CRC.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/read_verify
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Indicates whether the block layer should verify the
|
|
integrity of read requests serviced by devices that
|
|
support sending integrity metadata.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/tag_size
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Number of bytes of integrity tag space available per
|
|
512 bytes of data.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/device_is_integrity_capable
|
|
Date: July 2014
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Indicates whether a storage device is capable of storing
|
|
integrity metadata. Set if the device is T10 PI-capable.
|
|
|
|
What: /sys/block/<disk>/integrity/protection_interval_bytes
|
|
Date: July 2015
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Describes the number of data bytes which are protected
|
|
by one integrity tuple. Typically the device's logical
|
|
block size.
|
|
|
|
What: /sys/block/<disk>/integrity/write_generate
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Indicates whether the block layer should automatically
|
|
generate checksums for write requests bound for
|
|
devices that support receiving integrity metadata.
|
|
|
|
What: /sys/block/<disk>/alignment_offset
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report a physical block size that is
|
|
bigger than the logical block size (for instance a drive
|
|
with 4KB physical sectors exposing 512-byte logical
|
|
blocks to the operating system). This parameter
|
|
indicates how many bytes the beginning of the device is
|
|
offset from the disk's natural alignment.
|
|
|
|
What: /sys/block/<disk>/<partition>/alignment_offset
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report a physical block size that is
|
|
bigger than the logical block size (for instance a drive
|
|
with 4KB physical sectors exposing 512-byte logical
|
|
blocks to the operating system). This parameter
|
|
indicates how many bytes the beginning of the partition
|
|
is offset from the disk's natural alignment.
|
|
|
|
What: /sys/block/<disk>/queue/logical_block_size
|
|
Date: May 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
This is the smallest unit the storage device can
|
|
address. It is typically 512 bytes.
|
|
|
|
What: /sys/block/<disk>/queue/physical_block_size
|
|
Date: May 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
This is the smallest unit a physical storage device can
|
|
write atomically. It is usually the same as the logical
|
|
block size but may be bigger. One example is SATA
|
|
drives with 4KB sectors that expose a 512-byte logical
|
|
block size to the operating system. For stacked block
|
|
devices the physical_block_size variable contains the
|
|
maximum physical_block_size of the component devices.
|
|
|
|
What: /sys/block/<disk>/queue/minimum_io_size
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report a granularity or preferred
|
|
minimum I/O size which is the smallest request the
|
|
device can perform without incurring a performance
|
|
penalty. For disk drives this is often the physical
|
|
block size. For RAID arrays it is often the stripe
|
|
chunk size. A properly aligned multiple of
|
|
minimum_io_size is the preferred request size for
|
|
workloads where a high number of I/O operations is
|
|
desired.
|
|
|
|
What: /sys/block/<disk>/queue/optimal_io_size
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report an optimal I/O size, which is
|
|
the device's preferred unit for sustained I/O. This is
|
|
rarely reported for disk drives. For RAID arrays it is
|
|
usually the stripe width or the internal track size. A
|
|
properly aligned multiple of optimal_io_size is the
|
|
preferred request size for workloads where sustained
|
|
throughput is desired. If no optimal I/O size is
|
|
reported this file contains 0.
|
|
|
|
What: /sys/block/<disk>/queue/nomerges
|
|
Date: January 2010
|
|
Contact:
|
|
Description:
|
|
Standard I/O elevator operations include attempts to
|
|
merge contiguous I/Os. For known random I/O loads these
|
|
attempts will always fail and result in extra cycles
|
|
being spent in the kernel. This allows one to turn off
|
|
this behavior on one of two ways: When set to 1, complex
|
|
merge checks are disabled, but the simple one-shot merges
|
|
with the previous I/O request are enabled. When set to 2,
|
|
all merge tries are disabled. The default value is 0 -
|
|
which enables all types of merge tries.
|
|
|
|
What: /sys/block/<disk>/discard_alignment
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may
|
|
internally allocate space in units that are bigger than
|
|
the exported logical block size. The discard_alignment
|
|
parameter indicates how many bytes the beginning of the
|
|
device is offset from the internal allocation unit's
|
|
natural alignment.
|
|
|
|
What: /sys/block/<disk>/<partition>/discard_alignment
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may
|
|
internally allocate space in units that are bigger than
|
|
the exported logical block size. The discard_alignment
|
|
parameter indicates how many bytes the beginning of the
|
|
partition is offset from the internal allocation unit's
|
|
natural alignment.
|
|
|
|
What: /sys/block/<disk>/queue/discard_granularity
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may
|
|
internally allocate space using units that are bigger
|
|
than the logical block size. The discard_granularity
|
|
parameter indicates the size of the internal allocation
|
|
unit in bytes if reported by the device. Otherwise the
|
|
discard_granularity will be set to match the device's
|
|
physical block size. A discard_granularity of 0 means
|
|
that the device does not support discard functionality.
|
|
|
|
What: /sys/block/<disk>/queue/discard_max_bytes
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may have
|
|
internal limits on the number of bytes that can be
|
|
trimmed or unmapped in a single operation. Some storage
|
|
protocols also have inherent limits on the number of
|
|
blocks that can be described in a single command. The
|
|
discard_max_bytes parameter is set by the device driver
|
|
to the maximum number of bytes that can be discarded in
|
|
a single operation. Discard requests issued to the
|
|
device must not exceed this limit. A discard_max_bytes
|
|
value of 0 means that the device does not support
|
|
discard functionality.
|
|
|
|
What: /sys/block/<disk>/queue/discard_zeroes_data
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may return
|
|
stale or random data when a previously discarded block
|
|
is read back. This can cause problems if the filesystem
|
|
expects discarded blocks to be explicitly cleared. If a
|
|
device reports that it deterministically returns zeroes
|
|
when a discarded area is read the discard_zeroes_data
|
|
parameter will be set to one. Otherwise it will be 0 and
|
|
the result of reading a discarded area is undefined.
|
|
|
|
What: /sys/block/<disk>/queue/write_same_max_bytes
|
|
Date: January 2012
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Some devices support a write same operation in which a
|
|
single data block can be written to a range of several
|
|
contiguous blocks on storage. This can be used to wipe
|
|
areas on disk or to initialize drives in a RAID
|
|
configuration. write_same_max_bytes indicates how many
|
|
bytes can be written in a single write same command. If
|
|
write_same_max_bytes is 0, write same is not supported
|
|
by the device.
|
|
|
|
What: /sys/block/<disk>/queue/write_zeroes_max_bytes
|
|
Date: November 2016
|
|
Contact: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
|
|
Description:
|
|
Devices that support write zeroes operation in which a
|
|
single request can be issued to zero out the range of
|
|
contiguous blocks on storage without having any payload
|
|
in the request. This can be used to optimize writing zeroes
|
|
to the devices. write_zeroes_max_bytes indicates how many
|
|
bytes can be written in a single write zeroes command. If
|
|
write_zeroes_max_bytes is 0, write zeroes is not supported
|
|
by the device.
|
|
|
|
What: /sys/block/<disk>/queue/zoned
|
|
Date: September 2016
|
|
Contact: Damien Le Moal <damien.lemoal@hgst.com>
|
|
Description:
|
|
zoned indicates if the device is a zoned block device
|
|
and the zone model of the device if it is indeed zoned.
|
|
The possible values indicated by zoned are "none" for
|
|
regular block devices and "host-aware" or "host-managed"
|
|
for zoned block devices. The characteristics of
|
|
host-aware and host-managed zoned block devices are
|
|
described in the ZBC (Zoned Block Commands) and ZAC
|
|
(Zoned Device ATA Command Set) standards. These standards
|
|
also define the "drive-managed" zone model. However,
|
|
since drive-managed zoned block devices do not support
|
|
zone commands, they will be treated as regular block
|
|
devices and zoned will report "none".
|
|
|
|
What: /sys/block/<disk>/queue/chunk_sectors
|
|
Date: September 2016
|
|
Contact: Hannes Reinecke <hare@suse.com>
|
|
Description:
|
|
chunk_sectors has different meaning depending on the type
|
|
of the disk. For a RAID device (dm-raid), chunk_sectors
|
|
indicates the size in 512B sectors of the RAID volume
|
|
stripe segment. For a zoned block device, either
|
|
host-aware or host-managed, chunk_sectors indicates the
|
|
size of 512B sectors of the zones of the device, with
|
|
the eventual exception of the last zone of the device
|
|
which may be smaller.
|