linux/block
Ming Lei 396eaf21ee blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
blk_insert_cloned_request() is called in the fast path of a dm-rq driver
(e.g. blk-mq request-based DM mpath).  blk_insert_cloned_request() uses
blk_mq_request_bypass_insert() to directly append the request to the
blk-mq hctx->dispatch_list of the underlying queue.

1) This way isn't efficient enough because the hctx spinlock is always
used.

2) With blk_insert_cloned_request(), we completely bypass underlying
queue's elevator and depend on the upper-level dm-rq driver's elevator
to schedule IO.  But dm-rq currently can't get the underlying queue's
dispatch feedback at all.  Without knowing whether a request was issued
or not (e.g. due to underlying queue being busy) the dm-rq elevator will
not be able to provide effective IO merging (as a side-effect of dm-rq
currently blindly destaging a request from its elevator only to requeue
it after a delay, which kills any opportunity for merging).  This
obviously causes very bad sequential IO performance.

Fix this by updating blk_insert_cloned_request() to use
blk_mq_request_direct_issue().  blk_mq_request_direct_issue() allows a
request to be issued directly to the underlying queue and returns the
dispatch feedback (blk_status_t).  If blk_mq_request_direct_issue()
returns BLK_SYS_RESOURCE the dm-rq driver will now use DM_MAPIO_REQUEUE
to _not_ destage the request.  Whereby preserving the opportunity to
merge IO.

With this, request-based DM's blk-mq sequential IO performance is vastly
improved (as much as 3X in mpath/virtio-scsi testing).

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[blk-mq.c changes heavily influenced by Ming Lei's initial solution, but
they were refactored to make them less fragile and easier to read/review]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17 09:46:54 -07:00
..
partitions partitions/msdos: Unable to mount UFS 44bsd partitions 2018-01-10 09:12:16 -07:00
badblocks.c badblocks: fix wrong return value in badblocks_set if badblocks are disabled 2017-11-03 11:29:50 -07:00
bfq-cgroup.c block, bfq: put async queues for root bfq groups too 2018-01-09 08:45:25 -07:00
bfq-iosched.c block, bfq: fix occurrences of request finish method's old name 2018-01-10 07:43:54 -07:00
bfq-iosched.h block, bfq: let a queue be merged only shortly after starting I/O 2018-01-05 09:26:09 -07:00
bfq-wf2q.c block, bfq: let a queue be merged only shortly after starting I/O 2018-01-05 09:26:09 -07:00
bio-integrity.c block: remove unnecessary NULL checks in bioset_integrity_free() 2017-10-06 13:03:12 -06:00
bio.c block: move bio_alloc_pages() to bcache 2018-01-06 09:18:00 -07:00
blk-cgroup.c blkcg: add sanity check for blkcg policy operations 2017-11-04 12:31:15 -06:00
blk-core.c blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback 2018-01-17 09:46:54 -07:00
blk-exec.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
blk-flush.c blk-mq: don't allocate driver tag upfront for flush rq 2017-11-04 12:40:13 -06:00
blk-integrity.c block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
blk-ioc.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-lib.c Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
blk-map.c blk_rq_map_user_iov: fix error override 2018-01-15 08:50:32 -07:00
blk-merge.c Revert "block: blk-merge: try to make front segments in full size" 2018-01-09 20:23:19 -07:00
blk-mq-cpumap.c blk-mq: map queues to all present CPUs 2017-07-24 10:01:31 -06:00
blk-mq-debugfs.c blk-mq: add missing RQF_STARTED to debugfs 2018-01-12 14:47:57 -07:00
blk-mq-debugfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-mq-pci.c blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL 2017-08-18 08:08:14 -06:00
blk-mq-rdma.c block: Add rdma affinity based queue mapping helper 2017-08-08 14:58:03 -04:00
blk-mq-sched.c blk-mq: remove confusing comment of blk_mq_sched_dispatch_requests 2018-01-05 08:36:33 -07:00
blk-mq-sched.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-mq-sysfs.c block: properly protect the 'queue' kobj in blk_unregister_queue 2018-01-15 08:41:38 -07:00
blk-mq-tag.c blk-mq: improve heavily contended tag case 2017-12-22 11:09:37 -07:00
blk-mq-tag.h Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback 2018-01-17 09:46:54 -07:00
blk-mq.h blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback 2018-01-17 09:46:54 -07:00
blk-settings.c block: remove __bio_kmap_atomic 2017-11-10 19:53:25 -07:00
blk-softirq.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-stat.c treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
blk-stat.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-sysfs.c block: allow gendisk's request_queue registration to be deferred 2018-01-15 08:41:38 -07:00
blk-tag.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-throttle.c treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
blk-timeout.c block: add accessors for setting/querying request deadline 2018-01-10 11:47:47 -07:00
blk-wbt.c blk-wbt: fix comments typo 2017-11-23 22:00:20 -07:00
blk-wbt.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-zoned.c block: introduce zoned block devices zone write locking 2018-01-05 09:22:17 -07:00
blk.h block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit 2018-01-10 11:47:53 -07:00
bounce.c block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq 2018-01-06 09:18:00 -07:00
bsg-lib.c block: Fix kernel-doc warnings reported when building with W=1 2018-01-09 11:15:17 -07:00
bsg.c block: pass full fmode_t to blk_verify_command 2017-11-10 19:53:25 -07:00
cfq-iosched.c block/cfq: cache rightmost rb_node 2017-09-08 18:26:49 -07:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat_ioctl.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
deadline-iosched.c deadline-iosched: Introduce zone locking support 2018-01-05 09:22:17 -07:00
elevator.c blk-mq: quiesce queue during switching io sched and updating nr_requests 2018-01-06 09:25:36 -07:00
genhd.c block: allow gendisk's request_queue registration to be deferred 2018-01-15 08:41:38 -07:00
ioctl.c block: move CAP_SYS_ADMIN check in blkdev_roset() 2017-10-25 12:25:00 -06:00
ioprio.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
Kconfig License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig.iosched License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kyber-iosched.c block: kyber: check if there are requests in ctx in kyber_has_work() 2017-11-01 08:20:02 -06:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mq-deadline.c mq-deadline: make it clear that __dd_dispatch_request() works on all hw queues 2018-01-06 09:23:11 -07:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled 2017-09-11 09:45:52 -06:00
partition-generic.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
scsi_ioctl.c block: silently forbid sending any ioctl to a partition 2018-01-10 12:30:37 -07:00
sed-opal.c block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled 2017-09-11 09:45:52 -06:00
t10-pi.c t10-pi: Move opencoded contants to common header 2017-07-03 16:56:25 -06:00