linux/block
Ming Lei bf0beec060 blk-mq: drain I/O when all CPUs in a hctx are offline
Most of blk-mq drivers depend on managed IRQ's auto-affinity to setup
up queue mapping. Thomas mentioned the following point[1]:

"That was the constraint of managed interrupts from the very beginning:

 The driver/subsystem has to quiesce the interrupt line and the associated
 queue _before_ it gets shutdown in CPU unplug and not fiddle with it
 until it's restarted by the core when the CPU is plugged in again."

However, current blk-mq implementation doesn't quiesce hw queue before
the last CPU in the hctx is shutdown.  Even worse, CPUHP_BLK_MQ_DEAD is a
cpuhp state handled after the CPU is down, so there isn't any chance to
quiesce the hctx before shutting down the CPU.

Add new CPUHP_AP_BLK_MQ_ONLINE state to stop allocating from blk-mq hctxs
where the last CPU goes away, and wait for completion of in-flight
requests.  This guarantees that there is no inflight I/O before shutting
down the managed IRQ.

Add a BLK_MQ_F_STACKING and set it for dm-rq and loop, so we don't need
to wait for completion of in-flight requests from these drivers to avoid
a potential dead-lock. It is safe to do this for stacking drivers as those
do not use interrupts at all and their I/O completions are triggered by
underlying devices I/O completion.

[1] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/

[hch: different retry mechanism, merged two patches, minor cleanups]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-29 10:23:25 -06:00
..
partitions block: always use a percpu variable for disk stats 2020-05-27 05:21:23 -06:00
badblocks.c block: switch all files cleared marked as GPLv2 to SPDX tags 2019-04-30 16:11:57 -06:00
bfq-cgroup.c block, bfq: invoke flush_idle_tree after reparent_active_queues in pd_offline 2020-03-21 14:31:03 -06:00
bfq-iosched.c blk-mq: remove the bio argument to ->prepare_request 2020-05-29 10:23:24 -06:00
bfq-iosched.h block, bfq: turn put_queue into release_process_ref in __bfq_bic_change_cgroup 2020-03-21 14:31:00 -06:00
bfq-wf2q.c block, bfq: get a ref to a group when adding it to a service tree 2020-02-03 06:58:15 -07:00
bio-integrity.c block: Make blk-integrity preclude hardware inline encryption 2020-05-14 09:48:03 -06:00
bio.c block: move update_io_ticks to blk-core.c 2020-05-27 05:21:23 -06:00
blk-cgroup-rwstat.c blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup-rwstat.h blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup.c Merge branch 'block-5.7' into for-5.8/block 2020-05-09 16:13:58 -06:00
blk-core.c block: reduce part_stat_lock() scope 2020-05-27 05:21:23 -06:00
blk-crypto-fallback.c block: blk-crypto-fallback: remove redundant initialization of variable err 2020-05-27 05:41:59 -06:00
blk-crypto-internal.h block: blk-crypto-fallback for Inline Encryption 2020-05-14 09:48:03 -06:00
blk-crypto.c block: blk-crypto-fallback for Inline Encryption 2020-05-14 09:48:03 -06:00
blk-exec.c block: add a blk_account_io_merge_bio helper 2020-05-27 05:21:23 -06:00
blk-flush.c block: remove the disk and queue NULL checks in blkdev_issue_flush 2020-05-22 08:45:59 -06:00
blk-integrity.c block: Make blk-integrity preclude hardware inline encryption 2020-05-14 09:48:03 -06:00
blk-ioc.c block: Fix use-after-free issue accessing struct io_cq 2020-03-12 07:07:38 -06:00
blk-iocost.c iocost: don't let vrate run wild while there's no saturation signal 2020-05-14 09:32:09 -06:00
blk-iolatency.c blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ 2019-08-28 21:17:08 -06:00
blk-lib.c
blk-map.c block: Inline encryption support for blk-mq 2020-05-14 09:47:53 -06:00
blk-merge.c block: reduce part_stat_lock() scope 2020-05-27 05:21:23 -06:00
blk-mq-cpumap.c blk-mq: balance mapping between present CPUs and queues 2019-08-04 21:43:12 -06:00
blk-mq-debugfs-zoned.c block: Cleanup license notice 2019-01-17 21:21:40 -07:00
blk-mq-debugfs.c blk-mq: drain I/O when all CPUs in a hctx are offline 2020-05-29 10:23:25 -06:00
blk-mq-debugfs.h blk-mq: no need to check return value of debugfs_create functions 2019-06-13 03:00:30 -06:00
blk-mq-pci.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-rdma.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-sched.c blk-mq: make function '__blk_mq_sched_dispatch_requests' static 2020-04-29 09:16:53 -06:00
blk-mq-sched.h block: blk-mq: Remove blk_mq_sched_started_request and started_request 2019-07-23 07:25:09 -06:00
blk-mq-sysfs.c blk-mq: make sure that line break can be printed 2019-11-04 07:14:10 -07:00
blk-mq-tag.c blk-mq: drain I/O when all CPUs in a hctx are offline 2020-05-29 10:23:25 -06:00
blk-mq-tag.h blk-mq: add blk_mq_all_tag_iter 2020-05-29 10:23:25 -06:00
blk-mq-virtio.c blk-mq: Fix typo in comment 2020-03-17 20:55:21 +01:00
blk-mq.c blk-mq: drain I/O when all CPUs in a hctx are offline 2020-05-29 10:23:25 -06:00
blk-mq.h blk-mq: use BLK_MQ_NO_TAG in more places 2020-05-29 10:23:25 -06:00
blk-pm.c block: bypass blk_set_runtime_active for uninitialized q->dev 2019-09-12 07:11:56 -06:00
blk-pm.h
blk-rq-qos.c blk-wbt: fix performance regression in wbt scale_up/scale_down 2019-10-06 09:26:41 -06:00
blk-rq-qos.h blk-rq-qos: fix first node deletion of rq_qos_del() 2019-10-15 10:13:13 -06:00
blk-settings.c block: Introduce REQ_OP_ZONE_APPEND 2020-05-12 20:36:28 -06:00
blk-softirq.c block: Don't disable interrupts in trigger_softirq() 2019-11-18 07:29:22 -07:00
blk-stat.c blk-stat: Optimise blk_stat_add() 2019-10-07 21:19:10 -06:00
blk-stat.h block: deactivate blk_stat timer in wbt_disable_default() 2018-12-12 06:47:51 -07:00
blk-sysfs.c block: Introduce REQ_OP_ZONE_APPEND 2020-05-12 20:36:28 -06:00
blk-throttle.c blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-timeout.c block: add SPDX tags to block layer files missing licensing information 2019-04-30 16:12:03 -06:00
blk-wbt.c blk-wbt: Use tracepoint_string() for wbt_step tracepoint string literals 2020-04-17 08:21:44 -06:00
blk-wbt.h block/rq_qos: implement rq_qos_ops->queue_depth_changed() 2019-08-28 21:17:07 -06:00
blk-zoned.c block: Modify revalidate zones 2020-05-12 20:36:28 -06:00
blk.h block: add a blk_account_io_merge_bio helper 2020-05-27 05:21:23 -06:00
bounce.c block: Inline encryption support for blk-mq 2020-05-14 09:47:53 -06:00
bsg-lib.c block: Fix the type of 'sts' in bsg_queue_rq() 2019-12-20 11:52:01 -07:00
bsg.c compat_ioctl: bsg: add handler 2020-01-03 09:33:21 +01:00
cmdline-parser.c
elevator.c Merge branch 'for-linus' into for-5.5/block 2019-11-07 12:27:19 -07:00
genhd.c block: remove rcu_read_lock() from part_stat_lock() 2020-05-27 05:21:23 -06:00
ioctl.c block: Fix type of first compat_put_{,u}long() argument 2020-05-19 09:40:29 -06:00
ioprio.c docs: block: convert to ReST 2019-07-15 09:20:27 -03:00
Kconfig block: blk-crypto-fallback for Inline Encryption 2020-05-14 09:48:03 -06:00
Kconfig.iosched blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
keyslot-manager.c block: Make blk-integrity preclude hardware inline encryption 2020-05-14 09:48:03 -06:00
kyber-iosched.c blk-mq: remove the bio argument to ->prepare_request 2020-05-29 10:23:24 -06:00
Makefile block: blk-crypto-fallback for Inline Encryption 2020-05-14 09:48:03 -06:00
mq-deadline.c blk-mq: remove the bio argument to ->prepare_request 2020-05-29 10:23:24 -06:00
opal_proto.h block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
scsi_ioctl.c scsi: core: Allow non-root users to perform ZBC commands 2020-03-16 18:26:31 -04:00
sed-opal.c block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
t10-pi.c block: Allow t10-pi to be modular 2020-01-06 20:59:04 -07:00