linux/block
Yufen Yu 8d6996630c block: fix null pointer dereference in blk_mq_rq_timed_out()
We got a null pointer deference BUG_ON in blk_mq_rq_timed_out()
as following:

[  108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040
[  108.827059] PGD 0 P4D 0
[  108.827313] Oops: 0000 [#1] SMP PTI
[  108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ #431
[  108.829503] Workqueue: kblockd blk_mq_timeout_work
[  108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330
[  108.838191] Call Trace:
[  108.838406]  bt_iter+0x74/0x80
[  108.838665]  blk_mq_queue_tag_busy_iter+0x204/0x450
[  108.839074]  ? __switch_to_asm+0x34/0x70
[  108.839405]  ? blk_mq_stop_hw_queue+0x40/0x40
[  108.839823]  ? blk_mq_stop_hw_queue+0x40/0x40
[  108.840273]  ? syscall_return_via_sysret+0xf/0x7f
[  108.840732]  blk_mq_timeout_work+0x74/0x200
[  108.841151]  process_one_work+0x297/0x680
[  108.841550]  worker_thread+0x29c/0x6f0
[  108.841926]  ? rescuer_thread+0x580/0x580
[  108.842344]  kthread+0x16a/0x1a0
[  108.842666]  ? kthread_flush_work+0x170/0x170
[  108.843100]  ret_from_fork+0x35/0x40

The bug is caused by the race between timeout handle and completion for
flush request.

When timeout handle function blk_mq_rq_timed_out() try to read
'req->q->mq_ops', the 'req' have completed and reinitiated by next
flush request, which would call blk_rq_init() to clear 'req' as 0.

After commit 12f5b93145 ("blk-mq: Remove generation seqeunce"),
normal requests lifetime are protected by refcount. Until 'rq->ref'
drop to zero, the request can really be free. Thus, these requests
cannot been reused before timeout handle finish.

However, flush request has defined .end_io and rq->end_io() is still
called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq'
can be reused by the next flush request handle, resulting in null
pointer deference BUG ON.

We fix this problem by covering flush request with 'rq->ref'.
If the refcount is not zero, flush_end_io() return and wait the
last holder recall it. To record the request status, we add a new
entry 'rq_status', which will be used in flush_end_io().

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: stable@vger.kernel.org # v4.18+
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>

-------
v2:
 - move rq_status from struct request to struct blk_flush_queue
v3:
 - remove unnecessary '{}' pair.
v4:
 - let spinlock to protect 'fq->rq_status'
v5:
 - move rq_status after flush_running_idx member of struct blk_flush_queue
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-27 07:01:25 -06:00
..
partitions
badblocks.c
bfq-cgroup.c bfq: Add per-device weight 2019-09-06 14:33:52 -06:00
bfq-iosched.c block, bfq: push up injection only after setting service time 2019-09-17 20:03:49 -06:00
bfq-iosched.h bfq: Add per-device weight 2019-09-06 14:33:52 -06:00
bfq-wf2q.c bfq: Fix the missing barrier in __bfq_entity_update_weight_prio 2019-09-06 14:33:48 -06:00
bio-integrity.c
bio.c block: move same page handling from __bio_add_pc_page to the callers 2019-08-22 07:14:39 -06:00
blk-cgroup.c bfq: Fix bfq linkage error 2019-09-14 11:33:43 -06:00
blk-core.c block: centralize PI remapping logic to the block layer 2019-09-17 20:03:49 -06:00
blk-exec.c
blk-flush.c block: fix null pointer dereference in blk_mq_rq_timed_out() 2019-09-27 07:01:25 -06:00
blk-integrity.c block: centralize PI remapping logic to the block layer 2019-09-17 20:03:49 -06:00
blk-ioc.c
blk-iocost.c iocost: bump up default latency targets for hard disks 2019-09-26 01:12:01 -06:00
blk-iolatency.c blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ 2019-08-28 21:17:08 -06:00
blk-lib.c
blk-map.c
blk-merge.c
blk-mq-cpumap.c
blk-mq-debugfs-zoned.c
blk-mq-debugfs.c
blk-mq-debugfs.h
blk-mq-pci.c
blk-mq-rdma.c
blk-mq-sched.c blk-mq: move lockdep_assert_held() into elevator_exit 2019-09-26 00:45:05 -06:00
blk-mq-sched.h
blk-mq-sysfs.c block: split .sysfs_lock into two locks 2019-08-27 10:40:20 -06:00
blk-mq-tag.c
blk-mq-tag.h
blk-mq-virtio.c
blk-mq.c block: fix null pointer dereference in blk_mq_rq_timed_out() 2019-09-27 07:01:25 -06:00
blk-mq.h
blk-pm.c block: bypass blk_set_runtime_active for uninitialized q->dev 2019-09-12 07:11:56 -06:00
blk-pm.h
blk-rq-qos.c block/rq_qos: implement rq_qos_ops->queue_depth_changed() 2019-08-28 21:17:07 -06:00
blk-rq-qos.h blkcg: implement blk-iocost 2019-08-28 21:17:12 -06:00
blk-settings.c dma-mapping updates for 5.4: 2019-09-19 13:27:23 -07:00
blk-softirq.c
blk-stat.c
blk-stat.h
blk-sysfs.c rq-qos: get rid of redundant wbt_update_limits() 2019-09-27 01:13:10 -06:00
blk-throttle.c block: make rq sector size accessible for block stats 2019-09-15 16:02:08 -06:00
blk-timeout.c
blk-wbt.c block/rq_qos: implement rq_qos_ops->queue_depth_changed() 2019-08-28 21:17:07 -06:00
blk-wbt.h block/rq_qos: implement rq_qos_ops->queue_depth_changed() 2019-08-28 21:17:07 -06:00
blk-zoned.c
blk.h block: fix null pointer dereference in blk_mq_rq_timed_out() 2019-09-27 07:01:25 -06:00
bounce.c
bsg-lib.c block: drop device references in bsg_queue_rq() 2019-09-23 11:17:24 -06:00
bsg.c
cmdline-parser.c
compat_ioctl.c
elevator.c block: don't release queue's sysfs lock during switching elevator 2019-09-26 00:45:51 -06:00
genhd.c block: Delay default elevator initialization 2019-09-05 19:52:34 -06:00
ioctl.c
ioprio.c
Kconfig blkcg: implement blk-iocost 2019-08-28 21:17:12 -06:00
Kconfig.iosched
kyber-iosched.c
Makefile blkcg: implement blk-iocost 2019-08-28 21:17:12 -06:00
mq-deadline.c block: Introduce elevator features 2019-09-05 19:52:33 -06:00
opal_proto.h block: sed-opal: Removed duplicate OPAL_METHOD_LENGTH definition 2019-08-20 09:34:49 -06:00
partition-generic.c
scsi_ioctl.c
sed-opal.c
t10-pi.c block: t10-pi: fix -Wswitch warning 2019-09-23 08:05:19 -06:00