Commit Graph

6223 Commits

Author SHA1 Message Date
Finn Thain
d4d179c37c block/amiflop: Don't log error message on invalid ioctl
Cc: linux-m68k@lists.linux-m68k.org
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-31 10:19:11 -07:00
Chengguang Xu
93f87a74fd block: sunvdc: remove redundant code
Code cleanup for removing redundant break in switch case.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-22 14:45:33 -07:00
Chengguang Xu
c41103691b block: loop: remove redundant code
Code cleanup for removing redundant break in switch case.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-22 14:45:31 -07:00
Nathan Chancellor
5816a0932b drbd: Change drbd_request_detach_interruptible's return type to int
Clang warns when an implicit conversion is done between enumerated
types:

drivers/block/drbd/drbd_state.c:708:8: warning: implicit conversion from
enumeration type 'enum drbd_ret_code' to different enumeration type
'enum drbd_state_rv' [-Wenum-conversion]
                rv = ERR_INTR;
                   ~ ^~~~~~~~

drbd_request_detach_interruptible's only call site is in the return
statement of adm_detach, which returns an int. Change the return type of
drbd_request_detach_interruptible to match, silencing Clang's warning.

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:31 -07:00
Lars Ellenberg
f31e583aa2 drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")
And also re-enable partial-zero-out + discard aligned.

With the introduction of REQ_OP_WRITE_ZEROES,
we started to use that for both WRITE_ZEROES and DISCARDS,
hoping that WRITE_ZEROES would "do what we want",
UNMAP if possible, zero-out the rest.

The example scenario is some LVM "thin" backend.

While an un-allocated block on dm-thin reads as zeroes, on a dm-thin
with "skip_block_zeroing=true", after a partial block write allocated
that block, that same block may well map "undefined old garbage" from
the backends on LBAs that have not yet been written to.

If we cannot distinguish between zero-out and discard on the receiving
side, to avoid "undefined old garbage" to pop up randomly at later times
on supposedly zero-initialized blocks, we'd need to map all discards to
zero-out on the receiving side.  But that would potentially do a full
alloc on thinly provisioned backends, even when the expectation was to
unmap/trim/discard/de-allocate.

We need to distinguish on the protocol level, whether we need to guarantee
zeroes (and thus use zero-out, potentially doing the mentioned full-alloc),
or if we want to put the emphasis on discard, and only do a "best effort
zeroing" (by "discarding" blocks aligned to discard-granularity, and zeroing
only potential unaligned head and tail clippings to at least *try* to
avoid "false positives" in an online-verify later), hoping that someone
set skip_block_zeroing=false.

For some discussion regarding this on dm-devel, see also
https://www.mail-archive.com/dm-devel%40redhat.com/msg07965.html
https://www.redhat.com/archives/dm-devel/2018-January/msg00271.html

For backward compatibility, P_TRIM means zero-out, unless the
DRBD_FF_WZEROES feature flag is agreed upon during handshake.

To have upper layers even try to submit WRITE ZEROES requests,
we need to announce "efficient zeroout" independently.

We need to fixup max_write_zeroes_sectors after blk_queue_stack_limits():
if we can handle "zeroes" efficiently on the protocol,
we want to do that, even if our backend does not announce
max_write_zeroes_sectors itself.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:31 -07:00
Lars Ellenberg
9848b6ddd8 drbd: skip spurious timeout (ping-timeo) when failing promote
If you try to promote a Secondary while connected to a Primary
and allow-two-primaries is NOT set, we will wait for "ping-timeout"
to give this node a chance to detect a dead primary,
in case the cluster manager noticed faster than we did.

But if we then are *still* connected to a Primary,
we fail (after an additional timeout of ping-timout).

This change skips the spurious second timeout.

Most people won't notice really,
since "ping-timeout" by default is half a second.

But in some installations, ping-timeout may be 10 or 20 seconds or more,
and spuriously delaying the error return becomes annoying.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:31 -07:00
Lars Ellenberg
9049ccd46f drbd: don't retry connection if peers do not agree on "authentication" settings
emma: "Unexpected data packet AuthChallenge (0x0010)"
 ava: "expected AuthChallenge packet, received: ReportProtocol (0x000b)"
      "Authentication of peer failed, trying again."

Pattern repeats.

There is no point in retrying the handshake,
if we expect to receive an AuthChallenge,
but the peer is not even configured to expect or use a shared secret.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:31 -07:00
Luc Van Oostenryck
2c38f03511 drbd: fix print_st_err()'s prototype to match the definition
print_st_err() is defined with its 4th argument taking an
'enum drbd_state_rv' but its prototype use an int for it.

Fix this by using 'enum drbd_state_rv' in the prototype too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
be80ff8835 drbd: avoid spurious self-outdating with concurrent disconnect / down
If peers are "simultaneously" told to disconnect from each other,
either explicitly, or implicitly by taking down the resource,
with bad timing, one side may see its disconnect "fail" with
a result of "state change failed by peer", and interpret this as
"please oudate yourself".

Try to catch this by checking for current connection status,
and possibly retry as local-only state change instead.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
f708bd08ec drbd: do not block when adjusting "disk-options" while IO is frozen
"suspending" IO is overloaded.
It can mean "do not allow new requests" (obviously),
but it also may mean "must not complete pending IO",
for example while the fencing handlers do their arbitration.

When adjusting disk options, we suspend io (disallow new requests), then
wait for the activity-log to become unused (drain all IO completions),
and possibly replace it with a new activity log of different size.

If the other "suspend IO" aspect is active, pending IO completions won't
happen, and we would block forever (unkillable drbdsetup process).

Fix this by skipping the activity log adjustment if the "al-extents"
setting did not change. Also, in case it did change, fail early without
blocking if it looks like we would block forever.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
a2823ea920 drbd: fix comment typos
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
fe43ed97bb drbd: reject attach of unsuitable uuids even if connected
Multiple failure scenario:
a) all good
   Connected Primary/Secondary UpToDate/UpToDate
b) lose disk on Primary,
   Connected Primary/Secondary Diskless/UpToDate
c) continue to write to the device,
   changes only make it to the Secondary storage.
d) lose disk on Secondary,
   Connected Primary/Secondary Diskless/Diskless
e) now try to re-attach on Primary

This would have succeeded before, even though that is clearly the
wrong data set to attach to (missing the modifications from c).
Because we only compared our "effective" and the "to-be-attached"
data generation uuid tags if (device->state.conn < C_CONNECTED).

Fix: change that constraint to (device->state.pdsk != D_UP_TO_DATE)
compare the uuids, and reject the attach.

This patch also tries to improve the reverse scenario:
first lose Secondary, then Primary disk,
then try to attach the disk on Secondary.

Before this patch, the attach on the Secondary succeeds, but since commit
drbd: disconnect, if the wrong UUIDs are attached on a connected peer
the Primary will notice unsuitable data, and drop the connection hard.

Though unfortunately at a point in time during the handshake where
we cannot easily abort the attach on the peer without more
refactoring of the handshake.

We now reject any attach to "unsuitable" uuids,
as long as we can see a Primary role,
unless we already have access to "good" data.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
ad6e897902 drbd: attach on connected diskless peer must not shrink a consistent device
If we would reject a new handshake, if the peer had attached first,
and then connected, we should force disconnect if the peer first connects,
and only then attaches.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
4ef2a4f43f drbd: fix confusing error message during attach
If we attach a (consistent) backing device,
which knows about a last-agreed effective size,
and that effective size is *larger* than the currently requested size,
we refused to attach with ERR_DISK_TOO_SMALL
  Failure: (111) Low.dev. smaller than requested DRBD-dev. size.
which is confusing to say the least.

This patch changes the error code in that case to ERR_IMPLICIT_SHRINK
  Failure: (170) Implicit device shrinking not allowed. See kernel log.
  additional info from kernel:
  To-be-attached device has last effective > current size, and is consistent
  (9999 > 7777 sectors). Refusing to attach.

It also allows to attach with an explicit size.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
b17b59602b drbd: disconnect, if the wrong UUIDs are attached on a connected peer
With "on-no-data-accessible suspend-io", DRBD requires the next attach
or connect to be to the very same data generation uuid tag it lost last.

If we first lost connection to the peer,
then later lost connection to our own disk,
we would usually refuse to re-connect to the peer,
because it presents the wrong data set.

However, if the peer first connects without a disk,
and then attached its disk, we accepted that same wrong data set,
which would be "unexpected" by any user of that DRBD
and cause "undefined results" (read: very likely data corruption).

The fix is to forcefully disconnect as soon as we notice that the peer
attached to the "wrong" dataset.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:30 -07:00
Lars Ellenberg
94c43a13b8 drbd: ignore "all zero" peer volume sizes in handshake
During handshake, if we are diskless ourselves, we used to accept any size
presented by the peer.

Which could be zero if that peer was just brought up and connected
to us without having a disk attached first, in which case both
peers would just "flip" their volume sizes.

Now, even a diskless node will ignore "zero" sizes
presented by a diskless peer.

Also a currently Diskless Primary will refuse to shrink during handshake:
it may be frozen, and waiting for a "suitable" local disk or peer to
re-appear (on-no-data-accessible suspend-io). If the peer is smaller
than what we used to be, it is not suitable.

The logic for a diskless node during handshake is now supposed to be:
believe the peer, if
 - I don't have a current size myself
 - we agree on the size anyways
 - I do have a current size, am Secondary, and he has the only disk
 - I do have a current size, am Primary, and he has the only disk,
   which is larger than my current size

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:29 -07:00
Lars Ellenberg
d5412e8d8e drbd: centralize printk reporting of new size into drbd_set_my_capacity()
Previously, some implicit resizes that happend during handshake
have not been reported as prominently as explicit resize.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:29 -07:00
Lars Ellenberg
792c3fdd94 drbd: must not use connection after kref_put(&connection->kref)
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:29 -07:00
Roland Kammerer
d29e89e349 drbd: narrow rcu_read_lock in drbd_sync_handshake
So far there was the possibility that we called
genlmsg_new(GFP_NOIO)/mutex_lock() while holding an rcu_read_lock().

This included cases like:

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      drbd_bcast_event
        genlmsg_new(GFP_NOIO) --> may sleep

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      notify_helper
        genlmsg_new(GFP_NOIO) --> may sleep

drbd_sync_handshake (acquire the RCU lock)
  drbd_asb_recover_1p
    drbd_khelper
      notify_helper
        mutex_lock --> may sleep

While using GFP_ATOMIC whould have been possible in the first two cases,
the real fix is to narrow the rcu_read_lock.

Reported-by: Jia-Ju Bai <baijiaju1990@163.com>
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20 09:51:29 -07:00
Chengguang Xu
38a3499f6d block: loop: check error using IS_ERR instead of IS_ERR_OR_NULL in loop_add()
blk_mq_init_queue() will not return NULL pointer to its caller,
so it's better to replace IS_ERR_OR_NULL using IS_ERR in loop_add().

If in the future things change to check NULL pointer inside loop_add(),
we should return -ENOMEM as return code instead of PTR_ERR(NULL).

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-16 09:01:38 -07:00
Chengguang Xu
e7cc005fef aoe: add __exit annotation
Add __exit annotation to cleanup helper which
is only called once in the module.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-16 09:01:38 -07:00
Jens Axboe
4ba09f69e2 mtip32xx: use BLK_STS_DEV_RESOURCE for device resources
For cases where we can only fail with IO in-flight, we should be using
BLK_STS_DEV_RESOURCE instead of BLK_STS_RESOURCE. The latter refers to
system wide resource constraints.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-10 14:45:19 -07:00
Arnd Bergmann
e4025e46f0 mtip32xx: avoid using semaphores
The "cmd_slot_unal" semaphore is never used in a blocking way
but only as an atomic counter. Change the code to using
atomic_dec_if_positive() as a better API.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-10 14:44:56 -07:00
Dennis Zhou
db6638d7d1 blkcg: remove bio->bi_css and instead use bio->bi_blkg
Prior patches ensured that any bio that interacts with a request_queue
is properly associated with a blkg. This makes bio->bi_css unnecessary
as blkg maintains a reference to blkcg already.

This removes the bio field bi_css and transfers corresponding uses to
access via bi_blkg.

Signed-off-by: Dennis Zhou <dennis@kernel.org>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07 22:26:37 -07:00
Jens Axboe
80ff2040ac ataflop: implement mq_ops->commit_rqs() hook
We need this for blk-mq to kick things into gear, if we told it that
we had more IO coming, but then failed to deliver on that promise.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-29 10:12:27 -07:00
Jens Axboe
944e7c8796 virtio_blk: implement mq_ops->commit_rqs() hook
We need this for blk-mq to kick things into gear, if we told it that
we had more IO coming, but then failed to deliver on that promise.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-29 10:12:22 -07:00
Dan Carpenter
49379e6d1e ataflop: fix error handling in atari_floppy_init()
Smatch complains that there is an off by one if the allocation fails in:

	DMABuffer = atari_stram_alloc(BUFFER_SIZE+512, "ataflop");

In that situation, "i" would be point to one element beyond the end of
the unit[] array.

There is a second bug because the error handling calls
blk_mq_free_tag_set(&unit[i].tag_set); regardless of whether
"disk->queue" is NULL or non-NULL.  So if blk_mq_init_sq_queue() fails,
then that means unit[i].tag_set->tags is NULL and it leads to an Oops.

It's easiest to call put_disk() before the goto to clean up the partial
iteration.  Then the earlier unit[] elements are fully allocated so we
can remove the checks whether "disk->queue" is NULL and the code is
simpler.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-29 08:16:06 -07:00
Young Xiao
a11f6ca9ae sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN
__vdc_tx_trigger should only loop on EAGAIN a finite
number of times.

See commit adddc32d6f ("sunvnet: Do not spin in an
infinite loop when vio_ldc_send() returns EAGAIN") for detail.

Signed-off-by: Young Xiao <YangX92@hotmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-28 06:23:12 -07:00
Jens Axboe
a78b03bc73 Linux 4.20-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlvx2sAeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGycgIAIuxobwt0RRKa0zO
 ROS+34JGoC2yU2P9VdEGWdtxS6ANMVQgKPBhWL6s+xR89Kd+V4xSdJLD1pNTxxqP
 0DCva0np1/Q4juH+JbU50v/lykoLgteZ0P0LBRGf1y8p3WiLPv45IbnNsMDNYhB2
 7a8rOmZYakRY9CPznRDw3X8cJt3sddKgFJHIOGz1OQJVWtCD0KPGcJmQNsbDSagY
 Zx6Z5BKSIdjRqaAdN5gDa1Pft3WQo7TpaQGl80lSsgr5LcjmscXA3sClOCy+25Mo
 FZLx0PcwP+Efq8RTGzNK51WSOMa6d37hvjDqUAdQBOR0KbyjRyXQwyQVw/MGbPJs
 7J3Pzm0=
 =56Mt
 -----END PGP SIGNATURE-----

Merge tag 'v4.20-rc3' into for-4.21/block

Merge in -rc3 to resolve a few conflicts, but also to get a few
important fixes that have gone into mainline since the block
4.21 branch was forked off (most notably the SCSI queue issue,
which is both a conflict AND needed fix).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-18 15:46:03 -07:00
Jens Axboe
fce15a609f floppy: remove now unused 'flags' variable
With the locking removed, it's unused. Kill it.

Fixes: 503f620f0c ("floppy: remove queue_lock around floppy_end_request")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-18 15:42:48 -07:00
Christoph Hellwig
a50f9aec1a pktcdvd: remove queue_lock around blk_queue_max_hw_sectors
blk_queue_max_hw_sectors can't do anything with queue_lock protection
so don't hold it.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-16 09:16:59 -07:00
Christoph Hellwig
503f620f0c floppy: remove queue_lock around floppy_end_request
There is nothing the queue_lock could protect inside floppy_end_request,
so remove it.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-16 09:16:57 -07:00
Linus Torvalds
59749c2d49 for-linus-20181115
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlvt05cQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpg3XD/44tsOBP9Hb0LQbsGuUo0GYyiqIHyhF8u0Q
 01qBsXhx4gw/5tl7Y64IiymGIn/3H7pTB9DYaTYTzbWdG2U7AvTygDGdoTAeWS5O
 vYtuIkS7U/MgWRpAH68sByhTnBbFRLcZ2GUvXV3tMnT6brpafMyFxvcQxKhuckOY
 ZUmlQmgs8Nkce53yNeXTa+66RjhOHKHFdrkP119nEljr9+vsfDjjiCb+vNp8xO6N
 Q/HWCvbl5L1L1QLstoRuDZH/zBENVwqmRPFoQbbYnBHS/zQL0L2asYGNRCSavn5G
 /OQvB2nOKI6K+QiugKkf7gsQvTK0MK896IYge6oW0O96yyDer6NNqAOOPVQiA5g7
 jQPtIjG0YXRW4QVZWwh67FuJwAZGJxaO9wvfRvpuUh8G23DFfo3NdFlKOHhbNTW8
 tzinN8NL3Ixq2eifdkp5FMLPFILBil/A4oxYG2BUMqTOc8BSfhq82Msgq9vyzu/J
 4ZrtpmyhB0js9vU3CpJETywq8DSMsVeAEgOSEMJspFrGWTUEPlH/slG9s2vyhBer
 QkG3LWHHQGyXEdIzSWRzMCqfLtsQY4lHvC66k7xlv8fpwVBSlFa2q07WIpeDtta3
 RdUKifpr0uNUnviOrpErVp1gfv5G/quiPTc5SmQWW3Q1AQQn4sjOUOOOfhuDi1Fi
 m+F2YrZ1Ag==
 =TSpO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20181115' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Discard loop fix, caused by integer overflow (Dave)

 - Blacklist of Samsung drive that hangs with power management (Diego)

 - Copy bio priority when cloning it (Hannes)

 - Fix race condition exposed in floppy (me)

 - Fix SCSI queue cleanup regression. While elusive, it caused oopses in
   queue running (Ming)

 - Fix bad string copy in kyber tracing (Omar)

* tag 'for-linus-20181115' of git://git.kernel.dk/linux-block:
  SCSI: fix queue cleanup race before queue initialization is done
  block: fix 32 bit overflow in __blkdev_issue_discard()
  libata: blacklist SAMSUNG MZ7TD256HAFV-000L9 SSD
  block: copy ioprio in __bio_clone_fast() and bounce
  kyber: fix wrong strlcpy() size in trace_kyber_latency()
  floppy: fix race condition in __floppy_read_block_0()
2018-11-16 09:31:59 -06:00
Christoph Hellwig
0d945c1f96 block: remove the queue_lock indirection
With the legacy request path gone there is no good reason to keep
queue_lock as a pointer, we can always use the embedded lock now.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

Fixed floppy and blk-cgroup missing conversions and half done edits.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:17:28 -07:00
Christoph Hellwig
6d46964230 block: remove the lock argument to blk_alloc_queue_node
With the legacy request path gone there is no real need to override the
queue_lock.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:35 -07:00
Christoph Hellwig
68fc68f2ff umem: don't override the queue_lock
The umem card->lock and the block layer queue_lock are used for entirely
different resources.  Stop using card->lock as the block layer
queue_lock.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:31 -07:00
Christoph Hellwig
8295a69bdc drbd: don't override the queue_lock
The DRBD req_lock and block layer queue_lock are used for entirely
different resources.  Stop using the req_lock as the block layer
queue_lock.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:29 -07:00
Christoph Hellwig
39795d6534 block: don't hold the queue_lock over blk_abort_request
There is nothing it could synchronize against, so don't go through
the pains of acquiring the lock.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:18 -07:00
Tetsuo Handa
628bd85947 loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()
Commit 0a42e99b58 ("loop: Get rid of loop_index_mutex") forgot to
remove mutex_unlock(&loop_ctl_mutex) from loop_control_ioctl() when
replacing loop_index_mutex with loop_ctl_mutex.

Fixes: 0a42e99b58 ("loop: Get rid of loop_index_mutex")
Reported-by: syzbot <syzbot+c0138741c2290fc5e63f@syzkaller.appspotmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-12 08:44:06 -07:00
Jens Axboe
8e18ebef4d null_blk: remove unused nullb device
The compiler rightfully complains:

drivers/block/null_blk_main.c: In function ‘null_complete_rq’:
drivers/block/null_blk_main.c:647:16: warning: unused variable ‘nullb’ [-Wunused-variable]
  struct nullb *nullb = rq->q->queuedata;
                ^~~~~

Fixes: 49f6613632 ("nullb: remove leftover legacy request code")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10 13:03:52 -07:00
Jens Axboe
de7b75d82f floppy: fix race condition in __floppy_read_block_0()
LKP recently reported a hang at bootup in the floppy code:

[  245.678853] INFO: task mount:580 blocked for more than 120 seconds.
[  245.679906]       Tainted: G                T 4.19.0-rc6-00172-ga9f38e1 #1
[  245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.682181] mount           D 6372   580      1 0x00000004
[  245.683023] Call Trace:
[  245.683425]  __schedule+0x2df/0x570
[  245.683975]  schedule+0x2d/0x80
[  245.684476]  schedule_timeout+0x19d/0x330
[  245.685090]  ? wait_for_common+0xa5/0x170
[  245.685735]  wait_for_common+0xac/0x170
[  245.686339]  ? do_sched_yield+0x90/0x90
[  245.686935]  wait_for_completion+0x12/0x20
[  245.687571]  __floppy_read_block_0+0xfb/0x150
[  245.688244]  ? floppy_resume+0x40/0x40
[  245.688844]  floppy_revalidate+0x20f/0x240
[  245.689486]  check_disk_change+0x43/0x60
[  245.690087]  floppy_open+0x1ea/0x360
[  245.690653]  __blkdev_get+0xb4/0x4d0
[  245.691212]  ? blkdev_get+0x1db/0x370
[  245.691777]  blkdev_get+0x1f3/0x370
[  245.692351]  ? path_put+0x15/0x20
[  245.692871]  ? lookup_bdev+0x4b/0x90
[  245.693539]  blkdev_get_by_path+0x3d/0x80
[  245.694165]  mount_bdev+0x2a/0x190
[  245.694695]  squashfs_mount+0x10/0x20
[  245.695271]  ? squashfs_alloc_inode+0x30/0x30
[  245.695960]  mount_fs+0xf/0x90
[  245.696451]  vfs_kern_mount+0x43/0x130
[  245.697036]  do_mount+0x187/0xc40
[  245.697563]  ? memdup_user+0x28/0x50
[  245.698124]  ksys_mount+0x60/0xc0
[  245.698639]  sys_mount+0x19/0x20
[  245.699167]  do_int80_syscall_32+0x61/0x130
[  245.699813]  entry_INT80_32+0xc7/0xc7

showing that we never complete that read request. The reason is that
the completion setup is racy - it initializes the completion event
AFTER submitting the IO, which means that the IO could complete
before/during the init. If it does, we are passing garbage to
complete() and we may sleep forever waiting for the event to
occur.

Fixes: 7b7b68bba5 ("floppy: bail out in open() if drive is not responding to block0 read")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10 08:16:12 -07:00
Christoph Hellwig
289d088b66 pd: replace ->special use with private data in the request
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10 08:03:50 -07:00
Christoph Hellwig
61e7712e25 aoe: replace ->special use with private data in the request
Makes the code a whole lot easier to read.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10 08:03:49 -07:00
Christoph Hellwig
1bee42438f skd_main: don't use req->special
Add a retries field to the internal request structure instead, which gets
set to zero on the first submission.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10 08:03:47 -07:00
Christoph Hellwig
49f6613632 nullb: remove leftover legacy request code
null_softirq_done_fn is only used for the blk-mq path, so remove the
other branch.  Also rename the function to better match the method name.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10 08:03:46 -07:00
Linus Torvalds
ab6e1f378f xen: fixes for 4.20-rc2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCW+bgfAAKCRCAXGG7T9hj
 vuvvAQDWkWKWrvi6D71g6JV37aDAgv5QlyTnk9HbWKSFtzv1mgEAotDbEMnRuDE/
 CKFo+1J1Lgc8qczbX36X6bXR5TEh9gw=
 =n7Iq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.20a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Several fixes, mostly for rather recent regressions when running under
  Xen"

* tag 'for-linus-4.20a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: remove size limit of privcmd-buf mapping interface
  xen: fix xen_qlock_wait()
  x86/xen: fix pv boot
  xen-blkfront: fix kernel panic with negotiate_mq error path
  xen/grant-table: Fix incorrect gnttab_dma_free_pages() pr_debug message
  CONFIG_XEN_PV breaks xen_create_contiguous_region on ARM
2018-11-10 08:58:48 -06:00
Christoph Hellwig
27d420bc47 mtip32xxx: use for_each_sg
Use the proper helper instead of manually iterating the scatterlist,
which is broken in the presence of chained S/G lists.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09 08:39:21 -07:00
Christoph Hellwig
d85cb20453 mtip32xx: don't use req->special
Instead create add to the icmd into struct mtip_cmd which can be unioned
with the scatterlist used for the normal I/O path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09 08:39:21 -07:00
Christoph Hellwig
55c7bc37e0 mtip32xx: remove mtip_get_int_command
Merging this function into the only callers makes the code flow easier.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09 08:39:21 -07:00
Christoph Hellwig
7bbf118f3b mtip32xx: remove mtip_init_cmd_header
There isn't much need for this helper - we can just calculate the offset
for the command header once late in the submission path and fill out
the ctba and ctbau fields there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09 08:39:21 -07:00