linux/include
Ilya Dryomov 67645d7619 libceph: fix ceph_msg_revoke()
There are a number of problems with revoking a "was sending" message:

(1) We never make any attempt to revoke data - only kvecs contibute to
con->out_skip.  However, once the header (envelope) is written to the
socket, our peer learns data_len and sets itself to expect at least
data_len bytes to follow front or front+middle.  If ceph_msg_revoke()
is called while the messenger is sending message's data portion,
anything we send after that call is counted by the OSD towards the now
revoked message's data portion.  The effects vary, the most common one
is the eventual hang - higher layers get stuck waiting for the reply to
the message that was sent out after ceph_msg_revoke() returned and
treated by the OSD as a bunch of data bytes.  This is what Matt ran
into.

(2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs
is wrong.  If ceph_msg_revoke() is called before the tag is sent out or
while the messenger is sending the header, we will get a connection
reset, either due to a bad tag (0 is not a valid tag) or a bad header
CRC, which kind of defeats the purpose of revoke.  Currently the kernel
client refuses to work with header CRCs disabled, but that will likely
change in the future, making this even worse.

(3) con->out_skip is not reset on connection reset, leading to one or
more spurious connection resets if we happen to get a real one between
con->out_skip is set in ceph_msg_revoke() and before it's cleared in
write_partial_skip().

Fixing (1) and (3) is trivial.  The idea behind fixing (2) is to never
zero the tag or the header, i.e. send out tag+header regardless of when
ceph_msg_revoke() is called.  That way the header is always correct, no
unnecessary resets are induced and revoke stands ready for disabled
CRCs.  Since ceph_msg_revoke() rips out con->out_msg, introduce a new
"message out temp" and copy the header into it before sending.

Cc: stable@vger.kernel.org # 4.0+
Reported-by: Matt Conner <matt.conner@keepertech.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Matt Conner <matt.conner@keepertech.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:08 +01:00
..
acpi Merge branch 'acpi-pci' 2015-11-07 01:30:10 +01:00
asm-generic treewide: Remove old email address 2015-11-23 09:44:58 +01:00
clocksource
crypto Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2015-11-05 15:32:38 -08:00
drm drm/nouveau: Fix pre-nv50 pageflip events (v4) 2015-12-04 13:49:38 +10:00
dt-bindings ARM: DT updates for v4.4 2015-11-10 15:06:26 -08:00
keys
kvm KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active 2015-11-24 18:07:40 +01:00
linux libceph: fix ceph_msg_revoke() 2016-01-21 19:36:08 +01:00
math-emu
media
memory
misc
net net: Propagate lookup failure in l3mdev_get_saddr to caller 2016-01-04 22:58:30 -05:00
pcmcia
ras
rdma IB/mad: Require CM send method for everything except ClassPortInfo 2015-12-08 12:19:11 -05:00
rxrpc
scsi Merge branch 'mkp-fixes' into fixes 2015-12-03 09:32:33 -08:00
soc ARM: SoC driver updates for v4.4 2015-11-10 15:00:03 -08:00
sound Merge remote-tracking branch 'asoc/fix/dapm' into asoc-linus 2016-01-05 23:07:32 +00:00
target target: Fix race for SCF_COMPARE_AND_WRITE_POST checking 2015-11-28 19:33:15 -08:00
trace Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2015-11-11 09:03:01 -08:00
uapi Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-17 14:05:22 -08:00
video gpu: ipu-v3: drop unused dmfc field from client platform data 2015-11-24 11:30:15 +01:00
xen xen: Add RING_COPY_REQUEST() 2015-12-18 10:00:17 -05:00
Kbuild