linux/net
Johan Hedberg 8ce783dc5e Bluetooth: Fix missing hdev locking for LE scan cleanup
The hci_conn objects don't have a dedicated lock themselves but rely
on the caller to hold the hci_dev lock for most types of access. The
hci_conn_timeout() function has so far sent certain HCI commands based
on the hci_conn state which has been possible without holding the
hci_dev lock.

The recent changes to do LE scanning before connect attempts added
even more operations to hci_conn and hci_dev from hci_conn_timeout,
thereby exposing potential race conditions with the hci_dev and
hci_conn states.

As an example of such a race, here there's a timeout but an
l2cap_sock_connect() call manages to race with the cleanup routine:

[Oct21 08:14] l2cap_chan_timeout: chan ee4b12c0 state BT_CONNECT
[  +0.000004] l2cap_chan_close: chan ee4b12c0 state BT_CONNECT
[  +0.000002] l2cap_chan_del: chan ee4b12c0, conn f3141580, err 111, state BT_CONNECT
[  +0.000002] l2cap_sock_teardown_cb: chan ee4b12c0 state BT_CONNECT
[  +0.000005] l2cap_chan_put: chan ee4b12c0 orig refcnt 4
[  +0.000010] hci_conn_drop: hcon f53d56e0 orig refcnt 1
[  +0.000013] l2cap_chan_put: chan ee4b12c0 orig refcnt 3
[  +0.000063] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT
[  +0.000049] hci_conn_params_del: addr ee:0d:30:09:53:1f (type 1)
[  +0.000002] hci_chan_list_flush: hcon f53d56e0
[  +0.000001] hci_chan_del: hci0 hcon f53d56e0 chan f4e7ccc0
[  +0.004528] l2cap_sock_create: sock e708fc00
[  +0.000023] l2cap_chan_create: chan ee4b1770
[  +0.000001] l2cap_chan_hold: chan ee4b1770 orig refcnt 1
[  +0.000002] l2cap_sock_init: sk ee4b3390
[  +0.000029] l2cap_sock_bind: sk ee4b3390
[  +0.000010] l2cap_sock_setsockopt: sk ee4b3390
[  +0.000037] l2cap_sock_connect: sk ee4b3390
[  +0.000002] l2cap_chan_connect: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f (type 2) psm 0x00
[  +0.000002] hci_get_route: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f
[  +0.000001] hci_dev_hold: hci0 orig refcnt 8
[  +0.000003] hci_conn_hold: hcon f53d56e0 orig refcnt 0

Above the l2cap_chan_connect() shouldn't have been able to reach the
hci_conn f53d56e0 anymore but since hci_conn_timeout didn't do proper
locking that's not the case. The end result is a reference to hci_conn
that's not in the conn_hash list, resulting in list corruption when
trying to remove it later:

[Oct21 08:15] l2cap_chan_timeout: chan ee4b1770 state BT_CONNECT
[  +0.000004] l2cap_chan_close: chan ee4b1770 state BT_CONNECT
[  +0.000003] l2cap_chan_del: chan ee4b1770, conn f3141580, err 111, state BT_CONNECT
[  +0.000001] l2cap_sock_teardown_cb: chan ee4b1770 state BT_CONNECT
[  +0.000005] l2cap_chan_put: chan ee4b1770 orig refcnt 4
[  +0.000002] hci_conn_drop: hcon f53d56e0 orig refcnt 1
[  +0.000015] l2cap_chan_put: chan ee4b1770 orig refcnt 3
[  +0.000038] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT
[  +0.000003] hci_chan_list_flush: hcon f53d56e0
[  +0.000002] hci_conn_hash_del: hci0 hcon f53d56e0
[  +0.000001] ------------[ cut here ]------------
[  +0.000461] WARNING: CPU: 0 PID: 1782 at lib/list_debug.c:56 __list_del_entry+0x3f/0x71()
[  +0.000839] list_del corruption, f53d56e0->prev is LIST_POISON2 (00000200)

The necessary fix is unfortunately more complicated than just adding
hci_dev_lock/unlock calls to the hci_conn_timeout() call path.
Particularly, the hci_conn_del() API, which expects the hci_dev lock to
be held, performs a cancel_delayed_work_sync(&hcon->disc_work) which
would lead to a deadlock if the hci_conn_timeout() call path tries to
acquire the same lock.

This patch solves the problem by deferring the cleanup work to a
separate work callback. To protect against the hci_dev or hci_conn
going away meanwhile temporary references are taken with the help of
hci_dev_hold() and hci_conn_get().

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org # 4.3
2015-10-21 14:25:34 +02:00
..
6lowpan 6lowpan: put mcast compression in an own function 2015-10-21 00:49:25 +02:00
9p net/9p: Remove ib_get_dma_mr calls 2015-08-30 18:12:36 -04:00
802
8021q net: 8021q: convert to using IFF_NO_QUEUE 2015-08-18 11:55:06 -07:00
appletalk
atm atm: deal with setting entry before mkip was called 2015-09-17 22:13:32 -07:00
ax25
batman-adv batman-adv: turn batadv_neigh_node_get() into local function 2015-08-27 20:15:34 +02:00
bluetooth Bluetooth: Fix missing hdev locking for LE scan cleanup 2015-10-21 14:25:34 +02:00
bridge Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-10-17 14:28:03 +02:00
caif net: caif: convert to using IFF_NO_QUEUE 2015-08-18 11:55:07 -07:00
can can: avoid using timeval for uapi 2015-10-13 17:42:34 +02:00
ceph rbd: use writefull op for object size writes 2015-10-16 16:49:01 +02:00
core Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp tcp/dccp: add inet_csk_reqsk_queue_drop_and_put() helper 2015-10-16 00:52:18 -07:00
decnet Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-10-17 14:28:03 +02:00
dns_resolver
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
ethernet net: help compiler generate better code in eth_get_headlen 2015-09-28 22:51:15 -07:00
hsr net: hsr: convert to using IFF_NO_QUEUE 2015-08-18 11:55:07 -07:00
ieee802154 6lowpan: remove lowpan_is_addr_broadcast 2015-10-21 00:49:25 +02:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
ipx
irda
iucv s390/iucv: do not use arrays as argument 2015-09-21 16:03:04 -07:00
key net: Fix RCU splat in af_key 2015-08-24 14:48:10 -07:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-02 07:21:25 -07:00
l3mdev net: Add netif_is_l3_slave 2015-10-07 04:27:43 -07:00
lapb
llc tcp: fix recv with flags MSG_WAITALL | MSG_PEEK 2015-07-27 01:06:53 -07:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
mac802154 mac802154: llsec: use kzfree 2015-10-21 00:49:24 +02:00
mpls dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
netfilter Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-10-17 14:28:03 +02:00
netlabel
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
netrom
nfc nfc: netlink: Add capability to reply to vendor_cmd with data 2015-08-20 22:00:11 +02:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
packet ipv4: Pass struct net into ip_defrag and ip_check_defrag 2015-10-12 19:44:16 -07:00
phonet
rds RDS: fix rds-ping deadlock over TCP transport 2015-10-18 22:45:55 -07:00
rfkill rfkill: Copy "all" global state to other types 2015-09-04 14:26:56 +02:00
rose
rxrpc rxrpc: Replace get_seconds with ktime_get_seconds 2015-09-20 21:53:56 -07:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
sctp net: sctp: avoid incorrect time_t use 2015-10-05 03:16:48 -07:00
sunrpc Changes for 4.3-rc5 2015-10-15 13:44:35 -07:00
switchdev Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
unix net/unix: fix logic about sk_peek_offset 2015-10-05 06:33:09 -07:00
vmw_vsock
wimax net:wimax: Fix doucble word "the the" in networking.xml 2015-08-09 22:43:52 -07:00
wireless For the current cycle, we have the following right now: 2015-10-07 04:29:18 -07:00
x25
xfrm dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
compat.c
Kconfig net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
Makefile net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
socket.c
sysctl_net.c