linux/net
Vlad Buslov 4ce70b4aed net: sched: sch_htb: don't call qdisc_put() while holding tree lock
Recent changes that removed rtnl dependency from rules update path of tc
also made tcf_block_put() function sleeping. This function is called from
ops->destroy() of several Qdisc implementations, which in turn is called by
qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
which results sleeping-while-atomic BUG.

Steps to reproduce for htb:

tc qdisc add dev ens1f0 root handle 1: htb default 12
tc class add dev ens1f0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps
tc qdisc add dev ens1f0 parent 1:1 handle 40: sfq perturb 10
tc class add dev ens1f0 parent 1:1 classid 1:2 htb rate 100kbps ceil 100kbps

Resulting dmesg:

[ 4791.148551] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
[ 4791.151354] in_atomic(): 1, irqs_disabled(): 0, pid: 27273, name: tc
[ 4791.152805] INFO: lockdep is turned off.
[ 4791.153605] CPU: 19 PID: 27273 Comm: tc Tainted: G        W         5.3.0-rc8+ #721
[ 4791.154336] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 4791.155075] Call Trace:
[ 4791.155803]  dump_stack+0x85/0xc0
[ 4791.156529]  ___might_sleep.cold+0xac/0xbc
[ 4791.157251]  __mutex_lock+0x5b/0x960
[ 4791.157966]  ? console_unlock+0x363/0x5d0
[ 4791.158676]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 4791.159395]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 4791.160103]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 4791.160815]  tcf_block_put_ext.part.0+0x21/0x50
[ 4791.161530]  tcf_block_put+0x50/0x70
[ 4791.162233]  sfq_destroy+0x15/0x50 [sch_sfq]
[ 4791.162936]  qdisc_destroy+0x5f/0x160
[ 4791.163642]  htb_change_class.cold+0x5df/0x69d [sch_htb]
[ 4791.164505]  tc_ctl_tclass+0x19d/0x480
[ 4791.165360]  rtnetlink_rcv_msg+0x170/0x4b0
[ 4791.166191]  ? netlink_deliver_tap+0x95/0x400
[ 4791.166907]  ? rtnl_dellink+0x2d0/0x2d0
[ 4791.167625]  netlink_rcv_skb+0x49/0x110
[ 4791.168345]  netlink_unicast+0x171/0x200
[ 4791.169058]  netlink_sendmsg+0x224/0x3f0
[ 4791.169771]  sock_sendmsg+0x5e/0x60
[ 4791.170475]  ___sys_sendmsg+0x2ae/0x330
[ 4791.171183]  ? ___sys_recvmsg+0x159/0x1f0
[ 4791.171894]  ? do_wp_page+0x9c/0x790
[ 4791.172595]  ? __handle_mm_fault+0xcd3/0x19e0
[ 4791.173309]  __sys_sendmsg+0x59/0xa0
[ 4791.174024]  do_syscall_64+0x5c/0xb0
[ 4791.174725]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 4791.175435] RIP: 0033:0x7f0aa41497b8
[ 4791.176129] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
4
[ 4791.177532] RSP: 002b:00007fff4e37d588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 4791.178243] RAX: ffffffffffffffda RBX: 000000005d8132f7 RCX: 00007f0aa41497b8
[ 4791.178947] RDX: 0000000000000000 RSI: 00007fff4e37d5f0 RDI: 0000000000000003
[ 4791.179662] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000020149a0
[ 4791.180382] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
[ 4791.181100] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000

In htb_change_class() function save parent->leaf.q to local temporary
variable and put reference to it after sch tree lock is released in order
not to call potentially sleeping cls API in atomic section. This is safe to
do because Qdisc has already been reset by qdisc_purge_queue() inside sch
tree lock critical section.

Fixes: c266f64dbf ("net: sched: protect block state with mutex")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-27 12:13:55 +02:00
..
6lowpan 6lowpan: no need to check return value of debugfs_create functions 2019-07-06 12:50:01 +02:00
9p 9p pull request for inclusion in 5.13 2019-07-12 17:31:19 -07:00
802 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-22 08:59:24 -04:00
appletalk appletalk: enforce CAP_NET_RAW for raw sockets 2019-09-24 16:37:18 +02:00
atm pppoatm: use %*ph to print small buffer 2019-09-05 12:33:28 +02:00
ax25 ax25: enforce CAP_NET_RAW for raw sockets 2019-09-24 16:37:18 +02:00
batman-adv net: Fix Kconfig indentation 2019-09-26 08:56:17 +02:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2019-09-18 12:34:53 -07:00
bpf bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN 2019-07-25 18:00:41 -07:00
bpfilter Kbuild updates for v5.3 2019-07-12 16:03:16 -07:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-15 14:17:27 +02:00
caif treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 194 2019-05-30 11:29:22 -07:00
can can: add support of SAE J1939 protocol 2019-09-04 14:22:33 +02:00
ceph libceph: don't call crypto_free_sync_skcipher() on a NULL tfm 2019-08-28 12:33:46 +02:00
core net: print proper warning on dst underflow 2019-09-26 09:05:56 +02:00
dcb treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
dccp ipv6: add priority parameter to ip6_xmit() 2019-09-27 12:05:02 +02:00
decnet
dns_resolver Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
dsa Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2019-09-17 23:51:10 +02:00
ethernet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 11:00:14 -07:00
hsr hsr: switch ->dellink() to ->ndo_uninit() 2019-07-11 14:37:45 -07:00
ieee802154 ieee802154: enforce CAP_NET_RAW for raw sockets 2019-09-24 16:37:18 +02:00
ife net: Fix Kconfig indentation 2019-09-26 08:56:17 +02:00
ipv4 tcp: honor SO_PRIORITY in TIME_WAIT state 2019-09-27 12:05:02 +02:00
ipv6 tcp: honor SO_PRIORITY in TIME_WAIT state 2019-09-27 12:05:02 +02:00
iucv net/af_iucv: mark expected switch fall-throughs 2019-07-29 10:26:14 -07:00
kcm kcm: disable preemption in kcm_parse_func_strparser() 2019-09-27 10:27:14 +02:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-07-08 19:48:57 -07:00
l2tp compat_ioctl: pppoe: fix PPPOEIOCSFWD handling 2019-07-30 14:42:13 -07:00
l3mdev ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF 2019-06-23 13:24:17 -07:00
lapb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 20:20:36 -07:00
llc treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 281 2019-06-05 17:36:36 +02:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-15 14:17:27 +02:00
mac802154 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
mpls ipv4: mpls: fix mpls_xmit for iptunnel 2019-08-25 14:34:08 -07:00
ncsi net/ncsi: Disable global multicast filter 2019-09-19 18:04:40 -07:00
netfilter net: Fix Kconfig indentation 2019-09-26 08:56:17 +02:00
netlabel netlabel: remove redundant assignment to pointer iter 2019-09-01 11:45:02 -07:00
netlink net: remove empty netlink_tap_exit_net 2019-06-14 19:50:33 -07:00
netrom netrom: hold sock when setting skb->destructor 2019-07-24 15:49:05 -07:00
nfc nfc: enforce CAP_NET_RAW for raw sockets 2019-09-24 16:37:18 +02:00
nsh treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
openvswitch openvswitch: change type of UPCALL_PID attribute to NLA_UNSPEC 2019-09-26 09:32:33 +02:00
packet net/packet: fix race in tpacket_snd() 2019-08-15 13:59:48 -07:00
phonet treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 336 2019-06-05 17:37:07 +02:00
psample net: sched: take reference to psample group in flow_action infra 2019-09-16 09:18:03 +02:00
qrtr net: qrtr: Stop rx_worker before freeing node 2019-09-21 18:45:46 -07:00
rds net/rds: Check laddr_check before calling it 2019-09-27 12:10:55 +02:00
rfkill treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
rose treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-15 14:17:27 +02:00
sched net: sched: sch_htb: don't call qdisc_put() while holding tree lock 2019-09-27 12:13:55 +02:00
sctp ipv6: add priority parameter to ip6_xmit() 2019-09-27 12:05:02 +02:00
smc net/smc: make sure EPOLLOUT is raised 2019-08-20 12:25:14 -07:00
strparser Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-22 08:59:24 -04:00
sunrpc Merge branch 'work.mount-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-18 13:15:58 -07:00
switchdev treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-15 14:17:27 +02:00
tls net/tls: align non temporal copy to cache lines 2019-09-07 18:10:34 +02:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 11:00:14 -07:00
vmw_vsock vsock/virtio: a better comment on credit update 2019-09-05 09:53:01 +02:00
wimax wimax: no need to check return value of debugfs_create functions 2019-08-10 15:25:47 -07:00
wireless We have a number of changes, but things are settling down: 2019-09-11 14:57:17 +01:00
x25
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2019-09-06 16:49:17 +02:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-15 14:17:27 +02:00
compat.c uio: make import_iovec()/compat_import_iovec() return bytes on success 2019-05-31 15:30:03 -06:00
Kconfig devlink: Add packet trap infrastructure 2019-08-17 12:40:08 -07:00
Makefile
socket.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
sysctl_net.c