linux/net/sched
Eric Dumazet 138470a9b2 net/sched: act_ct: fix lockdep splat in tcf_ct_flow_table_get
Convert zones_lock spinlock to zones_mutex mutex,
and struct (tcf_ct_flow_table)->ref to a refcount,
so that control path can use regular GFP_KERNEL allocations
from standard process context. This is more robust
in case of memory pressure.

The refcount is needed because tcf_ct_flow_table_put() can
be called from RCU callback, thus in BH context.

The issue was spotted by syzbot, as rhashtable_init()
was called with a spinlock held, which is bad since GFP_KERNEL
allocations can sleep.

Note to developers : Please make sure your patches are tested
with CONFIG_DEBUG_ATOMIC_SLEEP=y

BUG: sleeping function called from invalid context at mm/slab.h:565
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9582, name: syz-executor610
2 locks held by syz-executor610/9582:
 #0: ffffffff8a34eb80 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:72 [inline]
 #0: ffffffff8a34eb80 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x3f9/0xad0 net/core/rtnetlink.c:5437
 #1: ffffffff8a3961b8 (zones_lock){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline]
 #1: ffffffff8a3961b8 (zones_lock){+...}, at: tcf_ct_flow_table_get+0xa3/0x1700 net/sched/act_ct.c:67
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 0 PID: 9582 Comm: syz-executor610 Not tainted 5.6.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x188/0x20d lib/dump_stack.c:118
 ___might_sleep.cold+0x1f4/0x23d kernel/sched/core.c:6798
 slab_pre_alloc_hook mm/slab.h:565 [inline]
 slab_alloc_node mm/slab.c:3227 [inline]
 kmem_cache_alloc_node_trace+0x272/0x790 mm/slab.c:3593
 __do_kmalloc_node mm/slab.c:3615 [inline]
 __kmalloc_node+0x38/0x60 mm/slab.c:3623
 kmalloc_node include/linux/slab.h:578 [inline]
 kvmalloc_node+0x61/0xf0 mm/util.c:574
 kvmalloc include/linux/mm.h:645 [inline]
 kvzalloc include/linux/mm.h:653 [inline]
 bucket_table_alloc+0x8b/0x480 lib/rhashtable.c:175
 rhashtable_init+0x3d2/0x750 lib/rhashtable.c:1054
 nf_flow_table_init+0x16d/0x310 net/netfilter/nf_flow_table_core.c:498
 tcf_ct_flow_table_get+0xe33/0x1700 net/sched/act_ct.c:82
 tcf_ct_init+0xba4/0x18a6 net/sched/act_ct.c:1050
 tcf_action_init_1+0x697/0xa20 net/sched/act_api.c:945
 tcf_action_init+0x1e9/0x2f0 net/sched/act_api.c:1001
 tcf_action_add+0xdb/0x370 net/sched/act_api.c:1411
 tc_ctl_action+0x366/0x456 net/sched/act_api.c:1466
 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5440
 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2478
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6b9/0x7d0 net/socket.c:2343
 ___sys_sendmsg+0x100/0x170 net/socket.c:2397
 __sys_sendmsg+0xec/0x1b0 net/socket.c:2430
 do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4403d9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffd719af218 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004403d9
RDX: 0000000000000000 RSI: 0000000020000300 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000005 R09: 00000000004002c8
R10: 0000000000000008 R11: 00000000000

Fixes: c34b961a24 ("net/sched: act_ct: Create nf flow table per zone")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paul Blakey <paulb@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-08 21:47:20 -07:00
..
act_api.c sched: act: allow user to specify type of HW stats for a filter 2020-03-08 21:07:48 -07:00
act_bpf.c net: sched: update action implementations to support flags 2019-10-30 18:07:51 -07:00
act_connmark.c net: sched: update action implementations to support flags 2019-10-30 18:07:51 -07:00
act_csum.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-11-26 15:42:43 -08:00
act_ct.c net/sched: act_ct: fix lockdep splat in tcf_ct_flow_table_get 2020-03-08 21:47:20 -07:00
act_ctinfo.c net: sched: act_ctinfo: fix memory leak 2020-01-19 16:02:15 +01:00
act_gact.c net: sched: update action implementations to support flags 2019-10-30 18:07:51 -07:00
act_ife.c net/sched: act_ife: initalize ife->metalist earlier 2020-01-17 10:58:15 +01:00
act_ipt.c net: sched: update action implementations to support flags 2019-10-30 18:07:51 -07:00
act_meta_mark.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_meta_skbprio.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_meta_skbtcindex.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_mirred.c net/sched: act_mirred: Pull mac prior redir to non mac_header_xmit device 2019-12-27 16:35:32 -08:00
act_mpls.c net: Fixed updating of ethertype in skb_mpls_push() 2019-12-04 17:11:25 -08:00
act_nat.c icmp: remove duplicate code 2019-11-05 14:03:11 -08:00
act_pedit.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-22 16:27:24 -08:00
act_police.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-11-26 15:42:43 -08:00
act_sample.c net: sched: lock action when translating it to flow_action infra 2020-02-17 14:17:02 -08:00
act_simple.c net_sched: extend packet counter to 64bit 2019-11-05 18:20:55 -08:00
act_skbedit.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-11-26 15:42:43 -08:00
act_skbmod.c net: sched: update action implementations to support flags 2019-10-30 18:07:51 -07:00
act_tunnel_key.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-11-26 15:42:43 -08:00
act_vlan.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-11-26 15:42:43 -08:00
cls_api.c sched: act: allow user to specify type of HW stats for a filter 2020-03-08 21:07:48 -07:00
cls_basic.c net_sched: fix ops->bind_class() implementations 2020-01-27 10:51:43 +01:00
cls_bpf.c net_sched: fix ops->bind_class() implementations 2020-01-27 10:51:43 +01:00
cls_cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
cls_flow.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
cls_flower.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-02-21 13:39:34 -08:00
cls_fw.c net_sched: fix ops->bind_class() implementations 2020-01-27 10:51:43 +01:00
cls_matchall.c net: sched: don't take rtnl lock during flow_action setup 2020-02-17 14:17:02 -08:00
cls_route.c net_sched: fix ops->bind_class() implementations 2020-01-27 10:51:43 +01:00
cls_rsvp6.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
cls_rsvp.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
cls_rsvp.h cls_rsvp: fix rsvp_policy 2020-02-01 12:25:06 -08:00
cls_tcindex.c net_sched: fix a resource leak in tcindex_set_parms() 2020-02-05 14:11:57 +01:00
cls_u32.c net_sched: fix ops->bind_class() implementations 2020-01-27 10:51:43 +01:00
em_canid.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 11 2019-05-21 11:28:45 +02:00
em_cmp.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
em_ipset.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
em_ipt.c net: sched: Replace zero-length array with flexible-array member 2020-02-29 21:27:02 -08:00
em_meta.c net: annotate lockless accesses to sk->sk_max_ack_backlog 2019-11-06 16:14:48 -08:00
em_nbyte.c net: sched: Replace zero-length array with flexible-array member 2020-02-29 21:27:02 -08:00
em_text.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
em_u32.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
ematch.c net_sched: ematch: reject invalid TCF_EM_SIMPLE 2020-01-27 10:55:26 +01:00
Kconfig net/sched: act_ct: Create nf flow table per zone 2020-03-03 15:09:12 -08:00
Makefile net: sched: add Flow Queue PIE packet scheduler 2020-01-23 11:38:31 +01:00
sch_api.c net_sched: walk through all child classes in tc_bind_tclass() 2020-01-27 10:52:46 +01:00
sch_atm.c net: sched: Replace zero-length array with flexible-array member 2020-02-29 21:27:02 -08:00
sch_blackhole.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sch_cake.c net: sched: use skb_list_walk_safe helper for gso segments 2020-01-14 11:48:41 -08:00
sch_cbq.c sch_cbq: validate TCA_CBQ_WRROPT to avoid crash 2019-09-30 11:07:46 -07:00
sch_cbs.c net: sched: cbs: Avoid division by zero when calculating the port rate 2019-10-01 09:51:39 -07:00
sch_choke.c sch_choke: Use kvcalloc 2020-01-29 11:58:10 +01:00
sch_codel.c net: sched: Fix a possible null-pointer dereference in dequeue_func() 2019-07-29 09:46:58 -07:00
sch_drr.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sch_dsmark.c sch_dsmark: fix potential NULL deref in dsmark_init() 2019-10-04 18:28:30 -07:00
sch_etf.c sched: etf: Fix ordering of packets with same txtime 2019-10-15 20:32:04 -07:00
sch_ets.c net: sch_ets: Make the ETS qdisc offloadable 2019-12-18 13:32:29 -08:00
sch_fifo.c net: sched: Make FIFO Qdisc offloadable 2020-03-05 14:03:31 -08:00
sch_fq_codel.c fq_codel: do not include <linux/jhash.h> 2019-10-22 15:31:42 -07:00
sch_fq_pie.c pie: remove pie_vars->accu_prob_overflows 2020-03-04 13:25:55 -08:00
sch_fq.c pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM 2020-01-08 12:40:47 -08:00
sch_generic.c netdev: pass the stuck queue to the timeout handler 2019-12-12 21:38:57 -08:00
sch_gred.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sch_hfsc.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
sch_hhf.c net/flow_dissector: switch to siphash 2019-10-23 20:13:22 -07:00
sch_htb.c net: sched: sch_htb: don't call qdisc_put() while holding tree lock 2019-09-27 12:13:55 +02:00
sch_ingress.c net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_* 2019-07-09 14:38:50 -07:00
sch_mq.c net: sched: fix dump qlen for sch_mq/sch_mqprio with NOLOCK subqueues 2019-12-03 11:53:55 -08:00
sch_mqprio.c mqprio: Fix out-of-bounds access in mqprio_dump 2019-12-06 11:58:45 -08:00
sch_multiq.c net: sched: fix tc -s class show no bstats on class with nolock subqueues 2019-11-30 10:38:40 -08:00
sch_netem.c net: sched: Replace zero-length array with flexible-array member 2020-02-29 21:27:02 -08:00
sch_pie.c pie: remove pie_vars->accu_prob_overflows 2020-03-04 13:25:55 -08:00
sch_plug.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sch_prio.c net: sch_prio: When ungrafting, replace with FIFO 2020-01-08 12:45:53 -08:00
sch_qfq.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sch_red.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sch_sfb.c net/flow_dissector: switch to siphash 2019-10-23 20:13:22 -07:00
sch_sfq.c net/flow_dissector: switch to siphash 2019-10-23 20:13:22 -07:00
sch_skbprio.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sch_taprio.c taprio: Fix dropping packets when using taprio + ETF offloading 2020-02-07 11:30:03 +01:00
sch_tbf.c net: sched: Make TBF Qdisc offloadable 2020-01-25 10:56:31 +01:00
sch_teql.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00