linux/net/sched
Daniel Borkmann 81d947e2b8 net, sched: fix panic when updating miniq {b,q}stats
While working on fixing another bug, I ran into the following panic
on arm64 by simply attaching clsact qdisc, adding a filter and running
traffic on ingress to it:

  [...]
  [  178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000
  [  178.197314] Mem abort info:
  [  178.200121]   ESR = 0x96000004
  [  178.203168]   Exception class = DABT (current EL), IL = 32 bits
  [  178.209095]   SET = 0, FnV = 0
  [  178.212157]   EA = 0, S1PTW = 0
  [  178.215288] Data abort info:
  [  178.218175]   ISV = 0, ISS = 0x00000004
  [  178.222019]   CM = 0, WnR = 0
  [  178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33
  [  178.231531] [0000810fb501f000] *pgd=0000000000000000
  [  178.236508] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G        W        4.15.0-rc7+ #5
  [  178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
  [  178.326887] pstate: 60400005 (nZCv daif +PAN -UAO)
  [  178.331685] pc : __netif_receive_skb_core+0x49c/0xac8
  [  178.336728] lr : __netif_receive_skb+0x28/0x78
  [  178.341161] sp : ffff00002344b750
  [  178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580
  [  178.349769] x27: 0000000000000000 x26: ffff000009378000
  [...]
  [  178.418715] x1 : 0000000000000054 x0 : 0000000000000000
  [  178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4)
  [  178.430537] Call trace:
  [  178.432976]  __netif_receive_skb_core+0x49c/0xac8
  [  178.437670]  __netif_receive_skb+0x28/0x78
  [  178.441757]  process_backlog+0x9c/0x160
  [  178.445584]  net_rx_action+0x2f8/0x3f0
  [...]

Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init()
which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc.
Problem is that this cannot work since they are actually set up right after
the qdisc ->init() callback in qdisc_create(), so first packet going into
sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we
therefore panic.

In order to fix this, allocation of {b,q}stats needs to happen before we
call into ->init(). In net-next, there's already such option through commit
d59f5ffa59 ("net: sched: a dflt qdisc may be used with per cpu stats").
However, the bug needs to be fixed in net still for 4.15. Thus, include
these bits to reduce any merge churn and reuse the static_flags field to
set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since
there is no other user left. Prashant Bhole ran into the same issue but
for net-next, thus adding him below as well as co-author. Same issue was
also reported by Sandipan Das when using bcc.

Fixes: 46209401f8 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath")
Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.html
Reported-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Co-authored-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Co-authored-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 15:02:36 -05:00
..
act_api.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
act_bpf.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
act_connmark.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_csum.c net: accept UFO datagrams from tuntap and packet 2017-11-24 01:37:35 +09:00
act_gact.c net/sched: Fix update of lastuse in act modules implementing stats_update 2018-01-02 13:27:52 -05:00
act_ife.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
act_ipt.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_meta_mark.c net: remove duplicate includes 2017-12-13 13:18:46 -05:00
act_meta_skbprio.c net sched actions: change IFE modules alias names 2017-10-12 22:13:20 -07:00
act_meta_skbtcindex.c net: remove duplicate includes 2017-12-13 13:18:46 -05:00
act_mirred.c net/sched: Fix update of lastuse in act modules implementing stats_update 2018-01-02 13:27:52 -05:00
act_nat.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_pedit.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_police.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_sample.c act_sample: get rid of tcf_sample_cleanup_rcu() 2017-11-30 10:19:17 -05:00
act_simple.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_skbedit.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_skbmod.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_tunnel_key.c Revert "net_sched: hold netns refcnt for each action" 2017-11-09 10:03:09 +09:00
act_vlan.c act_vlan: VLAN action rewrite to use RCU lock/unlock and update 2017-11-10 15:32:20 +09:00
cls_api.c net: sched: fix possible null pointer deref in tcf_block_put 2017-12-26 13:02:05 -05:00
cls_basic.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
cls_bpf.c cls_bpf: fix offload assumptions after callback conversion 2017-12-20 13:08:18 -05:00
cls_cgroup.c cls_cgroup: use tcf_exts_get_net() before call_rcu() 2017-11-09 10:03:09 +09:00
cls_flow.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
cls_flower.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
cls_fw.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
cls_matchall.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
cls_route.c cls_route: use tcf_exts_get_net() before call_rcu() 2017-11-09 10:03:10 +09:00
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h cls_rsvp: use tcf_exts_get_net() before call_rcu() 2017-11-09 10:03:10 +09:00
cls_tcindex.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
cls_u32.c net: remove duplicate includes 2017-12-13 13:18:46 -05:00
em_canid.c
em_cmp.c
em_ipset.c netfilter: x_tables: move hook state into xt_action_param structure 2016-11-03 10:56:21 +01:00
em_meta.c net: convert sock.sk_refcnt from atomic_t to refcount_t 2017-07-01 07:39:08 -07:00
em_nbyte.c
em_text.c net: Remove state argument from skb_find_text() 2015-02-22 15:59:54 -05:00
em_u32.c
ematch.c net: sched: ematch: obtain net pointer from blocks 2017-10-16 21:00:40 +01:00
Kconfig net/sched: Introduce Credit Based Shaper (CBS) qdisc 2017-10-27 09:48:02 -07:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
sch_api.c net, sched: fix panic when updating miniq {b,q}stats 2018-01-16 15:02:36 -05:00
sch_atm.c net: sched: store Qdisc pointer in struct block 2017-10-16 21:00:40 +01:00
sch_blackhole.c net_sched: drop packets after root qdisc lock is released 2016-06-25 12:19:35 -04:00
sch_cbq.c net: sched: cbq: create block for q->link.block 2017-11-28 16:04:26 -05:00
sch_cbs.c net_sch: cbs: Change TC_SETUP_CBS to TC_SETUP_QDISC_CBS 2017-11-08 12:23:38 +09:00
sch_choke.c net_sched: red: Avoid illegal values 2017-12-05 14:37:13 -05:00
sch_codel.c netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
sch_drr.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_dsmark.c net: sched: store Qdisc pointer in struct block 2017-10-16 21:00:40 +01:00
sch_fifo.c sched: don't use skb queue helpers 2016-09-19 01:47:18 -04:00
sch_fq_codel.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_fq.c mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic 2017-07-12 16:26:03 -07:00
sch_generic.c net, sched: fix panic when updating miniq {b,q}stats 2018-01-16 15:02:36 -05:00
sch_gred.c net_sched: red: Avoid illegal values 2017-12-05 14:37:13 -05:00
sch_hfsc.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_hhf.c sch_hhf: fix null pointer dereference on init failure 2017-08-30 15:26:11 -07:00
sch_htb.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_ingress.c net, sched: fix panic when updating miniq {b,q}stats 2018-01-16 15:02:36 -05:00
sch_mq.c net/sched: Change behavior of mq select_queue() 2017-10-27 09:41:45 -07:00
sch_mqprio.c net_sch: mqprio: Change TC_SETUP_MQPRIO to TC_SETUP_QDISC_MQPRIO 2017-11-08 12:23:38 +09:00
sch_multiq.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_netem.c netem: remove unnecessary 64 bit modulus 2017-11-15 14:14:16 +09:00
sch_pie.c net: sched: Convert timers to use timer_setup() 2017-10-18 12:39:54 +01:00
sch_plug.c net_sched: drop packets after root qdisc lock is released 2016-06-25 12:19:35 -04:00
sch_prio.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_qfq.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_red.c net: sched: Move to new offload indication in RED 2017-12-15 13:35:36 -05:00
sch_sfb.c net: sched: mark expected switch fall-throughs 2017-10-22 02:07:08 +01:00
sch_sfq.c net_sched: red: Avoid illegal values 2017-12-05 14:37:13 -05:00
sch_tbf.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-09-01 17:42:05 -07:00
sch_teql.c net: make ndo_get_stats64 a void function 2017-01-08 17:51:44 -05:00