linux/net/sched
Eric Dumazet edb09eb17e net: sched: do not acquire qdisc spinlock in qdisc/class stats dump
Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
agent [1] are problematic at scale :

For each qdisc/class found in the dump, we currently lock the root qdisc
spinlock in order to get stats. Sampling stats every 5 seconds from
thousands of HTB classes is a challenge when the root qdisc spinlock is
under high pressure. Not only the dumps take time, they also slow
down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.

An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
that might need the qdisc lock in fq_codel_dump_stats() and
fq_codel_dump_class_stats()

In v2 of this patch, I now use the Qdisc running seqcount to provide
consistent reads of packets/bytes counters, regardless of 32/64 bit arches.

I also changed rate estimators to use the same infrastructure
so that they no longer need to lock root qdisc lock.

[1]
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdf

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Kevin Athey <kda@google.com>
Cc: Xiaotian Pei <xiaotian@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:37:14 -07:00
..
act_api.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
act_bpf.c net sched: indentation and other OCD stylistic fixes 2016-06-07 15:53:54 -07:00
act_connmark.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_csum.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_gact.c net sched: indentation and other OCD stylistic fixes 2016-06-07 15:53:54 -07:00
act_ife.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_ipt.c net sched: indentation and other OCD stylistic fixes 2016-06-07 15:53:54 -07:00
act_meta_mark.c Support to encoding decoding skb mark on IFE action 2016-03-01 17:15:23 -05:00
act_meta_skbprio.c Support to encoding decoding skb prio on IFE action 2016-03-01 17:15:23 -05:00
act_mirred.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_nat.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_pedit.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_police.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
act_simple.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_skbedit.c net sched actions: aggregate dumping of actions timeinfo 2016-06-07 15:53:43 -07:00
act_vlan.c net sched: indentation and other OCD stylistic fixes 2016-06-07 15:53:54 -07:00
cls_api.c net sched: indentation and other OCD stylistic fixes 2016-06-07 15:53:54 -07:00
cls_basic.c net_sched: destroy proto tp when all filters are gone 2015-03-09 15:35:55 -04:00
cls_bpf.c bpf: wire in data and data_end for cls_act_bpf 2016-05-06 16:01:54 -04:00
cls_cgroup.c cls_cgroup: factor out classid retrieval 2015-07-20 12:41:30 -07:00
cls_flow.c sched: cls_flow: use skb_to_full_sk() helper 2015-11-08 20:56:39 -05:00
cls_flower.c net/sched: cls_flower: Introduce support in SKIP SW flag 2016-06-07 15:49:53 -07:00
cls_fw.c net: revert "net_sched: move tp->root allocation into fw_init()" 2015-09-24 14:33:30 -07:00
cls_route.c net_sched: destroy proto tp when all filters are gone 2015-03-09 15:35:55 -04:00
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h net_sched: convert rsvp to call tcf_exts_destroy from rcu callback 2015-08-26 11:01:45 -07:00
cls_tcindex.c net_sched: convert tcindex to call tcf_exts_destroy from rcu callback 2015-08-26 11:01:44 -07:00
cls_u32.c net: cls_u32: Add support for skip-sw flag to tc u32 classifier. 2016-05-16 13:30:57 -04:00
em_canid.c net: sched: remove tcf_proto from ematch calls 2014-10-06 18:02:32 -04:00
em_cmp.c
em_ipset.c netfilter: x_tables: Pass struct net in xt_action_param 2015-09-18 21:58:14 +02:00
em_meta.c qdisc: constify meta_type_ops structures 2016-04-14 00:35:30 -04:00
em_nbyte.c net: sched: remove tcf_proto from ematch calls 2014-10-06 18:02:32 -04:00
em_text.c net: Remove state argument from skb_find_text() 2015-02-22 15:59:54 -05:00
em_u32.c
ematch.c ematch: Fix auto-loading of ematch modules. 2015-02-20 15:30:56 -05:00
Kconfig Support to encoding decoding skb prio on IFE action 2016-03-01 17:15:23 -05:00
Makefile Support to encoding decoding skb prio on IFE action 2016-03-01 17:15:23 -05:00
sch_api.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_atm.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_blackhole.c net/sched: make sch_blackhole.c explicitly non-modular 2015-10-09 07:52:28 -07:00
sch_cbq.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_choke.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_codel.c codel: split into multiple files 2016-04-25 16:44:27 -04:00
sch_drr.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_dsmark.c net_sched: dsmark: use qdisc_dequeue_peeked() 2016-03-08 14:35:13 -05:00
sch_fifo.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_fq_codel.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_fq.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_generic.c net_sched: transform qdisc running bit into a seqcount 2016-06-07 16:37:13 -07:00
sch_gred.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_hfsc.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_hhf.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_htb.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_ingress.c net, sched: add clsact qdisc 2016-01-10 22:13:15 -05:00
sch_mq.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_mqprio.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_multiq.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_netem.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-04 00:52:29 -04:00
sch_pie.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_plug.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_prio.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_qfq.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sch_red.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_sfb.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_sfq.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_tbf.c sched: use nla_put_u64_64bit() 2016-04-25 15:09:09 -04:00
sch_teql.c net: sched: fix skb->protocol use in case of accelerated vlan path 2015-01-13 17:51:08 -05:00