linux/net
Florian Westphal e5072053b0 netfilter: conntrack: refine gc worker heuristics, redux
This further refines the changes made to conntrack gc_worker in
commit e0df8cae6c ("netfilter: conntrack: refine gc worker heuristics").

The main idea of that change was to reduce the scan interval when evictions
take place.

However, on the reporters' setup, there are 1-2 million conntrack entries
in total and roughly 8k new (and closing) connections per second.

In this case we'll always evict at least one entry per gc cycle and scan
interval is always at 1 jiffy because of this test:

 } else if (expired_count) {
     gc_work->next_gc_run /= 2U;
     next_run = msecs_to_jiffies(1);

being true almost all the time.

Given we scan ~10k entries per run its clearly wrong to reduce interval
based on nonzero eviction count, it will only waste cpu cycles since a vast
majorities of conntracks are not timed out.

Thus only look at the ratio (scanned entries vs. evicted entries) to make
a decision on whether to reduce or not.

Because evictor is supposed to only kick in when system turns idle after
a busy period, pick a high ratio -- this makes it 50%.  We thus keep
the idea of increasing scan rate when its likely that table contains many
expired entries.

In order to not let timed-out entries hang around for too long
(important when using event logging, in which case we want to timely
destroy events), we now scan the full table within at most
GC_MAX_SCAN_JIFFIES (16 seconds) even in worst-case scenario where all
timed-out entries sit in same slot.

I tested this with a vm under synflood (with
sysctl net.netfilter.nf_conntrack_tcp_timeout_syn_recv=3).

While flood is ongoing, interval now stays at its max rate
(GC_MAX_SCAN_JIFFIES / GC_MAX_BUCKETS_DIV -> 125ms).

With feedback from Nicolas Dichtel.

Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Fixes: b87a2f9199 ("netfilter: conntrack: add gc worker to remove timed-out entries")
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-19 14:28:01 +01:00
..
6lowpan 6lowpan: ndisc: no overreact if no short address is available 2016-09-19 20:19:34 +02:00
9p IB/core: add support to create a unsafe global rkey to ib_create_pd 2016-09-23 13:47:44 -04:00
802 Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
8021q Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
appletalk appletalk: use IS_ENABLED() instead of checking for built-in or module 2016-09-10 21:19:10 -07:00
atm net: atm: Fix warnings in net/atm/lec.c when !CONFIG_PROC_FS 2016-12-28 15:11:32 -05:00
ax25 Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-06 21:33:19 -05:00
bluetooth Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-16 10:24:44 -08:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2017-01-05 11:49:57 -05:00
caif Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-06 21:33:19 -05:00
can ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
ceph libceph: remove now unused finish_request() wrapper 2016-12-14 22:39:08 +01:00
core drop_monitor: consider inserted data in genlmsg_end 2017-01-03 11:09:44 -05:00
dcb net: dcb: set error code on failures 2016-12-03 23:54:25 -05:00
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-03 12:29:53 -05:00
decnet Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
dns_resolver
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-03 12:29:53 -05:00
ethernet net: make default TX queue length a defined constant 2016-11-07 20:15:55 -05:00
hsr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
ieee802154 Makefile: drop -D__CHECK_ENDIAN__ from cflags 2016-12-16 00:13:43 +02:00
ipv4 netfilter: ipt_CLUSTERIP: fix build error without procfs 2017-01-18 20:59:22 +01:00
ipv6 netfilter: rpfilter: fix incorrect loopback packet judgment 2017-01-16 14:23:01 +01:00
ipx ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
irda Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
iucv Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-12 19:25:04 -08:00
kcm Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-07 15:36:58 -07:00
key netns: make struct pernet_operations::id unsigned int 2016-11-18 10:59:15 -05:00
l2tp l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups 2017-01-01 22:07:20 -05:00
l3mdev net: ipv6: Remove l3mdev_get_saddr6 2016-09-10 23:12:53 -07:00
lapb Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
llc net: fix sleeping for sk_wait_event() 2016-11-14 13:17:21 -05:00
mac80211 mac80211: initialize fast-xmit 'info' later 2017-01-02 11:28:25 +01:00
mac802154 ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
mpls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-06 21:33:19 -05:00
ncsi net/ncsi: Improve HNCDSC AEN handler 2016-10-20 11:23:08 -04:00
netfilter netfilter: conntrack: refine gc worker heuristics, redux 2017-01-19 14:28:01 +01:00
netlabel netlabel: add CALIPSO to the list of built-in protocols 2017-01-06 22:20:45 -05:00
netlink Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
netrom
nfc genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
openvswitch openvswitch: upcall: Fix vlan handling. 2016-12-27 12:28:07 -05:00
packet Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
phonet netns: make struct pernet_operations::id unsigned int 2016-11-18 10:59:15 -05:00
qrtr
rds RDS: use rb_entry() 2016-12-20 14:22:49 -05:00
rfkill
rose Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
rxrpc rxrpc: abstract away knowledge of IDR internals 2016-12-14 16:04:10 -08:00
sched net/sched: cls_flower: Fix missing addr_type in classify 2016-12-28 14:28:13 -05:00
sctp ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
strparser strparser: Propagate correct error code in strp_recv() 2016-10-12 01:51:49 -04:00
sunrpc ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
switchdev Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
tipc tipc: don't send FIN message from connectionless socket 2016-12-23 17:53:47 -05:00
unix Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
vmw_vsock Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-17 20:17:04 -08:00
wimax genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
wireless nl80211: fix sched scan netlink socket owner destruction 2017-01-05 10:59:53 +01:00
x25 Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
xfrm ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
compat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
Kconfig bpf: BPF for lightweight tunnel infrastructure 2016-12-02 10:51:49 -05:00
Makefile
socket.c net: socket: don't set sk_uid to garbage value in ->setattr() 2017-01-01 11:53:34 -05:00
sysctl_net.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-10-06 09:52:23 -07:00