linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-15 08:31:55 +00:00

History

Eric Dumazet 19757cebf0 tcp: switch orphan_count to bare per-cpu counters Use of percpu_counter structure to track count of orphaned sockets is causing problems on modern hosts with 256 cpus or more. Stefan Bach reported a serious spinlock contention in real workloads, that I was able to reproduce with a netfilter rule dropping incoming FIN packets. 53.56% server [kernel.kallsyms] [k] queued_spin_lock_slowpath \| ---queued_spin_lock_slowpath \| --53.51%--_raw_spin_lock_irqsave \| --53.51%--__percpu_counter_sum tcp_check_oom \| \|--39.03%--__tcp_close \| tcp_close \| inet_release \| inet6_release \| sock_close \| __fput \| ____fput \| task_work_run \| exit_to_usermode_loop \| do_syscall_64 \| entry_SYSCALL_64_after_hwframe \| __GI___libc_close \| --14.48%--tcp_out_of_resources tcp_write_timeout tcp_retransmit_timer tcp_write_timer_handler tcp_write_timer call_timer_fn expire_timers __run_timers run_timer_softirq __softirqentry_text_start As explained in commit `cf86a086a1` ("net/dst: use a smaller percpu_counter batch for dst entries accounting"), default batch size is too big for the default value of tcp_max_orphans (262144). But even if we reduce batch sizes, there would still be cases where the estimated count of orphans is beyond the limit, and where tcp_too_many_orphans() has to call the expensive percpu_counter_sum_positive(). One solution is to use plain per-cpu counters, and have a timer to periodically refresh this cache. Updating this cache every 100ms seems about right, tcp pressure state is not radically changing over shorter periods. percpu_counter was nice 15 years ago while hosts had less than 16 cpus, not anymore by current standards. v2: Fix the build issue for CONFIG_CRYPTO_DEV_CHELSIO_TLS=m, reported by kernel test robot <lkp@intel.com> Remove unused socket argument from tcp_too_many_orphans() Fixes: `dd24c00191` ("net: Use a percpu_counter for orphan_count") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Stefan Bach <sfb@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2021-10-15 11:28:34 +01:00
..
6lowpan
9p	net/9p: increase default msize to 128k	2021-09-05 08:36:44 +09:00
802	llc/snap: constify dev_addr passing	2021-10-13 09:40:46 -07:00
8021q	net: use eth_hw_addr_set() instead of ether_addr_copy()	2021-10-02 14:18:25 +01:00
appletalk	net: socket: rework compat_ifreq_ioctl()	2021-07-23 14:20:25 +01:00
atm	ethernet: replace netdev->dev_addr assignment loops	2021-10-14 09:22:25 -07:00
ax25	ax25: constify dev_addr passing	2021-10-13 09:40:45 -07:00
batman-adv	Kbuild updates for v5.15	2021-09-03 15:33:47 -07:00
bluetooth	bluetooth-next pull request for net-next:	2021-10-05 07:41:16 -07:00
bpf	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2021-10-01 19:58:02 -07:00
bpfilter
bridge	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-07 15:24:06 -07:00
caif	net-caif: avoid user-triggerable WARN_ON(1)	2021-09-14 12:51:15 +01:00
can	net: Remove redundant if statements	2021-08-05 13:27:50 +01:00
ceph
core	page_pool: disable dma mapping support for 32-bit arch with 64-bit DMA	2021-10-15 10:54:20 +01:00
dcb
dccp	tcp: switch orphan_count to bare per-cpu counters	2021-10-15 11:28:34 +01:00
decnet	net: Remove redundant if statements	2021-08-05 13:27:50 +01:00
dns_resolver
dsa	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-14 16:50:14 -07:00
ethernet	eth: platform: add a helper for loading netdev->dev_addr	2021-10-08 14:54:33 +01:00
ethtool	ethtool: Add ability to control transceiver modules' power mode	2021-10-06 17:47:49 -07:00
hsr	net: use eth_hw_addr_set() instead of ether_addr_copy()	2021-10-02 14:18:25 +01:00
ieee802154	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-08-13 06:41:22 -07:00
ife
ipv4	tcp: switch orphan_count to bare per-cpu counters	2021-10-15 11:28:34 +01:00
ipv6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-14 16:50:14 -07:00
iucv	net/iucv: Replace deprecated CPU-hotplug functions.	2021-08-09 10:13:32 +01:00
kcm
key
l2tp	net/l2tp: Fix reference count leak in l2tp_udp_recv_core	2021-09-09 11:00:20 +01:00
l3mdev
lapb
llc	llc/snap: constify dev_addr passing	2021-10-13 09:40:46 -07:00
mac80211	net: mac80211: check return value of rhashtable_init	2021-09-28 12:59:24 +01:00
mac802154	ieee802154: Remove redundant initialization of variable ret	2021-09-07 14:06:08 +01:00
mctp	mctp: Avoid leak of mctp_sk_key	2021-10-15 11:22:08 +01:00
mpls	mpls: defer ttl decrement in mpls_forward()	2021-07-23 17:17:56 +01:00
mptcp	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-14 16:50:14 -07:00
ncsi	net/ncsi: add get MAC address command to get Intel i210 MAC address	2021-09-01 17:18:56 -07:00
netfilter	netfilter: nf_tables: honor NLM_F_CREATE and NLM_F_EXCL in event notification	2021-10-02 12:00:17 +02:00
netlabel	net: fix NULL pointer reference in cipso_v4_doi_free	2021-08-30 12:23:18 +01:00
netlink	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-07 15:24:06 -07:00
netrom	ax25: constify dev_addr passing	2021-10-13 09:40:45 -07:00
nfc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-14 16:50:14 -07:00
nsh
openvswitch	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-08-19 18:09:18 -07:00
packet	net/packet: clarify source of pr_*() messages	2021-09-10 10:00:59 +01:00
phonet	net: Remove redundant if statements	2021-08-05 13:27:50 +01:00
psample
qrtr	net: qrtr: combine nameservice into main module	2021-09-28 17:36:43 -07:00
rds	net/rds: dma_map_sg is entitled to merge entries	2021-08-18 15:35:50 -07:00
rfkill
rose	rose: constify dev_addr passing	2021-10-13 09:40:45 -07:00
rxrpc	rxrpc: Fix _usecs_to_jiffies() by using usecs_to_jiffies()	2021-09-24 14:18:34 +01:00
sched	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-14 16:50:14 -07:00
sctp	sctp: account stream padding length for reconf chunk	2021-10-14 07:15:22 -07:00
smc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-10-14 16:50:14 -07:00
strparser
sunrpc	Bug fixes for NFSD error handling paths	2021-10-07 14:11:40 -07:00
switchdev	net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge	2021-08-04 12:35:07 +01:00
tipc	tipc: constify dev_addr passing	2021-10-13 09:40:46 -07:00
tls	net/tls: support SM4 CCM algorithm	2021-09-28 13:26:23 +01:00
unix	af_unix: Rename UNIX-DGRAM to UNIX to maintain backwards compatability	2021-10-12 11:16:49 +01:00
vmw_vsock	vsock: Enable y2038 safe timeval for timeout	2021-10-08 16:21:53 +01:00
wireless	cfg80211: use wiphy DFS domain if it is self-managed	2021-08-26 11:04:55 +02:00
x25
xdp	xsk: Fix clang build error in __xp_alloc	2021-09-29 13:59:13 +02:00
xfrm	xfrm: fix rcu lock in xfrm_notify_userpolicy()	2021-09-23 10:11:12 +02:00
compat.c
devres.c
Kconfig	net/core: disable NET_RX_BUSY_POLL on PREEMPT_RT	2021-10-01 15:45:10 -07:00
Makefile	mctp: Add MCTP base	2021-07-29 15:06:49 +01:00
socket.c	Core:	2021-08-31 16:43:06 -07:00
sysctl_net.c