linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-18 18:11:56 +00:00

History

Eric Dumazet 8b27dae5a2 tcp: add one skb cache for rx Often times, recvmsg() system calls and BH handling for a particular TCP socket are done on different cpus. This means the incoming skb had to be allocated on a cpu, but freed on another. This incurs a high spinlock contention in slab layer for small rpc, but also a high number of cache line ping pongs for larger packets. A full size GRO packet might use 45 page fragments, meaning that up to 45 put_page() can be involved. More over performing the __kfree_skb() in the recvmsg() context adds a latency for user applications, and increase probability of trapping them in backlog processing, since the BH handler might found the socket owned by the user. This patch, combined with the prior one increases the rpc performance by about 10 % on servers with large number of cores. (tcp_rr workload with 10,000 flows and 112 threads reach 9 Mpps instead of 8 Mpps) This also increases single bulk flow performance on 40Gbit+ links, since in this case there are often two cpus working in tandem : - CPU handling the NIC rx interrupts, feeding the receive queue, and (after this patch) freeing the skbs that were consumed. - CPU in recvmsg() system call, essentially 100 % busy copying out data to user space. Having at most one skb in a per-socket cache has very little risk of memory exhaustion, and since it is protected by socket lock, its management is essentially free. Note that if rps/rfs is used, we do not enable this feature, because there is high chance that the same cpu is handling both the recvmsg() system call and the TCP rx path, but that another cpu did the skb allocations in the device driver right before the RPS/RFS logic. To properly handle this case, it seems we would need to record on which cpu skb was allocated, and use a different channel to give skbs back to this cpu. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2019-03-23 21:57:38 -04:00
..
bpfilter	net: bpfilter: disallow to remove bpfilter module while being used	2019-01-11 18:05:41 -08:00
netfilter	netfilter: nf_tables: merge ipv4 and ipv6 nat chain types	2019-03-01 14:36:59 +01:00
af_inet.c	tcp: add one skb cache for rx	2019-03-23 21:57:38 -04:00
ah4.c	net-ipv4: remove 2 always zero parameters from ipv4_redirect()	2018-09-26 20:30:55 -07:00
arp.c	net: Evict neighbor entries on carrier down	2018-10-12 09:47:39 -07:00
cipso_ipv4.c	netlabel: fix out-of-bounds memory accesses	2019-02-27 21:45:24 -08:00
datagram.c	ipv4: Allow sending multicast packets on specific i/f using VRF socket	2018-10-02 22:28:17 -07:00
devinet.c	net: ignore sysctl_devconf_inherit_init_net without SYSCTL	2019-03-04 13:14:34 -08:00
esp4_offload.c	net: use skb_sec_path helper in more places	2018-12-19 11:21:37 -08:00
esp4.c	esp: Skip TX bytes accounting when sending from a request socket	2019-01-28 11:20:58 +01:00
fib_frontend.c	ipv4: Return error for RTA_VIA attribute	2019-02-26 13:23:17 -08:00
fib_lookup.h
fib_notifier.c
fib_rules.c	ipv4: fib_rules: Fix possible infinite loop in fib_empty_table	2018-12-30 12:57:04 -08:00
fib_semantics.c	ipv4: fib: use struct_size() in kzalloc()	2019-02-01 15:12:29 -08:00
fib_trie.c	ipv4: Allow amount of dirty memory from fib resizing to be controllable	2019-03-21 13:29:53 -07:00
fou.c	genetlink: make policy common to family	2019-03-22 10:38:23 -04:00
gre_demux.c	net: ip_gre: use erspan key field for tunnel lookup	2019-01-22 11:52:17 -08:00
gre_offload.c	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2018-07-03 10:29:26 +09:00
icmp.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-03-02 12:54:35 -08:00
igmp.c	net: remove unneeded switch fall-through	2019-02-21 13:48:00 -08:00
inet_connection_sock.c	inet: minor optimization for backlog setting in listen(2)	2018-11-07 22:31:07 -08:00
inet_diag.c	inet_diag: fix reporting cgroup classid and fallback to priority	2019-02-12 13:35:57 -05:00
inet_fragment.c	net: remove unused struct inet_frag_queue.fragments field	2019-02-26 08:27:05 -08:00
inet_hashtables.c	net: dccp: fix kernel crash on module load	2018-12-24 15:27:56 -08:00
inet_timewait_sock.c	soreuseport: initialise timewait reuseport field	2018-04-07 22:32:32 -04:00
inetpeer.c	net: ipv4: use a dedicated counter for icmp_v4 redirect packets	2019-02-08 21:50:15 -08:00
ip_forward.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-12-20 11:53:36 -08:00
ip_fragment.c	net: remove unused struct inet_frag_queue.fragments field	2019-02-26 08:27:05 -08:00
ip_gre.c	route: Add multipath_hash in flowi_common to make user-define hash	2019-02-27 12:50:17 -08:00
ip_input.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-03-02 12:54:35 -08:00
ip_options.c	net: avoid use IPCB in cipso_v4_error	2019-02-25 14:32:35 -08:00
ip_output.c	sk_buff: add skb extension infrastructure	2018-12-19 11:21:37 -08:00
ip_sockglue.c	ip: on queued skb use skb_header_pointer instead of pskb_may_pull	2019-01-10 09:27:20 -05:00
ip_tunnel_core.c	ip_tunnel: Add dst_cache support in lwtunnel_state of ip tunnel	2019-02-24 22:13:49 -08:00
ip_tunnel.c	iptunnel: NULL pointer deref for ip_md_tunnel_xmit	2019-03-06 10:43:06 -08:00
ip_vti.c	vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel	2019-01-09 14:00:37 +01:00
ipcomp.c	net-ipv4: remove 2 always zero parameters from ipv4_redirect()	2018-09-26 20:30:55 -07:00
ipconfig.c	ipconfig: add carrier_timeout kernel parameter	2019-02-01 15:24:13 -08:00
ipip.c	ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit	2019-01-26 09:43:03 -08:00
ipmr_base.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-10-19 11:03:06 -07:00
ipmr.c	ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs	2019-02-21 13:05:05 -08:00
Kconfig	net: remove blank lines at end of file	2018-07-24 14:10:43 -07:00
Makefile	bpf, sockmap: convert to generic sk_msg interface	2018-10-15 12:23:19 -07:00
metrics.c	net: Add extack argument to ip_fib_metrics_init	2018-11-06 15:00:45 -08:00
netfilter.c	netfilter: ipv4: remove useless export_symbol	2019-01-28 11:32:58 +01:00
netlink.c	ipv4: Add ICMPv6 support when parse route ipproto	2019-03-01 16:41:27 -08:00
ping.c	ipv4: Allow sending multicast packets on specific i/f using VRF socket	2018-10-02 22:28:17 -07:00
proc.c	tcp: implement coalescing on backlog queue	2018-11-30 13:26:54 -08:00
protocol.c	fou, fou6: ICMP error handlers for FoU and GUE	2018-11-08 17:13:08 -08:00
raw_diag.c
raw.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-12-20 11:53:36 -08:00
route.c	net: dst: remove gc leftovers	2019-03-21 13:39:25 -07:00
syncookies.c	tcp: free request sock directly upon TFO or syncookies error	2019-03-19 14:13:01 -07:00
sysctl_net_ipv4.c	ipv4: Allow amount of dirty memory from fib resizing to be controllable	2019-03-21 13:29:53 -07:00
tcp_bbr.c	tcp_bbr: adapt cwnd based on ack aggregation estimation	2019-01-24 22:27:27 -08:00
tcp_bic.c
tcp_bpf.c	bpf: sk_msg, sock{map\|hash} redirect through ULP	2018-12-20 23:47:09 +01:00
tcp_cdg.c	tcp: cdg: use tcp high resolution clock cache	2018-10-15 22:56:42 -07:00
tcp_cong.c
tcp_cubic.c
tcp_dctcp.c	tcp: refactor DCTCP ECN ACK handling	2018-10-10 22:26:00 -07:00
tcp_dctcp.h	tcp: refactor DCTCP ECN ACK handling	2018-10-10 22:26:00 -07:00
tcp_diag.c
tcp_fastopen.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c	tcp: free request sock directly upon TFO or syncookies error	2019-03-19 14:13:01 -07:00
tcp_ipv4.c	tcp: add one skb cache for rx	2019-03-23 21:57:38 -04:00
tcp_lp.c
tcp_metrics.c	genetlink: make policy common to family	2019-03-22 10:38:23 -04:00
tcp_minisocks.c	tcp: use tcp_md5_needed for timewait sockets	2019-02-26 13:16:03 -08:00
tcp_nv.c
tcp_offload.c	net: use indirect call wrappers at GRO transport layer	2018-12-15 13:23:02 -08:00
tcp_output.c	tcp: remove conditional branches from tcp_mstamp_refresh()	2019-03-23 21:43:21 -04:00
tcp_rate.c	tcp: introduce tcp_skb_timestamp_us() helper	2018-09-21 19:37:59 -07:00
tcp_recovery.c	tcp: introduce tcp_skb_timestamp_us() helper	2018-09-21 19:37:59 -07:00
tcp_scalable.c
tcp_timer.c	tcp: Refactor pingpong code	2019-01-27 13:29:43 -08:00
tcp_ulp.c	tcp, ulp: remove socket lock assertion on ULP cleanup	2018-10-16 12:38:41 -07:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c	tcp: add one skb cache for rx	2019-03-23 21:57:38 -04:00
tunnel4.c	net: Convert protocol error handlers from void to int	2018-11-08 17:13:08 -08:00
udp_diag.c	net: diag: document swapped src/dst in udp_dump_one.	2018-10-28 19:27:21 -07:00
udp_impl.h	udp: add missing rehash callback to udplite	2019-01-17 15:01:08 -08:00
udp_offload.c	udp: use indirect call wrappers for GRO socket lookup	2018-12-15 13:23:02 -08:00
udp_tunnel.c	net/ipv4/udp_tunnel: prefer SO_BINDTOIFINDEX over SO_BINDTODEVICE	2019-01-17 14:55:52 -08:00
udp.c	udp: fix possible user after free in error handler	2019-02-22 16:05:11 -08:00
udplite.c	udp: add missing rehash callback to udplite	2019-01-17 15:01:08 -08:00
xfrm4_input.c	xfrm: reset transport header back to network header after all input transforms ahave been applied	2018-09-04 10:26:30 +02:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c	xfrm: reset transport header back to network header after all input transforms ahave been applied	2018-09-04 10:26:30 +02:00
xfrm4_mode_tunnel.c	xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto	2018-03-07 10:54:29 +01:00
xfrm4_output.c
xfrm4_policy.c	net: Drop pernet_operations::async	2018-03-27 13:18:09 -04:00
xfrm4_protocol.c	net: Convert protocol error handlers from void to int	2018-11-08 17:13:08 -08:00
xfrm4_state.c
xfrm4_tunnel.c