linux/net/ipv6
Kuniyuki Iwashima d4f2c86b2b tcp: Migrate TCP_NEW_SYN_RECV requests at receiving the final ACK.
This patch also changes the code to call reuseport_migrate_sock() and
inet_reqsk_clone(), but unlike the other cases, we do not call
inet_reqsk_clone() right after reuseport_migrate_sock().

Currently, in the receive path for TCP_NEW_SYN_RECV sockets, its listener
has three kinds of refcnt:

  (A) for listener itself
  (B) carried by reuqest_sock
  (C) sock_hold() in tcp_v[46]_rcv()

While processing the req, (A) may disappear by close(listener). Also, (B)
can disappear by accept(listener) once we put the req into the accept
queue. So, we have to hold another refcnt (C) for the listener to prevent
use-after-free.

For socket migration, we call reuseport_migrate_sock() to select a listener
with (A) and to increment the new listener's refcnt in tcp_v[46]_rcv().
This refcnt corresponds to (C) and is cleaned up later in tcp_v[46]_rcv().
Thus we have to take another refcnt (B) for the newly cloned request_sock.

In inet_csk_complete_hashdance(), we hold the count (B), clone the req, and
try to put the new req into the accept queue. By migrating req after
winning the "own_req" race, we can avoid such a worst situation:

  CPU 1 looks up req1
  CPU 2 looks up req1, unhashes it, then CPU 1 loses the race
  CPU 3 looks up req2, unhashes it, then CPU 2 loses the race
  ...

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210612123224.12525-8-kuniyu@amazon.co.jp
2021-06-15 18:01:06 +02:00
..
ila net: Add MODULE_DESCRIPTION entries to network modules 2020-06-20 21:33:57 -07:00
netfilter netfilter: allow to turn off xtables compat layer 2021-04-26 18:16:56 +02:00
addrconf_core.c ipv6: add ipv6_dev_find to stubs 2021-03-30 13:29:39 -07:00
addrconf.c mld: fix suspicious RCU usage in __ipv6_dev_mc_dec() 2021-04-16 15:40:59 -07:00
addrlabel.c ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init 2020-11-25 11:20:16 -08:00
af_inet6.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-04-02 11:03:07 -07:00
ah6.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2021-04-14 13:15:12 -07:00
anycast.c ipv6: fix memory leaks on IPV6_ADDRFORM path 2020-07-30 16:30:55 -07:00
calipso.c cipso,calipso: resolve a number of problems with the DOI refcounts 2021-03-04 15:26:57 -08:00
datagram.c lsm,selinux: pass flowi_common instead of flowi to the LSM hooks 2020-11-23 18:36:21 -05:00
esp6_offload.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2021-04-14 13:15:12 -07:00
esp6.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2021-04-14 13:15:12 -07:00
exthdrs_core.c
exthdrs_offload.c
exthdrs.c seg6: add support for IPv4 decapsulation in ipv6_srh_rcv() 2021-03-11 16:09:21 -08:00
fib6_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib6_rules.c fib: use indirect call wrappers in the most common fib_rules_ops 2020-07-28 17:42:31 -07:00
fou6.c net: Add MODULE_DESCRIPTION entries to network modules 2020-06-20 21:33:57 -07:00
icmp.c icmp: ICMPV6: pass RFC 8335 reply messages to ping_rcv 2021-04-13 14:38:01 -07:00
inet6_connection_sock.c lsm,selinux: pass flowi_common instead of flowi to the LSM hooks 2020-11-23 18:36:21 -05:00
inet6_hashtables.c net: ipv6: remove unused arg exact_dif in compute_score 2020-08-31 13:08:10 -07:00
ip6_checksum.c
ip6_fib.c ipv6: Add a sysctl to control multipath hash fields 2021-05-18 13:27:32 -07:00
ip6_flowlabel.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-08-05 20:13:21 -07:00
ip6_gre.c ipv6: remove extra dev_hold() for fallback tunnels 2021-03-31 14:53:11 -07:00
ip6_icmp.c net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending 2021-02-23 11:29:52 -08:00
ip6_input.c ipv6: weaken the v4mapped source check 2021-03-18 11:19:23 -07:00
ip6_offload.c net/core: move gro function declarations to separate header 2021-02-04 18:37:57 -08:00
ip6_offload.h
ip6_output.c net: use indirect call helpers for dst_output 2021-02-03 14:51:39 -08:00
ip6_tunnel.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-17 11:08:07 -07:00
ip6_udp_tunnel.c net: Make locking in sock_bindtoindex optional 2020-06-01 14:57:14 -07:00
ip6_vti.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-09 20:48:35 -07:00
ip6mr.c net/ipv6: switch ip6_mroute_setsockopt to sockptr_t 2020-07-24 15:41:54 -07:00
ipcomp6.c ipcomp: assign if_id to child tunnel from parent tunnel 2020-07-09 12:55:37 +02:00
ipv6_sockglue.c net/ipv6: propagate user pointer annotation 2020-12-01 11:42:33 -08:00
Kconfig net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC 2020-09-18 17:45:04 -07:00
Makefile net: ipv6: add rpl sr tunnel 2020-03-29 22:30:57 -07:00
mcast_snoop.c net: bridge: mcast: fix broken length + header check for MRDv6 Adv. 2021-04-27 14:02:06 -07:00
mcast.c mld: remove unnecessary prototypes 2021-04-19 15:23:00 -07:00
mip6.c
ndisc.c net: allow user to set metric on default route learned via Router Advertisement 2021-01-26 18:39:45 -08:00
netfilter.c netfilter: Dissect flow after packet mangling 2021-04-18 22:04:16 +02:00
output_core.c
ping.c lsm,selinux: pass flowi_common instead of flowi to the LSM hooks 2020-11-23 18:36:21 -05:00
proc.c net: udp: introduce UDP_MIB_MEMERRORS for udp_mem 2020-11-09 15:34:44 -08:00
protocol.c
raw.c net-ipv6: bugfix - raw & sctp - switch to ipv6_can_nonlocal_bind() 2021-04-05 12:56:52 -07:00
reassembly.c ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module 2020-11-19 10:49:50 -08:00
route.c ipv6: Add custom multipath hash policy 2021-05-18 13:27:32 -07:00
rpl_iptunnel.c net: ipv6: rpl_iptunnel: simplify the return expression of rpl_do_srh() 2020-12-08 16:22:54 -08:00
rpl.c net: ipv6: rpl*: Fix strange kerneldoc warnings due to bad header 2020-10-30 12:12:52 -07:00
seg6_hmac.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
seg6_iptunnel.c seg6_iptunnel: Refactor seg6_lwt_headroom out of uapi header 2020-08-03 17:57:40 -07:00
seg6_local.c seg6: add counters support for SRv6 Behaviors 2021-04-29 15:26:57 -07:00
seg6.c net: Remove redundant assignment to err 2021-04-29 15:34:15 -07:00
sit.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-17 11:08:07 -07:00
syncookies.c selinux/stable-5.11 PR 20201214 2020-12-16 11:01:04 -08:00
sysctl_net_ipv6.c net: Add notifications when multipath hash field change 2021-05-19 12:47:47 -07:00
tcp_ipv6.c tcp: Migrate TCP_NEW_SYN_RECV requests at receiving the final ACK. 2021-06-15 18:01:06 +02:00
tcpv6_offload.c
tunnel6.c tunnel6: add tunnel6_input_afinfo for ipip and ipv6 tunnels 2020-07-09 12:52:37 +02:00
udp_impl.h net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
udp_offload.c udp: properly complete L4 GRO over UDP tunnel packet 2021-03-30 17:06:49 -07:00
udp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-04-02 11:03:07 -07:00
udplite.c net/ipv6: remove compat_ipv6_{get,set}sockopt 2020-07-19 18:16:41 -07:00
xfrm6_input.c xfrm: state: remove extract_input indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm6_output.c xfrm: remove output_finish indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm6_policy.c net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2019-12-24 22:28:54 -08:00
xfrm6_protocol.c xfrm: add support for UDPv6 encapsulation of ESP 2020-04-28 11:28:36 +02:00
xfrm6_state.c xfrm: remove output_finish indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm6_tunnel.c xfrm: interface: fix the priorities for ipip and ipv6 tunnels 2020-10-09 12:29:48 +02:00