linux/net/ipv4
SeongJae Park 9603d47bad tcp: Reduce SYN resend delay if a suspicous ACK is received
When closing a connection, the two acks that required to change closing
socket's status to FIN_WAIT_2 and then TIME_WAIT could be processed in
reverse order.  This is possible in RSS disabled environments such as a
connection inside a host.

For example, expected state transitions and required packets for the
disconnection will be similar to below flow.

	 00 (Process A)				(Process B)
	 01 ESTABLISHED				ESTABLISHED
	 02 close()
	 03 FIN_WAIT_1
	 04 		---FIN-->
	 05 					CLOSE_WAIT
	 06 		<--ACK---
	 07 FIN_WAIT_2
	 08 		<--FIN/ACK---
	 09 TIME_WAIT
	 10 		---ACK-->
	 11 					LAST_ACK
	 12 CLOSED				CLOSED

In some cases such as LINGER option applied socket, the FIN and FIN/ACK
will be substituted to RST and RST/ACK, but there is no difference in
the main logic.

The acks in lines 6 and 8 are the acks.  If the line 8 packet is
processed before the line 6 packet, it will be just ignored as it is not
a expected packet, and the later process of the line 6 packet will
change the status of Process A to FIN_WAIT_2, but as it has already
handled line 8 packet, it will not go to TIME_WAIT and thus will not
send the line 10 packet to Process B.  Thus, Process B will left in
CLOSE_WAIT status, as below.

	 00 (Process A)				(Process B)
	 01 ESTABLISHED				ESTABLISHED
	 02 close()
	 03 FIN_WAIT_1
	 04 		---FIN-->
	 05 					CLOSE_WAIT
	 06 				(<--ACK---)
	 07	  			(<--FIN/ACK---)
	 08 				(fired in right order)
	 09 		<--FIN/ACK---
	 10 		<--ACK---
	 11 		(processed in reverse order)
	 12 FIN_WAIT_2

Later, if the Process B sends SYN to Process A for reconnection using
the same port, Process A will responds with an ACK for the last flow,
which has no increased sequence number.  Thus, Process A will send RST,
wait for TIMEOUT_INIT (one second in default), and then try
reconnection.  If reconnections are frequent, the one second latency
spikes can be a big problem.  Below is a tcpdump results of the problem:

    14.436259 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644
    14.436266 IP 127.0.0.1.4242 > 127.0.0.1.45150: Flags [.], ack 5, win 512
    14.436271 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [R], seq 2541101298
    /* ONE SECOND DELAY */
    15.464613 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644

This commit mitigates the problem by reducing the delay for the next SYN
if the suspicous ACK is received while in SYN_SENT state.

Following commit will add a selftest, which can be also helpful for
understanding of this issue.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-02-02 13:33:21 -08:00
..
bpfilter
netfilter netfilter: arp_tables: init netns pointer in xt_tgdtor_param struct 2020-01-13 19:22:10 +01:00
af_inet.c net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port) 2019-11-26 13:20:46 -08:00
ah4.c xfrm: remove type and offload_type map from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00
arp.c
bpf_tcp_ca.c bpf: Add BPF_FUNC_tcp_send_ack helper 2020-01-09 08:46:18 -08:00
cipso_ipv4.c
datagram.c inet: stop leaking jiffies on the wire 2019-11-01 14:57:52 -07:00
devinet.c inet: protect against too small mtu values. 2019-12-07 11:55:11 -08:00
esp4_offload.c xfrm: support output_mark for offload ESP packets 2020-01-15 12:18:35 +01:00
esp4.c xfrm: add espintcp (RFC 8229) 2019-12-09 09:59:07 +01:00
fib_frontend.c ipv4: move fib4_has_custom_rules() helper to public header 2019-11-21 14:45:55 -08:00
fib_lookup.h ipv4: Add "offload" and "trap" indications to routes 2020-01-14 18:53:35 -08:00
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_semantics.c ipv4: Add "offload" and "trap" indications to routes 2020-01-14 18:53:35 -08:00
fib_trie.c Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
fou.c fou: Fix IPv6 netlink policy 2020-01-23 14:32:52 +01:00
gre_demux.c gre: refetch erspan header from skb->data after pskb_may_pull() 2019-12-07 11:53:27 -08:00
gre_offload.c net: remove the check argument from __skb_gro_checksum_convert 2020-01-03 12:24:34 -08:00
icmp.c net: icmp: fix data-race in cmp_global_allow() 2019-11-08 12:32:43 -08:00
igmp.c igmp: uninline ip_mc_validate_checksum() 2019-10-04 14:26:46 -07:00
inet_connection_sock.c tcp/ipv4: remove AF_INET_FAMILY 2020-01-21 12:03:51 +01:00
inet_diag.c tcp/dccp: fix possible race __inet_lookup_established() 2019-12-13 21:40:49 -08:00
inet_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
inet_hashtables.c tcp/dccp: fix possible race __inet_lookup_established() 2019-12-13 21:40:49 -08:00
inet_timewait_sock.c
inetpeer.c inetpeer: fix data-race in inet_putpeer / inet_putpeer 2019-11-07 16:15:56 -08:00
ip_forward.c ipv4: Revert removal of rt_uses_gateway 2019-09-20 18:23:33 -07:00
ip_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
ip_gre.c treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
ip_input.c ipv4: use dst hint for ipv4 list receive 2019-11-21 14:45:55 -08:00
ip_options.c netfilter: nf_tables: add support for matching IPv4 options 2019-06-21 18:35:51 +02:00
ip_output.c net: ipv4: use skb_list_walk_safe helper for gso segments 2020-01-14 11:48:41 -08:00
ip_sockglue.c
ip_tunnel_core.c lwtunnel: check erspan options before allocating tun_info 2019-11-21 11:47:39 -08:00
ip_tunnel.c net, ip_tunnel: fix namespaces move 2020-01-21 16:05:21 +01:00
ip_vti.c vti[6]: fix packet tx through bpf_redirect() 2020-01-14 08:55:38 +01:00
ipcomp.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2019-07-05 15:01:15 -07:00
ipconfig.c net: ipconfig: Wait for deferred device probes 2019-11-20 12:39:53 -08:00
ipip.c ipip: validate header length in ipip_tunnel_xmit 2019-07-25 17:23:40 -07:00
ipmr_base.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-16 21:51:42 -08:00
Kconfig xfrm: add espintcp (RFC 8229) 2019-12-09 09:59:07 +01:00
Makefile bpf: tcp: Support tcp_congestion_ops in bpf 2020-01-09 08:46:18 -08:00
metrics.c
netfilter.c
netlink.c
nexthop.c net: include struct nhmsg size in nh nlmsg size 2020-01-27 10:54:08 +01:00
ping.c ip: support SO_MARK cmsg 2019-09-13 21:44:19 +02:00
proc.c tcp: export count for rehash attempts 2020-01-26 15:28:47 +01:00
protocol.c
raw_diag.c net: don't warn in inet diag when IPV6 is disabled 2019-07-03 11:26:35 -07:00
raw.c netfilter: drop bridge nf reset from nf_reset 2019-10-01 18:42:15 +02:00
route.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-26 10:40:21 +01:00
syncookies.c mptcp: handle tcp fallback when using syn cookies 2020-01-29 17:45:20 +01:00
sysctl_net_ipv4.c net-tcp: Disable TCP ssthresh metrics cache by default 2019-12-09 20:17:48 -08:00
tcp_bbr.c tcp_bbr: improve arithmetic division in bbr_update_bw() 2020-01-21 10:45:49 +01:00
tcp_bic.c
tcp_bpf.c bpf: Sockmap/tls, fix pop data with SK_DROP return code 2020-01-15 23:26:13 +01:00
tcp_cdg.c
tcp_cong.c bpf: tcp: Support tcp_congestion_ops in bpf 2020-01-09 08:46:18 -08:00
tcp_cubic.c tcp_cubic: refactor code to perform a divide only when needed 2019-12-30 14:44:27 -08:00
tcp_dctcp.c
tcp_dctcp.h
tcp_diag.c net: annotate lockless accesses to sk->sk_max_ack_backlog 2019-11-06 16:14:48 -08:00
tcp_fastopen.c tcp: add TCP_INFO status for failed client TFO 2019-10-25 19:25:37 -07:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: Reduce SYN resend delay if a suspicous ACK is received 2020-02-02 13:33:21 -08:00
tcp_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2020-01-23 08:10:16 +01:00
tcp_lp.c
tcp_metrics.c net-tcp: Disable TCP ssthresh metrics cache by default 2019-12-09 20:17:48 -08:00
tcp_minisocks.c bpf: tcp: Support tcp_congestion_ops in bpf 2020-01-09 08:46:18 -08:00
tcp_nv.c
tcp_offload.c
tcp_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-26 10:40:21 +01:00
tcp_rate.c
tcp_recovery.c
tcp_scalable.c
tcp_timer.c tcp: export count for rehash attempts 2020-01-26 15:28:47 +01:00
tcp_ulp.c bpf: Sockmap/tls, push write_space updates through ulp updates 2020-01-15 23:26:13 +01:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c tcp: clear tp->segs_{in|out} in tcp_disconnect() 2020-01-31 22:12:37 -08:00
tunnel4.c
udp_diag.c
udp_impl.h
udp_offload.c udp: Support UDP fraglist GRO/GSO. 2020-01-27 11:00:21 +01:00
udp_tunnel.c
udp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-26 10:40:21 +01:00
udplite.c
xfrm4_input.c
xfrm4_output.c netfilter: Support iif matches in POSTROUTING 2019-11-15 23:44:48 +01:00
xfrm4_policy.c net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2019-12-24 22:28:54 -08:00
xfrm4_protocol.c xfrm: add route lookup to xfrm4_rcv_encap 2019-12-09 09:59:07 +01:00
xfrm4_state.c xfrm: remove eth_proto value from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00
xfrm4_tunnel.c xfrm: remove type and offload_type map from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00