linux/net/ipv4
Eric Dumazet 50c8339e92 tcp: tso: restore IW10 after TSO autosizing
With sysctl_tcp_min_tso_segs being 4, it is very possible
that tcp_tso_should_defer() decides not sending last 2 MSS
of initial window of 10 packets. This also applies if
autosizing decides to send X MSS per GSO packet, and cwnd
is not a multiple of X.

This patch implements an heuristic based on age of first
skb in write queue : If it was sent very recently (less than half srtt),
we can predict that no ACK packet will come in less than half rtt,
so deferring might cause an under utilization of our window.

This is visible on initial send (IW10) on web servers,
but more generally on some RPC, as the last part of the message
might need an extra RTT to get delivered.

Tested:

Ran following packetdrill test
// A simple server-side test that sends exactly an initial window (IW10)
// worth of packets.

`sysctl -e -q net.ipv4.tcp_min_tso_segs=4`

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0    setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0    bind(3, ..., ...) = 0
+0    listen(3, 1) = 0

+.1   < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
+0    > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.1   < . 1:1(0) ack 1 win 257
+0    accept(3, ..., ...) = 4

+0    write(4, ..., 14600) = 14600
+0    > . 1:5841(5840) ack 1 win 457
+0    > . 5841:11681(5840) ack 1 win 457
// Following packet should be sent right now.
+0    > P. 11681:14601(2920) ack 1 win 457

+.1   < . 1:1(0) ack 14601 win 257

+0    close(4) = 0
+0    > F. 14601:14601(0) ack 1
+.1   < F. 1:1(0) ack 14602 win 257
+0    > . 14602:14602(0) ack 2

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28 15:10:39 -05:00
..
netfilter netfilter: nf_tables: fix port natting in little endian archs 2014-12-23 15:34:28 +01:00
af_inet.c net: rfs: add hash collision detection 2015-02-08 16:53:57 -08:00
ah4.c ipsec: Remove obsolete MAX_AH_AUTH_LEN 2014-09-18 10:54:36 +02:00
arp.c neigh: remove dynamic neigh table registration support 2014-11-11 15:23:54 -05:00
cipso_ipv4.c cipso: don't use IPCB() to locate the CIPSO IP option 2015-02-11 14:46:37 -05:00
datagram.c net: Save TX flow hash in sock and set in skbuf on xmit 2014-07-07 21:14:21 -07:00
devinet.c multicast: Extend ip address command to enable multicast group join/leave on 2015-02-27 16:25:25 -05:00
esp4.c net: esp: Convert NETDEBUG to pr_info 2014-11-06 15:11:10 -05:00
fib_frontend.c fib_trie: Push rcu_read_lock/unlock to callers 2014-12-31 18:25:54 -05:00
fib_lookup.h fib_trie: Add slen to fib alias 2015-02-27 16:37:07 -05:00
fib_rules.c fib_trie: Push rcu_read_lock/unlock to callers 2014-12-31 18:25:54 -05:00
fib_semantics.c fib_trie: Convert fib_alias to hlist from list 2015-02-27 16:37:06 -05:00
fib_trie.c fib_trie: Remove leaf_info 2015-02-27 16:37:07 -05:00
fou.c gue: Use checksum partial with remote checksum offload 2015-02-11 15:12:13 -08:00
geneve.c openvswitch: Add support for checksums on UDP tunnels. 2015-01-28 23:04:15 -08:00
gre_demux.c net: Fix GRE RX to use skb_transport_header for GRE header offset 2014-09-08 15:23:05 -07:00
gre_offload.c gre: Set inner mac header in gro complete 2014-12-05 21:18:34 -08:00
icmp.c ipv4: icmp: use percpu allocation 2015-01-31 17:48:18 -08:00
igmp.c multicast: Extend ip address command to enable multicast group join/leave on 2015-02-27 16:25:25 -05:00
inet_connection_sock.c ipv4: make ip_local_reserved_ports per netns 2014-05-14 15:31:45 -04:00
inet_diag.c netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
inet_fragment.c net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited 2014-11-11 14:10:31 -05:00
inet_hashtables.c net: use reciprocal_scale() helper 2014-08-23 12:21:21 -07:00
inet_lro.c lro: remove dead code 2013-12-29 16:34:25 -05:00
inet_timewait_sock.c
inetpeer.c inet: remove dead inetpeer sequence code 2014-09-08 16:42:42 -07:00
ip_forward.c ipv4: try to cache dst_entries which would cause a redirect 2015-01-26 17:28:27 -08:00
ip_fragment.c net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited 2014-11-11 14:10:31 -05:00
ip_gre.c gre/ipip: use be16 variants of netlink functions 2015-02-08 16:28:06 -08:00
ip_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip_options.c ipv4: rename ip_options_echo to __ip_options_echo() 2014-09-28 16:35:42 -04:00
ip_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-02-05 14:33:28 -08:00
ip_sockglue.c net-timestamp: no-payload option 2015-02-02 18:46:51 -08:00
ip_tunnel_core.c ipv4: fix a potential use after free in ip_tunnel_core.c 2014-10-17 23:45:26 -04:00
ip_tunnel.c tunnels: advertise link netns via netlink 2015-01-19 14:32:03 -05:00
ip_vti.c tunnels: advertise link netns via netlink 2015-01-19 14:32:03 -05:00
ipcomp.c ipcomp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
ipconfig.c net: ipv4: handle DSA enabled master network devices 2015-01-19 15:45:10 -05:00
ipip.c gre/ipip: use be16 variants of netlink functions 2015-02-08 16:28:06 -08:00
ipmr.c netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
Kconfig net: Move fou_build_header into fou.c and refactor 2014-11-05 16:30:02 -05:00
Makefile net: Add Geneve tunneling protocol driver 2014-10-06 00:32:20 -04:00
netfilter.c netfilter: remove double colon 2014-02-19 11:41:25 +01:00
ping.c net: switch memcpy_fromiovec()/memcpy_fromiovecend() users to copy_from_iter() 2015-02-04 01:34:15 -05:00
proc.c tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks 2015-02-08 01:03:12 -08:00
protocol.c net: Export inet_offloads and inet6_offloads 2014-09-19 17:15:31 -04:00
raw.c net: switch memcpy_fromiovec()/memcpy_fromiovecend() users to copy_from_iter() 2015-02-04 01:34:15 -05:00
route.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-02-05 14:33:28 -08:00
syncookies.c net: allow setting ecn via routing table 2014-11-04 16:06:09 -05:00
sysctl_net_ipv4.c ipv4: Namespecify TCP PMTU mechanism 2015-02-09 18:45:00 -08:00
tcp_bic.c tcp: stretch ACK fixes prep 2015-01-28 22:18:37 -08:00
tcp_cong.c tcp: silence registration message 2015-02-20 15:04:03 -05:00
tcp_cubic.c tcp: fix timing issue in CUBIC slope calculation 2015-01-28 22:18:38 -08:00
tcp_dctcp.c net: tcp: add DCTCP congestion control algorithm 2014-09-29 00:13:10 -04:00
tcp_diag.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_fastopen.c tcp: make sure skb is not shared before using skb_get() 2015-02-13 07:11:40 -08:00
tcp_highspeed.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_htcp.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_hybla.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_illinois.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_input.c tcp: mitigate ACK loops for connections as tcp_sock 2015-02-08 01:03:12 -08:00
tcp_ipv4.c ipv4: Namespecify TCP PMTU mechanism 2015-02-09 18:45:00 -08:00
tcp_lp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_memcontrol.c memcg: cleanup static keys decrement 2015-02-12 18:54:10 -08:00
tcp_metrics.c netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
tcp_minisocks.c tcp: mitigate ACK loops for connections as tcp_timewait_sock 2015-02-08 01:03:13 -08:00
tcp_offload.c net: Remove MPLS GSO feature. 2014-11-05 23:52:33 -08:00
tcp_output.c tcp: tso: restore IW10 after TSO autosizing 2015-02-28 15:10:39 -05:00
tcp_probe.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_scalable.c tcp: stretch ACK fixes prep 2015-01-28 22:18:37 -08:00
tcp_timer.c ipv4: Namespecify TCP PMTU mechanism 2015-02-09 18:45:00 -08:00
tcp_vegas.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_vegas.h
tcp_veno.c tcp: stretch ACK fixes prep 2015-01-28 22:18:37 -08:00
tcp_westwood.c net: tcp: split ack slow/fast events from cwnd_event 2014-09-29 00:13:10 -04:00
tcp_yeah.c tcp: stretch ACK fixes prep 2015-01-28 22:18:37 -08:00
tcp.c ip: convert tcp_sendmsg() to iov_iter primitives 2015-02-04 01:34:14 -05:00
tunnel4.c
udp_diag.c udp_diag: Fix socket skipping within chain 2015-01-27 00:02:41 -08:00
udp_impl.h
udp_offload.c udp: Set SKB_GSO_UDP_TUNNEL* in UDP GRO path 2015-02-11 15:12:10 -08:00
udp_tunnel.c udp: Do not require sock in udp_tunnel_xmit_skb 2015-01-24 23:15:40 -08:00
udp.c udp: In udp_flow_src_port use random hash value if skb_get_hash fails 2015-02-27 16:00:01 -05:00
udplite.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
xfrm4_input.c xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
xfrm4_mode_beet.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
xfrm4_policy.c xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly 2014-03-14 07:28:07 +01:00
xfrm4_protocol.c xfrm4: Remove duplicate semicolon 2014-06-30 07:49:47 +02:00
xfrm4_state.c
xfrm4_tunnel.c