linux/net/ipv4
Ilpo Järvinen 68f8353b48 [TCP]: Rewrite SACK block processing & sack_recv_cache use
Key points of this patch are:

  - In case new SACK information is advance only type, no skb
    processing below previously discovered highest point is done
  - Optimize cases below highest point too since there's no need
    to always go up to highest point (which is very likely still
    present in that SACK), this is not entirely true though
    because I'm dropping the fastpath_skb_hint which could
    previously optimize those cases even better. Whether that's
    significant, I'm not too sure.

Currently it will provide skipping by walking. Combined with
RB-tree, all skipping would become fast too regardless of window
size (can be done incrementally later).

Previously a number of cases in TCP SACK processing fails to
take advantage of costly stored information in sack_recv_cache,
most importantly, expected events such as cumulative ACK and new
hole ACKs. Processing on such ACKs result in rather long walks
building up latencies (which easily gets nasty when window is
huge). Those latencies are often completely unnecessary
compared with the amount of _new_ information received, usually
for cumulative ACK there's no new information at all, yet TCP
walks whole queue unnecessary potentially taking a number of
costly cache misses on the way, etc.!

Since the inclusion of highest_sack, there's a lot information
that is very likely redundant (SACK fastpath hint stuff,
fackets_out, highest_sack), though there's no ultimate guarantee
that they'll remain the same whole the time (in all unearthly
scenarios). Take advantage of this knowledge here and drop
fastpath hint and use direct access to highest SACKed skb as
a replacement.

Effectively "special cased" fastpath is dropped. This change
adds some complexity to introduce better coveraged "fastpath",
though the added complexity should make TCP behave more cache
friendly.

The current ACK's SACK blocks are compared against each cached
block individially and only ranges that are new are then scanned
by the high constant walk. For other parts of write queue, even
when in previously known part of the SACK blocks, a faster skip
function is used (if necessary at all). In addition, whenever
possible, TCP fast-forwards to highest_sack skb that was made
available by an earlier patch. In typical case, no other things
but this fast-forward and mandatory markings after that occur
making the access pattern quite similar to the former fastpath
"special case".

DSACKs are special case that must always be walked.

The local to recv_sack_cache copying could be more intelligent
w.r.t DSACKs which are likely to be there only once but that
is left to a separate patch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:07 -08:00
..
ipvs [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
netfilter [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
af_inet.c [TCP]: Splice receive support. 2008-01-28 14:53:31 -08:00
ah4.c [IPSEC]: Move state lock into x->type->input 2008-01-28 14:53:52 -08:00
arp.c IPoIB: improve IPv4/IPv6 to IB mcast mapping functions 2008-01-25 14:15:37 -08:00
cipso_ipv4.c [NetLabel]: correct usage of RCU locking 2007-10-26 04:29:08 -07:00
datagram.c [IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP 2007-06-03 18:08:50 -07:00
devinet.c [INET]: Fix netdev renaming and inet address labels 2008-01-04 03:55:34 -08:00
esp4.c [IPSEC]: Move state lock into x->type->input 2008-01-28 14:53:52 -08:00
fib_frontend.c [IPV4]: OOPS with NETLINK_FIB_LOOKUP netlink socket 2007-12-21 02:01:53 -08:00
fib_hash.c [IPV4] FIB_HASH : Avoid unecessary loop in fn_hash_dump_zone() 2008-01-20 20:31:39 -08:00
fib_lookup.h [RTNETLINK]: Fix sending netlink message when replace route. 2007-05-24 16:36:53 -07:00
fib_rules.c [INET]: Small possible memory leak in FIB rules 2007-11-10 22:12:03 -08:00
fib_semantics.c [NETLINK]: Introduce nested and byteorder flag to netlink attribute 2007-10-10 16:49:16 -07:00
fib_trie.c [IPV4] fib_trie: fix duplicated route issue 2008-01-20 20:31:38 -08:00
icmp.c [ICMP]: ICMP_MIB_OUTMSGS increment duplicated 2008-01-21 03:39:45 -08:00
igmp.c [IPV4]: Add ip_local_out 2008-01-28 14:53:47 -08:00
inet_connection_sock.c [NET]: Convert init_timer into setup_timer 2008-01-28 14:53:35 -08:00
inet_diag.c [INET]: Fix inet_diag dead-lock regression 2007-12-03 15:51:25 +11:00
inet_fragment.c [NET]: Convert init_timer into setup_timer 2008-01-28 14:53:35 -08:00
inet_hashtables.c [INET]: Remove per bucket rwlock in tcp/dccp ehash table. 2007-11-07 04:15:11 -08:00
inet_lro.c [LRO] Fix lro_mgr->features checks 2008-01-08 23:30:18 -08:00
inet_timewait_sock.c [INET]: Remove per bucket rwlock in tcp/dccp ehash table. 2007-11-07 04:15:11 -08:00
inetpeer.c [INET]: Use list_head-s in inetpeer.c 2007-11-12 21:27:28 -08:00
ip_forward.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
ip_fragment.c [NET]: Fix uninitialised variable in ip_frag_reasm() 2007-10-17 21:37:22 -07:00
ip_gre.c [IPV4] ip_gre: set mac_header correctly in receive path 2007-12-20 00:10:33 -08:00
ip_input.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
ip_options.c [IPV4] ip_options.c: kmalloc + memset conversion to kzalloc 2007-07-31 14:06:45 -07:00
ip_output.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
ip_sockglue.c [IPV4]: Clean the ip_sockglue.c from some ugly ifdefs 2007-11-07 04:08:55 -08:00
ipcomp.c [IPSEC]: Forbid BEET + ipcomp for now 2008-01-28 14:53:43 -08:00
ipconfig.c [IPV4] ipconfig: Implement DHCP Class-identifier 2008-01-28 14:53:59 -08:00
ipip.c [NET]: Treat the sign of the result of skb_headroom() consistently 2007-10-23 21:27:55 -07:00
ipmr.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
Kconfig typo fixes 2007-10-20 01:34:40 +02:00
Makefile [INET]: Collect frag queues management objects together 2007-10-15 12:26:39 -07:00
netfilter.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
proc.c [NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way. 2007-11-07 04:08:57 -08:00
protocol.c [IPV4]: align inet_protos[] on SMP 2007-04-25 22:28:20 -07:00
raw.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
route.c [IPSEC]: Merge most of the output path 2008-01-28 14:53:48 -08:00
syncookies.c [SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th 2007-04-25 22:25:26 -07:00
sysctl_net_ipv4.c [TCP]: Problem bug with sysctl_tcp_congestion_control function 2007-11-19 23:28:21 -08:00
tcp_bic.c [TCP]: Remove num_acked>0 checks from cong.ctrl mods pkts_acked 2007-10-10 16:47:55 -07:00
tcp_cong.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_cubic.c [TCP]: Remove num_acked>0 checks from cong.ctrl mods pkts_acked 2007-10-10 16:47:55 -07:00
tcp_diag.c [INET]: Let inet_diag and friends autoload 2007-10-22 02:59:54 -07:00
tcp_highspeed.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_htcp.c [TCP]: H-TCP maxRTT estimation at startup 2007-08-07 18:29:05 -07:00
tcp_hybla.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_illinois.c [TCP] illinois: Incorrect beta usage 2007-11-30 01:10:55 +11:00
tcp_input.c [TCP]: Rewrite SACK block processing & sack_recv_cache use 2008-01-28 14:54:07 -08:00
tcp_ipv4.c [IPV4] TCPMD5: Use memmove() instead of memcpy() because we have overlaps. 2007-11-20 17:30:31 -08:00
tcp_lp.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_minisocks.c [TCP]: Move sack_ok access to obviously named funcs & cleanup 2007-10-10 16:48:00 -07:00
tcp_output.c [TCP]: Rewrite SACK block processing & sack_recv_cache use 2008-01-28 14:54:07 -08:00
tcp_probe.c [NET]: Make /proc/net per network namespace 2007-10-10 16:49:06 -07:00
tcp_scalable.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_timer.c [TCP]: Move sack_ok access to obviously named funcs & cleanup 2007-10-10 16:48:00 -07:00
tcp_vegas.c [TCP] vegas: Fix a bug in disabling slow start by gamma parameter. 2007-10-29 22:37:25 -07:00
tcp_vegas.h [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_veno.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_westwood.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_yeah.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp.c [TCP]: Make tcp_splice_data_recv() static. 2008-01-28 14:53:32 -08:00
tunnel4.c [INET]: Cleanup the xfrm4_tunnel_(un)register 2007-11-10 21:48:54 -08:00
udp_impl.h [UDP]: Randomize port selection. 2007-10-10 16:48:31 -07:00
udp.c [IPV4]: Use the {DEFINE|REF}_PROTO_INUSE infrastructure 2007-11-07 04:08:58 -08:00
udplite.c [IPV4]: Use the {DEFINE|REF}_PROTO_INUSE infrastructure 2007-11-07 04:08:58 -08:00
xfrm4_input.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
xfrm4_mode_beet.c [IPSEC]: Separate inner/outer mode processing on input 2008-01-28 14:53:46 -08:00
xfrm4_mode_transport.c [IPSEC]: Use IPv6 calling convention as the convention for x->mode->output 2007-10-10 16:55:54 -07:00
xfrm4_mode_tunnel.c [IPSEC]: Separate inner/outer mode processing on input 2008-01-28 14:53:46 -08:00
xfrm4_output.c [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
xfrm4_policy.c [IPSEC]: Merge most of the output path 2008-01-28 14:53:48 -08:00
xfrm4_state.c [IPSEC]: Kill afinfo->nf_post_routing 2008-01-28 14:53:55 -08:00
xfrm4_tunnel.c [IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input 2007-10-17 21:28:53 -07:00