linux/net
Eric Dumazet 2f53384424 tcp: allow splice() to build full TSO packets
vmsplice()/splice(pipe, socket) call do_tcp_sendpages() one page at a
time, adding at most 4096 bytes to an skb. (assuming PAGE_SIZE=4096)

The call to tcp_push() at the end of do_tcp_sendpages() forces an
immediate xmit when pipe is not already filled, and tso_fragment() try
to split these skb to MSS multiples.

4096 bytes are usually split in a skb with 2 MSS, and a remaining
sub-mss skb (assuming MTU=1500)

This makes slow start suboptimal because many small frames are sent to
qdisc/driver layers instead of big ones (constrained by cwnd and packets
in flight of course)

In fact, applications using sendmsg() (adding an additional memory copy)
instead of vmsplice()/splice()/sendfile() are a bit faster because of
this anomaly, especially if serving small files in environments with
large initial [c]wnd.

Call tcp_push() only if MSG_MORE is not set in the flags parameter.

This bit is automatically provided by splice() internals but for the
last page, or on all pages if user specified SPLICE_F_MORE splice()
flag.

In some workloads, this can reduce number of sent logical packets by an
order of magnitude, making zero-copy TCP actually faster than
one-copy :)

Reported-by: Tom Herbert <therbert@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail>com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-03 17:35:43 -04:00
..
9p net/9p: handle flushed Tclunk/Tremove 2012-02-26 14:49:57 -06:00
802 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-02 17:53:39 -07:00
8021q
appletalk
atm Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
ax25 Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
batman-adv Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge 2012-03-11 15:36:34 -07:00
bluetooth Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-03-09 14:34:20 -08:00
caif caif: make zero a legal caif connetion id. 2012-03-11 15:38:16 -07:00
can
ceph libceph: isolate kmap() call in write_partial_msg_pages() 2012-03-22 10:47:52 -05:00
core net: fix /proc/net/dev regression 2012-04-03 17:23:23 -04:00
dcb
dccp dccp: fix bug in sequence number validation during connection setup 2012-03-03 09:02:52 -07:00
decnet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
dns_resolver KEYS: Allow special keyrings to be cleared 2012-01-19 14:38:51 +11:00
dsa
econet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
ethernet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
ieee802154 net/ieee802154/6lowpan.c: reuse eth_mac_addr() 2012-02-22 14:46:37 -05:00
ipv4 tcp: allow splice() to build full TSO packets 2012-04-03 17:35:43 -04:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-02 17:53:39 -07:00
ipx
irda Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
iucv Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2012-03-22 18:15:32 -07:00
key
l2tp l2tp: enable automatic module loading for l2tp_ppp 2012-03-21 22:14:56 -04:00
lapb Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
llc llc: Fix race condition in llc_ui_recvmsg 2012-01-24 15:33:19 -05:00
mac80211 mac80211: fix oper channel timestamp updation 2012-03-28 14:25:37 -04:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-02 17:53:39 -07:00
netlabel netlabel: use GFP flags from caller instead of GFP_ATOMIC 2012-03-22 19:29:57 -04:00
netlink netlink: allow to pass data pointer to netlink_dump_start() callback 2012-02-26 14:10:44 -05:00
netrom Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
nfc NFC: NCI code identation fixes 2012-03-06 15:16:25 -05:00
openvswitch Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
packet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
phonet net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
rds RDS: use gfp flags from caller in conn_alloc() 2012-03-22 19:29:58 -04:00
rfkill device.h: cleanup users outside of linux/include (C files) 2012-03-11 14:27:37 -04:00
rose Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-02 17:53:39 -07:00
rxrpc RxRPC: Fix kcalloc parameters swapped 2012-02-14 14:41:55 -05:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2012-03-20 21:04:47 -07:00
sctp sctp: Export sctp_do_peeloff 2012-03-08 13:52:08 -08:00
sunrpc Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linux 2012-03-29 14:53:25 -07:00
tipc tipc: Optimize setting of immutable payload message header fields 2012-02-29 11:45:35 -05:00
unix poll: add poll_requested_events() and poll_does_not_wait() functions 2012-03-23 16:58:38 -07:00
wanrouter
wimax
wireless cfg80211: allow CFG80211_SIGNAL_TYPE_UNSPEC in station_info 2012-03-26 15:07:25 -04:00
x25 net:x25: use IS_ENABLED 2011-12-16 15:49:52 -05:00
xfrm xfrm: Access the replay notify functions via the registered callbacks 2012-03-22 19:29:58 -04:00
compat.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
Kconfig
Makefile
nonet.c
socket.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
sysctl_net.c sysctl: Modify __register_sysctl_paths to take a set instead of a root and an nsproxy 2012-01-24 16:40:30 -08:00