linux/net/ipv6
Eric Dumazet 5640f76858 net: use a per task frag allocator
We currently use a per socket order-0 page cache for tcp_sendmsg()
operations.

This page is used to build fragments for skbs.

Its done to increase probability of coalescing small write() into
single segments in skbs still in write queue (not yet sent)

But it wastes a lot of memory for applications handling many mostly
idle sockets, since each socket holds one page in sk->sk_sndmsg_page

Its also quite inefficient to build TSO 64KB packets, because we need
about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit
page allocator more than wanted.

This patch adds a per task frag allocator and uses bigger pages,
if available. An automatic fallback is done in case of memory pressure.

(up to 32768 bytes per frag, thats order-3 pages on x86)

This increases TCP stream performance by 20% on loopback device,
but also benefits on other network devices, since 8x less frags are
mapped on transmit and unmapped on tx completion. Alexander Duyck
mentioned a probable performance win on systems with IOMMU enabled.

Its possible some SG enabled hardware cant cope with bigger fragments,
but their ndo_start_xmit() should already handle this, splitting a
fragment in sub fragments, since some arches have PAGE_SIZE=65536

Successfully tested on various ethernet devices.
(ixgbe, igb, bnx2x, tg3, mellanox mlx4)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-24 16:31:37 -04:00
..
netfilter netfilter: combine ipt_REDIRECT and ip6t_REDIRECT 2012-09-21 12:12:05 +02:00
addrconf_core.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
addrconf.c ipv6: replace write lock with read lock when get route info 2012-09-13 16:53:46 -04:00
addrlabel.c ipv6: Add labels for site-local and 6bone testing addresses (RFC6724) 2012-09-13 16:34:03 -04:00
af_inet6.c ipv6: bool conversions phase1 2012-05-18 02:24:13 -04:00
ah6.c ipv6: Add redirect support to all protocol icmp error handlers. 2012-07-12 00:25:15 -07:00
anycast.c ipv6: bool/const conversions phase2 2012-05-19 01:08:16 -04:00
datagram.c ipv6: bool/const conversions phase2 2012-05-19 01:08:16 -04:00
esp6.c net: ipv6: fix error return code 2012-08-31 16:27:48 -04:00
exthdrs_core.c ipv6: bool/const conversions phase2 2012-05-19 01:08:16 -04:00
exthdrs.c net: Remove casts to same type 2012-06-04 11:45:11 -04:00
fib6_rules.c net/ipv6/fib6_rules.c: Checkpatch cleanup 2012-04-02 04:33:46 -04:00
icmp.c ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect(). 2012-07-12 00:33:37 -07:00
inet6_connection_sock.c ipv6: fix inet6_csk_xmit() 2012-07-18 08:59:58 -07:00
inet6_hashtables.c net: Compute protocol sequence numbers and fragment IDs using MD5. 2011-08-06 18:33:19 -07:00
ip6_fib.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-06-25 15:50:32 -07:00
ip6_flowlabel.c ipv6: move dereference after check in fl_free() 2012-08-16 16:04:42 -07:00
ip6_gre.c ipv6: gre: fix ip6gre_err() 2012-08-22 22:48:32 -07:00
ip6_input.c net: TCP early demux cleanup 2012-07-30 14:53:21 -07:00
ip6_output.c net: use a per task frag allocator 2012-09-24 16:31:37 -04:00
ip6_tunnel.c gre: information leak in ip6_tnl_ioctl() 2012-08-20 02:21:30 -07:00
ip6mr.c netlink: Rename pid to portid to avoid confusion 2012-09-10 15:30:41 -04:00
ipcomp6.c ipv6: Add redirect support to all protocol icmp error handlers. 2012-07-12 00:25:15 -07:00
ipv6_sockglue.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
Kconfig gre: Support GRE over IPv6 2012-08-14 14:28:32 -07:00
Makefile gre: Support GRE over IPv6 2012-08-14 14:28:32 -07:00
mcast.c ipv6: fix unappropriate errno returned for non-multicast address 2012-07-17 01:35:03 -07:00
mip6.c ipv6: correct the ipv6 option name - Pad0 to Pad1 2012-05-17 15:49:51 -04:00
ndisc.c ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect(). 2012-07-12 00:33:37 -07:00
netfilter.c netfilter: ipv6: expand skb head in ip6_route_me_harder after oif change 2012-08-30 03:00:15 +02:00
proc.c net: ipv6: proc: Fix error handling 2012-08-14 14:45:07 -07:00
protocol.c inet: Sanitize inet{,6} protocol demux. 2012-06-19 18:56:21 -07:00
raw.c userns: Print out socket uids in a user namespace aware fashion. 2012-08-14 21:48:06 -07:00
reassembly.c ipv6: unify fragment thresh handling code 2012-09-19 17:23:28 -04:00
route.c ipv6: recursive check rt->dst.from when call rt6_check_expired 2012-09-19 15:35:33 -04:00
sit.c net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() 2012-07-17 03:29:28 -07:00
syncookies.c tcp: TCP Fast Open Server - support TFO listeners 2012-08-31 20:02:19 -04:00
sysctl_net_ipv6.c net: Delete all remaining instances of ctl_path 2012-04-20 21:22:30 -04:00
tcp_ipv6.c tcp: TCP Fast Open Server - take SYNACK RTT after completing 3WHS 2012-09-22 15:47:10 -04:00
tunnel6.c net: ipv6: Standardize prefixes for message logging 2012-05-16 01:01:03 -04:00
udp_impl.h
udp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-09-15 11:43:53 -04:00
udplite.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
xfrm6_input.c
xfrm6_mode_beet.c ipsec: be careful of non existing mac headers 2012-02-23 16:50:45 -05:00
xfrm6_mode_ro.c
xfrm6_mode_transport.c
xfrm6_mode_tunnel.c ipsec: be careful of non existing mac headers 2012-02-23 16:50:45 -05:00
xfrm6_output.c xfrm6: remove unneeded NULL check in __xfrm6_output() 2012-02-01 02:52:48 -05:00
xfrm6_policy.c net: ipv6: fix oops in inet_putpeer() 2012-08-20 02:56:56 -07:00
xfrm6_state.c net: remove ipv6_addr_copy() 2011-11-22 16:43:32 -05:00
xfrm6_tunnel.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00