linux/net/core
Eric Dumazet b2b5ce9d1c net: introduce build_skb()
One of the thing we discussed during netdev 2011 conference was the idea
to change some network drivers to allocate/populate their skb at RX
completion time, right before feeding the skb to network stack.

In old days, we allocated skbs when populating the RX ring.

This means bringing into cpu cache sk_buff and skb_shared_info cache
lines (since we clear/initialize them), then 'queue' skb->data to NIC.

By the time NIC fills a frame in skb->data buffer and host can process
it, cpu probably threw away the cache lines from its caches, because lot
of things happened between the allocation and final use.

So the deal would be to allocate only the data buffer for the NIC to
populate its RX ring buffer. And use build_skb() at RX completion to
attach a data buffer (now filled with an ethernet frame) to a new skb,
initialize the skb_shared_info portion, and give the hot skb to network
stack.

build_skb() is the function to allocate an skb, caller providing the
data buffer that should be attached to it. Drivers are expected to call
skb_reserve() right after build_skb() to adjust skb->data to the
Ethernet frame (usually skipping NET_SKB_PAD and NET_IP_ALIGN, but some
drivers might add a hardware provided alignment)

Data provided to build_skb() MUST have been allocated by a prior
kmalloc() call, with enough room to add SKB_DATA_ALIGN(sizeof(struct
skb_shared_info)) bytes at the end of the data without corrupting
incoming frame.

data = kmalloc(NET_SKB_PAD + NET_IP_ALIGN + 1536 +
               SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
	       GFP_ATOMIC);
...
skb = build_skb(data);
if (!skb) {
	recycle_data(data);
} else {
	skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
	...
}

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Eilon Greenstein <eilong@broadcom.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Tom Herbert <therbert@google.com>
CC: Jamal Hadi Salim <hadi@mojatatu.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Thomas Graf <tgraf@infradead.org>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-14 14:13:30 -05:00
..
datagram.c net: add skb frag size accessors 2011-10-19 03:10:46 -04:00
dev_addr_lists.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
dev.c vlan: allow nested vlan_do_receive() 2011-10-30 04:43:30 -04:00
drop_monitor.c net,rcu: convert call_rcu(free_dm_hw_stat) to kfree_rcu() 2011-05-07 22:50:59 -07:00
dst.c net: fix potential neighbour race in dst_ifdown() 2011-08-09 21:47:14 -07:00
ethtool.c net: consolidate and fix ethtool_ops->get_settings calling 2011-09-15 17:32:26 -04:00
fib_rules.c net: Fix files explicitly needing to include module.h 2011-10-31 19:30:28 -04:00
filter.c filter: use unsigned int to silence static checker warning 2011-10-19 19:35:51 -04:00
flow.c net/flow: Fix potential memory leak 2011-10-17 19:18:42 -04:00
gen_estimator.c net,rcu: convert call_rcu(__gen_kill_estimator) to kfree_rcu() 2011-05-07 22:50:57 -07:00
gen_stats.c net/core: EXPORT_SYMBOL cleanups 2010-07-12 12:57:55 -07:00
iovec.c net: Limit socket I/O iovec total length to INT_MAX. 2010-10-28 11:47:52 -07:00
kmap_skb.h net: convert core to skb paged frag APIs 2011-08-24 17:52:11 -07:00
link_watch.c net: linkwatch: allow vlans to get carrier changes faster 2011-09-15 15:36:34 -04:00
Makefile net: Compute protocol sequence numbers and fragment IDs using MD5. 2011-08-06 18:33:19 -07:00
neighbour.c neigh: new unresolved queue limits 2011-11-14 00:47:54 -05:00
net_namespace.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
net-sysfs.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
net-sysfs.h xps: Add CONFIG_XPS 2010-11-28 18:24:14 -08:00
net-traces.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
netevent.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
netpoll.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
pktgen.c pktgen: remove ndelay() call 2011-10-20 17:00:21 -04:00
request_sock.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-08 13:47:38 -08:00
rtnetlink.c if_link: Add additional parameter to IFLA_VF_INFO for spoof checking 2011-10-16 13:15:38 -07:00
scm.c af_unix: dont send SCM_CREDENTIALS by default 2011-09-28 13:29:50 -04:00
secure_seq.c tcp: add const qualifiers where possible 2011-10-21 05:22:42 -04:00
skbuff.c net: introduce build_skb() 2011-11-14 14:13:30 -05:00
sock.c net: rename sk_clone to sk_clone_lock 2011-11-08 17:07:07 -05:00
stream.c net: Fix the condition passed to sk_wait_event() 2010-10-03 20:41:32 -07:00
sysctl_net_core.c net: Kill ratelimit.h dependency in linux/net.h 2011-05-27 13:41:33 -04:00
timestamping.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
user_dma.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
utils.c net: Kill ratelimit.h dependency in linux/net.h 2011-05-27 13:41:33 -04:00