linux/net
Daniel Borkmann 9aa1206e8f bpf: Add redirect_peer helper
Add an efficient ingress to ingress netns switch that can be used out of tc BPF
programs in order to redirect traffic from host ns ingress into a container
veth device ingress without having to go via CPU backlog queue [0]. For local
containers this can also be utilized and path via CPU backlog queue only needs
to be taken once, not twice. On a high level this borrows from ipvlan which does
similar switch in __netif_receive_skb_core() and then iterates via another_round.
This helps to reduce latency for mentioned use cases.

Pod to remote pod with redirect(), TCP_RR [1]:

  # percpu_netperf 10.217.1.33
          RT_LATENCY:         122.450         (per CPU:         122.666         122.401         122.333         122.401 )
        MEAN_LATENCY:         121.210         (per CPU:         121.100         121.260         121.320         121.160 )
      STDDEV_LATENCY:         120.040         (per CPU:         119.420         119.910         125.460         115.370 )
         MIN_LATENCY:          46.500         (per CPU:          47.000          47.000          47.000          45.000 )
         P50_LATENCY:         118.500         (per CPU:         118.000         119.000         118.000         119.000 )
         P90_LATENCY:         127.500         (per CPU:         127.000         128.000         127.000         128.000 )
         P99_LATENCY:         130.750         (per CPU:         131.000         131.000         129.000         132.000 )

    TRANSACTION_RATE:       32666.400         (per CPU:        8152.200        8169.842        8174.439        8169.897 )

Pod to remote pod with redirect_peer(), TCP_RR:

  # percpu_netperf 10.217.1.33
          RT_LATENCY:          44.449         (per CPU:          43.767          43.127          45.279          45.622 )
        MEAN_LATENCY:          45.065         (per CPU:          44.030          45.530          45.190          45.510 )
      STDDEV_LATENCY:          84.823         (per CPU:          66.770          97.290          84.380          90.850 )
         MIN_LATENCY:          33.500         (per CPU:          33.000          33.000          34.000          34.000 )
         P50_LATENCY:          43.250         (per CPU:          43.000          43.000          43.000          44.000 )
         P90_LATENCY:          46.750         (per CPU:          46.000          47.000          47.000          47.000 )
         P99_LATENCY:          52.750         (per CPU:          51.000          54.000          53.000          53.000 )

    TRANSACTION_RATE:       90039.500         (per CPU:       22848.186       23187.089       22085.077       21919.130 )

  [0] https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf
  [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperf

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201010234006.7075-3-daniel@iogearbox.net
2020-10-11 10:21:04 -07:00
..
6lowpan
9p treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
802
8021q net: vlan: Fixed signedness in vlan_group_prealloc_vid() 2020-09-28 00:51:39 -07:00
appletalk appletalk: Fix atalk_proc_init() return path 2020-08-03 15:48:32 -07:00
atm net: atm: delete duplicated words 2020-09-18 14:12:43 -07:00
ax25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-07-25 17:49:04 -07:00
batman-adv net: bridge: mcast: rename br_ip's u member to dst 2020-09-23 13:24:34 -07:00
bluetooth Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2020-09-29 13:22:53 -07:00
bpf bpf: fix raw_tp test run in preempt kernel 2020-09-30 08:34:08 -07:00
bpfilter bpf: Add kernel module with user mode driver that populates bpffs. 2020-08-20 16:02:36 +02:00
bridge net: bridge: mcast: remove only S,G port groups from sg_port hash 2020-09-25 16:50:19 -07:00
caif caif: Remove duplicate macro SRVL_CTRL_PKT_SIZE 2020-09-05 15:57:05 -07:00
can can: remove "WITH Linux-syscall-note" from SPDX tag of C files 2020-09-21 10:13:16 +02:00
ceph treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
core bpf: Add redirect_peer helper 2020-10-11 10:21:04 -07:00
dcb net: DCB: Validate DCB_ATTR_DCB_BUFFER argument 2020-09-10 15:09:08 -07:00
dccp inet: remove icsk_ack.blocked 2020-09-30 14:21:30 -07:00
decnet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dns_resolver
dsa net: dsa: tag_rtl4_a: use the generic flow dissector procedure 2020-09-26 14:17:59 -07:00
ethernet
ethtool Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
hsr Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
ieee802154 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ife
ipv4 bpf: tcp: Do not limit cb_flags when creating child sk from listen sk 2020-10-02 11:34:48 -07:00
ipv6 ip6gre: avoid tx_error when sending MLD/DAD on external tunnels 2020-09-28 16:01:37 -07:00
iucv treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
kcm net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-08-02 01:02:12 -07:00
l2tp l2tp: report rx cookie discards in netlink get 2020-09-29 13:26:36 -07:00
l3mdev net: Fix some comments 2020-08-27 07:55:59 -07:00
lapb
llc net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
mac802154 Merge tag 'ieee802154-for-davem-2020-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan 2020-09-08 20:12:58 -07:00
mpls treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
mptcp net: tcp: drop unused function argument from mptcp_incoming_options 2020-09-24 20:17:01 -07:00
ncsi treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
netlabel netlabel: Fix some kernel-doc warnings 2020-09-08 20:04:27 -07:00
netlink netlink: add spaces around '&' in netlink_recv/sendmsg() 2020-09-17 16:53:47 -07:00
netrom treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nfc NFC: digital: Remove two unused macroes 2020-09-05 16:01:52 -07:00
nsh
openvswitch net: openswitch: reuse the helper variable to improve the code readablity 2020-09-18 14:24:08 -07:00
packet net/packet: Fix a comment about network_header 2020-09-19 16:40:48 -07:00
phonet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
psample
qrtr net: qrtr: check skb_put_padto() return value 2020-09-09 11:04:39 -07:00
rds RDS: drop double zeroing 2020-09-20 19:09:11 -07:00
rfkill
rose treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
rxrpc rxrpc: Fix an overget of the conn bundle when setting up a client conn 2020-09-14 16:18:59 +01:00
sched net/sched: cls_u32: Replace one-element array with flexible-array member 2020-09-28 18:48:42 -07:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
smc net/smc: CLC decline - V2 enhancements 2020-09-28 15:19:03 -07:00
strparser
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
switchdev net: switchdev: kerneldoc fixes 2020-07-13 17:20:40 -07:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
tls net/tls: Implement getsockopt SOL_TLS TLS_RX 2020-09-01 11:47:12 -07:00
unix net: unix: remove redundant assignment to variable 'err' 2020-09-21 14:51:37 -07:00
vmw_vsock vsock: fix potential null pointer dereference in vsock_poll() 2020-08-12 12:56:06 -07:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
x25 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
xdp xsk: Introduce padding between ring pointers 2020-10-09 16:35:01 +02:00
xfrm treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
compat.c net/scm: Fix typo in SCM_RIGHTS compat refactoring 2020-08-07 12:43:25 -07:00
devres.c net: devres: rename the release callback of devm_register_netdev() 2020-06-30 15:57:34 -07:00
Kconfig drop_monitor: Convert to using devlink tracepoint 2020-09-30 18:01:26 -07:00
Makefile
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-04 21:28:59 -07:00
sysctl_net.c