linux/net
Toke Høiland-Jørgensen ad1e03b2b3 core: Don't skip generic XDP program execution for cloned SKBs
The current generic XDP handler skips execution of XDP programs entirely if
an SKB is marked as cloned. This leads to some surprising behaviour, as
packets can end up being cloned in various ways, which will make an XDP
program not see all the traffic on an interface.

This was discovered by a simple test case where an XDP program that always
returns XDP_DROP is installed on a veth device. When combining this with
the Scapy packet sniffer (which uses an AF_PACKET) socket on the sending
side, SKBs reliably end up in the cloned state, causing them to be passed
through to the receiving interface instead of being dropped. A minimal
reproducer script for this is included below.

This patch fixed the issue by simply triggering the existing linearisation
code for cloned SKBs instead of skipping the XDP program execution. This
behaviour is in line with the behaviour of the native XDP implementation
for the veth driver, which will reallocate and copy the SKB data if the SKB
is marked as shared.

Reproducer Python script (requires BCC and Scapy):

from scapy.all import TCP, IP, Ether, sendp, sniff, AsyncSniffer, Raw, UDP
from bcc import BPF
import time, sys, subprocess, shlex

SKB_MODE = (1 << 1)
DRV_MODE = (1 << 2)
PYTHON=sys.executable

def client():
    time.sleep(2)
    # Sniffing on the sender causes skb_cloned() to be set
    s = AsyncSniffer()
    s.start()

    for p in range(10):
        sendp(Ether(dst="aa:aa:aa:aa:aa:aa", src="cc:cc:cc:cc:cc:cc")/IP()/UDP()/Raw("Test"),
              verbose=False)
        time.sleep(0.1)

    s.stop()
    return 0

def server(mode):
    prog = BPF(text="int dummy_drop(struct xdp_md *ctx) {return XDP_DROP;}")
    func = prog.load_func("dummy_drop", BPF.XDP)
    prog.attach_xdp("a_to_b", func, mode)

    time.sleep(1)

    s = sniff(iface="a_to_b", count=10, timeout=15)
    if len(s):
        print(f"Got {len(s)} packets - should have gotten 0")
        return 1
    else:
        print("Got no packets - as expected")
        return 0

if len(sys.argv) < 2:
    print(f"Usage: {sys.argv[0]} <skb|drv>")
    sys.exit(1)

if sys.argv[1] == "client":
    sys.exit(client())
elif sys.argv[1] == "server":
    mode = SKB_MODE if sys.argv[2] == 'skb' else DRV_MODE
    sys.exit(server(mode))
else:
    try:
        mode = sys.argv[1]
        if mode not in ('skb', 'drv'):
            print(f"Usage: {sys.argv[0]} <skb|drv>")
            sys.exit(1)
        print(f"Running in {mode} mode")

        for cmd in [
                'ip netns add netns_a',
                'ip netns add netns_b',
                'ip -n netns_a link add a_to_b type veth peer name b_to_a netns netns_b',
                # Disable ipv6 to make sure there's no address autoconf traffic
                'ip netns exec netns_a sysctl -qw net.ipv6.conf.a_to_b.disable_ipv6=1',
                'ip netns exec netns_b sysctl -qw net.ipv6.conf.b_to_a.disable_ipv6=1',
                'ip -n netns_a link set dev a_to_b address aa:aa:aa:aa:aa:aa',
                'ip -n netns_b link set dev b_to_a address cc:cc:cc:cc:cc:cc',
                'ip -n netns_a link set dev a_to_b up',
                'ip -n netns_b link set dev b_to_a up']:
            subprocess.check_call(shlex.split(cmd))

        server = subprocess.Popen(shlex.split(f"ip netns exec netns_a {PYTHON} {sys.argv[0]} server {mode}"))
        client = subprocess.Popen(shlex.split(f"ip netns exec netns_b {PYTHON} {sys.argv[0]} client"))

        client.wait()
        server.wait()
        sys.exit(server.returncode)

    finally:
        subprocess.run(shlex.split("ip netns delete netns_a"))
        subprocess.run(shlex.split("ip netns delete netns_b"))

Fixes: d445516966 ("net: xdp: support xdp generic on virtual devices")
Reported-by: Stepan Horacek <shoracek@redhat.com>
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:01:29 -08:00
..
6lowpan
9p
802 treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-09 12:13:43 -08:00
appletalk
atm proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
ax25 net: Make sock protocol value checks more specific 2020-01-09 18:41:40 -08:00
batman-adv Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
bluetooth Bluetooth: Fix race condition in hci_release_sock() 2020-01-26 10:34:17 +02:00
bpf bpf: Allow to change skb mark in test_run 2019-12-18 17:05:58 -08:00
bpfilter
bridge net: bridge: vlan: add per-vlan state 2020-01-24 12:58:14 +01:00
caif caif_usb: fix spelling mistake "to" -> "too" 2020-01-24 08:12:06 +01:00
can can: j1939: j1939_sk_bind(): take priv after lock is held 2019-12-08 11:52:02 +01:00
ceph Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-02-08 13:26:41 -08:00
core core: Don't skip generic XDP program execution for cloned SKBs 2020-02-11 17:01:29 -08:00
dcb
dccp treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
decnet net: Make sock protocol value checks more specific 2020-01-09 18:41:40 -08:00
dns_resolver
dsa net: dsa: Fix use-after-free in probing of DSA switch tree 2020-01-27 11:12:46 +01:00
ethernet net: remove eth_change_mtu 2020-01-27 11:09:31 +01:00
ethtool net/core: Replace driver version to be kernel version 2020-01-27 13:47:22 +01:00
hsr net: hsr: fix possible NULL deref in hsr_handle_frame() 2020-02-04 09:27:07 +01:00
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-02 13:54:56 -07:00
ife
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-02-04 13:32:20 +00:00
ipv6 ipv6/addrconf: fix potential NULL deref in inet6_set_link_af() 2020-02-07 18:43:23 +01:00
iucv treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
kcm
key
l2tp l2tp: Allow duplicate session creation with UDP 2020-02-04 12:35:49 +01:00
l3mdev
lapb
llc llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c) 2019-12-20 21:19:36 -08:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
mac802154
mpls net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2019-12-04 12:27:13 -08:00
mptcp mptcp: make the symbol 'mptcp_sk_clone_lock' static 2020-02-10 10:23:00 +01:00
ncsi net/ncsi: Support for multi host mellanox card 2020-01-09 18:36:22 -08:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-02-04 13:32:20 +00:00
netlabel
netlink treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
netrom
nfc net: nfc: nci: fix a possible sleep-in-atomic-context bug in nci_uart_tty_receive() 2019-12-18 11:57:33 -08:00
nsh
openvswitch net: openvswitch: use skb_list_walk_safe helper for gso segments 2020-01-14 11:48:41 -08:00
packet y2038: core, driver and file system changes 2020-01-29 14:55:47 -08:00
phonet net: Remove redundant BUG_ON() check in phonet_pernet 2020-01-03 12:25:50 -08:00
psample net: psample: fix skb_over_panic 2019-11-26 14:40:13 -08:00
qrtr net: qrtr: Remove receive worker 2020-01-14 18:36:42 -08:00
rds net/rds: Use prefetch for On-Demand-Paging MR 2020-01-18 11:48:19 +02:00
rfkill rfkill: Fix incorrect check to avoid NULL pointer dereference 2019-12-16 10:15:49 +01:00
rose Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-26 10:40:21 +01:00
rxrpc rxrpc: Fix call RCU cleanup using non-bh-safe locks 2020-02-07 11:20:57 +01:00
sched taprio: Fix dropping packets when using taprio + ETF offloading 2020-02-07 11:30:03 +01:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-09 12:13:43 -08:00
smc net/smc: allow unprivileged users to read pnet table 2020-01-21 11:39:56 +01:00
strparser
sunrpc Highlights: 2020-02-07 17:50:21 -08:00
switchdev
tipc tipc: fix successful connect() but timed out 2020-02-10 10:23:00 +01:00
tls Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
unix skbuff: fix a data race in skb_queue_len() 2020-02-06 13:59:10 +01:00
vmw_vsock Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
x25 net/x25: fix nonblocking connect 2020-01-09 18:39:33 -08:00
xdp mm, tree-wide: rename put_user_page*() to unpin_user_page*() 2020-01-31 10:30:38 -08:00
xfrm treewide: remove redundant IS_ERR() before error code check 2020-02-04 03:05:27 +00:00
compat.c y2038: socket: use __kernel_old_timespec instead of timespec 2019-11-15 14:38:29 +01:00
Kconfig mptcp: Add MPTCP socket stubs 2020-01-24 13:44:07 +01:00
Makefile mptcp: Add MPTCP socket stubs 2020-01-24 13:44:07 +01:00
socket.c socket: fix unused-function warning 2020-01-08 15:02:21 -08:00
sysctl_net.c