linux/net/core
Alexei Starovoitov 1be7f75d16 bpf: enable non-root eBPF programs
In order to let unprivileged users load and execute eBPF programs
teach verifier to prevent pointer leaks.
Verifier will prevent
- any arithmetic on pointers
  (except R10+Imm which is used to compute stack addresses)
- comparison of pointers
  (except if (map_value_ptr == 0) ... )
- passing pointers to helper functions
- indirectly passing pointers in stack to helper functions
- returning pointer from bpf program
- storing pointers into ctx or maps

Spill/fill of pointers into stack is allowed, but mangling
of pointers stored in the stack or reading them byte by byte is not.

Within bpf programs the pointers do exist, since programs need to
be able to access maps, pass skb pointer to LD_ABS insns, etc
but programs cannot pass such pointer values to the outside
or obfuscate them.

Only allow BPF_PROG_TYPE_SOCKET_FILTER unprivileged programs,
so that socket filters (tcpdump), af_packet (quic acceleration)
and future kcm can use it.
tracing and tc cls/act program types still require root permissions,
since tracing actually needs to be able to see all kernel pointers
and tc is for root only.

For example, the following unprivileged socket filter program is allowed:
int bpf_prog1(struct __sk_buff *skb)
{
  u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
  u64 *value = bpf_map_lookup_elem(&my_map, &index);

  if (value)
	*value += skb->len;
  return 0;
}

but the following program is not:
int bpf_prog1(struct __sk_buff *skb)
{
  u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
  u64 *value = bpf_map_lookup_elem(&my_map, &index);

  if (value)
	*value += (u64) skb;
  return 0;
}
since it would leak the kernel address into the map.

Unprivileged socket filter bpf programs have access to the
following helper functions:
- map lookup/update/delete (but they cannot store kernel pointers into them)
- get_random (it's already exposed to unprivileged user space)
- get_smp_processor_id
- tail_call into another socket filter program
- ktime_get_ns

The feature is controlled by sysctl kernel.unprivileged_bpf_disabled.
This toggle defaults to off (0), but can be set true (1).  Once true,
bpf programs and maps cannot be accessed from unprivileged process,
and the toggle cannot be set back to false.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 19:13:35 -07:00
..
datagram.c net: Fix skb_set_peeked use-after-free bug 2015-08-06 21:55:47 -07:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c dev_ioctl: use sizeof(x) instead of sizeof x 2014-11-18 15:27:32 -05:00
dev.c net: use sk_fullsock() in __netdev_pick_tx() 2015-10-05 02:45:25 -07:00
drop_monitor.c net: Replace get_cpu_var through this_cpu_ptr 2014-08-26 13:45:47 -04:00
dst.c dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
ethtool.c net/ethtool: Add current supported tunable options 2015-06-11 00:36:37 -07:00
fib_rules.c fib_rules: fix fib rule dumps across multiple skbs 2015-09-24 15:21:54 -07:00
filter.c bpf: enable non-root eBPF programs 2015-10-12 19:13:35 -07:00
flow_dissector.c flow_dissector: Use 'const' where possible. 2015-09-01 21:19:17 -07:00
flow.c flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c 2015-09-01 17:00:24 -07:00
gen_estimator.c net_sched: gen_estimator: extend pps limit 2015-07-08 13:59:20 -07:00
gen_stats.c gen_stats.c: Duplicate xstats buffer for later use 2015-02-19 15:45:53 -05:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwtunnel.c dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
Makefile lwtunnel: infrastructure for handling light weight tunnels like mpls 2015-07-21 10:39:03 -07:00
neighbour.c net: Add support for filtering neigh dump by device index 2015-10-07 04:12:02 -07:00
net_namespace.c netns: make nsid_lock per net 2015-05-17 23:41:11 -04:00
net-procfs.c rps: selective flow shedding during softnet overflow 2013-05-20 13:48:04 -07:00
net-sysfs.c switchdev: rename SWITCHDEV_ATTR_* enum values to SWITCHDEV_ATTR_ID_* 2015-10-03 04:49:37 -07:00
net-sysfs.h net: netdev_kobject_init: annotate with __init 2014-01-05 20:27:54 -05:00
net-traces.c net: FIB tracepoints 2015-08-29 13:05:16 -07:00
netclassid_cgroup.c cgroup: net_cls: fix false-positive "suspicious RCU usage" 2015-07-25 00:13:18 -07:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c netpoll: Drop budget parameter from NAPI polling call hierarchy 2015-09-29 14:57:16 -07:00
netprio_cgroup.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
pktgen.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-13 16:23:11 -07:00
ptp_classifier.c net: filter: split 'struct sk_filter' into socket and bpf parts 2014-08-02 15:03:58 -07:00
request_sock.c tcp: restore fastopen operations 2015-10-05 03:19:06 -07:00
rtnetlink.c net/core: lockdep_rtnl_is_held can be boolean 2015-10-09 07:49:06 -07:00
scm.c net: introduce helper macro for_each_cmsghdr 2014-12-10 22:41:55 -05:00
secure_seq.c net: remove a sparse error in secure_dccpv6_sequence_number() 2015-05-25 22:55:37 -04:00
skbuff.c skbuff: Fix skb checksum partial check. 2015-09-29 16:48:46 -07:00
sock_diag.c net/core: make sock_diag.c explicitly non-modular 2015-10-09 07:52:27 -07:00
sock.c tcp/dccp: add SLAB_DESTROY_BY_RCU flag for request sockets 2015-10-03 13:25:20 -07:00
stream.c tcp: set SOCK_NOSPACE under memory pressure 2015-05-09 17:38:36 -04:00
sysctl_net_core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-20 18:51:09 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: fix unaligned access to crafted TCP header in helper API 2014-10-22 12:52:55 -04:00
utils.c net: move net_get_random_once to lib 2015-10-08 05:26:35 -07:00