linux/net/core
John Fastabend 90db6d772f bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free
The bucket->lock is not needed in the sock_hash_free and sock_map_free
calls, in fact it is causing a splat due to being inside rcu block.

| BUG: sleeping function called from invalid context at net/core/sock.c:2935
| in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 62, name: kworker/0:1
| 3 locks held by kworker/0:1/62:
|  #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0
|  #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0
|  #2: ffff8881381f6df8 (&stab->lock){+...}, at: sock_map_free+0x26/0x180
| CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04008-g7b083332376e #454
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
| Workqueue: events bpf_map_free_deferred
| Call Trace:
|  dump_stack+0x71/0xa0
|  ___might_sleep.cold+0xa6/0xb6
|  lock_sock_nested+0x28/0x90
|  sock_map_free+0x5f/0x180
|  bpf_map_free_deferred+0x58/0x80
|  process_one_work+0x260/0x5e0
|  worker_thread+0x4d/0x3e0
|  kthread+0x108/0x140
|  ? process_one_work+0x5e0/0x5e0
|  ? kthread_park+0x90/0x90
|  ret_from_fork+0x3a/0x50

The reason we have stab->lock and bucket->locks in sockmap code is to
handle checking EEXIST in update/delete cases. We need to be careful during
an update operation that we check for EEXIST and we need to ensure that the
psock object is not in some partial state of removal/insertion while we do
this. So both map_update_common and sock_map_delete need to guard from being
run together potentially deleting an entry we are checking, etc. But by the
time we get to the tear-down code in sock_{ma[|hash}_free we have already
disconnected the map and we just did synchronize_rcu() in the line above so
no updates/deletes should be in flight. Because of this we can drop the
bucket locks from the map free'ing code, noting no update/deletes can be
in-flight.

Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Reported-by: Jakub Sitnicki <jakub@cloudflare.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/158385850787.30597.8346421465837046618.stgit@john-Precision-5820-Tower
2020-03-11 14:08:52 +01:00
..
bpf_sk_storage.c bpf: Improve bucket_log calculation logic 2020-02-07 23:01:41 +01:00
datagram.c net: add queue argument to __skb_wait_for_more_packets and __skb_{,try_}recv_datagram 2019-12-09 09:59:07 +01:00
datagram.h net/core: Allow the compiler to verify declaration and definition consistency 2019-03-27 13:49:44 -07:00
dev_addr_lists.c net: remove unnecessary variables and callback 2019-10-24 14:53:49 -07:00
dev_ioctl.c net: Introduce peer to peer one step PTP time stamping. 2019-12-25 19:51:34 -08:00
dev.c net: Fix Tx hash bound checking 2020-02-26 11:14:10 -08:00
devlink.c devlink: validate length of region addr/len 2020-03-03 13:28:48 -08:00
drop_monitor.c drop_monitor: Do not cancel uninitialized work item 2020-02-07 18:48:36 +01:00
dst_cache.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
dst.c net: print proper warning on dst underflow 2019-09-26 09:05:56 +02:00
failover.c failover: allow name change on IFF_UP slave interfaces 2019-04-10 22:12:26 -07:00
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c net: fib_rules: Correctly set table field when table number exceeds 8 bits 2020-02-16 18:38:24 -08:00
filter.c treewide: remove redundant IS_ERR() before error code check 2020-02-04 03:05:27 +00:00
flow_dissector.c flow_dissector: Fix to use new variables for port ranges in bpf hook 2020-01-27 11:25:07 +01:00
flow_offload.c net: core: rename indirect block ingress cb function 2019-12-06 20:45:09 -08:00
gen_estimator.c net_sched: gen_estimator: extend packet counter to 64bit 2019-11-06 21:51:36 -08:00
gen_stats.c net_sched: add TCA_STATS_PKT64 attribute 2019-11-05 18:20:55 -08:00
gro_cells.c gro_cells: make sure device is up in gro_cells_receive() 2019-03-10 11:07:14 -07:00
hwbm.c net: hwbm: Make the hwbm_pool lock a mutex 2019-06-09 19:40:10 -07:00
link_watch.c net: link_watch: prevent starvation when processing linkwatch wq 2019-07-01 19:02:47 -07:00
lwt_bpf.c net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2019-12-04 12:27:13 -08:00
lwtunnel.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
Makefile ethtool: move to its own directory 2019-12-12 17:07:05 -08:00
neighbour.c neigh_stat_seq_next() should increase position index 2020-01-24 11:42:18 +01:00
net_namespace.c netns: Constify exported functions 2020-01-17 13:25:24 +01:00
net-procfs.c net: procfs: use index hashlist instead of name hashlist 2019-10-01 14:47:19 -07:00
net-sysfs.c net-sysfs: Call dev_hold always in rx_queue_add_kobject 2019-12-17 22:57:11 -08:00
net-sysfs.h
net-traces.c page_pool: add tracepoints for page_pool with details need by XDP 2019-06-19 11:23:13 -04:00
netclassid_cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netevent.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netpoll.c net: fix skb use after free in netpoll 2019-08-27 20:52:02 -07:00
netprio_cgroup.c netprio: use css ID instead of cgroup ID 2019-11-12 08:18:03 -08:00
page_pool.c page_pool: refill page when alloc.count of pool is zero 2020-02-13 14:11:51 -08:00
pktgen.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
ptp_classifier.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
request_sock.c tcp: add rcu protection around tp->fastopen_rsk 2019-10-13 10:13:08 -07:00
rtnetlink.c net: rtnetlink: fix bugs in rtnl_alt_ifname() 2020-02-16 18:51:59 -08:00
scm.c y2038: socket: remove timespec reference in timestamping 2019-11-15 14:38:29 +01:00
secure_seq.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
skbuff.c net: core: Distribute switch variables for initialization 2020-02-20 10:00:19 -08:00
skmsg.c net, sk_msg: Don't check if sock is locked when tearing down psock 2020-01-22 20:30:20 +01:00
sock_diag.c sock: make cookie generation global instead of per netns 2019-08-09 13:14:46 -07:00
sock_map.c bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free 2020-03-11 14:08:52 +01:00
sock_reuseport.c soreuseport: Cleanup duplicate initialization of more_reuse->max_socks. 2020-01-27 11:01:16 +01:00
sock.c xsk, net: Make sock_def_readable() have external linkage 2020-01-22 00:08:52 +01:00
stream.c tcp: make sure EPOLLOUT wont be missed 2019-08-19 13:07:43 -07:00
sysctl_net_core.c net, sysctl: Fix compiler warning when only cBPF is present 2019-12-19 17:17:51 +01:00
timestamping.c net: Introduce a new MII time stamping interface. 2019-12-25 19:51:33 -08:00
tso.c net: Use skb accessors in network core 2019-07-22 20:47:56 -07:00
utils.c net: Fix skb->csum update in inet_proto_csum_replace16(). 2020-01-24 20:54:30 +01:00
xdp.c treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00