linux/net
Paolo Abeni 3f34cfae12 netfilter: on sockopt() acquire sock lock only in the required scope
Syzbot reported several deadlocks in the netfilter area caused by
rtnl lock and socket lock being acquired with a different order on
different code paths, leading to backtraces like the following one:

======================================================
WARNING: possible circular locking dependency detected
4.15.0-rc9+ #212 Not tainted
------------------------------------------------------
syzkaller041579/3682 is trying to acquire lock:
  (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock
include/net/sock.h:1463 [inline]
  (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>]
do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167

but task is already holding lock:
  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.}:
        __mutex_lock_common kernel/locking/mutex.c:756 [inline]
        __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
        register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607
        tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106
        xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845
        check_target net/ipv6/netfilter/ip6_tables.c:538 [inline]
        find_check_entry.isra.7+0x935/0xcf0
net/ipv6/netfilter/ip6_tables.c:580
        translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749
        do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline]
        do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
        ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928
        udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        entry_SYSCALL_64_fastpath+0x29/0xa0

-> #0 (sk_lock-AF_INET6){+.+.}:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
        lock_sock_nested+0xc2/0x110 net/core/sock.c:2780
        lock_sock include/net/sock.h:1463 [inline]
        do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
        ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922
        udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        entry_SYSCALL_64_fastpath+0x29/0xa0

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(rtnl_mutex);
                                lock(sk_lock-AF_INET6);
                                lock(rtnl_mutex);
   lock(sk_lock-AF_INET6);

  *** DEADLOCK ***

1 lock held by syzkaller041579/3682:
  #0:  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

The problem, as Florian noted, is that nf_setsockopt() is always
called with the socket held, even if the lock itself is required only
for very tight scopes and only for some operation.

This patch addresses the issues moving the lock_sock() call only
where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt()
does not need anymore to acquire both locks.

Fixes: 22265a5c3c ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-31 16:37:47 +01:00
..
6lowpan License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
9p make sock_alloc_file() do sock_release() on failures 2017-12-05 18:39:29 -05:00
802 treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-12 09:17:05 +09:00
appletalk treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
atm treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
ax25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
batman-adv batman-adv: Fix lock for ogm cnt access in batadv_iv_ogm_calc_tq 2017-12-04 11:47:33 +01:00
bluetooth treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
bpf bpf: add meta pointer for direct access 2017-09-26 13:36:44 -07:00
bridge net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks 2017-12-18 13:29:01 -05:00
caif License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
can treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
ceph We have a set of file locking improvements from Zheng, rbd rw/ro 2017-11-21 05:38:32 -10:00
core rtnetlink: give a user socket to get_target_net() 2018-01-04 13:42:20 -05:00
dcb rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
dccp dccp: CVE-2017-8824: use-after-free in DCCP code 2017-12-05 18:08:53 -05:00
decnet treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
dns_resolver KEYS: Fix race between updating and finding a negative key 2017-10-18 09:12:40 +01:00
dsa net: remove duplicate includes 2017-12-13 13:18:46 -05:00
ethernet networking: make skb_push & __skb_push return void pointers 2017-06-16 11:48:40 -04:00
hsr net: hsr: Convert timers to use timer_setup() 2017-10-25 13:00:27 +09:00
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
ife MAINTAINERS: Update Yotam's E-mail 2017-11-01 12:19:03 +09:00
ipv4 netfilter: on sockopt() acquire sock lock only in the required scope 2018-01-31 16:37:47 +01:00
ipv6 netfilter: on sockopt() acquire sock lock only in the required scope 2018-01-31 16:37:47 +01:00
ipx Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
iucv iucv: Convert sk_wmem_alloc accesses to refcount_t. 2017-07-03 02:31:22 -07:00
kcm make sock_alloc_file() do sock_release() on failures 2017-12-05 18:39:29 -05:00
key af_key: replace BUG_ON on WARN_ON in net_exit hook 2017-11-14 15:45:52 +09:00
l2tp l2tp: exit_net cleanup check added 2017-11-14 15:45:53 +09:00
l3mdev
lapb treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
llc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-11-15 11:56:19 -08:00
mac80211 mac80211: mesh: drop frames appearing to be from us 2018-01-04 15:51:53 +01:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
mpls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
ncsi treewide: setup_timer() -> timer_setup() (2 field) 2017-11-21 15:57:09 -08:00
netfilter netfilter: x_tables: fix pointer leaks to userspace 2018-01-31 14:59:24 +01:00
netlabel net/netlabel: Add list_next_rcu() in rcu_dereference(). 2017-11-18 10:32:41 +09:00
netlink netlink: Add netns check on taps 2017-12-11 11:58:18 -05:00
netrom treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
nfc treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
nsh openvswitch: enable NSH support 2017-11-08 16:12:33 +09:00
openvswitch openvswitch: Fix pop_vlan action for double tagged frames 2017-12-21 13:02:08 -05:00
packet net/packet: fix a race in packet_bind() and packet_notifier() 2017-11-28 11:13:30 -05:00
phonet phonet: exit_net cleanup check added 2017-11-14 15:45:53 +09:00
psample MAINTAINERS: Update Yotam's E-mail 2017-11-01 12:19:03 +09:00
qrtr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
rds RDS: null pointer dereference in rds_atomic_free_op 2018-01-04 14:19:26 -05:00
rfkill net: rfkill: gpio: Switch to devm_acpi_dev_add_driver_gpios() 2017-06-13 11:07:51 +02:00
rose treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
rxrpc rxrpc: Use correct netns source in rxrpc_release_sock() 2017-12-03 10:05:20 -05:00
sched net/sched: Fix update of lastuse in act modules implementing stats_update 2018-01-02 13:27:52 -05:00
sctp sctp: fix error path in sctp_stream_init 2018-01-03 11:29:42 -05:00
smc net/smc: Fix preinitialization of buf_desc in __smc_buf_create() 2017-11-24 01:33:34 +09:00
strparser strparser: Call sock_owned_by_user_nocheck 2017-12-28 14:28:22 -05:00
sunrpc NFS client fixes for Linux 4.15-rc4 2017-12-16 13:12:53 -08:00
switchdev net: bridge: Add/del switchdev object on host join/leave 2017-11-10 13:41:40 +09:00
tipc tipc: fix problems with multipoint-to-point flow control 2018-01-02 21:52:07 -05:00
tls tls: don't override sk_write_space if tls_set_sw_offload fails. 2017-11-14 16:26:35 +09:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
vmw_vsock VSOCK: fix outdated sk_state value in hvs_release() 2017-12-05 15:07:37 -05:00
wimax License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
wireless nl80211: Check for the required netlink attribute presence 2018-01-04 15:22:02 +01:00
x25 treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2017-12-27 10:58:23 -05:00
compat.c net: compat: assert the size of cmsg copied in is as expected 2017-09-20 15:36:18 -07:00
Kconfig net: Remove CONFIG_NETFILTER_DEBUG and _ASSERT() macros. 2017-09-04 13:25:20 +02:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
socket.c make sock_alloc_file() do sock_release() on failures 2017-12-05 18:39:29 -05:00
sysctl_net.c