linux/net/ipv4/netfilter
Paolo Abeni 3f34cfae12 netfilter: on sockopt() acquire sock lock only in the required scope
Syzbot reported several deadlocks in the netfilter area caused by
rtnl lock and socket lock being acquired with a different order on
different code paths, leading to backtraces like the following one:

======================================================
WARNING: possible circular locking dependency detected
4.15.0-rc9+ #212 Not tainted
------------------------------------------------------
syzkaller041579/3682 is trying to acquire lock:
  (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock
include/net/sock.h:1463 [inline]
  (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>]
do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167

but task is already holding lock:
  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.}:
        __mutex_lock_common kernel/locking/mutex.c:756 [inline]
        __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
        register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607
        tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106
        xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845
        check_target net/ipv6/netfilter/ip6_tables.c:538 [inline]
        find_check_entry.isra.7+0x935/0xcf0
net/ipv6/netfilter/ip6_tables.c:580
        translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749
        do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline]
        do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
        ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928
        udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        entry_SYSCALL_64_fastpath+0x29/0xa0

-> #0 (sk_lock-AF_INET6){+.+.}:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
        lock_sock_nested+0xc2/0x110 net/core/sock.c:2780
        lock_sock include/net/sock.h:1463 [inline]
        do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
        ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922
        udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        entry_SYSCALL_64_fastpath+0x29/0xa0

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(rtnl_mutex);
                                lock(sk_lock-AF_INET6);
                                lock(rtnl_mutex);
   lock(sk_lock-AF_INET6);

  *** DEADLOCK ***

1 lock held by syzkaller041579/3682:
  #0:  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

The problem, as Florian noted, is that nf_setsockopt() is always
called with the socket held, even if the lock itself is required only
for very tight scopes and only for some operation.

This patch addresses the issues moving the lock_sock() call only
where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt()
does not need anymore to acquire both locks.

Fixes: 22265a5c3c ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-31 16:37:47 +01:00
..
arp_tables.c netfilter: remove redundant assignment to e 2017-11-20 12:03:41 +01:00
arpt_mangle.c
arptable_filter.c netfilter: arp_tables: register table in initns 2016-04-07 11:58:49 +02:00
ip_tables.c netfilter: remove redundant assignment to e 2017-11-20 12:03:41 +01:00
ipt_ah.c
ipt_CLUSTERIP.c netfilter: ipt_CLUSTERIP: fix out-of-bounds accesses in clusterip_tg_check() 2018-01-31 15:00:33 +01:00
ipt_ECN.c
ipt_MASQUERADE.c netfilter: nat: add dependencies on conntrack module 2016-12-04 21:16:51 +01:00
ipt_REJECT.c netfilter: x_tables: move hook state into xt_action_param structure 2016-11-03 10:56:21 +01:00
ipt_rpfilter.c netfilter: rpfilter: fix incorrect loopback packet judgment 2017-01-16 14:23:01 +01:00
ipt_SYNPROXY.c netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook 2017-10-09 13:08:39 +02:00
iptable_filter.c
iptable_mangle.c netfilter: x_tables: simplify ip{6}table_mangle_hook() 2016-07-01 16:37:02 +02:00
iptable_nat.c netfilter: nf_hook_ops structs can be const 2017-07-31 19:10:44 +02:00
iptable_raw.c
iptable_security.c
Kconfig netfilter: move socket lookup infrastructure to nf_socket_ipv{4,6}.c 2016-11-01 20:50:31 +01:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nf_conntrack_l3proto_ipv4.c netfilter: on sockopt() acquire sock lock only in the required scope 2018-01-31 16:37:47 +01:00
nf_conntrack_proto_icmp.c netfilter: conntrack: don't cache nlattr_tuple_size result in nla_size 2017-11-06 16:48:38 +01:00
nf_defrag_ipv4.c netfilter: nf_hook_ops structs can be const 2017-07-31 19:10:44 +02:00
nf_dup_ipv4.c netfilter: kill the fake untracked conntrack objects 2017-04-15 11:47:57 +02:00
nf_log_arp.c netfilter: constify nf_loginfo structures 2017-08-02 14:25:59 +02:00
nf_log_ipv4.c netfilter: constify nf_loginfo structures 2017-08-02 14:25:59 +02:00
nf_nat_h323.c netfilter: nf_nat_h323: fix logical-not-parentheses warning 2017-08-24 18:48:05 +02:00
nf_nat_l3proto_ipv4.c ipv4: mark expected switch fall-throughs 2017-10-18 14:10:29 +01:00
nf_nat_masquerade_ipv4.c net: Replace NF_CT_ASSERT() with WARN_ON(). 2017-09-04 13:25:19 +02:00
nf_nat_pptp.c netfilter: pptp: attach nat extension when needed 2017-04-26 09:30:22 +02:00
nf_nat_proto_gre.c netfilter: gre: Use consistent GRE and PTTP header structure instead of the ones defined by netfilter 2016-09-07 10:36:52 +02:00
nf_nat_proto_icmp.c
nf_nat_snmp_basic.c netfilter: snmp: avoid stack size warning 2017-05-01 11:43:58 +02:00
nf_reject_ipv4.c netfilter: nf_reject_ipv4: Fix use-after-free in send_reset 2017-11-01 12:15:29 +01:00
nf_socket_ipv4.c netfilter: remove nf_ct_is_untracked 2017-04-15 11:51:33 +02:00
nf_tables_arp.c netfilter: nf_tables: only allow in/output for arp packets 2017-07-17 17:02:44 +02:00
nf_tables_ipv4.c netfilter: Add the missed return value check of nft_register_chain_type 2016-09-12 19:54:45 +02:00
nft_chain_nat_ipv4.c
nft_chain_route_ipv4.c netfilter: nft_chain_route: re-route before skb is queued to userspace 2016-09-06 18:02:37 +02:00
nft_dup_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
nft_fib_ipv4.c netfilter: nf_tables: fib: use skb_header_pointer 2017-07-31 19:01:39 +02:00
nft_masq_ipv4.c netfilter: nf_tables: fix mismatch in big-endian system 2017-03-13 13:30:28 +01:00
nft_redir_ipv4.c netfilter: nf_tables: fix mismatch in big-endian system 2017-03-13 13:30:28 +01:00
nft_reject_ipv4.c netfilter: nf_tables: use hook state from xt_action_param structure 2016-11-03 11:52:34 +01:00