linux/include/net/netfilter
Pablo Neira Ayuso e53376bef2 netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt
With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.

Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().

A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.

CPU1                                    CPU2
____nf_conntrack_find()
                                        nf_ct_put()
                                         destroy_conntrack()
                                        ...
                                        init_conntrack
                                         __nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&ct->use) (use = 2)
                                         if (!l4proto->new(ct, skb, dataoff, timeouts))
                                          nf_conntrack_free(ct); (use = 2 !!!)
                                        ...
                                        __nf_conntrack_alloc (set use = 1)
 if (!nf_ct_key_equal(h, tuple, zone))
  nf_ct_put(ct); (use = 0)
   destroy_conntrack()
                                        /* continue to work with CT */

After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():

<4>[67096.759334] ------------[ cut here ]------------
<2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
<4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
<4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>]  [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
<4>[67096.760255] Call Trace:
<4>[67096.760255]  [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30
<4>[67096.760255]  [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
<4>[67096.760255]  [<ffffffff81484419>] nf_iterate+0x69/0xb0
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814845d4>] nf_hook_slow+0x74/0x110
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910
<4>[67096.760255]  [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0
<4>[67096.760255]  [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140
<4>[67096.760255]  [<ffffffff81444f97>] sock_sendmsg+0x117/0x140
<4>[67096.760255]  [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60
<4>[67096.760255]  [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20
<4>[67096.760255]  [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40
<4>[67096.760255]  [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814457c9>] sys_sendto+0x139/0x190
<4>[67096.760255]  [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200
<4>[67096.760255]  [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290
<4>[67096.760255]  [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210
<4>[67096.760255]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5

I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Andrew Vagin <avagin@parallels.com>
Cc: Florian Westphal <fw@strlen.de>
Reported-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-02-05 17:46:06 +01:00
..
ipv4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables 2014-01-06 13:29:30 -05:00
ipv6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables 2014-01-06 13:29:30 -05:00
nf_conntrack_acct.h netfilter: introduce nf_conn_acct structure 2013-11-03 21:48:49 +01:00
nf_conntrack_core.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_ecache.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_expect.h netfilter: rename netlink related "pid" variables to "portid" 2013-04-19 14:58:36 -04:00
nf_conntrack_extend.h netfilter: introduce nf_conn_acct structure 2013-11-03 21:48:49 +01:00
nf_conntrack_helper.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_l3proto.h netfilter: nf_conntrack: remove dead code 2014-01-03 23:41:37 +01:00
nf_conntrack_l4proto.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_labels.h netfilter: nf_ct_labels: move initialization out of pernet_operations 2013-01-23 12:56:23 +01:00
nf_conntrack_seqadj.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_synproxy.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-10-01 17:06:14 -04:00
nf_conntrack_timeout.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_timestamp.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_tuple.h netfilter: nf_nat: export NAT definitions to userspace 2011-12-23 14:36:43 +01:00
nf_conntrack_zones.h
nf_conntrack.h netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt 2014-02-05 17:46:06 +01:00
nf_log.h netfilter: log: netns NULL ptr bug when calling from conntrack 2013-05-15 14:11:07 +02:00
nf_nat_core.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_nat_helper.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_nat_l3proto.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_nat_l4proto.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_nat.h netfilter: nf_nat: move alloc_null_binding to nf_nat_core.c 2013-10-14 11:29:39 +02:00
nf_queue.h netfilter: xt_NFQUEUE: separate reusable code 2013-12-07 23:20:45 +01:00
nf_tables_core.h nf_tables*.h: Remove extern from function prototypes 2013-10-21 17:19:06 -04:00
nf_tables_ipv4.h netfilter: nft_meta: add l4proto support 2014-01-07 23:57:31 +01:00
nf_tables_ipv6.h netfilter: nft_meta: add l4proto support 2014-01-07 23:57:31 +01:00
nf_tables.h netfilter: nf_tables: rename nft_do_chain_pktinfo() to nft_do_chain() 2014-01-09 20:17:16 +01:00
nfnetlink_log.h netfilter: log: netns NULL ptr bug when calling from conntrack 2013-05-15 14:11:07 +02:00
nfnetlink_queue.h netfilter: nfnetlink_queue: allow to attach expectations to conntracks 2013-08-13 16:32:10 +02:00
xt_log.h netfilter: xt_LOG: don't use xchg() for simple assignment 2012-03-26 14:00:28 +02:00
xt_rateest.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00