linux/include/net
Pablo Neira Ayuso e53376bef2 netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt
With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.

Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().

A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.

CPU1                                    CPU2
____nf_conntrack_find()
                                        nf_ct_put()
                                         destroy_conntrack()
                                        ...
                                        init_conntrack
                                         __nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&ct->use) (use = 2)
                                         if (!l4proto->new(ct, skb, dataoff, timeouts))
                                          nf_conntrack_free(ct); (use = 2 !!!)
                                        ...
                                        __nf_conntrack_alloc (set use = 1)
 if (!nf_ct_key_equal(h, tuple, zone))
  nf_ct_put(ct); (use = 0)
   destroy_conntrack()
                                        /* continue to work with CT */

After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():

<4>[67096.759334] ------------[ cut here ]------------
<2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
<4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
<4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>]  [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
<4>[67096.760255] Call Trace:
<4>[67096.760255]  [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30
<4>[67096.760255]  [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
<4>[67096.760255]  [<ffffffff81484419>] nf_iterate+0x69/0xb0
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814845d4>] nf_hook_slow+0x74/0x110
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910
<4>[67096.760255]  [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0
<4>[67096.760255]  [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140
<4>[67096.760255]  [<ffffffff81444f97>] sock_sendmsg+0x117/0x140
<4>[67096.760255]  [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60
<4>[67096.760255]  [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20
<4>[67096.760255]  [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40
<4>[67096.760255]  [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814457c9>] sys_sendto+0x139/0x190
<4>[67096.760255]  [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200
<4>[67096.760255]  [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290
<4>[67096.760255]  [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210
<4>[67096.760255]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5

I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Andrew Vagin <avagin@parallels.com>
Cc: Florian Westphal <fw@strlen.de>
Reported-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-02-05 17:46:06 +01:00
..
9p for-linus-3.12-merge minor 9p fixes and tweaks for 3.12 merge window 2013-09-11 12:34:13 -07:00
bluetooth Bluetooth: Add quirk for disabling Delete Stored Link Key command 2014-01-04 20:10:40 +02:00
caif caif_hsi.h: Remove extern from function prototypes 2013-09-23 16:29:41 -04:00
irda include/net/: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
iucv af_iucv: fix recvmsg by replacing skb_pull() function 2013-04-08 17:16:57 -04:00
netfilter netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt 2014-02-05 17:46:06 +01:00
netns ipv6: add flowlabel_consistency sysctl 2014-01-19 17:12:31 -08:00
nfc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2014-01-17 14:43:17 -05:00
phonet net: remove my future former mail address 2012-06-17 16:29:38 -07:00
sctp sctp: remove macros sctp_bh_[un]lock_sock 2014-01-21 18:41:36 -08:00
tc_act include/net/: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
act_api.h net_sched: act: export tcf_hash_search() instead of tcf_hash_lookup() 2014-01-21 14:43:16 -08:00
addrconf.h ipv6: enable anycast addresses as source addresses for datagrams 2014-01-22 21:57:05 -08:00
af_ieee802154.h
af_rxrpc.h af_rxrpc.h: Remove extern from function prototypes 2013-07-31 17:50:01 -07:00
af_unix.h af_unix: improve STREAM behavior with fragmented memory 2013-08-10 01:16:44 -07:00
af_vsock.h VSOCK: Move af_vsock.h and vsock_addr.h to include/net 2013-07-27 22:14:06 -07:00
ah.h
arp.h arp: make arp_invalidate static 2013-12-28 17:02:46 -05:00
atmclip.h
ax25.h ax25.h: Remove extern from function prototypes 2013-07-31 17:50:02 -07:00
ax88796.h
busy_poll.h sched, net: Fixup busy_loop_us_clock() 2014-01-13 17:39:11 +01:00
cfg80211-wext.h
cfg80211.h cfg80211: Add a function to get the number of supported channels 2014-01-09 14:24:24 +01:00
checksum.h net: checksum: fix warning in skb_checksum 2013-11-04 15:27:08 -05:00
cipso_ipv4.h cipso: cleanup cipso_v4_translate() when !CONFIG_NETLABEL 2013-12-10 17:56:54 -05:00
cls_cgroup.h net: net_cls: move cgroupfs classid handling into core 2014-01-03 23:41:41 +01:00
codel.h net: introduce reciprocal_scale helper and convert users 2014-01-21 23:17:20 -08:00
compat.h compat.h: Remove extern from function prototypes 2013-09-20 14:49:32 -04:00
datalink.h
dcbevent.h include/net/: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
dcbnl.h include/net/: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
dn_dev.h dn_dev: add support for IFA_FLAGS nl attribute 2013-12-10 21:50:00 -05:00
dn_fib.h decnet (dn*.h): Remove extern from function prototypes 2013-09-20 14:49:32 -04:00
dn_neigh.h decnet (dn*.h): Remove extern from function prototypes 2013-09-20 14:49:32 -04:00
dn_nsp.h decnet (dn*.h): Remove extern from function prototypes 2013-09-20 14:49:32 -04:00
dn_route.h decnet (dn*.h): Remove extern from function prototypes 2013-09-20 14:49:32 -04:00
dn.h decnet (dn*.h): Remove extern from function prototypes 2013-09-20 14:49:32 -04:00
dsa.h
dsfield.h ipv6: Optimize ipv6_change_dsfield(). 2013-01-09 23:59:53 -08:00
dst_ops.h net: Fix warnings in dst_ops.h 2012-07-19 10:43:03 -07:00
dst.h net: Add utility functions to clear rxhash 2013-12-17 16:36:21 -05:00
esp.h net: move pskb_put() to core code 2013-11-07 19:28:58 -05:00
ethoc.h
fib_rules.h fib_rules.h: Remove extern from function prototypes 2013-09-20 14:49:33 -04:00
firewire.h firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection. 2013-03-26 12:32:13 -04:00
flow_keys.h flow_dissector: factor out the ports extraction in skb_flow_get_ports 2013-10-03 15:36:37 -04:00
flow.h net: Remove FLOWI_FLAG_CAN_SLEEP 2013-12-06 07:24:39 +01:00
garp.h garp.h: Remove extern from function prototypes 2013-09-20 14:49:33 -04:00
gen_stats.h gen_stats.h: Remove extern from function prototypes 2013-09-20 14:49:33 -04:00
genetlink.h genl: Add genlmsg_new_unicast() for unicast message allocation 2014-01-06 15:51:53 -08:00
gre.h gre_offload: statically build GRE offloading support 2014-01-06 20:28:34 -05:00
gro_cells.h gro: Fix kcalloc argument order 2013-01-27 22:46:33 -05:00
icmp.h icmp.h: Remove extern from function prototypes 2013-09-20 14:49:33 -04:00
ieee80211_radiotap.h mac80211: add radiotap flag and handling for 5/10 MHz 2013-07-16 09:58:05 +03:00
ieee802154_netdev.h ieee802154/nl-mac.c: make some MLME operations optional 2013-04-08 12:00:16 -04:00
ieee802154.h
if_inet6.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-18 00:55:41 -08:00
inet6_connection_sock.h inet*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
inet6_hashtables.h ipv6: split inet6_ehashfn to hash functions per compilation unit 2013-10-19 19:45:34 -04:00
inet_common.h inet*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
inet_connection_sock.h inet*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
inet_ecn.h net: Correct comparisons and calculations using skb->tail and skb-transport_header 2013-05-28 23:49:07 -07:00
inet_frag.h inet: remove old fragmentation hash initializing 2013-10-23 17:01:41 -04:00
inet_hashtables.h tcp/dccp: remove twchain 2013-10-08 23:19:24 -04:00
inet_sock.h inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once 2013-10-19 19:45:35 -04:00
inet_timewait_sock.h ipv6: tcp: fix flowlabel value in ACK messages send from TIME_WAIT 2014-01-17 17:56:33 -08:00
inetpeer.h ipv4: remove unused function 2013-12-28 17:03:20 -05:00
ip6_checksum.h net: fix build errors if ipv6 is disabled 2013-10-09 13:04:03 -04:00
ip6_fib.h ipv6: remove prune parameter for fib6_clean_all 2014-01-02 03:30:35 -05:00
ip6_route.h IPv6: add the option to use anycast addresses as source addresses in echo reply 2014-01-07 15:51:39 -05:00
ip6_tunnel.h net: unify the pcpu_tstats and br_cpu_netstats as one 2014-01-04 20:10:24 -05:00
ip_fib.h ip*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
ip_tunnels.h ipv4: fix a dst leak in tunnels 2014-01-17 18:36:39 -08:00
ip_vs.h netfilter: push reasm skb through instead of original frag skbs 2013-11-11 00:19:35 -05:00
ip.h ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
ipcomp.h
ipconfig.h
ipv6.h ipv6: protect protocols not handling ipv4 from v4 connection/bind attempts 2014-01-21 16:59:19 -08:00
ipx.h ipx.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
iw_handler.h iw_handler.h: Remove extern from function prototypes 2013-09-21 14:01:39 -04:00
lapb.h lapb.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
lib80211.h hostap: Don't use create_proc_read_entry() 2013-04-29 15:41:56 -04:00
llc_c_ac.h llc*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
llc_c_ev.h llc*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
llc_c_st.h
llc_conn.h llc*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
llc_if.h llc*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
llc_pdu.h net: llc: fix order of evaluation in llc_conn_ac_inc_vr_by_1 2014-01-01 22:22:43 -05:00
llc_s_ac.h llc*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
llc_s_ev.h llc*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
llc_s_st.h
llc_sap.h llc*.h: Remove extern from function prototypes 2013-09-21 14:01:38 -04:00
llc.h llc: make lock static 2014-01-03 20:56:48 -05:00
mac80211.h mac80211: handle MMPDUs at EOSP correctly 2014-01-10 09:50:02 +01:00
mac802154.h mac802154: correct a typo in ieee802154_alloc_device() prototype 2013-10-21 18:56:23 -04:00
mip6.h include/net/: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
mld.h net: ipv6: mld: get rid of MLDV2_MRC and simplify calculation 2013-09-04 14:53:20 -04:00
mrp.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-10-01 17:06:14 -04:00
ndisc.h ndisc.h: Remove extern from function prototypes 2013-09-21 14:01:39 -04:00
neighbour.h neigh: use NEIGH_VAR_INIT in ndo_neigh_setup functions. 2014-01-16 11:31:58 -08:00
net_namespace.h netfilter: nf_tables: complete net namespace support 2013-10-14 18:00:59 +02:00
net_ratelimit.h
netdma.h
netevent.h netevent/netlink.h: Remove extern from function prototypes 2013-09-21 14:01:39 -04:00
netlabel.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
netlink.h netevent/netlink.h: Remove extern from function prototypes 2013-09-21 14:01:39 -04:00
netprio_cgroup.h net: netprio: rename config to be more consistent with cgroup configs 2014-01-03 23:41:42 +01:00
netrom.h netrom.h: Remove extern from function prototypes 2013-09-21 14:01:39 -04:00
nexthop.h
nl802154.h
p8022.h p8022.h: Remove extern from function prototypes 2013-09-21 14:01:39 -04:00
ping.h ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
pkt_cls.h net_sched: optimize tcf_match_indev() 2014-01-13 11:50:15 -08:00
pkt_sched.h pkt_sched: give visibility to mq slave qdiscs 2013-12-09 19:54:47 -05:00
protocol.h net: Add GRO support for UDP encapsulating protocols 2014-01-21 18:05:04 -08:00
psnap.h psnap.h: Remove extern from function prototypes 2013-09-23 01:51:08 -04:00
raw.h raw/rawv6.h: Remove extern from function prototypes 2013-09-23 01:51:08 -04:00
rawv6.h raw/rawv6.h: Remove extern from function prototypes 2013-09-23 01:51:08 -04:00
red.h reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
regulatory.h cfg80211: make regulatory_hint() remove REGULATORY_CUSTOM_REG 2014-01-13 14:46:58 -05:00
request_sock.h inet: includes a sock_common in request_sock 2013-10-10 00:08:07 -04:00
rose.h rose.h: Remove extern from function prototypes 2013-09-23 01:51:08 -04:00
route.h ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing 2014-01-13 11:22:54 -08:00
rtnetlink.h rtnetlink: provide api for getting and setting slave info 2014-01-22 21:57:05 -08:00
sch_generic.h net_sched: add struct net pointer to tcf_proto_ops->dump 2014-01-13 11:50:14 -08:00
scm.h scm.h: Remove extern from function prototypes 2013-09-23 01:51:09 -04:00
secure_seq.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-10-01 17:06:14 -04:00
slhc_vj.h
snmp.h net: avoid reloads in SNMP_UPD_PO_STATS 2012-08-06 13:40:47 -07:00
sock.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2014-01-05 20:18:50 -05:00
Space.h drivers: net: Include new header file in sbni.c 2013-12-19 18:51:20 -05:00
stp.h stp.h: Remove extern from function prototypes 2013-09-23 01:51:09 -04:00
tcp_memcontrol.h tcp_memcontrol: Kill struct tcp_memcontrol 2013-10-21 18:43:02 -04:00
tcp_states.h
tcp.h tcp: make local functions static 2013-12-29 16:34:24 -05:00
timewait_sock.h [PATCH] tcp: Cache inetpeer in timewait socket, and only when necessary. 2012-06-09 14:56:12 -07:00
transp_v6.h ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
udp.h udp: Remove unnecessary semicolon from do{}while (0) macro 2013-11-07 02:14:33 -05:00
udplite.h udplite.h: Remove extern from function prototypes 2013-09-23 16:29:40 -04:00
vsock_addr.h VSOCK: Move af_vsock.h and vsock_addr.h to include/net 2013-07-27 22:14:06 -07:00
vxlan.h net: Add GRO support for vxlan traffic 2014-01-21 18:05:04 -08:00
wext.h wext.h: Remove extern from function prototypes 2013-09-23 16:29:40 -04:00
wimax.h wimax.h: Remove extern from function prototypes 2013-09-23 16:29:41 -04:00
wpan-phy.h mac802154: monitor device support 2012-05-16 15:17:08 -04:00
x25.h x25.h: Remove extern from function prototypes 2013-09-23 16:29:41 -04:00
x25device.h
xfrm.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00