linux/net
Pablo Neira Ayuso 0838aa7fcf netfilter: fix netns dependencies with conntrack templates
Quoting Daniel Borkmann:

"When adding connection tracking template rules to a netns, f.e. to
configure netfilter zones, the kernel will endlessly busy-loop as soon
as we try to delete the given netns in case there's at least one
template present, which is problematic i.e. if there is such bravery that
the priviledged user inside the netns is assumed untrusted.

Minimal example:

  ip netns add foo
  ip netns exec foo iptables -t raw -A PREROUTING -d 1.2.3.4 -j CT --zone 1
  ip netns del foo

What happens is that when nf_ct_iterate_cleanup() is being called from
nf_conntrack_cleanup_net_list() for a provided netns, we always end up
with a net->ct.count > 0 and thus jump back to i_see_dead_people. We
don't get a soft-lockup as we still have a schedule() point, but the
serving CPU spins on 100% from that point onwards.

Since templates are normally allocated with nf_conntrack_alloc(), we
also bump net->ct.count. The issue why they are not yet nf_ct_put() is
because the per netns .exit() handler from x_tables (which would eventually
invoke xt_CT's xt_ct_tg_destroy() that drops reference on info->ct) is
called in the dependency chain at a *later* point in time than the per
netns .exit() handler for the connection tracker.

This is clearly a chicken'n'egg problem: after the connection tracker
.exit() handler, we've teared down all the connection tracking
infrastructure already, so rightfully, xt_ct_tg_destroy() cannot be
invoked at a later point in time during the netns cleanup, as that would
lead to a use-after-free. At the same time, we cannot make x_tables depend
on the connection tracker module, so that the xt_ct_tg_destroy() would
be invoked earlier in the cleanup chain."

Daniel confirms this has to do with the order in which modules are loaded or
having compiled nf_conntrack as modules while x_tables built-in. So we have no
guarantees regarding the order in which netns callbacks are executed.

Fix this by allocating the templates through kmalloc() from the respective
SYNPROXY and CT targets, so they don't depend on the conntrack kmem cache.
Then, release then via nf_ct_tmpl_free() from destroy_conntrack(). This branch
is marked as unlikely since conntrack templates are rarely allocated and only
from the configuration plane path.

Note that templates are not kept in any list to avoid further dependencies with
nf_conntrack anymore, thus, the tmpl larval list is removed.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
2015-07-20 14:58:19 +02:00
..
6lowpan 6lowpan: nhc: add other known rfc6282 compressions 2015-02-14 23:08:44 +01:00
9p IB/core: Change ib_create_cq to use struct ib_cq_init_attr 2015-06-12 14:49:10 -04:00
802 net: Kill dev_rebuild_header 2015-03-02 16:43:41 -05:00
8021q vlan: Add GRO support for non hardware accelerated vlan 2015-06-01 16:50:52 -07:00
appletalk net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
atm net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
ax25 ax25: Stop using sock->sk_protinfo. 2015-06-28 16:55:44 -07:00
batman-adv batman-adv: change the MAC of each VLAN upon ndo_set_mac_address 2015-06-07 17:07:20 +02:00
bluetooth Bluetooth: Reinitialize the list after deletion for session user list 2015-06-30 21:46:19 +02:00
bridge bridge: fix potential crash in __netdev_pick_tx() 2015-07-09 22:48:42 -07:00
caif Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-01 22:51:30 -07:00
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-24 02:58:51 -07:00
ceph Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-01 22:51:30 -07:00
core net: pktgen: kill the "Wait for kthread_stop" code in pktgen_thread_worker() 2015-07-09 15:05:32 -07:00
dcb net/dcb: Add IEEE QCN attribute 2015-03-06 21:50:02 -05:00
dccp sock_diag: specify info_size per inet protocol 2015-06-15 19:49:22 -07:00
decnet net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
dns_resolver
dsa dsa: fix promiscuity leak on slave dev open error 2015-06-28 16:57:08 -07:00
ethernet net: Add full IPv6 addresses to flow_keys 2015-06-04 15:44:30 -07:00
hsr net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface. 2015-03-01 13:40:23 -05:00
ieee802154 nl802154: export supported commands 2015-06-04 12:27:15 +02:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2015-07-09 00:03:10 -07:00
ipv6 ipv6: Make MLD packets to only be processed locally 2015-07-03 09:52:38 -07:00
ipx net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
irda irda: use msecs_to_jiffies for conversion to jiffies 2015-05-25 17:46:21 -04:00
iucv net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
l2tp net: Modify sk_alloc to not reference count the netns of kernel sockets. 2015-05-11 10:50:18 -04:00
lapb
llc net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
mac80211 Minor merge needed, due to function move. 2015-07-01 10:49:25 -07:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
mpls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-13 23:56:52 -07:00
netfilter netfilter: fix netns dependencies with conntrack templates 2015-07-20 14:58:19 +02:00
netlabel netlink: implement nla_put_in_addr and nla_put_in6_addr 2015-03-31 13:58:35 -04:00
netlink netlink: Delete an unnecessary check before the function call "module_put" 2015-07-03 09:27:43 -07:00
netrom netfilter: Remove spurios included of netfilter.h 2015-06-18 21:14:32 +02:00
nfc NFC: nci: fix mistake in uart generic driver 2015-06-15 18:10:37 +02:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-08 20:06:56 -07:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-24 02:58:51 -07:00
phonet net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
rds net-RDS: Delete an unnecessary check before the function call "module_put" 2015-07-03 09:27:42 -07:00
rfkill net: rfkill: gpio: make better use of gpiod API 2015-05-29 13:13:45 +02:00
rose Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-24 02:58:51 -07:00
rxrpc net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
sched net: sched: flower fix typo 2015-06-25 05:23:02 -07:00
sctp sctp: Fix race between OOTB responce and route removal 2015-06-29 09:28:42 -07:00
sunrpc Minor merge needed, due to function move. 2015-07-01 10:49:25 -07:00
switchdev net: switchdev: ignore unsupported bridge flags 2015-06-24 01:06:34 -07:00
tipc net/tipc: initialize security state for new connection socket 2015-07-08 16:08:23 -07:00
unix net/unix: support SCM_SECURITY for stream sockets 2015-06-10 22:49:20 -07:00
vmw_vsock net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-13 23:56:52 -07:00
x25 net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
compat.c net: switch importing msghdr from userland to {compat_,}import_iovec() 2015-04-09 00:02:26 -04:00
Kconfig net: add CONFIG_NET_INGRESS to enable ingress filtering 2015-05-14 01:10:05 -04:00
Makefile mpls: Refactor how the mpls module is built 2015-03-04 00:26:06 -05:00
socket.c net: Add a struct net parameter to sock_create_kern 2015-05-11 10:50:17 -04:00
sysctl_net.c