linux/include/net/netfilter
Florian Westphal 7223ecd466 netfilter: nat: switch to new rhlist interface
I got offlist bug report about failing connections and high cpu usage.
This happens because we hit 'elasticity' checks in rhashtable that
refuses bucket list exceeding 16 entries.

The nat bysrc hash unfortunately needs to insert distinct objects that
share same key and are identical (have same source tuple), this cannot
be avoided.

Switch to the rhlist interface which is designed for this.

The nulls_base is removed here, I don't think its needed:

A (unlikely) false positive results in unneeded port clash resolution,
a false negative results in packet drop during conntrack confirmation,
when we try to insert the duplicate into main conntrack hash table.

Tested by adding multiple ip addresses to host, then adding
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

... and then creating multiple connections, from same source port but
different addresses:

for i in $(seq 2000 2032);do nc -p 1234 192.168.7.1 $i > /dev/null  & done

(all of these then get hashed to same bysource slot)

Then, to test that nat conflict resultion is working:

nc -s 10.0.0.1 -p 1234 192.168.7.1 2000
nc -s 10.0.0.2 -p 1234 192.168.7.1 2000

tcp  .. src=10.0.0.1 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1024 [ASSURED]
tcp  .. src=10.0.0.2 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1025 [ASSURED]
tcp  .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1234 [ASSURED]
tcp  .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2001 src=192.168.7.1 dst=192.168.7.10 sport=2001 dport=1234 [ASSURED]
[..]

-> nat altered source ports to 1024 and 1025, respectively.
This can also be confirmed on destination host which shows
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1024
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1025
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1234

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 870190a9ec ("netfilter: nat: convert nat bysrc hash to rhashtable")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-24 14:43:34 +01:00
..
ipv4 ipv4: Push struct net down into nf_send_reset 2015-09-29 20:21:31 +02:00
ipv6 netfilter: ipv6: avoid nf_iterate recursion 2015-11-23 17:54:45 +01:00
br_netfilter.h netfilter: bridge: add and use br_nf_hook_thresh 2016-09-24 21:25:48 +02:00
nf_conntrack_acct.h netfilter: introduce nf_conn_acct structure 2013-11-03 21:48:49 +01:00
nf_conntrack_core.h netfilter: conntrack: simplify the code by using nf_conntrack_get_ht 2016-08-18 01:20:52 +02:00
nf_conntrack_ecache.h netfilter: don't rely on DYING bit to detect when destroy event was sent 2016-08-30 11:43:08 +02:00
nf_conntrack_expect.h netfilter: conntrack: use a single expectation table for all namespaces 2016-05-06 11:50:01 +02:00
nf_conntrack_extend.h netfilter: move nat hlist_head to nf_conn 2016-07-11 11:47:50 +02:00
nf_conntrack_helper.h netfilter: Add helper array register/unregister functions 2016-07-21 02:31:53 +02:00
nf_conntrack_l3proto.h netfilter: nf_conntrack: remove unused ctl_table_path member in nf_conntrack_l3proto 2016-09-09 16:17:58 +02:00
nf_conntrack_l4proto.h netfilter: remove ip_conntrack* sysctl compat code 2016-08-13 13:27:13 +02:00
nf_conntrack_labels.h netfilter: conntrack: avoid excess memory allocation 2016-10-27 18:29:02 +02:00
nf_conntrack_seqadj.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_synproxy.h netfilter: synproxy: Check oom when adding synproxy and seqadj ct extensions 2016-09-13 10:50:56 +02:00
nf_conntrack_timeout.h netfilter: cttimeout: add netns support 2015-12-14 12:48:58 +01:00
nf_conntrack_timestamp.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_conntrack_tuple.h
nf_conntrack_zones.h netfilter: move zone info into struct nf_conn 2016-06-23 13:33:12 +02:00
nf_conntrack.h netfilter: nat: switch to new rhlist interface 2016-11-24 14:43:34 +01:00
nf_dup_netdev.h netfilter: nf_tables: add packet duplication to the netdev family 2016-01-03 21:04:23 +01:00
nf_log.h netfilter: nft_log: complete NFTA_LOG_FLAGS attr support 2016-09-25 23:16:43 +02:00
nf_nat_core.h netfilter: Pass net into nf_xfrm_me_harder 2015-09-18 22:00:22 +02:00
nf_nat_helper.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_nat_l3proto.h netfilter: Pass priv instead of nf_hook_ops to netfilter hooks 2015-09-18 22:00:16 +02:00
nf_nat_l4proto.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00
nf_nat_redirect.h netfilter: combine IPv4 and IPv6 nf_nat_redirect code in one module 2014-11-27 13:08:42 +01:00
nf_nat.h netfilter: nat: convert nat bysrc hash to rhashtable 2016-07-11 12:07:57 +02:00
nf_queue.h netfilter: replace list_head with single linked list 2016-09-25 14:38:48 +02:00
nf_tables_core.h netfilter: nf_tables: add range expression 2016-09-25 23:16:42 +02:00
nf_tables_ipv4.h netfilter: merge fixup for "nf_tables_netdev: remove redundant ip_hdr assignment" 2016-10-05 20:25:48 -04:00
nf_tables_ipv6.h netfilter: nf_tables: don't drop IPv6 packets that cannot parse transport 2016-09-12 18:52:32 +02:00
nf_tables.h netfilter: nf_tables: fix type mismatch with error return from nft_parse_u32_check 2016-10-27 18:29:01 +02:00
nfnetlink_log.h
nft_dup.h netfilter: nf_tables: add nft_dup expression 2015-08-07 11:49:49 +02:00
nft_masq.h netfilter: nft_masq: support port range 2016-03-02 20:05:27 +01:00
nft_meta.h netfilter: nft_meta: improve the validity check of pkttype set expr 2016-08-25 13:12:03 +02:00
nft_redir.h netfilter: nf_tables: add new expression nft_redir 2014-10-27 22:49:39 +01:00
nft_reject.h netfilter: nft_reject: restrict to INPUT/FORWARD/OUTPUT 2016-08-25 12:55:34 +02:00
xt_rateest.h netfilter: Remove extern from function prototypes 2013-09-23 16:29:42 -04:00