Commit Graph

7598 Commits

Author SHA1 Message Date
Alex Gartrell
6cff339bbd ipvs: Add destination address family to netlink interface
This is necessary to support heterogeneous pools.  For example, if you have
an ipv6 addressed network, you'll want to be able to forward ipv4 traffic
into it.

This patch enforces that destination address family is the same as service
family, as none of the forwarding mechanisms support anything else.

For the old setsockopt mechanism, we simply set the dest address family to
AF_INET as we do with the service.

Signed-off-by: Alex Gartrell <agartrell@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2014-09-16 09:03:33 +09:00
Pablo Neira Ayuso
67981fefb2 netfilter: fix compilation of masquerading without IP_NF_TARGET_MASQUERADE
CONFIG_NF_NAT_MASQUERADE_IPV6=m
 # CONFIG_IP6_NF_TARGET_MASQUERADE is not set

results in:

net/ipv6/netfilter/nf_nat_masquerade_ipv6.c: In function ‘nf_nat_masquerade_ipv6’:
net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:41:14: error: ‘struct nf_conn_nat’ has no member named ‘masq_index’
  nfct_nat(ct)->masq_index = out->ifindex;
              ^
net/ipv6/netfilter/nf_nat_masquerade_ipv6.c: In function ‘device_cmp’:
net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:61:12: error: ‘const struct nf_conn_nat’ has no member named ‘masq_index’
  return nat->masq_index == (int)(long)ifindex;
            ^
net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:62:1: warning: control
reaches end of non-void function [-Wreturn-type]
 }
 ^
make[3]: *** [net/ipv6/netfilter/nf_nat_masquerade_ipv6.o] Error 1

Fix this by using the new NF_NAT_MASQUERADE_IPV4 and _IPV6 symbols
in include/net/netfilter/nf_nat.h.

Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-11 17:02:45 +02:00
David S. Miller
0aac383353 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
nf-next pull request

The following patchset contains Netfilter/IPVS updates for your
net-next tree. Regarding nf_tables, most updates focus on consolidating
the NAT infrastructure and adding support for masquerading. More
specifically, they are:

1) use __u8 instead of u_int8_t in arptables header, from
   Mike Frysinger.

2) Add support to match by skb->pkttype to the meta expression, from
   Ana Rey.

3) Add support to match by cpu to the meta expression, also from
   Ana Rey.

4) A smatch warning about IPSET_ATTR_MARKMASK validation, patch from
   Vytas Dauksa.

5) Fix netnet and netportnet hash types the range support for IPv4,
   from Sergey Popovich.

6) Fix missing-field-initializer warnings resolved, from Mark Rustad.

7) Dan Carperter reported possible integer overflows in ipset, from
   Jozsef Kadlecsick.

8) Filter out accounting objects in nfacct by type, so you can
   selectively reset quotas, from Alexey Perevalov.

9) Move specific NAT IPv4 functions to the core so x_tables and
   nf_tables can share the same NAT IPv4 engine.

10) Use the new NAT IPv4 functions from nft_chain_nat_ipv4.

11) Move specific NAT IPv6 functions to the core so x_tables and
    nf_tables can share the same NAT IPv4 engine.

12) Use the new NAT IPv6 functions from nft_chain_nat_ipv6.

13) Refactor code to add nft_delrule(), which can be reused in the
    enhancement of the NFT_MSG_DELTABLE to remove a table and its
    content, from Arturo Borrero.

14) Add a helper function to unregister chain hooks, from
    Arturo Borrero.

15) A cleanup to rename to nft_delrule_by_chain for consistency with
    the new nft_*() functions, also from Arturo.

16) Add support to match devgroup to the meta expression, from Ana Rey.

17) Reduce stack usage for IPVS socket option, from Julian Anastasov.

18) Remove unnecessary textsearch state initialization in xt_string,
    from Bojan Prtvar.

19) Add several helper functions to nf_tables, more work to prepare
    the enhancement of NFT_MSG_DELTABLE, again from Arturo Borrero.

20) Enhance NFT_MSG_DELTABLE to delete a table and its content, from
    Arturo Borrero.

21) Support NAT flags in the nat expression to indicate the flavour,
    eg. random fully, from Arturo.

22) Add missing audit code to ebtables when replacing tables, from
    Nicolas Dichtel.

23) Generalize the IPv4 masquerading code to allow its re-use from
    nf_tables, from Arturo.

24) Generalize the IPv6 masquerading code, also from Arturo.

25) Add the new masq expression to support IPv4/IPv6 masquerading
    from nf_tables, also from Arturo.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-10 12:46:32 -07:00
Willem de Bruijn
67cc0d4077 net-timestamp: optimize sock_tx_timestamp default path
Few packets have timestamping enabled. Exit sock_tx_timestamp quickly
in this common case.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-09 17:34:41 -07:00
Vincent Bernat
49a601589c net/ipv4: bind ip_nonlocal_bind to current netns
net.ipv4.ip_nonlocal_bind sysctl was global to all network
namespaces. This patch allows to set a different value for each
network namespace.

Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-09 11:27:09 -07:00
Arturo Borrero
9ba1f726be netfilter: nf_tables: add new nft_masq expression
The nft_masq expression is intended to perform NAT in the masquerade flavour.

We decided to have the masquerade functionality in a separated expression other
than nft_nat.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09 16:31:30 +02:00
Arturo Borrero
be6b635cd6 netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables
Let's refactor the code so we can reach the masquerade functionality
from outside the xt context (ie. nftables).

The patch includes the addition of an atomic counter to the masquerade
notifier: the stuff to be done by the notifier is the same for xt and
nftables. Therefore, only one notification handler is needed.

This factorization only involves IPv6; a similar patch exists to
handle IPv4.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09 16:31:29 +02:00
Arturo Borrero
8dd33cc93e netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables
Let's refactor the code so we can reach the masquerade functionality
from outside the xt context (ie. nftables).

The patch includes the addition of an atomic counter to the masquerade
notifier: the stuff to be done by the notifier is the same for xt and
nftables. Therefore, only one notification handler is needed.

This factorization only involves IPv4; a similar patch follows to
handle IPv6.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09 16:31:29 +02:00
Pablo Neira Ayuso
2a5538e9aa netfilter: nat: move specific NAT IPv6 to core
Move the specific NAT IPv6 core functions that are called from the
hooks from ip6table_nat.c to nf_nat_l3proto_ipv6.c. This prepares the
ground to allow iptables and nft to use the same NAT engine code that
comes in a follow up patch.

This also renames nf_nat_ipv6_fn to nft_nat_ipv6_fn in
net/ipv6/netfilter/nft_chain_nat_ipv6.c to avoid a compilation breakage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-09 16:30:00 +02:00
David S. Miller
5b4c314575 Merge tag 'master-2014-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-09-08

Please pull this batch of updates intended for the 3.18 stream...

For the mac80211 bits, Johannes says:

"Not that much content this time. Some RCU cleanups, crypto
performance improvements, and various patches all over,
rather than listing them one might as well look into the
git log instead."

For the Bluetooth bits, Gustavo says:

"The changes consists of:

        - Coding style fixes to HCI drivers
        - Corrupted ack value fix for the H5 HCI driver
        - A couple of Enhanced L2CAP fixes
        - Conversion of SMP code to use common L2CAP channel API
        - Page scan optimizations when using the kernel-side whitelist
        - Various mac802154 and and ieee802154 6lowpan cleanups
        - One new Atheros USB ID"

For the iwlwifi bits, Emmanuel says:

"We have a new big thing coming up which is called Dynamic Queue
Allocation (or DQA).  This is a completely new way to work with the
Tx queues and it requires major refactoring.  This is being done by
Johannes and Avri.  Besides this, Johannes disables U-APSD by default
because of APs that would disable A-MPDU if the association supports
U-ASPD.  Luca contributed to the power area which he was cleaning
up on the way while working on CSA.  A few more random things here
and there."

For the Atheros bits, Kalle says:

"For ath6kl we had two small fixes and a new SDIO device id.

For ath10k the bigger changes are:

 * support for new firmware version 10.2 (Michal)

 * spectral scan support (Simon, Sven & Mathias)

 * export a firmware crash dump file (Ben & me)

 * cleaning up of pci.c (Michal)

 * print pci id in all messages, which causes most of the churn (Michal)"

Beyond that, we have the usual collection of various updates to ath9k,
b43, mwifiex, and wil6210, as well as a few other bits here and there.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-08 16:43:58 -07:00
Willem de Bruijn
a7f26b7e1e inet: remove dead inetpeer sequence code
inetpeer sequence numbers are no longer incremented, so no need to
check and flush the tree. The function that increments the sequence
number was already dead code and removed in in "ipv4: remove unused
function" (068a6e18). Remove the code that checks for a change, too.

Verifying that v4_seq and v6_seq are never incremented and thus that
flush_check compares bp->flush_seq to 0 is trivial.

The second part of the change removes flush_check completely even
though bp->flush_seq is exactly !0 once, at initialization. This
change is correct because the time this branch is true is when
bp->root == peer_avl_empty_rcu, in which the branch and
inetpeer_invalidate_tree are a NOOP.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-08 16:42:42 -07:00
John W. Linville
61a3d4f9d5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2014-09-08 11:14:56 -04:00
David S. Miller
eb84d6b604 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-09-07 21:41:53 -07:00
David S. Miller
45ce829dd0 Merge tag 'master-2014-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
pull request: wireless 2014-09-05

Please pull this batch of fixes intended for the 3.17 stream...

For the mac80211 bits, Johannes says:

"Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.

In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs."

For the iwlwifi bits, Emmanuel says:

"I revert a patch that disabled CTS to self in dvm because users
reported issues. The revert is CCed to stable since the offending
patch was sent to stable too. I also bump the firmware API versions
since a new firmware is coming up. On top of that, Marcel fixes a
bug I introduced while fixing a bug in our Kconfig file."

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-07 16:11:10 -07:00
Eric Dumazet
7faee5c0d5 tcp: remove TCP_SKB_CB(skb)->when
After commit 740b0f1841 ("tcp: switch rtt estimations to usec resolution"),
we no longer need to maintain timestamps in two different fields.

TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:49:33 -07:00
Eric Dumazet
04317dafd1 tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn
TCP_SKB_CB(skb)->when has different meaning in output and input paths.

In output path, it contains a timestamp.
In input path, it contains an ISN, chosen by tcp_timewait_state_process()

Lets add a different name to ease code comprehension.

Note that 'when' field will disappear in following patch,
as skb_mstamp already contains timestamp, the anonymous
union will promptly disappear as well.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:49:33 -07:00
Alexander Duyck
56193d1bce net: Add function for parsing the header length out of linear ethernet frames
This patch updates some of the flow_dissector api so that it can be used to
parse the length of ethernet buffers stored in fragments.  Most of the
changes needed were to __skb_get_poff as it needed to be updated to support
sending a linear buffer instead of a skb.

I have split __skb_get_poff into two functions, the first is skb_get_poff
and it retains the functionality of the original __skb_get_poff.  The other
function is __skb_get_poff which now works much like __skb_flow_dissect in
relation to skb_flow_dissect in that it provides the same functionality but
works with just a data buffer and hlen instead of needing an skb.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:47:02 -07:00
Alexander Duyck
82eabd9eb2 net: merge cases where sock_efree and sock_edemux are the same function
Since sock_efree and sock_demux are essentially the same code for non-TCP
sockets and the case where CONFIG_INET is not defined we can combine the
code or replace the call to sock_edemux in several spots.  As a result we
can avoid a bit of unnecessary code or code duplication.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:43:45 -07:00
Alexander Duyck
62bccb8cdb net-timestamp: Make the clone operation stand-alone from phy timestamping
The phy timestamping takes a different path than the regular timestamping
does in that it will create a clone first so that the packets needing to be
timestamped can be placed in a queue, or the context block could be used.

In order to support these use cases I am pulling the core of the code out
so it can be used in other drivers beyond just phy devices.

In addition I have added a destructor named sock_efree which is meant to
provide a simple way for dropping the reference to skb exceptions that
aren't part of either the receive or send windows for the socket, and I
have removed some duplication in spots where this destructor could be used
in place of sock_edemux.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:43:45 -07:00
Eric Dumazet
d546c62154 ipv4: harden fnhe_hashfun()
Lets make this hash function a bit secure, as ICMP attacks are still
in the wild.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:40:33 -07:00
Masanari Iida
e793c0f70e net: treewide: Fix typo found in DocBook/networking.xml
This patch fix spelling typo found in DocBook/networking.xml.
It is because the neworking.xml is generated from comments
in the source, I have to fix typo in comments within the source.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:35:28 -07:00
Eric Dumazet
caa415270c ipv4: fix a race in update_or_create_fnhe()
nh_exceptions is effectively used under rcu, but lacks proper
barriers. Between kzalloc() and setting of nh->nh_exceptions(),
we need a proper memory barrier.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 4895c771c7 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:15:50 -07:00
Willem de Bruijn
c199105d15 net-timestamp: only report sw timestamp if reporting bit is set
The timestamping API has separate bits for generating and reporting
timestamps. A software timestamp should only be reported for a packet
when the packet has the relevant generation flag (SKBTX_..) set
and the socket has reporting bit SOF_TIMESTAMPING_SOFTWARE set.

The second check was accidentally removed. Reinstitute the original
behavior.

Tested:
  Without this patch, Documentation/networking/txtimestamp reports
  timestamps regardless of whether SOF_TIMESTAMPING_SOFTWARE is set.
  After the patch, it only reports them when the flag is set.

Fixes: f24b9be595 ("net-timestamp: extend SCM_TIMESTAMPING ancillary data struct")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 15:02:43 -07:00
Hannes Frederic Sowa
2f711939d2 ipv6: add sysctl_mld_qrv to configure query robustness variable
This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
robustness variable. It specifies how many retransmit of unsolicited mld
retransmit should happen. Admins might want to tune this on lossy links.

Also reset mld state on interface down/up, so we pick up new sysctl
settings during interface up event.

IPv6 certification requests this knob to be available.

I didn't make this knob netns specific, as it is mostly a setting in a
physical environment and should be per host.

Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-04 22:26:14 -07:00
John W. Linville
ef4ead3f29 Not that much content this time. Some RCU cleanups, crypto
performance improvements, and various patches all over,
 rather than listing them one might as well look into the
 git log instead.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUAIx4AAoJEDBSmw7B7bqrUYcP/3t4qdFxm0bd4j2AEkl3mPwB
 Qu7obTicOTfBRoJNEgS+8AU2u3PfztU6+ErZs4ETLUuqaZwXisqmwBiMo86+Wtdf
 gx9KonwEW051g7YmB0+6EMwuy04MGzTEk8VavQwqM4g9LIPJ4Buo/kj7MNJ51m11
 XyRmJqZJnKKeiiQ4eC0gPf8e44qiQqaDuYZ0r1UDnNRg2KrbAHlGTBKYI3VRl2u4
 xRpPGVnHwT0qkWb1Zw9fk0VfPr9m1ETthzcZvnhk6uMnJ28D+1B1FjZR1GJU6BW7
 Zx2FbevbZTjDoNT1GQpLGMXBuW0lsZFetXVFiJCr/StaPBtHmtdu28fuNVm8yJYz
 euDlEgrE8F4npdec2F5R2zh7Ue2U7eMEL2uxxjciNSJOipHgx5EXH12Y/5QtrChy
 4OHPbNHgpmqFB7TmkvHDgP/0A7XdyqKVc+NtIV+eECIwE4tHcJ6A+bQ+ZCoRV2Vw
 zmsNuNeNeDW7NEAw9veRXissLZMy/EjUnsOrnW29BpO/yG+2YjqpyQ6JQpcXeCPD
 WQgl2FHpk6ap3jpVjxminxw2HkDnQ0oTKusGLcezalhUlWMo7VYNN59aLzcphxX5
 Fotp/8v1sbDTF46uc/QJ38N5TqflwWeFpxvGkdNGuAT4llP03NaXV0ORBecFmMW2
 esb+PLwlByCDeVFu53q+
 =Qth6
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg <johannes@sipsolutions.net> says:

"Not that much content this time. Some RCU cleanups, crypto
performance improvements, and various patches all over,
rather than listing them one might as well look into the
git log instead."

Signed-off-by: John W. Linville <linville@tuxdriver.com>

Conflicts:
	drivers/net/wireless/ath/wil6210/wmi.c
2014-09-04 13:41:33 -04:00
John W. Linville
190355cc06 Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
 shouldn't be necessary but since many places treat it as a string we
 couldn't move to just sending two bytes.
 
 In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
 fix for the recently introduced RX aggregation offload, a revert for
 a broken patch (that luckily didn't really cause any harm) and a small
 fix for alignment in debugfs.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUAFrmAAoJEDBSmw7B7bqr79AP/1yGi9lkv/wWUs5y0AhUSen9
 850MU26BBlyAAFSz11xqgaEeRmeBeqhR3K7w/M02TX0CHxBzMqMZfyE//tq0UJaI
 ZwZmtyQmdMiOSNKignTIIx7OHTioq0wrGKb6O2UvKoJfTlB9t01jCC4jmCTF5Vos
 6ReF7NaZEbxW6XDOsClNTAtIa1c6n1RQ5VbDIEL5Vfvqv8LbcobduF8WcYl80eIQ
 +EvIHtUm/Luxg6DblibgEVtwYOtNpvRz4pofdw3xoSHAnF+zhXbUr0dUjpkBNA7o
 vWboCBl14Qn1M7pOJZ0+TBzFmquAr6CDbDvArVCH01Swh27EUDQUcHQAggGpT71w
 DFgWHOYP0UCB6Y4U0GjBehy8PeuytqJLBSceKVud7DDqd8fY+Lq3MMyicIk0aw3o
 IIDLWrujkCBXsdfuxQETmYxHU05WHSuYOCTgGSqbq3QPTWm8pBGWTdbk+1t/0FyH
 cGLJOWs/jCrtHdzDj6TH+kL8NmvwB7sC9MT45qG0ilevmPW25yrnTJPEMEvFBqvZ
 lnaqiX6D1kGNZd09CxgSIhxrQi+N0Yg+UlLa4IUtOIqnQussOC3xH2U5qTufdpa1
 Gi9aCkBGVKQiObPWucf2QB4t1sZ18rxBhrAelZhQPLTKrnsuLhpcVBlU+L6ScCAk
 FVni4HZH2IGtDQ577k10
 =G/pB
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg <johannes@sipsolutions.net> says:

"Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.

In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs."

Signed-off-by: John W. Linville <linville@redhat.com>
2014-09-04 13:08:24 -04:00
Pablo Neira Ayuso
30766f4c2d netfilter: nat: move specific NAT IPv4 to core
Move the specific NAT IPv4 core functions that are called from the
hooks from iptable_nat.c to nf_nat_l3proto_ipv4.c. This prepares the
ground to allow iptables and nft to use the same NAT engine code that
comes in a follow up patch.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-02 17:14:10 +02:00
Willem de Bruijn
364a9e9324 sock: deduplicate errqueue dequeue
sk->sk_error_queue is dequeued in four locations. All share the
exact same logic. Deduplicate.

Also collapse the two critical sections for dequeue (at the top of
the recv handler) and signal (at the bottom).

This moves signal generation for the next packet forward, which should
be harmless.

It also changes the behavior if the recv handler exits early with an
error. Previously, a signal for follow-up packets on the errqueue
would then not be scheduled. The new behavior, to always signal, is
arguably a bug fix.

For rxrpc, the change causes the same function to be called repeatedly
for each queued packet (because the recv handler == sk_error_report).
It is likely that all packets will fail for the same reason (e.g.,
memory exhaustion).

This code runs without sk_lock held, so it is not safe to trust that
sk->sk_err is immutable inbetween releasing q->lock and the subsequent
test. Introduce int err just to avoid this potential race.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 21:49:08 -07:00
David S. Miller
377655fe6b Merge tag 'master-2014-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
pull request: wireless 2014-08-28

Please pull this batch of fixes intended for the 3.17 stream.

For the Bluetooth/6LowPAN/802.15.4 bits, Johan says:

'It contains a connection reference counting fix for LE where a
connection might stay up even though it should get disconnected.

The other 802.15.4 6LoWPAN related patches were sent to the bluetooth
tree by Alexander Aring and described as follows by him:

"
these patches contains patches for the bluetooth branch.

This series includes memory leak fixes and an errno value fix.
Also there are two patches for sending and receiving 1280 6LoWPAN
packets, which makes the IEEE 802.15.4 6LoWPAN stack more RFC
compliant.
"'

Along with that...

Alexey Khoroshilov fixes a use-after-free bug on at76c50x-usb.

Hauke Mehrtens adds a PCI ID to bcma.

Himangi Saraogi fixes a silly "A || A" test in rtlwifi.

Larry Finger adds a device ID to rtl8192cu.

Maks Naumov fixes a strncmp argument in ath9k.

Álvaro Fernández Rojas adds a PCI ID to ssb.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 18:08:11 -07:00
Daniel Borkmann
38ab1fa981 net: sctp: fix ABI mismatch through sctp_assoc_to_state helper
Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp
tree, the official <netinet/sctp.h> header carries a copy of enum
sctp_sstat_state that looks like (compared to the current in-kernel
enumeration):

  User definition:                     Kernel definition:

  enum sctp_sstat_state {              typedef enum {
    SCTP_EMPTY             = 0,          <removed>
    SCTP_CLOSED            = 1,          SCTP_STATE_CLOSED            = 0,
    SCTP_COOKIE_WAIT       = 2,          SCTP_STATE_COOKIE_WAIT       = 1,
    SCTP_COOKIE_ECHOED     = 3,          SCTP_STATE_COOKIE_ECHOED     = 2,
    SCTP_ESTABLISHED       = 4,          SCTP_STATE_ESTABLISHED       = 3,
    SCTP_SHUTDOWN_PENDING  = 5,          SCTP_STATE_SHUTDOWN_PENDING  = 4,
    SCTP_SHUTDOWN_SENT     = 6,          SCTP_STATE_SHUTDOWN_SENT     = 5,
    SCTP_SHUTDOWN_RECEIVED = 7,          SCTP_STATE_SHUTDOWN_RECEIVED = 6,
    SCTP_SHUTDOWN_ACK_SENT = 8,          SCTP_STATE_SHUTDOWN_ACK_SENT = 7,
  };                                   } sctp_state_t;

This header was later on also placed into the uapi, so that user space
programs can compile without having <netinet/sctp.h>, but the shipped
with <linux/sctp.h> instead.

While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that
sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we
nevertheless have a what it appears to be dummy SCTP_EMPTY state from
the very early days.

While it seems to do just nothing, commit 0b8f9e25b0 ("sctp: remove
completely unsed EMPTY state") did the right thing and removed this dead
code. That however, causes an off-by-one when the user asks the SCTP
stack via SCTP_STATUS API and checks for the current socket state thus
yielding possibly undefined behaviour in applications as they expect
the kernel to tell the right thing.

The enumeration had to be changed however as based on the current socket
state, we access a function pointer lookup-table through this. Therefore,
I think the best way to deal with this is just to add a helper function
sctp_assoc_to_state() to encapsulate the off-by-one quirk.

Reported-by: Tristan Su <sooqing@gmail.com>
Fixes: 0b8f9e25b0 ("sctp: remove completely unsed EMPTY state")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:31:08 -07:00
Florian Fainelli
5037d532b8 net: dsa: add Broadcom tag RX/TX handler
Add support for the 4-bytes Broadcom tag that built-in switches such as
the Starfighter 2 might insert when receiving packets, or that we need
to insert while targetting specific switch ports. We use a fake local
EtherType value for this 4-bytes switch tag: ETH_P_BRCMTAG to make sure
we can assign DSA-specific network operations within the DSA drivers.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
ce31b31c68 net: dsa: allow updating fixed PHY link information
Allow switch drivers to hook a PHY link update callback to perform
port-specific link work.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
ec9436baed net: dsa: allow drivers to do link adjustment
Whenever libphy determines that the link status of a given PHY/port has
changed, allow to call into the switch driver link adjustment callback
so proper actions can be taken care of by the switch driver upon link
notification.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
5aed85cec2 net: dsa: allow switches to work without tagging
In case switch port tagging is disabled (voluntarily, or the switch just
does not support it), allow us to continue using the defined set of
dsa_device_ops in net/dsa/slave.c.

We introduce dsa_protocol_is_tagged() to check whether we need to
override skb->protocol and go through the DSA-specifif packet_type
function, or if we just go on and receive the SKB through the normal
path.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
0d8bcdd383 net: dsa: allow for more complex PHY setups
Modify the DSA slave interface to be bound to an arbitray PHY, not just
the ones that are available as child PHY devices of the switch MDIO bus.

This allows us for instance to have external PHYs connected to a
separate MDIO bus, but yet also connected to a given switch port.

Under certain configurations, the physical port mask might not be a 1:1
mapping to the MII PHYs mask. This is the case, if e.g: Port 1 of the
switch is used and connects to a PHY at a MDIO address different than 1.

Introduce a phys_mii_mask variable which allows driver to implement and
divert their own MDIO read/writes operations for a subset of the MDIO
PHY addresses.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
bd47497a01 net: dsa: retain a per-port device_node pointer
We will later use the per-port device_node pointer to fetch a bunch of
port-specific properties.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
fa981d9af8 net: dsa: provide a switch device device tree node pointer
We might need to fetch additional resources from the device tree node
pointer, such as register ranges or other properties. Keep a device_node
pointer around for this.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:39 -07:00
Florian Fainelli
3e8a72d1da net: dsa: reduce number of protocol hooks
DSA is currently registering one packet_type function per EtherType it
needs to intercept in the receive path of a DSA-enabled Ethernet device.
Right now we have three of them: trailer, DSA and eDSA, and there might
be more in the future, this will not scale to the addition of new
protocols.

This patch proceeds with adding a new layer of abstraction and two new
functions:

dsa_switch_rcv() which will dispatch into the tag-protocol specific
receive function implemented by net/dsa/tag_*.c

dsa_slave_xmit() which will dispatch into the tag-protocol specific
transmit function implemented by net/dsa/tag_*.c

When we do create the per-port slave network devices, we iterate over
the switch protocol to assign the DSA-specific receive and transmit
operations.

A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact
that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER
like it used to be.

This allows us to greatly simplify the check in eth_type_trans() and
always override the skb->protocol with ETH_P_XDSA for Ethernet switches
tagged protocol, while also reducing the number repetitive slave
netdevice_ops assignments.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:39 -07:00
Johannes Berg
5bc8c1f2b0 cfg80211: allow passing frame type to cfg80211_inform_bss()
When using the cfg80211_inform_bss[_width]() functions drivers
cannot currently indicate whether the data was received in a
beacon or probe response. Fix that by passing a new enum that
indicates such (or unknown).

For good measure, use it in ath6kl.

Acked-by: Kalle Valo <kvalo@qca.qualcomm.com> [ath6kl]
Acked-by: Arend van Spriel <arend@broadcom.com> [brcmfmac]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:16:02 +02:00
Johannes Berg
0e227084ae cfg80211: clarify BSS probe response vs. beacon data
There are a few possible cases of where BSS data came from:
 1) only a beacon has been received
 2) only a probe response has been received
 3) the driver didn't report what it received (this happens when
    using cfg80211_inform_bss[_width]())
 4) both probe response and beacon data has been received

Unfortunately, in the userspace API, a few things weren't there:
 a) there was no way to differentiate cases 1) and 4) above
    without comparing the data of the IEs
 b) the TSF was always from the last frame, instead of being
    exposed for beacon/probe response separately like IEs

Fix this by
   i) exporting a new flag attribute that indicates whether or
      not probe response data has been received - this addresses (a)
  ii) exporting a BEACON_TSF attribute that holds the beacon's TSF
      if a beacon has been received
 iii) not exporting the beacon attributes in case (3) above as that
      would just lead userspace into thinking the data actually came
      from a beacon when that isn't clear

To implement this, track inside the IEs struct whether or not it
(definitely) came from a beacon.

Reported-by: William Seto
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:16:01 +02:00
Ido Yariv
c70f59a2a0 mac80211: don't resize skbs needlessly
Header-less cloned skbs with sufficient headroom need not be cloned
unless the tailroom is going to be modified.

Fix ieee80211_skb_resize so it would only resize cloned skbs if either
the header isn't released or the tailroom is going to be modified.

Some drivers might have assumed that skbs are never cloned, so add a HW
flag that explicitly permits cloned TX skbs. Drivers which do not modify
TX skbs should set this flag to avoid copying skbs.

Signed-off-by: Ido Yariv <idox.yariv@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:16:00 +02:00
Ido Yariv
ca34e3b5c8 mac80211: Fix accounting of the tailroom-needed counter
When hw acceleration is enabled, the GENERATE_IV or PUT_IV_SPACE flags
will only require headroom space. Consequently, the tailroom-needed
counter can safely be decremented.

Signed-off-by: Ido Yariv <idox.yariv@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:15:59 +02:00
Vladimir Kondratiev
970fdfa89b cfg80211: remove @gfp parameter from cfg80211_rx_mgmt()
In the cfg80211_rx_mgmt(), parameter @gfp was used for the memory allocation.
But, memory get allocated under spin_lock_bh(), this implies atomic context.
So, one can't use GFP_KERNEL, only variants with no __GFP_WAIT. Actually, in all
occurrences GFP_ATOMIC is used (wil6210 use GFP_KERNEL by mistake),
and it should be this way or warning triggered in the memory allocation code.

Remove @gfp parameter as no actual choice exist, and use hard coded
GFP_ATOMIC for memory allocation.

Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:15:58 +02:00
WANG Cong
453a940ea7 net: make skb an optional parameter for__skb_flow_dissect()
Fixes: commit 690e36e726 (net: Allow raw buffers to be passed into the flow dissector)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25 17:21:26 -07:00
John W. Linville
07bc788424 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-08-25 15:58:02 -04:00
John W. Linville
0fdcaa5948 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2014-08-25 15:35:20 -04:00
Tom Herbert
57c67ff4bd udp: additional GRO support
Implement GRO for UDPv6. Add UDP checksum verification in gro_receive
for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-24 18:09:24 -07:00
Tom Herbert
1933a7852c net: add gro_compute_pseudo functions
Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are
the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo
for GRO path. The IP header is taken from skb_gro_network_header.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-24 18:09:23 -07:00
David S. Miller
690e36e726 net: Allow raw buffers to be passed into the flow dissector.
Drivers, and perhaps other entities we have not yet considered,
sometimes want to know how deep the protocol headers go before
deciding how large of an SKB to allocate and how much of the packet to
place into the linear SKB area.

For example, consider a driver which has a device which DMAs into
pools of pages and then tells the driver where the data went in the
DMA descriptor(s).  The driver can then build an SKB and reference
most of the data via SKB fragments (which are page/offset/length
triplets).

However at least some of the front of the packet should be placed into
the linear SKB area, which comes before the fragments, so that packet
processing can get at the headers efficiently.  The first thing each
protocol layer is going to do is a "pskb_may_pull()" so we might as
well aggregate as much of this as possible while we're building the
SKB in the driver.

Part of supporting this is that we don't have an SKB yet, so we want
to be able to let the flow dissector operate on a raw buffer in order
to compute the offset of the end of the headers.

So now we have a __skb_flow_dissect() which takes an explicit data
pointer and length.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 12:13:41 -07:00
Eric Dumazet
d2de875c6d net: use ktime_get_ns() and ktime_get_real_ns() helpers
ktime_get_ns() replaces ktime_to_ns(ktime_get())

ktime_get_real_ns() replaces ktime_to_ns(ktime_get_real())

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-22 19:57:23 -07:00