Commit Graph

35328 Commits

Author SHA1 Message Date
Jiri Pirko
f6f6424ba7 net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
Do the work of parsing NDA_VLAN directly in rtnetlink code, pass simple
u16 vid to drivers from there.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 20:01:18 -08:00
Jiri Pirko
93859b13fa bridge: convert flags in fbd entry into bitfields
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 20:01:17 -08:00
Jiri Pirko
020ec6ba2a bridge: rename fdb_*_hw to fdb_*_hw_addr to avoid confusion
The current name might seem that this actually offloads the fdb entry to
hw. So rename it to clearly present that this for hardware address
addition/removal.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 20:01:16 -08:00
David S. Miller
60b7379dc5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-29 20:47:48 -08:00
Linus Torvalds
8e8459719c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Several small fixes here:

   1) Don't crash in tg3 driver when the number of tx queues has been
      configured to be different from the number of rx queues.  From
      Thadeu Lima de Souza Cascardo.

   2) VLAN filter not disabled properly in promisc mode in ixgbe driver,
      from Vlad Yasevich.

   3) Fix OOPS on dellink op in VTI tunnel driver, from Xin Long.

   4) IPV6 GRE driver WCCP code checks skb->protocol for ETH_P_IP
      instead of ETH_P_IPV6, whoops.  From Yuri Chislov.

   5) Socket matching in ping driver is buggy when packet AF does not
      match socket's AF.  Fix from Jane Zhou.

   6) Fix checksum calculation errors in VXLAN due to where the
      udp_tunnel6_xmit_skb() helper gets it's saddr/daddr from.  From
      Alexander Duyck.

   7) Fix 5G detection problem in rtlwifi driver, from Larry Finger.

   8) Fix NULL deref in tcp_v{4,6}_send_reset, from Eric Dumazet.

   9) Various missing netlink attribute verifications in bridging code,
      from Thomas Graf.

  10) tcp_recvmsg() unconditionally calls ipv4 ip_recv_error even for
      ipv6 sockets, whoops.  Fix from Willem de Bruijn"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
  net-timestamp: make tcp_recvmsg call ipv6_recv_error for AF_INET6 socks
  bridge: Sanitize IFLA_EXT_MASK for AF_BRIDGE:RTM_GETLINK
  bridge: Add missing policy entry for IFLA_BRPORT_FAST_LEAVE
  net: Check for presence of IFLA_AF_SPEC
  net: Validate IFLA_BRIDGE_MODE attribute length
  bridge: Validate IFLA_BRIDGE_FLAGS attribute length
  stmmac: platform: fix default values of the filter bins setting
  net/mlx4_core: Limit count field to 24 bits in qp_alloc_res
  net: dsa: bcm_sf2: reset switch prior to initialization
  net: dsa: bcm_sf2: fix unmapping registers in case of errors
  tg3: fix ring init when there are more TX than RX channels
  tcp: fix possible NULL dereference in tcp_vX_send_reset()
  rtlwifi: Change order in device startup
  rtlwifi: rtl8821ae: Fix 5G detection problem
  Revert "netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse"
  vxlan: Fix boolean flip in VXLAN_F_UDP_ZERO_CSUM6_[TX|RX]
  ip6_udp_tunnel: Fix checksum calculation
  net-timestamp: Fix a documentation typo
  net/ping: handle protocol mismatching scenario
  af_packet: fix sparse warning
  ...
2014-11-27 18:05:05 -08:00
Willem de Bruijn
f4713a3dfa net-timestamp: make tcp_recvmsg call ipv6_recv_error for AF_INET6 socks
TCP timestamping introduced MSG_ERRQUEUE handling for TCP sockets.
If the socket is of family AF_INET6, call ipv6_recv_error instead
of ip_recv_error.

This change is more complex than a single branch due to the loadable
ipv6 module. It reuses a pre-existing indirect function call from
ping. The ping code is safe to call, because it is part of the core
ipv6 module and always present when AF_INET6 sockets are active.

Fixes: 4ed2d765 (net-timestamp: TCP timestamping)
Signed-off-by: Willem de Bruijn <willemb@google.com>

----

It may also be worthwhile to add WARN_ON_ONCE(sk->family == AF_INET6)
to ip_recv_error.
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 15:45:04 -05:00
Thomas Graf
aa68c20ff3 bridge: Sanitize IFLA_EXT_MASK for AF_BRIDGE:RTM_GETLINK
Only search for IFLA_EXT_MASK if the message actually carries a
ifinfomsg header and validate minimal length requirements for
IFLA_EXT_MASK.

Fixes: 6cbdceeb ("bridge: Dump vlan information from a bridge port")
Cc: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 15:29:01 -05:00
Thomas Graf
6f705d8cfc bridge: Add missing policy entry for IFLA_BRPORT_FAST_LEAVE
Fixes: c2d3babf ("bridge: implement multicast fast leave")
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 15:29:01 -05:00
Thomas Graf
6e8d1c5545 bridge: Validate IFLA_BRIDGE_FLAGS attribute length
Payload is currently accessed blindly and may exceed valid message
boundaries.

Fixes: 407af3299 ("bridge: Add netlink interface to configure vlans on bridge ports")
Cc: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 15:29:00 -05:00
Ying Xue
a6ca109443 tipc: use generic SKB list APIs to manage TIPC outgoing packet chains
Use standard SKB list APIs associated with struct sk_buff_head to
manage socket outgoing packet chain and name table outgoing packet
chain, having relevant code simpler and more readable.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
Ying Xue
f03273f1e2 tipc: use generic SKB list APIs to manage link receive queue
Use standard SKB list APIs associated with struct sk_buff_head to
manage link's receive queue to simplify its relevant code cemplexity.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
Ying Xue
bc6fecd409 tipc: use generic SKB list APIs to manage deferred queue of link
Use standard SKB list APIs associated with struct sk_buff_head to
manage link's deferred queue, simplifying relevant code.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
Ying Xue
58dc55f256 tipc: use generic SKB list APIs to manage link transmission queue
Use standard SKB list APIs associated with struct sk_buff_head to
manage link transmission queue, having relevant code more clean.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
Ying Xue
58d78b328a tipc: use skb_queue_walk_safe marco to simplify link_prepare_wakeup routine
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
Ying Xue
99315ad43d tipc: remove unused between routine
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
Ying Xue
58311d1690 tipc: eliminate two pseudo message types of BUNDLE_OPEN and BUNDLE_CLOSED
The pseudo message types of BUNDLE_CLOSED as well as BUNDLE_OPEN are
used to flag whether or not more messages can be bundled into a data
packet in the outgoing transmission queue. Obviously, no more messages
can be appended after the packet has been sent and is waiting to be
acknowledged and deleted. These message types do in reality represent
a send-side local implementation flag, and are not defined as part of
the protocol. It is therefore safe to move it to to where it belongs,
that is, the control area (TIPC_SKB_CB) of the buffer.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
Ying Xue
47b4c9a82f tipc: clean up the process of link pushing packets
In original tipc_link_push_packet(), it pushes messages from protocol
message queue, retransmission queue and next_out queue. But as the two
first queues are removed, we can simplify its relevant code through
deleting tipc_link_push_queue().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:16 -05:00
Ying Xue
7b6f087f98 tipc: remove retransmission queue
TIPC retransmission queue is intended to record which messages
should be retransmitted when bearer is not congested. However,
as the retransmission queue becomes useless with the removal of
bearer congestion mechanism, it should be removed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:16 -05:00
Ying Xue
8965d250c2 tipc: remove protocol message queue
TIPC protocol message queue is intended to save one protocol message
when bearer is congested so that the message stored in the queue can
be immediately transmitted when bearer congestion is released. However,
as now the protocol queue has no mission any more with the removal of
bearer congestion mechanism, it should be removed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:16 -05:00
Ying Xue
a8f48af587 tipc: remove node subscription infrastructure
The node subscribe infrastructure represents a virtual base class, so
its users, such as struct tipc_port and struct publication, can derive
its implemented functionalities. However, after the removal of struct
tipc_port, struct publication is left as its only single user now. So
defining an abstract infrastructure for one user becomes no longer
reasonable. If corresponding new functions associated with the
infrastructure are moved to name_table.c file, the node subscription
infrastructure can be removed as well.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:16 -05:00
zhuyj
73cf0e923d ipv6: Remove unnecessary test
The "init_net" test in function addrconf_exit_net is introduced
in commit 44a6bd29 [Create ipv6 devconf-s for namespaces] to avoid freeing
init_net. In commit c900a800 [ipv6: fix bad free of addrconf_init_net],
function addrconf_init_net will allocate memory for every net regardless of
init_net. In this case, it is unnecessary to make "init_net" test.

CC: Hong Zhiguo <honkiko@gmail.com>
CC: Octavian Purdila <opurdila@ixiacom.com>
CC: Pavel Emelyanov <xemul@openvz.org>
CC: Cong Wang <cwang@twopensource.com>
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zhu Yanjun <Yanjun.Zhu@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:27:04 -05:00
Tom Herbert
4fd671ded1 gue: Call remcsum_adjust
Change remote checksum offload to call remcsum_adjust. This also
eliminates the optimization to skip an IP header as part of the
adjustment (really does not seem to be much of a win).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:25:44 -05:00
Eric Dumazet
ced7a04e39 pkt_sched: fq: increase max delay from 125 ms to one second
FQ/pacing has a clamp of delay of 125 ms, to avoid some possible harm.

It turns out this delay is too small to allow pacing low rates :
Some ISP setup very aggressive policers as low as 16kbit.

Now TCP stack has spurious rtx prevention, it seems safe to increase
this fixed parameter, without adding a qdisc attribute.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:08:04 -05:00
Linus Torvalds
b914c5b213 Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
 "These fix one mishandling of the case when security labels are
  configured out, and two races in the 4.1 backchannel code"

* 'for-3.18' of git://linux-nfs.org/~bfields/linux:
  nfsd: Fix slot wake up race in the nfsv4.1 callback code
  SUNRPC: Fix locking around callback channel reply receive
  nfsd: correctly define v4.2 support attributes
2014-11-25 19:05:41 -08:00
David S. Miller
d3fc6b3fdd Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
More work from Al Viro to move away from modifying iovecs
by using iov_iter instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25 20:02:51 -05:00
Eric Dumazet
c3658e8d0f tcp: fix possible NULL dereference in tcp_vX_send_reset()
After commit ca777eff51 ("tcp: remove dst refcount false sharing for
prequeue mode") we have to relax check against skb dst in
tcp_v[46]_send_reset() if prequeue dropped the dst.

If a socket is provided, a full lookup was done to find this socket,
so the dst test can be skipped.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88191
Reported-by: Jaša Bartelj <jasa.bartelj@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Daniel Borkmann <dborkman@redhat.com>
Fixes: ca777eff51 ("tcp: remove dst refcount false sharing for prequeue mode")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25 14:29:18 -05:00
Pablo Neira
43612d7c04 Revert "netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse"
This reverts commit 5195c14c8b.

If the conntrack clashes with an existing one, it is left out of
the unconfirmed list, thus, crashing when dropping the packet and
releasing the conntrack since golden rule is that conntracks are
always placed in any of the existing lists for traceability reasons.

Reported-by: Daniel Borkmann <dborkman@redhat.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=88841
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25 14:14:51 -05:00
Alexander Duyck
f3750817a9 ip6_udp_tunnel: Fix checksum calculation
The UDP checksum calculation for VXLAN tunnels is currently using the
socket addresses instead of the actual packet source and destination
addresses.  As a result the checksum calculated is incorrect in some
cases.

Also uh->check was being set twice, first it was set to 0, and then it is
set again in udp6_set_csum.  This change removes the redundant assignment
to 0.

Fixes: acbf74a7 ("vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions.")

Cc: Andy Zhou <azhou@nicira.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25 14:12:12 -05:00
Jane Zhou
91a0b60346 net/ping: handle protocol mismatching scenario
ping_lookup() may return a wrong sock if sk_buff's and sock's protocols
dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's
sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong
sock will be returned.
the fix is to "continue" the searching, if no matching, return NULL.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Jane Zhou <a17711@motorola.com>
Signed-off-by: Yiwei Zhao <gbjc64@motorola.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:48:20 -05:00
Michael S. Tsirkin
6e58040b84 af_packet: fix sparse warning
af_packet produces lots of these:
	net/packet/af_packet.c:384:39: warning: incorrect type in return expression (different modifiers)
	net/packet/af_packet.c:384:39:    expected struct page [pure] *
	net/packet/af_packet.c:384:39:    got struct page *

this seems to be because sparse does not realize that _pure
refers to function, not the returned pointer.

Tweak code slightly to avoid the warning.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:15:36 -05:00
Yuri Chislov
be6572fdb1 ipv6: gre: fix wrong skb->protocol in WCCP
When using GRE redirection in WCCP, it sets the wrong skb->protocol,
that is, ETH_P_IP instead of ETH_P_IPV6 for the encapuslated traffic.

Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Cc: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Yuri Chislov <yuri.chislov@gmail.com>
Tested-by: Yuri Chislov <yuri.chislov@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:11:05 -05:00
Richard Alpe
d8182804cf tipc: fix sparse warnings in new nl api
Fix sparse warnings about non-static declaration of static functions
in the new tipc netlink API.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:10:23 -05:00
David S. Miller
958d03b016 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
netfilter/ipvs updates for net-next

The following patchset contains Netfilter updates for your net-next
tree, this includes the NAT redirection support for nf_tables, the
cgroup support for nft meta and conntrack zone support for the connlimit
match. Coming after those, a bunch of sparse warning fixes, missing
netns bits and cleanups. More specifically, they are:

1) Prepare IPv4 and IPv6 NAT redirect code to use it from nf_tables,
   patches from Arturo Borrero.

2) Introduce the nf_tables redir expression, from Arturo Borrero.

3) Remove an unnecessary assignment in ip_vs_xmit/__ip_vs_get_out_rt().
   Patch from Alex Gartrell.

4) Add nft_log_dereference() macro to the nf_log infrastructure, patch
   from Marcelo Leitner.

5) Add some extra validation when registering logger families, also
   from Marcelo.

6) Some spelling cleanups from stephen hemminger.

7) Fix sparse warning in nf_logger_find_get().

8) Add cgroup support to nf_tables meta, patch from Ana Rey.

9) A Kconfig fix for the new redir expression and fix sparse warnings in
   the new redir expression.

10) Fix several sparse warnings in the netfilter tree, from
    Florian Westphal.

11) Reduce verbosity when OOM in nfnetlink_log. User can basically do
    nothing when this situation occurs.

12) Add conntrack zone support to xt_connlimit, again from Florian.

13) Add netnamespace support to the h323 conntrack helper, contributed
    by Vasily Averin.

14) Remove unnecessary nul-pointer checks before free_percpu() and
    module_put(), from Markus Elfring.

15) Use pr_fmt in nfnetlink_log, again patch from Marcelo Leitner.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:00:58 -05:00
Al Viro
083735f4b0 rds: switch rds_message_copy_from_user() to iov_iter
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:43 -05:00
Al Viro
c310e72c89 rds: switch ->inc_copy_to_user() to passing iov_iter
instances get considerably simpler from that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:43 -05:00
Al Viro
7424ce6506 [atm] switch vcc_sendmsg() to copy_from_iter()
... and make it handle multi-segment iovecs - deals with that
"fix this later" issue for free.  A bit of shame, really - it
had been there since 2.3.15pre3 when the whole thing went into the
tree, practically a historical artefact by now...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:42 -05:00
Al Viro
0f7db23a07 vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:42 -05:00
Al Viro
45dcc687f7 tipc_msg_build(): pass msghdr instead of its ->msg_iov
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:41 -05:00
Al Viro
562640f3c3 tipc_sendmsg(): pass msghdr instead of its ->msg_iov
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:40 -05:00
Al Viro
e0eb093e79 switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:40 -05:00
Al Viro
8feb2fb2bb switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()
... and kill skb_copy_datagram_iovec()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:39 -05:00
Al Viro
195e952d03 kill zerocopy_sg_from_iovec()
no users left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:39 -05:00
Al Viro
3a654f975b new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:03:08 -05:00
Al Viro
7eab8d9e8a new helper: memcpy_to_msg()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 04:28:51 -05:00
Al Viro
e169371823 switch ipxrtr_route_packet() from iovec to msghdr
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 04:28:49 -05:00
Al Viro
6ce8e9ce59 new helper: memcpy_from_msg()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 04:28:48 -05:00
Al Viro
227158db16 new helper: skb_copy_and_csum_datagram_msg()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 04:28:44 -05:00
lucien
20ea60ca99 ip_tunnel: the lack of vti_link_ops' dellink() cause kernel panic
Now the vti_link_ops do not point the .dellink, for fb tunnel device
(ip_vti0), the net_device will be removed as the default .dellink is
unregister_netdevice_queue,but the tunnel still in the tunnel list,
then if we add a new vti tunnel, in ip_tunnel_find():

        hlist_for_each_entry_rcu(t, head, hash_node) {
                if (local == t->parms.iph.saddr &&
                    remote == t->parms.iph.daddr &&
                    link == t->parms.link &&
==>                 type == t->dev->type &&
                    ip_tunnel_key_match(&t->parms, flags, key))
                        break;
        }

the panic will happen, cause dev of ip_tunnel *t is null:
[ 3835.072977] IP: [<ffffffffa04103fd>] ip_tunnel_find+0x9d/0xc0 [ip_tunnel]
[ 3835.073008] PGD b2c21067 PUD b7277067 PMD 0
[ 3835.073008] Oops: 0000 [#1] SMP
.....
[ 3835.073008] Stack:
[ 3835.073008]  ffff8800b72d77f0 ffffffffa0411924 ffff8800bb956000 ffff8800b72d78e0
[ 3835.073008]  ffff8800b72d78a0 0000000000000000 ffffffffa040d100 ffff8800b72d7858
[ 3835.073008]  ffffffffa040b2e3 0000000000000000 0000000000000000 0000000000000000
[ 3835.073008] Call Trace:
[ 3835.073008]  [<ffffffffa0411924>] ip_tunnel_newlink+0x64/0x160 [ip_tunnel]
[ 3835.073008]  [<ffffffffa040b2e3>] vti_newlink+0x43/0x70 [ip_vti]
[ 3835.073008]  [<ffffffff8150d4da>] rtnl_newlink+0x4fa/0x5f0
[ 3835.073008]  [<ffffffff812f68bb>] ? nla_strlcpy+0x5b/0x70
[ 3835.073008]  [<ffffffff81508fb0>] ? rtnl_link_ops_get+0x40/0x60
[ 3835.073008]  [<ffffffff8150d11f>] ? rtnl_newlink+0x13f/0x5f0
[ 3835.073008]  [<ffffffff81509cf4>] rtnetlink_rcv_msg+0xa4/0x270
[ 3835.073008]  [<ffffffff8126adf5>] ? sock_has_perm+0x75/0x90
[ 3835.073008]  [<ffffffff81509c50>] ? rtnetlink_rcv+0x30/0x30
[ 3835.073008]  [<ffffffff81529e39>] netlink_rcv_skb+0xa9/0xc0
[ 3835.073008]  [<ffffffff81509c48>] rtnetlink_rcv+0x28/0x30
....

modprobe ip_vti
ip link del ip_vti0 type vti
ip link add ip_vti0 type vti
rmmod ip_vti

do that one or more times, kernel will panic.

fix it by assigning ip_tunnel_dellink to vti_link_ops' dellink, in
which we skip the unregister of fb tunnel device. do the same on ip6_vti.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23 21:11:17 -05:00
Ian Morris
e5d08d718a ipv6: coding style improvements (remove assignment in if statements)
This change has no functional impact and simply addresses some coding
style issues detected by checkpatch. Specifically this change
adjusts "if" statements which also include the assignment of a
variable.

No changes to the resultant object files result as determined by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23 21:00:56 -05:00
Alexander Duyck
b6fef4c6b8 ipv6: Do not treat a GSO_TCPV4 request from UDP tunnel over IPv6 as invalid
This patch adds SKB_GSO_TCPV4 to the list of supported GSO types handled by
the IPv6 GSO offloads.  Without this change VXLAN tunnels running over IPv6
do not currently handle IPv4 TCP TSO requests correctly and end up handing
the non-segmented frame off to the device.

Below is the before and after for a simple netperf TCP_STREAM test between
two endpoints tunneling IPv4 over a VXLAN tunnel running on IPv6 on top of
a 1Gb/s network adapter.

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.29       0.88      Before
 87380  16384  16384    10.03     895.69      After

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23 14:18:11 -05:00