Commit Graph

496736 Commits

Author SHA1 Message Date
Siva Mannem
9a05dde59a bridge: Let bridge not age 'externally' learnt FDB entries, they are removed when 'external' entity notifies the aging
When 'learned_sync' flag is turned on, the offloaded switch
 port syncs learned MAC addresses to bridge's FDB via switchdev notifier
 (NETDEV_SWITCH_FDB_ADD). Currently, FDB entries learnt via this mechanism are
 wrongly being deleted by bridge aging logic. This patch ensures that FDB
 entries synced from offloaded switch ports are not deleted by bridging logic.
 Such entries can only be deleted via switchdev notifier
 (NETDEV_SWITCH_FDB_DEL).

Signed-off-by: Siva Mannem <siva.mannem.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 13:51:10 -08:00
LEROY Christophe
4fc9b87bae net: fs_enet: Implement NETIF_F_SG feature
Freescale ethernet controllers have the capability to re-assemble fragmented
data into a single ethernet frame. This patch uses this capability and
implements NETIP_F_SG feature into the fs_enet ethernet driver.

On a MPC885, I get 53% performance improvement on a ftp transfer of a 15Mb file:
  * Without the patch : 2,8 Mbps
  * With the patch : 4,3 Mbps

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 13:13:04 -08:00
Eric Dumazet
2bd82484bb xps: fix xps for stacked devices
A typical qdisc setup is the following :

bond0 : bonding device, using HTB hierarchy
eth1/eth2 : slaves, multiqueue NIC, using MQ + FQ qdisc

XPS allows to spread packets on specific tx queues, based on the cpu
doing the send.

Problem is that dequeues from bond0 qdisc can happen on random cpus,
due to the fact that qdisc_run() can dequeue a batch of packets.

CPUA -> queue packet P1 on bond0 qdisc, P1->ooo_okay=1
CPUA -> queue packet P2 on bond0 qdisc, P2->ooo_okay=0

CPUB -> dequeue packet P1 from bond0
        enqueue packet on eth1/eth2
CPUC -> dequeue packet P2 from bond0
        enqueue packet on eth1/eth2 using sk cache (ooo_okay is 0)

get_xps_queue() then might select wrong queue for P1, since current cpu
might be different than CPUA.

P2 might be sent on the old queue (stored in sk->sk_tx_queue_mapping),
if CPUC runs a bit faster (or CPUB spins a bit on qdisc lock)

Effect of this bug is TCP reorders, and more generally not optimal
TX queue placement. (A victim bulk flow can be migrated to the wrong TX
queue for a while)

To fix this, we have to record sender cpu number the first time
dev_queue_xmit() is called for one tx skb.

We can union napi_id (used on receive path) and sender_cpu,
granted we clear sender_cpu in skb_scrub_packet() (credit to Willem for
this union idea)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 13:02:54 -08:00
David S. Miller
7e8acbb69e Merge branch 'netlabel-next'
Markus Elfring says:

====================
netlabel: Deletion of a few unnecessary checks

Further update suggestions were taken into account after patches were applied
from static source code analysis.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03 16:22:20 -08:00
Markus Elfring
4de46d5ebc netlabel: Less function calls in netlbl_mgmt_add_common() after error detection
The functions "cipso_v4_doi_putdef" and "kfree" could be called in some cases
by the netlbl_mgmt_add_common() function during error handling even if the
passed variables contained still a null pointer.

* This implementation detail could be improved by adjustments for jump labels.

* Let us return immediately after the first failed function call according to
  the current Linux coding style convention.

* Let us delete also an unnecessary check for the variable "entry" there.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03 16:22:13 -08:00
Markus Elfring
7a11b1d303 netlabel: Deletion of an unnecessary check before the function call "cipso_v4_doi_free"
The cipso_v4_doi_free() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03 16:22:12 -08:00
Markus Elfring
79b7cf60e1 netlabel: Deletion of an unnecessary check before the function call "cipso_v4_doi_putdef"
The cipso_v4_doi_putdef() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03 16:22:12 -08:00
Shruti Kanetkar
132d7bcafa net/fsl_pq_mdio: Document supported compatibles
The device tree binding(s) document has fallen out of sync with the
driver code. Update the list of supported devices to reflect current
driver capabilities

Change-Id: I440d8de2ee2d9c3b7b23e69b3da851cab18a4c9a
Signed-off-by: Shruti Kanetkar <Kanetkar.Shruti@gmail.com>
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03 16:11:39 -08:00
Markus Elfring
7d37d0c159 net: sctp: Deletion of an unnecessary check before the function call "kfree"
The kfree() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-By: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:29:43 -08:00
David S. Miller
193cdc4a04 Merge branch 'udpv6_lockless_send'
Vladislav Yasevich says:

====================
ipv6: Add lockless UDP send path

This series introduces a lockless UDPv6 send path similar to
what Herbert Xu did for IPv4 a while ago.

There are some difference from IPv4.  IPv6 caching for flow
label is a bit different, as well as it requires another cork
cork structure that holds the IPv6 ancillary data.

Please take a look.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:19 -08:00
Vlad Yasevich
32dce968dd ipv6: Allow for partial checksums on non-ufo packets
Currntly, if we are not doing UFO on the packet, all UDP
packets will start with CHECKSUM_NONE and thus perform full
checksum computations in software even if device support
IPv6 checksum offloading.

Let's start start with CHECKSUM_PARTIAL if the device
supports it and we are sending only a single packet at
or below mtu size.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:05 -08:00
Vlad Yasevich
03485f2adc udpv6: Add lockless sendmsg() support
This commit adds the same functionaliy to IPv6 that
commit 903ab86d19
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Mar 1 02:36:48 2011 +0000

    udp: Add lockless transmit path

added to IPv4.

UDP transmit path can now run without a socket lock,
thus allowing multiple threads to send to a single socket
more efficiently.
This is only used when corking/MSG_MORE is not used.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:04 -08:00
Vlad Yasevich
d39d938c82 ipv6: Introduce udpv6_send_skb()
Now that we can individually construct IPv6 skbs to send, add a
udpv6_send_skb() function to populate the udp header and send the
skb.  This allows udp_v6_push_pending_frames() to re-use this
function as well as enables us to add lockless sendmsg() support.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:04 -08:00
Vlad Yasevich
6422398c2a ipv6: introduce ipv6_make_skb
This commit is very similar to
commit 1c32c5ad6f
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Mar 1 02:36:47 2011 +0000

    inet: Add ip_make_skb and ip_finish_skb

It adds IPv6 version of the helpers ip6_make_skb and ip6_finish_skb.

The job of ip6_make_skb is to collect messages into an ipv6 packet
and poplulate ipv6 eader.  The job of ip6_finish_skb is to transmit
the generated skb.  Together they replicated the job of
ip6_push_pending_frames() while also provide the capability to be
called independently.  This will be needed to add lockless UDP sendmsg
support.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:04 -08:00
Vlad Yasevich
0bbe84a67b ipv6: Append sending data to arbitrary queue
Add the ability to append data to arbitrary queue.  This
will be needed later to implement lockless UDP sends.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:04 -08:00
Vlad Yasevich
366e41d977 ipv6: pull cork initialization into its own function.
Pull IPv6 cork initialization into its own function that
can be re-used.  IPv6 specific cork data did not have an
explicit data structure.  This patch creats eone so that
just ipv6 cork data can be as arguemts.  Also, since
IPv6 tries to save the flow label into inet_cork_full
tructure, pass the full cork.

Adjust ip6_cork_release() to take cork data structures.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:04 -08:00
Anish Bhatt
ba0c39cb98 cxgb4 : Improve IEEE DCBx support, other minor open-lldp fixes
* Add support for IEEE ets & pfc api.
* Fix bug that resulted in incorrect bandwidth percentage being returned for
  CEE peers
* Convert pfc enabled info from firmware format to what dcbnl expects before
  returning

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:54:35 -08:00
Arnd Bergmann
98830dd0fe net/tulip: don't warn about unknown ARM architecture
ARM has 32-byte cache lines, which according to the comment in
the init registers function seems to work best with the default
value of 0x4800 that is also used on sparc and parisc.

This adds ARM to the same list, to use that default but no
longer warn about it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:53:34 -08:00
Arnd Bergmann
4c0c46be90 net: hip04: add missing MODULE_LICENSE
The hip04 ethernet driver causes a new compile-time warning
when built as a loadable module:

WARNING: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/hisilicon/hip04_eth.o
see include/linux/module.h for more information

This adds the license as "GPL", which matches the header of the file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:51:03 -08:00
Florian Westphal
843c2fdf7a net: dctcp: loosen requirement to assert ECT(0) during 3WHS
One deployment requirement of DCTCP is to be able to run
in a DC setting along with TCP traffic. As Glenn Judd's
NSDI'15 paper "Attaining the Promise and Avoiding the Pitfalls
of TCP in the Datacenter" [1] (tba) explains, one way to
solve this on switch side is to split DCTCP and TCP traffic
in two queues per switch port based on the DSCP: one queue
soley intended for DCTCP traffic and one for non-DCTCP traffic.

For the DCTCP queue, there's the marking threshold K as
explained in commit e3118e8359 ("net: tcp: add DCTCP congestion
control algorithm") for RED marking ECT(0) packets with CE.
For the non-DCTCP queue, there's f.e. a classic tail drop queue.
As already explained in e3118e8359, running DCTCP at scale
when not marking SYN/SYN-ACK packets with ECT(0) has severe
consequences as for non-ECT(0) packets, traversing the RED
marking DCTCP queue will result in a severe reduction of
connection probability.

This is due to the DCTCP queue being dominated by ECT(0) traffic
and switches handle non-ECT traffic in the RED marking queue
after passing K as drops, where K is usually a low watermark
in order to leave enough tailroom for bursts. Splitting DCTCP
traffic among several queues (ECN and non-ECN queue) is being
considered a terrible idea in the network community as it
splits single flows across multiple network paths.

Therefore, commit e3118e8359 implements this on Linux as
ECT(0) marked traffic, as we argue that marking all packets
of a DCTCP flow is the only viable solution and also doesn't
speak against the draft.

However, recently, a DCTCP implementation for FreeBSD hit also
their mainline kernel [2]. In order to let them play well
together with Linux' DCTCP, we would need to loosen the
requirement that ECT(0) has to be asserted during the 3WHS as
not implemented in FreeBSD. This simplifies the ECN test and
lets DCTCP work together with FreeBSD.

Joint work with Daniel Borkmann.

  [1] https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/judd
  [2] 8ad8794452

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:48:55 -08:00
David S. Miller
6942241616 Merge branch 'net-timestamp'
Willem de Bruijn says:

====================
net-timestamp: blinding

Changes
  (v2 -> v3)
  - rebase only: v2 did not make it to patchwork / netdev
  (v1 -> v2)
  - fix capability check in patch 2
      this could be moved into net/core/sock.c as sk_capable_nouser()
  (rfc -> v1)
  - dropped patch 4: timestamp batching
      due to complexity, as discussed
  - dropped patch 5: default mode
      because it does not really cover all use cases, as discussed
  - added documentation
  - minor fix, see patch 2

Two issues were raised during recent timestamping discussions:
1. looping full packets on the error queue exposes packet headers
2. TCP timestamping with retransmissions generates many timestamps

This RFC patchset is an attempt at addressing both without breaking
legacy behavior.

Patch 1 reintroduces the "no payload" timestamp option, which loops
timestamps onto an empty skb. This reduces the pressure on SO_RCVBUF
from looping many timestamps. It does not reduce the number of recv()
calls needed to process them. The timestamp cookie mechanism developed
in http://patchwork.ozlabs.org/patch/427213/ did, but this is
considerably simpler.

Patch 2 then gives administrators the power to block all timestamp
requests that contain data by unprivileged users. I proposed this
earlier as a backward compatible workaround in the discussion of

  net-timestamp: pull headers for SOCK_STREAM
  http://patchwork.ozlabs.org/patch/414810/

Patch 3 only updates the txtimestamp example to test this option.
Verified that with option '-n', length is zero in all cases and
option '-I' (PKTINFO) stops working.
====================

Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:46:57 -08:00
Willem de Bruijn
2368592365 net-timestamp: no-payload option in txtimestamp test
Demonstrate how SOF_TIMESTAMPING_OPT_TSONLY can be used and
test the implementation.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:46:51 -08:00
Willem de Bruijn
b245be1f4d net-timestamp: no-payload only sysctl
Tx timestamps are looped onto the error queue on top of an skb. This
mechanism leaks packet headers to processes unless the no-payload
options SOF_TIMESTAMPING_OPT_TSONLY is set.

Add a sysctl that optionally drops looped timestamp with data. This
only affects processes without CAP_NET_RAW.

The policy is checked when timestamps are generated in the stack.
It is possible for timestamps with data to be reported after the
sysctl is set, if these were queued internally earlier.

No vulnerability is immediately known that exploits knowledge
gleaned from packet headers, but it may still be preferable to allow
administrators to lock down this path at the cost of possible
breakage of legacy applications.

Signed-off-by: Willem de Bruijn <willemb@google.com>

----

Changes
  (v1 -> v2)
  - test socket CAP_NET_RAW instead of capable(CAP_NET_RAW)
  (rfc -> v1)
  - document the sysctl in Documentation/sysctl/net.txt
  - fix access control race: read .._OPT_TSONLY only once,
        use same value for permission check and skb generation.
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:46:51 -08:00
Willem de Bruijn
49ca0d8bfa net-timestamp: no-payload option
Add timestamping option SOF_TIMESTAMPING_OPT_TSONLY. For transmit
timestamps, this loops timestamps on top of empty packets.

Doing so reduces the pressure on SO_RCVBUF. Payload inspection and
cmsg reception (aside from timestamps) are no longer possible. This
works together with a follow on patch that allows administrators to
only allow tx timestamping if it does not loop payload or metadata.

Signed-off-by: Willem de Bruijn <willemb@google.com>

----

Changes (rfc -> v1)
  - add documentation
  - remove unnecessary skb->len test (thanks to Richard Cochran)
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:46:51 -08:00
David Ahern
9766e97af1 net: rocker: Add support for retrieving port level statistics
Add support for retrieving port level statistics from device.
Hook is added for ethtool's stats functionality. For example,

$ ethtool -S eth3
NIC statistics:
     rx_packets: 12
     rx_bytes: 2790
     rx_dropped: 0
     rx_errors: 0
     tx_packets: 8
     tx_bytes: 728
     tx_dropped: 0
     tx_errors: 0

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:17:39 -08:00
David S. Miller
fe3ef61653 Merge branch 'switchdev_offload_flags'
Roopa Prabhu says:

====================
switchdev offload flags

This patch series introduces new offload flags for switchdev.
Kernel network subsystems can use this flag to accelerate
network functions by offloading to hw.

I expect that there will be need for subsystem specific feature
flag in the future.

This patch series currently only addresses bridge driver link
attribute offloads to hardware.

Looking at the current state of bridge l2 offload in the kernel,
    - flag 'self' is the way to directly manage the bridge device in hw via
      the ndo_bridge_setlink/ndo_bridge_getlink calls

    - flag 'master' is always used to manage the in kernel bridge devices
      via the same ndo_bridge_setlink/ndo_bridge_getlink calls

Today these are used separately. The nic offloads use hwmode "vepa/veb" to go
directly to hw with the "self" flag.

At this point i am trying not to introduce any new user facing flags/attributes.
In the model where we want the kernel bridging to be accelerated with
hardware, we very much want the bridge driver to be involved.

In this proposal,
- The offload flag/bit helps switch asic drivers to indicate that they
  accelerate the kernel networking objects/functions
- The user does not have to specify a new flag to do so. A bridge created with
  switch asic ports will be accelerated if the switch driver supports it.
- The user can continue to directly manage l2 in nics (ixgbe) using the
  existing hwmode/self flags
- It also does not stop users from using the 'self' flag to talk to the
  switch asic driver directly
- Involving the bridge driver makes sure the add/del notifications to user
  space go out after both kernel and hardware are programmed

(To selectively offload bridge port attributes,
example learning in hw only etc, we can introduce offload bits for
per bridge port flag attribute as in my previous patch
https://patchwork.ozlabs.org/patch/413211/. I have not included that in this
series)

v2
   - try a different name for the offload flag/bit
   - tries to solve the stacked netdev case by traversing the lowerdev
     list to reach the switch port

v3 -
    - Tested with bond as bridge port for the stacked device case.
      Includes a bond_fix_features change to not ignore the
      NETIF_F_HW_NETFUNC_OFFLOAD flag
    - Some checkpatch fixes

v4 -
    - rename flag to NETIF_F_HW_SWITCH_OFFLOAD
    - add ndo_bridge_setlink/dellink handlers in bond and team drivers as
      suggested by jiri.
    - introduce default ndo_dflt_netdev_switch_port_bridge_setlink/dellink
    handlers that masters can use to call offload api on lowerdevs.
====================

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
2015-02-01 23:16:40 -08:00
Roopa Prabhu
a16a8ee7f6 team: handle NETIF_F_HW_SWITCH_OFFLOAD flag and add ndo_bridge_setlink/dellink handlers
Currently ndo_bridge_setlink and ndo_bridge_dellink handlers point
to the default switchdev handlers

This follows my bonding driver changes.

I have only compile tested this patch. However similar
bonding code has been tested.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:34 -08:00
Roopa Prabhu
c158cba38c bonding: handle NETIF_F_HW_SWITCH_OFFLOAD flag and add ndo_bridge_setlink/dellink handlers
We want bond to pick up the offload flag if any of its slaves have it.

NETIF_F_HW_SWITCH_OFFLOAD flag is added to the mask, so that
netdev_increment_features does not ignore it.

This also adds ndo_bridge_setlink and ndo_bridge_dellink handlers.
These currently point to the default handlers provided by the
switchdev api.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:34 -08:00
Roopa Prabhu
eb0ac4207f rocker: set feature NETIF_F_HW_SWITCH_OFFLOAD
This patch sets the NETIF_F_HW_SWITCH_OFFLOAD feature flag on rocker ports

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:34 -08:00
Roopa Prabhu
68e331c785 bridge: offload bridge port attributes to switch asic if feature flag set
This patch adds support to set/del bridge port attributes in hardware from
the bridge driver.

With this, when the user sends a bridge setlink message with no flags or
master flags set,
   - the bridge driver ndo_bridge_setlink handler sets settings in the kernel
   - calls the swicthdev api to propagate the attrs to the switchdev
	hardware

   You can still use the self flag to go to the switch hw or switch port
   driver directly.

With this, it also makes sure a notification goes out only after the
attributes are set both in the kernel and hw.

The patch calls switchdev api only if BRIDGE_FLAGS_SELF is not set.
This is because the offload cases with BRIDGE_FLAGS_SELF are handled in
the caller (in rtnetlink.c).

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:34 -08:00
Roopa Prabhu
8a44dbb202 swdevice: add new apis to set and del bridge port attributes
This patch adds two new api's netdev_switch_port_bridge_setlink
and netdev_switch_port_bridge_dellink to offload bridge port attributes
to switch port

(The names of the apis look odd with 'switch_port_bridge',
but am more inclined to change the prefix of the api to something else.
Will take any suggestions).

The api's look at the NETIF_F_HW_SWITCH_OFFLOAD feature flag to
pass bridge port attributes to the port device.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:34 -08:00
Roopa Prabhu
add511b382 bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink
bridge flags are needed inside ndo_bridge_setlink/dellink handlers to
avoid another call to parse IFLA_AF_SPEC inside these handlers

This is used later in this series

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:33 -08:00
Roopa Prabhu
aafb3e98b2 netdev: introduce new NETIF_F_HW_SWITCH_OFFLOAD feature flag for switch device offloads
This is a high level feature flag for all switch asic offloads

switch drivers set this flag on switch ports. Logical devices like
bridge, bonds, vxlans can inherit this flag from their slaves/ports.

The patch also adds the flag to NETIF_F_ONE_FOR_ALL, so that it gets
propagated to the upperdevices (bridges and bonds).

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:33 -08:00
Sonic Zhang
b2dec116fb stmmac: DMA threshold mode or SF mode can be different among multiple device instance
- In tx_hard_error_bump_tc interrupt, tc should be bumped only when current
device instance is in DMA threshold mode. Check per device xstats.threshold
other than global tc.

- Set per device xstats.threshold to SF_DMA_MODE when current device
instance is set to SF mode.

v2-changes:
- fix ident style

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:14:24 -08:00
Hariprasad Shenai
3051fa617a cxgb4: Remove preprocessor check for CONFIG_CXGB4_DCB
In commit dc9daab226 ("cxgb4: Added support in debugfs to dump
sge_qinfo") a preprocessor check for CONFIG_CXGB4_DCB got added, which should
have been CONFIG_CHELSIO_T4_DCB. After adding the right preprocessor, build
fails due to missing function ethqset2pinfo. Fixing that as well.

V2: Updated description since the patch also fixes build failure

Reported-by: Paul Bolle <pebolle@tiscal.nl>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 22:38:00 -08:00
David S. Miller
2caabb3d2e Merge branch 'hso-next'
Olivier Sobrie says:

====================
hso: fix some problems in the disconnect path

These patches attempt to fix some problems I observed when the hso
device is disconnected.
Several patches of this serie are fixing crashes or memleaks when a
hso device is disconnected.
This serie of patches is based on v3.18.

changes in v2:
 - Last patch of the serie dropped since another patch fix the issue.
   See http://marc.info/?l=linux-usb&m=142186699418489 for more info.

 - Added an extra patch avoiding name conflicts for the rfkill interface.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:31 -08:00
Olivier Sobrie
38121067b1 hso: fix rfkill name conflicts
By using only the usb interface number for the rfkill name, we might
have a name conflicts in case two similar hso devices are connected.

In this patch, the name of the hso rfkill interface embed the value
of a counter that is incremented each time a new rfkill interface is
added.

Suggested-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:27 -08:00
Olivier Sobrie
cc491970f5 hso: add missing cancel_work_sync in disconnect()
For hso serial devices, two cancel_work_sync were missing in the
disconnect method.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:27 -08:00
Olivier Sobrie
301d3b7e10 hso: update serial_table in usb disconnect method
The serial_table is used to map the minor number of the usb serial device
to its associated context. The table is updated in the probe method and
in hso_serial_ref_free() which is called either from the tty cleanup
method or from the usb disconnect method.
This patch ensures that the serial_table is updated in the disconnect
method and no more from the cleanup method to avoid the following
potential race condition.

 - hso_disconnect() is called for usb interface "x". Because the serial
   port was open and because the cleanup method of the tty_port hasn't
   been called yet, hso_serial_ref_free() is not run.
 - hso_probe() is called and fails for a new hso serial usb interface
   "y". The function hso_free_interface() is called and iterates
   over the element of serial_table to find the device associated to
   the usb interface context.
   If the usb interface context of usb interface "y" has been created
   at the same place as for usb interface "x", then the cleanup
   functions are called for usb interfaces "x" and "y" and
   hso_serial_ref_free() is called for both interfaces.
 - release_tty() is called for serial port linked to usb interface "x"
   and possibly crash because the tty_port structure contained in the
   hso_device structure has been freed.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:27 -08:00
Olivier Sobrie
69b377b31b hso: move tty_unregister outside hso_serial_common_free()
The function hso_serial_common_free() is called either by the cleanup
method of the tty or by the usb disconnect method.
In the former case, the usb_disconnect() has been already called
and the sysfs group associated to the device has been removed.
By calling tty_unregister directly from the usb_disconnect() method,
we avoid a warning due to the removal of the sysfs group of the usb
device.

Example of warning:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 778 at fs/sysfs/group.c:225 sysfs_remove_group+0x50/0x94()
sysfs group c0645a88 not found for kobject 'ttyHS5'
Modules linked in:
CPU: 0 PID: 778 Comm: kworker/0:3 Tainted: G        W      3.18.0+ #105
Workqueue: events release_one_tty
[<c000dfe4>] (unwind_backtrace) from [<c000c014>] (show_stack+0x14/0x1c)
[<c000c014>] (show_stack) from [<c0016bac>] (warn_slowpath_common+0x5c/0x7c)
[<c0016bac>] (warn_slowpath_common) from [<c0016c60>] (warn_slowpath_fmt+0x30/0x40)
[<c0016c60>] (warn_slowpath_fmt) from [<c00ddd14>] (sysfs_remove_group+0x50/0x94)
[<c00ddd14>] (sysfs_remove_group) from [<c0221e44>] (device_del+0x30/0x190)
[<c0221e44>] (device_del) from [<c0221fb0>] (device_unregister+0xc/0x18)
[<c0221fb0>] (device_unregister) from [<c0221fec>] (device_destroy+0x30/0x3c)
[<c0221fec>] (device_destroy) from [<c01fe1dc>] (tty_unregister_device+0x2c/0x5c)
[<c01fe1dc>] (tty_unregister_device) from [<c029a428>] (hso_serial_common_free+0x2c/0x88)
[<c029a428>] (hso_serial_common_free) from [<c029a4c0>] (hso_serial_ref_free+0x3c/0xb8)
[<c029a4c0>] (hso_serial_ref_free) from [<c01ff430>] (release_one_tty+0x30/0x84)
[<c01ff430>] (release_one_tty) from [<c00271d4>] (process_one_work+0x21c/0x3c8)
[<c00271d4>] (process_one_work) from [<c0027758>] (worker_thread+0x3d8/0x560)
[<c0027758>] (worker_thread) from [<c002be4c>] (kthread+0xc0/0xcc)
[<c002be4c>] (kthread) from [<c0009630>] (ret_from_fork+0x14/0x24)
---[ end trace cb88537fdc8fa208 ]---

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:27 -08:00
Olivier Sobrie
26c1f1f544 hso: replace reset_device work by usb_queue_reset_device()
There is no need for a dedicated reset work in the hso driver since
there is already a reset work foreseen in usb_interface that does
the same.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:27 -08:00
Olivier Sobrie
f6516b697c hso: rename hso_dev into serial in hso_free_interface()
In other functions of the driver, variables of type "struct hso_serial"
are denoted by "serial" and variables of type "struct hso_device" are
denoted by "hso_dev". This patch makes the hso_free_interface()
consistent with these notations.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:27 -08:00
Olivier Sobrie
799276791f hso: fix small indentation error
Simply remove the useless extra tab.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:26 -08:00
Olivier Sobrie
2e6d01ff75 hso: fix memory leak in hso_create_rfkill()
When the rfkill interface was created, a buffer containing the name
of the rfkill node was allocated. This buffer was never freed when the
device disappears.

To fix the problem, we put the name given to rfkill_alloc() in
the hso_net structure.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:26 -08:00
Olivier Sobrie
295fc56f46 hso: fix memory leak when device disconnects
In the disconnect path, tx_buffer should freed like tx_data to avoid
a memory leak when the device disconnects.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:26 -08:00
Olivier Sobrie
29bd3bc119 hso: fix crash when device disappears while serial port is open
When the device disappear, the function hso_disconnect() is called to
perform cleanup. In the cleanup function, hso_free_interface() calls
tty_port_tty_hangup() in view of scheduling a work to hang up the tty if
needed. If the port was not open then hso_serial_ref_free() is called
directly to cleanup everything. Otherwise, hso_serial_ref_free() is called
when the last fd associated to the port is closed.

For each open port, tty_release() will call the close method,
hso_serial_close(), which drops the last kref and call
hso_serial_ref_free() which unregisters, destroys the tty port
and finally frees the structure in which the tty_port structure
is included. Later, in tty_release(), more precisely when release_tty()
is called, the tty_port previously freed is accessed to cancel
the tty buf workqueue and it leads to a crash.

In view of avoiding this crash, we add a cleanup method that is called
at the end of the hangup process and we drop the last kref in this
function when all the ports have been closed, when tty_port is no
more needed and when it is safe to free the structure containing the
tty_port structure.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:26 -08:00
Olivier Sobrie
3ac856c100 hso: remove useless header file timer.h
No timer related function is used in this driver.

Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 12:33:26 -08:00
Eric Dumazet
349c9e3c73 ipv4: icmp: use percpu allocation
Get rid of nr_cpu_ids and use modern percpu allocation.

Note that the sockets themselves are not yet allocated
using NUMA affinity.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31 17:48:18 -08:00
Karicheri, Muralidharan
58c11b5fae drivers: net: cpsw: make cpsw_ale.c a module to allow re-use on Keystone
NetCP on Keystone has cpsw ale function similar to other TI SoCs
and this driver is re-used. To allow both ti cpsw and keystone netcp
to re-use the driver, convert the cpsw ale to a module and configure
it through Kconfig option CONFIG_TI_CPSW_ALE. Currently it is statically
linked to both TI CPSW and NetCP and this causes issues when the above
drivers are built as dynamic modules. This patch addresses this issue

While at it, fix the Makefile and code to build both netcp_core and
netcp_ethss as dynamic modules. This is needed to support arm allmodconfig.
This also requires exporting of API calls provided by netcp_core so that
both the above can be dynamic modules.

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Tested-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31 17:33:07 -08:00
Kenneth Klette Jonassen
932eb7638a tcp: use SACK RTTs for CC
Current behavior only passes RTTs from sequentially acked data to CC.

If sender gets a combined ACK for segment 1 and SACK for segment 3, then the
computed RTT for CC is the time between sending segment 1 and receiving SACK
for segment 3.

Pass the minimum computed RTT from any acked data to CC, i.e. time between
sending segment 3 and receiving SACK for segment 3.

Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31 17:25:37 -08:00