Commit Graph

43494 Commits

Author SHA1 Message Date
Marcel Holtmann
9db5c62951 Bluetooth: Use command status event for Set IO Capability errors
In case of failure, the Set IO Capability command is suppose to return
command status and not command complete.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
56f787c502 Bluetooth: Fix wrong Get Clock Information return parameters
The address information of the Get Clock Information return parameters
is copying from a different memory location. It uses &cmd->param while
it actually needs to be cmd->param.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
5504c3a310 Bluetooth: Use individual flags for certain management events
Instead of hiding everything behind a general managment events flag,
introduce indivdual flags that allow fine control over which events are
send to a given management channel.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Johan Hedberg
37d3a1fab5 Bluetooth: mgmt: Fix sending redundant event for Advertising Instance
When an Advertising Instance is removed, the Advertising Removed event
shouldn't be sent to the same socket that issued the Remove
Advertising command (it gets a command complete event instead). The
mgmt_advertising_removed() function already has a parameter for
skipping a specific socket, but there was no code to propagate the
right value to this parameter. This patch fixes the issue by making
sure the intermediate hci_req_clear_adv_instance() function gets the
socket pointer.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
38ceaa00d0 Bluetooth: Add support for sending MGMT commands and events to monitor
This adds support for tracing all management commands and events via the
monitor interface.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
249fa1699f Bluetooth: Add support for sending MGMT open and close to monitor
This sends new notifications to the monitor support whenever a
management channel has been opened or closed. This allows tracing of
control channels really easily.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
03c979c471 Bluetooth: Introduce helper to pack mgmt version information
The mgmt version information will be also needed for the control
changell tracing feature. This provides a helper to pack them.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
70ecce91e3 Bluetooth: Store control socket cookie and comm information
To further allow unique identification and tracking of control socket,
store cookie and comm information when binding the socket.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
47b0f573f2 Bluetooth: Check SOL_HCI for raw socket options
The SOL_HCI level should be enforced when using socket options on the
HCI raw socket interface.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
Aristeu Rozanski
bd89bb6daa mac802154: use rate limited warnings for malformed frames
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-09-19 20:19:34 +02:00
Aristeu Rozanski
ca1de81aa2 mac802154: don't warn on unsupported frames
Just because we don't support certain types of frames yet doesn't mean
we have to flood the message log with warnings about "invalid" frames.

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-09-19 20:19:34 +02:00
Alexander Aring
5ddedce3b7 6lowpan: ndisc: no overreact if no short address is available
This patch removes handling to remove short address for a neigbour entry
if RS/RA/NS/NA doesn't contain a short address. If these messages
doesn't has any short address option, the existing short address from
ndisc cache will be used. The current behaviour will set that the
neigbour doesn't has a short address anymore.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-09-19 20:19:34 +02:00
Alexander Aring
abbcc341ad mac802154: set phy net namespace for new ifaces
This patch sets the net namespace when creating SoftMAC interfaces. This
is important if the namespace at phy layer was switched before.
Currently we losing interfaces in some namespace and it's not possible
to recover that.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-09-19 20:19:34 +02:00
Marcel Holtmann
e64c97b53b Bluetooth: Add combined LED trigger for controller power
Instead of just having a LED trigger for power on a specific controller,
this adds the LED trigger "bluetooth-power" that combines the power
states of all controllers into a single trigger. This simplifies the
trigger selection and also supports multiple controllers per host
system via a single LED.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-09-19 20:19:34 +02:00
David S. Miller
e867e87ae8 RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAV90ZzvSw1s6N8H32AQIBsg//StI2J/tyGGTvvVXoeWPAGeDZp8907Qbc
 O6671V3feft0HvenQ/uFfC9PdpRTYHTlQhWidk9/wTx5nvg/5b7HGHSD8ghPnCRj
 Jyu4XbpXM6vw650qSBpRAg5TLIAkSsrdjaANzFSYw70sX6eTLdKJF3YFyoDRD2Cc
 T2Y0XIUAcACl5/A/KIHRaKjTtmzhoc5Vih9lOWQRZ/dw2u0A6GdwN0qoFfHqXwvI
 t1S4MtSkXrO8D9uWqmFhTpRzSAzO57L7bccVJbr+OzdNAm/Bkicjrxyzo6YepDAg
 VRdYaNtKCR9vPRFb9vZFlZy3vyUHb/22mrgM6/V5GG5qd/NFSnouFZ/WGxLnv1Yt
 5X+F0qJXXgxSmaGt0f6cV7d9OO24HU/YWGl7wUCHsMZ4bWDySBUWtk0X4USpCsiD
 +AUxYJTEUNKZcCsW8XmC+cfnkRLNPycX603/pCH7aiPdQ5qY+ue89ICxu3PrJrf/
 S8PuQPIoEWmEdTQ9QweG9FpIuQc2LO9fmzOJdiZXHYbGAKFOfNNz5Z64tSVRvPVE
 WbxW0HYSNCJmBt15OI3hyONmbgACgmuD+e3EFWzCATz94FwMogLBFNqPcw1YpbwX
 48RdcP62gx7FMZ4MdqfWllviZKjTlTVmacCRghwGFBOuZL3ifu1gfhsTbrAxqXCs
 CSlnXgZV7gI=
 =/2cJ
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20160917-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Tracepoint addition and improvement

Here is a set of patches that add some more tracepoints and improve a couple
of existing ones.  New additions include:

 (1) Connection refcount tracking.

 (2) Client connection state machine tracking.

 (3) Tx and Rx packet lifecycle.

 (4) ACK reception and transmission.

 (5) recvmsg processing.

Updates include:

 (1) Print the symbolic packet name in the Rx packet tracepoint.

 (2) Additional call refcount trace events.

 (3) Improvements to sk_buff tracking with AF_RXRPC.

In addition:

 (1) Config option to inject packet loss during both transmission and
     reception.

 (2) Removal of some printks.

This series needs to be applied on top of the previously posted fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:52:21 -04:00
David S. Miller
5b0c6fc8ef RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAV90ZnPSw1s6N8H32AQKguA//Q7lvTa3n3DFvMQcIPsyJZ6VniUqksTA1
 wmQrw4GHXRUgM8UWz7G9Y5aqxUp2q6y6Vm9BeHkQ2bYZjSrOx5Dc/AImWhBn5au+
 h+HZYcEs4mFM5AoVT7GK8o/nNODDjNt2qwcH/Nf8+SM7Xf52zYHelrSteLZZ1YWO
 S1m4pa5YUBa/ICD3+K52lWq9SCG0VMmy41UHDXU6uSakfN62rn9ZYCcNeathlJGS
 2D0cG3GzYYHiBZ1CkmdPQgGQhTM3wzI+0OvbNnidFlF78zVDlxX8C9zgWs/VlTCg
 02ok4+ftzDojgXH9W+DziEiyCNq14GTDbrdSai5WA+vHVGagr6OoSwDWgPHkXBvW
 pYQh7jqRoBOKN2fsVkU0t19hPc2CCVGYMh49A6AFv8lgS+nWoDXOlmZ0snh8deQg
 Z0HO5mx+V+4yJplBlwH6ncvbRB9ywpsvIuLriZXC/aJg6aY4a8nrU35d1+6xUaM7
 RBMud0uj+7oU+sC9N7CuM8m8HpBOg6+qAsbsfATSwadMRcMdS4LSoXcBg0WvljIH
 JmtL924yEnMDw1yPkmuDBcQ9K6DuxeOYZg4A2756tBtGulxuVjntmI1MVAQlsbqH
 CnNPWpxIDoLRQsHVcYWS5O1F16drGobzFhmj7Hf/6HmGa28x7nQhDafzfFj/3Dos
 MAdM2pdO2x8=
 =MjVT
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20160917-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fixes & miscellany

Here are some more AF_RXRPC fix patches with a couple of miscellaneous
changes also.  Fixes include:

 (1) Make RxRPC IPv6 support conditional on IPv6 being available.

 (2) Move the condition check in rxrpc_locate_data() into the caller and
     check the error return.

 (3) Fix the detection of the last received packet in recvmsg.

 (4) Account calls that need acceptance and clean up any unaccepted ones if
     the socket gets closed.

 (5) Fix the cleanup of client connections.

 (6) Fix the soft-ACK parsing and the retransmission of packets based on
     those ACKs.

 (7) Suppress transmission of an ACK when there's no pending ACK to
     transmit because another thread stole it.

And some miscellany:

 (8) Whitespace removal.

 (9) Switch-value consistency in rxrpc_send_call_packet().

(10) Fix the basic transmission packet size to allow for spur-of-the-moment
     jumbo DATA packet production.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:51:21 -04:00
Florian Westphal
48da34b7a7 sched: add and use qdisc_skb_head helpers
This change replaces sk_buff_head struct in Qdiscs with new qdisc_skb_head.

Its similar to the skb_buff_head api, but does not use skb->prev pointers.

Qdiscs will commonly enqueue at the tail of a list and dequeue at head.
While skb_buff_head works fine for this, enqueue/dequeue needs to also
adjust the prev pointer of next element.

The ->prev pointer is not required for qdiscs so we can just leave
it undefined and avoid one cacheline write access for en/dequeue.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
ed760cb8aa sched: replace __skb_dequeue with __qdisc_dequeue_head
After previous patch these functions are identical.
Replace __skb_dequeue in qdiscs with __qdisc_dequeue_head.

Next patch will then make __qdisc_dequeue_head handle
single-linked list instead of strcut sk_buff_head argument.

Doesn't change generated code.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
ec32336879 sched: remove qdisc arg from __qdisc_dequeue_head
Moves qdisc stat accouting to qdisc_dequeue_head.

The only direct caller of the __qdisc_dequeue_head version open-codes
this now.

This allows us to later use __qdisc_dequeue_head as a replacement
of __skb_dequeue() (which operates on sk_buff_head list).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
97d0678f91 sched: don't use skb queue helpers
A followup change will replace the sk_buff_head in the qdisc
struct with a slightly different list.

Use of the sk_buff_head helpers will thus cause compiler
warnings.

Open-code these accesses in an extra change to ease review.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
1486587b2f pie: use qdisc_dequeue_head wrapper
Doesn't change generated code.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Christophe Jaillet
e8bc8f9a67 sctp: Remove some redundant code
In commit 311b21774f ("sctp: simplify sk_receive_queue locking"), a call
to 'skb_queue_splice_tail_init()' has been made explicit. Previously it was
hidden in 'sctp_skb_list_tail()'

Now, the code around it looks redundant. The '_init()' part of
'skb_queue_splice_tail_init()' should already do the same.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:34:01 -04:00
Mahesh Bandewar
e8bffe0cf9 net: Add _nf_(un)register_hooks symbols
Add _nf_register_hooks() and _nf_unregister_hooks() calls which allow
caller to hold RTNL mutex.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:25:22 -04:00
Mahesh Bandewar
d409b84768 ipv6: Export p6_route_input_lookup symbol
Make ip6_route_input_lookup available outside of ipv6 the module
similar to ip_route_input_noref in the IPv4 world.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:25:22 -04:00
Nogah Frankel
69ae6ad2ff net: core: Add offload stats to if_stats_msg
Add a nested attribute of offload stats to if_stats_msg
named IFLA_STATS_LINK_OFFLOAD_XSTATS.
Under it, add SW stats, meaning stats only per packets that went via
slowpath to the cpu, named IFLA_OFFLOAD_XSTATS_CPU_HIT.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:33:42 -04:00
David S. Miller
c13ed534b8 This time we have various things - all across the board:
* MU-MIMO sniffer support in mac80211
  * a create_singlethread_workqueue() cleanup
  * interface dump filtering that was documented but not implemented
  * support for the new radiotap timestamp field
  * send delBA in two unexpected conditions (as required by the spec)
  * connect keys cleanups - allow only WEP with index 0-3
  * per-station aggregation limit to work around broken APs
  * debugfs improvement for the integrated codel algorithm
 and various other small improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJX2+umAAoJEGt7eEactAAdIMkP/jMmpbxkzD64L7nTkO4APGva
 r6RmMM1SmgVD/CtVkjlBLuvo5YOTWv/vWvy6KoUESOINAx/e6T3T7bmmCOXzbsOL
 e5/YYcS1AOqgn5SdhgIj1E5cpdYIhlUGRlNJ0qEjeLLrh4/TLUNbCcuPhOYybUMz
 fUrdPKgDeWb7x9EHLENhPsVtCXWwKnkDIS4qclPZCWgRj46XM4pNB4OlvCUzGY6k
 bOqGJfrtjYjgKFDmPFqfYA4JDA56980qqO41+eEKXeMvDKNs+pSiNco130Q+uU3E
 o7tk9DMnAnCy2GihpV1ZYVkLr6O+7o9xVuenj3NRlhyd1mn2gXxLcO4AkHcrZBkf
 Ei+2L+KgnWELyqiSOaGTJKlugsgS4DDoNnFEIVjSweQ9DIoBA/Gj/6+4uZeHXJ3M
 bEjtHnCLi5CuI067uBoevwXFoMi1poWra2KnZKOZzFS5OL3xHv4//x/Wmnn2/5Jz
 ffEwVyRmTY76sLWfnwXUDClrFWAYQrpNyTryc+k3cpYKzhnseiqt+z43cBuISm00
 uh5B9PpPB8RhtUnXrL/SHRyf8YEluaidTsI2lc1LvwXOc0+Zbp73mTCgP+rzLs9p
 K2qVRiozpIXanW6hKmmaDwjKlcAKKLP0xN2v90MqwQt4YdLIKlXnll1AH2BawzuP
 OWB3n8D0I6y0PWH+Yo8o
 =s1MY
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2016-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
This time we have various things - all across the board:
 * MU-MIMO sniffer support in mac80211
 * a create_singlethread_workqueue() cleanup
 * interface dump filtering that was documented but not implemented
 * support for the new radiotap timestamp field
 * send delBA in two unexpected conditions (as required by the spec)
 * connect keys cleanups - allow only WEP with index 0-3
 * per-station aggregation limit to work around broken APs
 * debugfs improvement for the integrated codel algorithm
and various other small improvements and cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:29:08 -04:00
Eric Dumazet
695b4ec0f0 pkt_sched: fq: use proper locking in fq_dump_stats()
When fq is used on 32bit kernels, we need to lock the qdisc before
copying 64bit fields.

Otherwise "tc -s qdisc ..." might report bogus values.

Fixes: afe4fd0624 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:15:08 -04:00
Thadeu Lima de Souza Cascardo
db74a3335e openvswitch: use percpu flow stats
Instead of using flow stats per NUMA node, use it per CPU. When using
megaflows, the stats lock can be a bottleneck in scalability.

On a E5-2690 12-core system, usual throughput went from ~4Mpps to
~15Mpps when forwarding between two 40GbE ports with a single flow
configured on the datapath.

This has been tested on a system with possible CPUs 0-7,16-23. After
module removal, there were no corruption on the slab cache.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Cc: pravin shelar <pshelar@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:14:01 -04:00
Thadeu Lima de Souza Cascardo
40773966cc openvswitch: fix flow stats accounting when node 0 is not possible
On a system with only node 1 as possible, all statistics is going to be
accounted on node 0 as it will have a single writer.

However, when getting and clearing the statistics, node 0 is not going
to be considered, as it's not a possible node.

Tested that statistics are not zero on a system with only node 1
possible. Also compile-tested with CONFIG_NUMA off.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:14:01 -04:00
Xin Long
41001faf95 sctp: not return ENOMEM err back in sctp_packet_transmit
As David and Marcelo's suggestion, ENOMEM err shouldn't return back to
user in transmit path. Instead, sctp's retransmit would take care of
the chunks that fail to send because of ENOMEM.

This patch is only to do some release job when alloc_skb fails, not to
return ENOMEM back any more.

Besides, it also cleans up sctp_packet_transmit's err path, and fixes
some issues in err path:

 - It didn't free the head skb in nomem: path.
 - No need to check nskb in no_route: path.
 - It should goto err: path if alloc_skb fails for head.
 - Not all the NOMEMs should free nskb.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:02:33 -04:00
Xin Long
83dbc3d4a3 sctp: make sctp_outq_flush/tail/uncork return void
sctp_outq_flush return value is meaningless now, this patch is
to make sctp_outq_flush return void, as well as sctp_outq_fail
and sctp_outq_uncork.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:02:33 -04:00
Xin Long
645194409b sctp: save transmit error to sk_err in sctp_outq_flush
Every time when sctp calls sctp_outq_flush, it sends out the chunks of
control queue, retransmit queue and data queue. Even if some trunks are
failed to transmit, it still has to flush all the transports, as it's
the only chance to clean that transmit_list.

So the latest transmit error here should be returned back. This transmit
error is an internal error of sctp stack.

I checked all the places where it uses the transmit error (the return
value of sctp_outq_flush), most of them are actually just save it to
sk_err.

Except for sctp_assoc/endpoint_bh_rcv, they will drop the chunk if
it's failed to send a REPLY, which is actually incorrect, as we can't
be sure the error that sctp_outq_flush returns is from sending that
REPLY.

So it's meaningless for sctp_outq_flush to return error back.

This patch is to save transmit error to sk_err in sctp_outq_flush, the
new error can update the old value. Eventually, sctp_wait_for_* would
check for it.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:02:32 -04:00
Xin Long
b61c654f9b sctp: free msg->chunks when sctp_primitive_SEND return err
Last patch "sctp: do not return the transmit err back to sctp_sendmsg"
made sctp_primitive_SEND return err only when asoc state is unavailable.
In this case, chunks are not enqueued, they have no chance to be freed if
we don't take care of them later.

This Patch is actually to revert commit 1cd4d5c432 ("sctp: remove the
unused sctp_datamsg_free()"), commit 69b5777f2e ("sctp: hold the chunks
only after the chunk is enqueued in outq") and commit 8b570dc9f7 ("sctp:
only drop the reference on the datamsg after sending a msg"), to use
sctp_datamsg_free to free the chunks of current msg.

Fixes: 8b570dc9f7 ("sctp: only drop the reference on the datamsg after sending a msg")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:02:32 -04:00
Xin Long
66388f2c08 sctp: do not return the transmit err back to sctp_sendmsg
Once a chunk is enqueued successfully, sctp queues can take care of it.
Even if it is failed to transmit (like because of nomem), it should be
put into retransmit queue.

If sctp report this error to users, it confuses them, they may resend
that msg, but actually in kernel sctp stack is in charge of retransmit
it already.

Besides, this error probably is not from the failure of transmitting
current msg, but transmitting or retransmitting another msg's chunks,
as sctp_outq_flush just tries to send out all transports' chunks.

This patch is to make sctp_cmd_send_msg return avoid, and not return the
transmit err back to sctp_sendmsg

Fixes: 8b570dc9f7 ("sctp: only drop the reference on the datamsg after sending a msg")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:02:32 -04:00
Xin Long
2c89791eeb sctp: remove the unnecessary state check in sctp_outq_tail
Data Chunks are only sent by sctp_primitive_SEND, in which sctp checks
the asoc's state through statetable before calling sctp_outq_tail. So
there's no need to check the asoc's state again in sctp_outq_tail.

Besides, sctp_do_sm is protected by lock_sock, even if sending msg is
interrupted by timer events, the event's processes still need to acquire
lock_sock first. It means no others CMDs can be enqueue into side effect
list before CMD_SEND_MSG to change asoc->state, so it's safe to remove it.

This patch is to remove redundant asoc->state check from sctp_outq_tail.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:02:32 -04:00
Alexei Starovoitov
8d79266bc4 ip6_tunnel: add collect_md mode to IPv6 tunnels
Similar to gre, vxlan, geneve tunnels allow IPIP6 and IP6IP6 tunnels
to operate in 'collect metadata' mode.
Unlike ipv4 code here it's possible to reuse ip6_tnl_xmit() function
for both collect_md and traditional tunnels.
bpf_skb_[gs]et_tunnel_key() helpers and ovs (in the future) are the users.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17 10:13:07 -04:00
Alexei Starovoitov
cfc7381b30 ip_tunnel: add collect_md mode to IPIP tunnel
Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
operate in 'collect metadata' mode.
bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
ovs can use it as well in the future (once appropriate ovs-vport
abstractions and user apis are added).
Note that just like in other tunnels we cannot cache the dst,
since tunnel_info metadata can be different for every packet.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17 10:13:07 -04:00
Julia Lawall
eb94737d71 l2tp: constify net_device_ops structures
Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure.  This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p = { ... };

@ok@
identifier r.i;
struct net_device e;
position p;
@@
e.netdev_ops = &i@p;

@bad@
position p != {r.p,ok.p};
identifier r.i;
struct net_device_ops e;
@@
e@i@p

@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
 struct net_device_ops i = { ... };
// </smpl>

The result of size on this file before the change is:
   text	      data     bss     dec         hex	  filename
   3401        931      44    4376        1118	net/l2tp/l2tp_eth.o

and after the change it is:
   text	     data        bss	    dec	    hex	filename
   3993       347         44       4384    1120	net/l2tp/l2tp_eth.o

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17 10:07:23 -04:00
Alan Cox
5ff904d55d llc: switch type to bool as the timeout is only tested versus 0
(As asked by Dave in Februrary)

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17 10:05:05 -04:00
Eric Dumazet
3613b3dbd1 tcp: prepare skbs for better sack shifting
With large BDP TCP flows and lossy networks, it is very important
to keep a low number of skbs in the write queue.

RACK and SACK processing can perform a linear scan of it.

We should avoid putting any payload in skb->head, so that SACK
shifting can be done if needed.

With this patch, we allow to pack ~0.5 MB per skb instead of
the 64KB initially cooked at tcp_sendmsg() time.

This gives a reduction of number of skbs in write queue by eight.
tcp_rack_detect_loss() likes this.

We still allow payload in skb->head for first skb put in the queue,
to not impact RPC workloads.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17 10:05:05 -04:00
David Howells
8a681c3605 rxrpc: Add config to inject packet loss
Add a configuration option to inject packet loss by discarding
approximately every 8th packet received and approximately every 8th DATA
packet transmitted.

Note that no locking is used, but it shouldn't really matter.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:04 +01:00
David Howells
71f3ca408f rxrpc: Improve skb tracing
Improve sk_buff tracing within AF_RXRPC by the following means:

 (1) Use an enum to note the event type rather than plain integers and use
     an array of event names rather than a big multi ?: list.

 (2) Distinguish Rx from Tx packets and account them separately.  This
     requires the call phase to be tracked so that we know what we might
     find in rxtx_buffer[].

 (3) Add a parameter to rxrpc_{new,see,get,free}_skb() to indicate the
     event type.

 (4) A pair of 'rotate' events are added to indicate packets that are about
     to be rotated out of the Rx and Tx windows.

 (5) A pair of 'lost' events are added, along with rxrpc_lose_skb() for
     packet loss injection recording.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:04 +01:00
David Howells
ba39f3a0ed rxrpc: Remove printks from rxrpc_recvmsg_data() to fix uninit var
Remove _enter/_debug/_leave calls from rxrpc_recvmsg_data() of which one
uses an uninitialised variable.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:04 +01:00
David Howells
849979051c rxrpc: Add a tracepoint to follow what recvmsg does
Add a tracepoint to follow what recvmsg does within AF_RXRPC.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
58dc63c998 rxrpc: Add a tracepoint to follow packets in the Rx buffer
Add a tracepoint to follow the life of packets that get added to a call's
receive buffer.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
f3639df2d9 rxrpc: Add a tracepoint to log ACK transmission
Add a tracepoint to log information about ACK transmission.

Signed-off-by: David Howels <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
ec71eb9ada rxrpc: Add a tracepoint to log received ACK packets
Add a tracepoint to log information from received ACK packets.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
a124fe3ee5 rxrpc: Add a tracepoint to follow the life of a packet in the Tx buffer
Add a tracepoint to follow the insertion of a packet into the transmit
buffer, its transmission and its rotation out of the buffer.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
363deeab6d rxrpc: Add connection tracepoint and client conn state tracepoint
Add a pair of tracepoints, one to track rxrpc_connection struct ref
counting and the other to track the client connection cache state.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
a84a46d730 rxrpc: Add some additional call tracing
Add additional call tracepoint points for noting call-connected,
call-released and connection-failed events.

Also fix one tracepoint that was using an integer instead of the
corresponding enum value as the point type.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
a3868bfc8d rxrpc: Print the packet type name in the Rx packet trace
Print a symbolic packet type name for each valid received packet in the
trace output, not just a number.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 11:24:03 +01:00
David Howells
182f505624 rxrpc: Fix the basic transmit DATA packet content size at 1412 bytes
Fix the basic transmit DATA packet content size at 1412 bytes so that they
can be arbitrarily assembled into jumbo packets.

In the future, I'm thinking of moving to keeping a jumbo packet header at
the beginning of each packet in the Tx queue and creating the packet header
on the spot when kernel_sendmsg() is invoked.  That way, jumbo packets can
be assembled on the spur of the moment for (re-)transmission.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:54:32 +01:00
David Howells
2311e327cd rxrpc: Be consistent about switch value in rxrpc_send_call_packet()
rxrpc_send_call_packet() should use type in both its switch-statements
rather than using pkt->whdr.type.  This might give the compiler an easier
job of uninitialised variable checking.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:54:21 +01:00
David Howells
27d0fc431c rxrpc: Don't transmit an ACK if there's no reason set
Don't transmit an ACK if call->ackr_reason in unset.  There's the
possibility of a race between recvmsg() sending an ACK and the background
processing thread trying to send the same one.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:53:55 +01:00
David Howells
dfa7d92040 rxrpc: Fix retransmission algorithm
Make the retransmission algorithm use for-loops instead of do-loops and
move the counter increments into the for-statement increment slots.

Though the do-loops are slighly more efficient since there will be at least
one pass through the each loop, the counter increments are harder to get
right as the continue-statements skip them.

Without this, if there are any positive acks within the loop, the do-loop
will cycle forever because the counter increment is never done.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:53:21 +01:00
David Howells
d01dc4c3c1 rxrpc: Fix the parsing of soft-ACKs
The soft-ACK parser doesn't increment the pointer into the soft-ACK list,
resulting in the first ACK/NACK value being applied to all the relevant
packets in the Tx queue.  This has the potential to miss retransmissions
and cause excessive retransmissions.

Fix this by incrementing the pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:53:21 +01:00
David Howells
78883793f8 rxrpc: Fix unexposed client conn release
If the last call on a client connection is release after the connection has
had a bunch of calls allocated but before any DATA packets are sent (so
that it's not yet marked RXRPC_CONN_EXPOSED), an assertion will happen in
rxrpc_disconnect_client_call().

	af_rxrpc: Assertion failed - 1(0x1) >= 2(0x2) is false
	------------[ cut here ]------------
	kernel BUG at ../net/rxrpc/conn_client.c:753!

This is because it's expecting the conn to have been exposed and to have 2
or more refs - but this isn't necessarily the case.

Simply remove the assertion.  This allows the conn to be moved into the
inactive state and deleted if it isn't resurrected before the final put is
called.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:53:21 +01:00
David Howells
357f5ef646 rxrpc: Call rxrpc_release_call() on error in rxrpc_new_client_call()
Call rxrpc_release_call() on getting an error in rxrpc_new_client_call()
rather than trying to do the cleanup ourselves.  This isn't a problem,
provided we set RXRPC_CALL_HAS_USERID only if we actually add the call to
the calls tree as cleanup code fragments that would otherwise cause
problems are conditional.

Without this, we miss some of the cleanup.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:53:21 +01:00
David Howells
66d58af7f4 rxrpc: Fix the putting of client connections
In rxrpc_put_one_client_conn(), if a connection has RXRPC_CONN_COUNTED set
on it, then it's accounted for in rxrpc_nr_client_conns and may be on
various lists - and this is cleaned up correctly.

However, if the connection doesn't have RXRPC_CONN_COUNTED set on it, then
the put routine returns rather than just skipping the extra bit of cleanup.

Fix this by making the extra bit of clean up conditional instead and always
killing off the connection.

This manifests itself as connections with a zero usage count hanging around
in /proc/net/rxrpc_conns because the connection allocated, but discarded,
due to a race with another process that set up a parallel connection, which
was then shared instead.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:53:20 +01:00
David Howells
0360da6db7 rxrpc: Purge the to_be_accepted queue on socket release
Purge the queue of to_be_accepted calls on socket release.  Note that
purging sock_calls doesn't release the ref owned by to_be_accepted.

Probably the sock_calls list is redundant given a purges of the recvmsg_q,
the to_be_accepted queue and the calls tree.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:51:54 +01:00
David Howells
e6f3afb3fc rxrpc: Record calls that need to be accepted
Record calls that need to be accepted using sk_acceptq_added() otherwise
the backlog counter goes negative because sk_acceptq_removed() is called.
This causes the preallocator to malfunction.

Calls that are preaccepted by AFS within the kernel aren't affected by
this.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:51:54 +01:00
David Howells
816c9fce12 rxrpc: Fix handling of the last packet in rxrpc_recvmsg_data()
The code for determining the last packet in rxrpc_recvmsg_data() has been
using the RXRPC_CALL_RX_LAST flag to determine if the rx_top pointer points
to the last packet or not.  This isn't a good idea, however, as the input
code may be running simultaneously on another CPU and that sets the flag
*before* updating the top pointer.

Fix this by the following means:

 (1) Restrict the use of RXRPC_CALL_RX_LAST to the input routines only.
     There's otherwise a synchronisation problem between detecting the flag
     and checking tx_top.  This could probably be dealt with by appropriate
     application of memory barriers, but there's a simpler way.

 (2) Set RXRPC_CALL_RX_LAST after setting rx_top.

 (3) Make rxrpc_rotate_rx_window() consult the flags header field of the
     DATA packet it's about to discard to see if that was the last packet.
     Use this as the basis for ending the Rx phase.  This shouldn't be a
     problem because the recvmsg side of things is guaranteed to see the
     packets in order.

 (4) Make rxrpc_recvmsg_data() return 1 to indicate the end of the data if:

     (a) the packet it has just processed is marked as RXRPC_LAST_PACKET

     (b) the call's Rx phase has been ended.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:51:54 +01:00
David Howells
2e2ea51dec rxrpc: Check the return value of rxrpc_locate_data()
Check the return value of rxrpc_locate_data() in rxrpc_recvmsg_data().

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:50:49 +01:00
David Howells
4b22457c06 rxrpc: Move the check of rx_pkt_offset from rxrpc_locate_data() to caller
Move the check of rx_pkt_offset from rxrpc_locate_data() to the caller,
rxrpc_recvmsg_data(), so that it's more clear what's going on there.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:50:48 +01:00
David Howells
fabf920180 rxrpc: Remove some whitespace.
Remove a tab that's on a line that should otherwise be blank.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-17 10:50:15 +01:00
David Howells
d19127473a rxrpc: Make IPv6 support conditional on CONFIG_IPV6
Add CONFIG_AF_RXRPC_IPV6 and make the IPv6 support code conditional on it.
This is then made conditional on CONFIG_IPV6.

Without this, the following can be seen:

   net/built-in.o: In function `rxrpc_init_peer':
>> peer_object.c:(.text+0x18c3c8): undefined reference to `ip6_route_output_flags'

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17 03:58:45 -04:00
Luca Coelho
fbd05e4a6e cfg80211: add helper to find an IE that matches a byte-array
There are a few places where an IE that matches not only the EID, but
also other bytes inside the element, needs to be found.  To simplify
that and reduce the amount of similar code, implement a new helper
function to match the EID and an extra array of bytes.

Additionally, simplify cfg80211_find_vendor_ie() by using the new
match function.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-16 14:49:52 +02:00
Emmanuel Grumbach
c68df2e7be mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE
In 46fa38e84b ("mac80211: allow software PS-Poll/U-APSD with
AP_LINK_PS"), Johannes allowed to use mac80211's code for handling
stations that go to PS or send PS-Poll / uAPSD trigger frames for
devices that enable RSS.

This means that mac80211 doesn't look at frames anymore but rather
relies on a notification that will come from the device when a PS
transition occurs or when a PS-Poll / trigger frame is detected by
the device.

iwlwifi will need this capability but still needs mac80211 to take
care of the TIM IE. Today, if a driver sets AP_LINK_PS, mac80211
will not update the TIM IE. Change mac80211 to check existence of
the set_tim driver callback rather than using AP_LINK_PS to decide
if the driver handles the TIM IE internally or not.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
[reword commit message a bit]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-16 14:49:23 +02:00
John Crispin
cafdc45c94 net-next: dsa: add Qualcomm tag RX/TX handler
Add support for the 2-bytes Qualcomm tag that gigabit switches such as
the QCA8337/N might insert when receiving packets, or that we need
to insert while targeting specific switch ports. The tag is inserted
directly behind the ethernet header.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-16 04:31:51 -04:00
Eric Dumazet
76f0dcbb5a tcp: fix a stale ooo_last_skb after a replace
When skb replaces another one in ooo queue, I forgot to also
update tp->ooo_last_skb as well, if the replaced skb was the last one
in the queue.

To fix this, we simply can re-use the code that runs after an insertion,
trying to merge skbs at the right of current skb.

This not only fixes the bug, but also remove all small skbs that might
be a subset of the new one.

Example:

We receive segments 2001:3001,  4001:5001

Then we receive 2001:8001 : We should replace 2001:3001 with the big
skb, but also remove 4001:50001 from the queue to save space.

packetdrill test demonstrating the bug

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

+0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
+0.100 < . 1:1(0) ack 1 win 1024
+0 accept(3, ..., ...) = 4

+0.01 < . 1001:2001(1000) ack 1 win 1024
+0    > . 1:1(0) ack 1 <nop,nop, sack 1001:2001>

+0.01 < . 1001:3001(2000) ack 1 win 1024
+0    > . 1:1(0) ack 1 <nop,nop, sack 1001:2001 1001:3001>

Fixes: 9f5afeae51 ("tcp: use an RB tree for ooo receive queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Yuchung Cheng <ycheng@google.com>
Cc: Yaogong Wang <wygivan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-16 04:09:49 -04:00
David S. Miller
364eac0c8b RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAV9h/5PSw1s6N8H32AQLJCQ//RFbu0SNSoJnnZbOTwkxBaGYnGg4KbNVt
 iR4zumQfFssyYr7WcH1S6kuPzM/dJfjkRYqollyUGCEfnWyDwyfnjM9Na9PQoZ9F
 k7xnbim8N65njHLdGF6QMhenmoRXSBVCN2E0uPTbBXurFHJ8ZgQQs+DhogalvGUl
 2TL/aMdpqRoo1Vg0/APVOKeLGqgHEhrXxelTZB/74IXyYT+rzjfzu+ZfwxUAijsM
 d+FBSwY+D8RYSV4LXQzMNNFCwNORbG2Rse2nEqd7bVqdVywWsuhbgeESjx1Y3+ge
 /mofVyxrpoblT9qsScbISbIQEe6cLxRiQgQHEudennRI2/3EbpNSijhNFWVon2Em
 NAa7r+tfOPtVx5JTL9NyvwtrXPfAgDi7Stpml3Yhhr/CjRHYK9kfKysowMXL5vOz
 NHD0fUozNLecpGCmdxG+alwf5BJ5q9DRPP7bI7KE/4FfVsYe8bO6pQ9G+myeP/A4
 h8DuvK4xSJUEpEa5dpDLA1wSC2XH5PgYdXIr2DFBaFjllIdf1cGNKKIYRF2eIX/I
 obVD7c72oZV1kIK3URTLJpE+CdA4KgTFuL3YqIxquA2Iedb1t2uOrQcp4WwaWf7V
 REY9KbBn1F0+yJfO3Fjckerzle+MlAmrAHQpkZUduo5JzdRm3DY++YzswXQT5Fpr
 8S3T30nwY0k=
 =T1wh
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20160913-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Support IPv6

Here is a set of patches that add IPv6 support.  They need to be applied on
top of the just-posted miscellaneous fix patches.  They are:

 (1) Make autobinding of an unconnected socket work when sendmsg() is
     called to initiate a client call.

 (2) Don't specify the protocol when creating the client socket, but rather
     take the default instead.

 (3) Use rxrpc_extract_addr_from_skb() in a couple of places that were
     doing the same thing manually.  This allows the IPv6 address
     extraction to be done in fewer places.

 (4) Add IPv6 support.  With this, calls can be made to IPv6 servers from
     userspace AF_RXRPC programs; AFS, however, can't use IPv6 yet as the
     RPC calls need to be upgradeable.
====================

Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-16 01:57:19 -04:00
David S. Miller
39caa8bf6d RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAV9h5A/Sw1s6N8H32AQJOPA//UI0606GZV2zjGqvWYbwquxjhWbbiVfEx
 CB5BeiQjKs8MxrJeHT/+bh6Z1Y6YorkyrVCc7kI1RQ+yiN0hw49bhFfF9Kr46DBF
 gYI2VdiKjIFEgC9fTenLkhMDQC7Hhf9O50hzk9QcC4y7w1Lhytah97d9w+Df0ECy
 a2QLMe2Ad9K5qR08ih3yTH7+G9K1m4/iqIrON2Hd9Opb+oFJgOiixvUVPr9f/6Xd
 /2YeAPDy/2A1MQ2nNE+oSW4C5uD+mJICqjjSw9YyhYl31lIfwBZ7+DE9hjR1qCXj
 UzMJLKrutXQQ1U7/Fbbke6UU5yKVm1djQB1qTF8t1hCHp/q88E7T06UUU9oBDqe0
 98CjPofEXBcqn9hjrXIvJgxCEISTPHx9ikaq0i5yF/6pSHZ9G8gLUfrqbMwipkfk
 mXItd6HAHXhX7cS5u76v7I4c9u5olexX5cJ91/ibtOdsupiJTMLwCx4twR6knEcS
 /6SSqjklFL4f6HjuNlNJ8m2dB98DII+Ym0qo/ZQy4KUm/+0yzrkpGHvt32CR4wng
 qjtDN+KgxNss1duu4zkHgQe22u3iSRToxwydWTIQYY6tx4e08X1eSIFRL5ddYpEC
 bjnOtmniAyDP5YF1jRwFDLS3YzT9Uvrf0TVAOvU7/FjPh3KCGa8fn38xIbEsX6eI
 1uadG1bf9wg=
 =vHfH
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20160913-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Miscellaneous fixes

Here's a set of miscellaneous fix patches.  There are a couple of points of
note:

 (1) There is one non-fix patch that adjusts the call ref tracking
     tracepoint to make kernel API-held refs on calls more obvious.  This
     is a prerequisite for the patch that fixes prealloc refcounting.

 (2) The final patch alters how jumbo packets that partially exceed the
     receive window are handled.  Previously, space was being left in the
     Rx buffer for them, but this significantly hurts performance as the Rx
     window can't be increased to match the OpenAFS Tx window size.

     Instead, the excess subpackets are discarded and an EXCEEDS_WINDOW ACK
     is generated for the first.  To avoid the problem of someone trying to
     run the kernel out of space by feeding the kernel a series of
     overlapping maximal jumbo packets, we stop allowing jumbo packets on a
     call if we encounter more than three jumbo packets with duplicate or
     excessive subpackets.
====================

Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-16 01:52:20 -04:00
Lance Richardson
2679d04041 openvswitch: avoid deferred execution of recirc actions
The ovs kernel data path currently defers the execution of all
recirc actions until stack utilization is at a minimum.
This is too limiting for some packet forwarding scenarios due to
the small size of the deferred action FIFO (10 entries). For
example, broadcast traffic sent out more than 10 ports with
recirculation results in packet drops when the deferred action
FIFO becomes full, as reported here:

     http://openvswitch.org/pipermail/dev/2016-March/067672.html

Since the current recursion depth is available (it is already tracked
by the exec_actions_level pcpu variable), we can use it to determine
whether to execute recirculation actions immediately (safe when
recursion depth is low) or defer execution until more stack space is
available.

With this change, the deferred action fifo size becomes a non-issue
for currently failing scenarios because it is no longer used when
there are three or fewer recursions through ovs_execute_actions().

Suggested-by: Pravin Shelar <pshelar@ovn.org>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 20:35:52 -04:00
Or Gerlitz
a53d850a79 net/sched: cls_flower: Remove an unused field from the filter key structure
Commit c3f8324188 "net: Add full IPv6 addresses to flow_keys" added an
unused instance of struct flow_dissector_key_addrs into struct fl_flow_key,
remove it.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 20:27:23 -04:00
Or Gerlitz
aa72d70837 net/sched: cls_flower: Support masking for matching on tcp/udp ports
Add the definitions for src/dst udp/tcp port masks and use
them when setting && dumping the relevant keys.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 20:27:23 -04:00
Jamal Hadi Salim
86da71b573 net_sched: Introduce skbmod action
This action is intended to be an upgrade from a usability perspective
from pedit (as well as operational debugability).
Compare this:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action pedit munge offset -14 u8 set 0x02 \
munge offset -13 u8 set 0x15 \
munge offset -12 u8 set 0x15 \
munge offset -11 u8 set 0x15 \
munge offset -10 u16 set 0x1515 \
pipe

to:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod dmac 02:15:15:15:15:15

Also try to do a MAC address swap with pedit or worse
try to debug a policy with destination mac, source mac and
etherype. Then make few rules out of those and you'll get my point.

In the future common use cases on pedit can be migrated to this action
(as an example different fields in ip v4/6, transports like tcp/udp/sctp
etc). For this first cut, this allows modifying basic ethernet header.

The most important ethernet use case at the moment is when redirecting or
mirroring packets to a remote machine. The dst mac address needs a re-write
so that it doesnt get dropped or confuse an interconnecting (learning) switch
or dropped by a target machine (which looks at the dst mac). And at times
when flipping back the packet a swap of the MAC addresses is needed.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 19:33:47 -04:00
Daniel Borkmann
f53d8c7b18 bpf: use skb_at_tc_ingress helper in tcf_bpf
We have a small skb_at_tc_ingress() helper for testing for ingress, so
make use of it. cls_bpf already uses it and so should act_bpf.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 19:29:47 -04:00
Daniel Borkmann
04b3f8de4b bpf: drop unnecessary test in cls_bpf_classify and tcf_bpf
The skb_mac_header_was_set() test in cls_bpf's and act_bpf's fast-path is
actually unnecessary and can be removed altogether. This was added by
commit a166151cbe ("bpf: fix bpf helpers to use skb->mac_header relative
offsets"), which was later on improved by 3431205e03 ("bpf: make programs
see skb->data == L2 for ingress and egress"). We're always guaranteed to
have valid mac header at the time we invoke cls_bpf_classify() or tcf_bpf().

Reason is that since 6d1ccff627 ("net: reset mac header in dev_start_xmit()")
we do skb_reset_mac_header() in __dev_queue_xmit() before we could call
into sch_handle_egress() or any subsequent enqueue. sch_handle_ingress()
always sees a valid mac header as well (things like skb_reset_mac_len()
would badly fail otherwise). Thus, drop the unnecessary test in classifier
and action case.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 19:29:47 -04:00
Hadar Hen Zion
07c0f09e23 net/sched: act_tunnel_key: Remove rcu_read_lock protection
Remove rcu_read_lock protection from tunnel_key_dump and use
rtnl_dereference, dump operation is protected by  rtnl lock.

Also, remove rcu_read_lock from tunnel_key_release and use
rcu_dereference_protected.

Both operations are running exclusively and a writer couldn't modify
t->params while those functions are executed.

Fixes: 54d94fd89d90 ('net/sched: Introduce act_tunnel_key')
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 19:18:18 -04:00
Johannes Berg
ec53c832ee cfg80211: remove unnecessary pointer-of
For an array, there's no need to use &array, so just use the
plain wiphy->addresses[i].addr here to silence smatch.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:20 +02:00
Rajkumar Manoharan
e8a24cd4b8 mac80211: allow driver to handle packet-loss mechanism
Based on consecutive msdu failures, mac80211 triggers CQM packet-loss
mechanism. Drivers like ath10k that have its own connection monitoring
algorithm, offloaded to firmware for triggering station kickout. In case
of station kickout, driver will report low ack status by mac80211 API
(ieee80211_report_low_ack).

This flag will enable the driver to completely rely on firmware events
for station kickout and bypass mac80211 packet loss mechanism.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:20 +02:00
Johannes Berg
c7e9dbcf09 mac80211: remove sta_remove_debugfs driver callback
No drivers implement this, relying either on the recursive
directory removal to remove their debugfs, or not having any
to start with. Remove the dead driver callback.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:19 +02:00
Johannes Berg
8826fef95b mac80211: remove pointless chanctx NULL check
If chanctx is derived as container_of() from a non-NULL pointer,
it can't ever be NULL. Since we checked conf before, that's true
here, so remove the useless NULL check.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:19 +02:00
Johannes Berg
5140974dca mac80211: remove unused assignment
The next line overwrites this assignment, so remove it; there's
no real value in using it for the next assignment either.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:18 +02:00
Johannes Berg
53b18980fd nl80211: always check nla_put* return values
A few instances were found where we didn't check them, add the
missing checks even though they'll probably never trigger as
the message should be large enough here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:17 +02:00
Johannes Berg
76e1fb4b55 nl80211: always check nla_nest_start() return value
If the message got full during nla_nest_start(), it can return
NULL. None of the cases here seem like that can really happen,
but check the return value nonetheless.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:17 +02:00
Johannes Berg
58bd7f1158 mac80211: fix scan completed tracing
Passing the 'info' pointer where a 'info->aborted' is expected will
always lead to tracing to erroneously record that the scan was aborted,
fix that by passing the correct info->aborted. The remaining data will
be collected in cfg80211, so I haven't duplicated it here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:16 +02:00
Johannes Berg
93db1d9e6c mac80211: fix possible out-of-bounds access
In the unlikely situation that the supplicant has negotiated
admission for the background AC (which it has no reason to as
it's not supposed to be requiring admission control to start
with, and we'd ignore such a requirement anyway), the loop
here may terminate with non_acm_ac == 4, which leads to an
array overrun.

Check this explicitly just for completeness.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:46:16 +02:00
Johannes Berg
f1c1f17ac5 cfg80211: allow connect keys only with default (TX) key
There's no point in allowing connect keys when one of them
isn't also configured as the TX key, it would just confuse
drivers and probably cause them to pick something for TX.
Disallow this confusing and erroneous configuration.

As wpa_supplicant will always send NL80211_ATTR_KEYS, even
when there are no keys inside, allow that and treat it as
though the attribute isn't present at all.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-09-15 16:45:41 +02:00
David Howells
75b54cb57c rxrpc: Add IPv6 support
Add IPv6 support to AF_RXRPC.  With this, AF_RXRPC sockets can be created:

	service = socket(AF_RXRPC, SOCK_DGRAM, PF_INET6);

instead of:

	service = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);

The AFS filesystem doesn't support IPv6 at the moment, though, since that
requires upgrades to some of the RPC calls.

Note that a good portion of this patch is replacing "%pI4:%u" in print
statements with "%pISpc" which is able to handle both protocols and print
the port.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 23:09:13 +01:00
David Howells
1c2bc7b948 rxrpc: Use rxrpc_extract_addr_from_skb() rather than doing this manually
There are two places that want to transmit a packet in response to one just
received and manually pick the address to reply to out of the sk_buff.
Make them use rxrpc_extract_addr_from_skb() instead so that IPv6 is handled
automatically.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 23:09:13 +01:00
David Howells
aaa31cbc66 rxrpc: Don't specify protocol to when creating transport socket
Pass 0 as the protocol argument when creating the transport socket rather
than IPPROTO_UDP.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 23:09:13 +01:00
David Howells
cd5892c756 rxrpc: Create an address for sendmsg() to bind unbound socket with
Create an address for sendmsg() to bind unbound socket with rather than
using a completely blank address otherwise the transport socket creation
will fail because it will try to use address family 0.

We use the address family specified in the protocol argument when the
AF_RXRPC socket was created and SOCK_DGRAM as the default.  For anything
else, bind() must be used.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 23:09:13 +01:00
David Howells
75e4212639 rxrpc: Correctly initialise, limit and transmit call->rx_winsize
call->rx_winsize should be initialised to the sysctl setting and the sysctl
setting should be limited to the maximum we want to permit.  Further, we
need to place this in the ACK info instead of the sysctl setting.

Furthermore, discard the idea of accepting the subpackets of a jumbo packet
that lie beyond the receive window when the first packet of the jumbo is
within the window.  Just discard the excess subpackets instead.  This
allows the receive window to be opened up right to the buffer size less one
for the dead slot.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 22:38:45 +01:00
David Howells
3432a757b1 rxrpc: Fix prealloc refcounting
The preallocated call buffer holds a ref on the calls within that buffer.
The ref was being released in the wrong place - it worked okay for incoming
calls to the AFS cache manager service, but doesn't work right for incoming
calls to a userspace service.

Instead of releasing an extra ref service calls in rxrpc_release_call(),
the ref needs to be released during the acceptance/rejectance process.  To
this end:

 (1) The prealloc ref is now normally released during
     rxrpc_new_incoming_call().

 (2) For preallocated kernel API calls, the kernel API's ref needs to be
     released when the call is discarded on socket close.

 (3) We shouldn't take a second ref in rxrpc_accept_call().

 (4) rxrpc_recvmsg_new_call() needs to get a ref of its own when it adds
     the call to the to_be_accepted socket queue.

In doing (4) above, we would prefer not to put the call's refcount down to
0 as that entails doing cleanup in softirq context, but it's unlikely as
there are several refs held elsewhere, at least one of which must be put by
someone in process context calling rxrpc_release_call().  However, it's not
a problem if we do have to do that.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 22:38:37 +01:00
David Howells
cbd00891de rxrpc: Adjust the call ref tracepoint to show kernel API refs
Adjust the call ref tracepoint to show references held on a call by the
kernel API separately as much as possible and add an additional trace to at
the allocation point from the preallocation buffer for an incoming call.

Note that this doesn't show the allocation of a client call for the kernel
separately at the moment.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 22:38:30 +01:00
David Howells
01fd074224 rxrpc: Allow tx_winsize to grow in response to an ACK
Allow tx_winsize to grow when the ACK info packet shows a larger receive
window at the other end rather than only permitting it to shrink.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 22:38:24 +01:00
David Howells
89a80ed4c0 rxrpc: Use skb->len not skb->data_len
skb->len should be used rather than skb->data_len when referring to the
amount of data in a packet.  This will only cause a malfunction in the
following cases:

 (1) We receive a jumbo packet (validation and splitting both are wrong).

 (2) We see if there's extra ACK info in an ACK packet (we think it's not
     there and just ignore it).

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 22:36:22 +01:00
David Howells
b25de36053 rxrpc: Add missing unlock in rxrpc_call_accept()
Add a missing unlock in rxrpc_call_accept() in the path taken if there's no
call to wake up.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 22:36:22 +01:00
David Howells
33b603fda8 rxrpc: Requeue call for recvmsg if more data
rxrpc_recvmsg() needs to make sure that the call it has just been
processing gets requeued for further attention if the buffer has been
filled and there's more data to be consumed.  The softirq producer only
queues the call and wakes the socket if it fills the first slot in the
window, so userspace might end up sleeping forever otherwise, despite there
being data available.

This is not a problem provided the userspace buffer is big enough or it
empties the buffer completely before more data comes in.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-09-13 22:36:21 +01:00