Commit Graph

520465 Commits

Author SHA1 Message Date
Martin KaFai Lau
83a09abd1a ipv6: Break up ip6_rt_copy()
This patch breaks up ip6_rt_copy() into ip6_rt_copy_init() and
ip6_rt_cache_alloc().

In the later patch, we need to create a percpu rt6_info copy. Hence,
refactor the common rt6_info init codes to ip6_rt_copy_init().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:34 -04:00
Martin KaFai Lau
8d0b94afdc ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister
This patch keeps track of the DST_NOCACHE routes in a list and replaces its
dev with loopback during the iface down/unregister event.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:34 -04:00
Martin KaFai Lau
3da59bd945 ipv6: Create RTF_CACHE clone when FLOWI_FLAG_KNOWN_NH is set
This patch always creates RTF_CACHE clone with DST_NOCACHE
when FLOWI_FLAG_KNOWN_NH is set so that the rt6i_dst is set to
the fl6->daddr.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Julian Anastasov <ja@ssi.bg>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:34 -04:00
Martin KaFai Lau
48e8aa6e31 ipv6: Set FLOWI_FLAG_KNOWN_NH at flowi6_flags
The neighbor look-up used to depend on the rt6i_gateway (if
there is a gateway) or the rt6i_dst (if it is a RTF_CACHE clone)
as the nexthop address.  Note that rt6i_dst is set to fl6->daddr
for the RTF_CACHE clone where fl6->daddr is the one used to do
the route look-up.

Now, we only create RTF_CACHE clone after encountering exception.
When doing the neighbor look-up with a route that is neither a gateway
nor a RTF_CACHE clone, the daddr in skb will be used as the nexthop.

In some cases, the daddr in skb is not the one used to do
the route look-up.  One example is in ip_vs_dr_xmit_v6() where the
real nexthop server address is different from the one in the skb.

This patch is going to follow the IPv4 approach and ask the
ip6_pol_route() callers to set the FLOWI_FLAG_KNOWN_NH properly.

In the next patch, ip6_pol_route() will honor the FLOWI_FLAG_KNOWN_NH
and create a RTF_CACHE clone.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Julian Anastasov <ja@ssi.bg>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:34 -04:00
Martin KaFai Lau
b197df4f0f ipv6: Add rt6_get_cookie() function
Instead of doing the rt6->rt6i_node check whenever we need
to get the route's cookie.  Refactor it into rt6_get_cookie().
It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also
percpu rt6_info later.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:34 -04:00
Martin KaFai Lau
45e4fd2668 ipv6: Only create RTF_CACHE routes after encountering pmtu exception
This patch creates a RTF_CACHE routes only after encountering a pmtu
exception.

After ip6_rt_update_pmtu() has inserted the RTF_CACHE route to the fib6
tree, the rt->rt6i_node->fn_sernum is bumped which will fail the
ip6_dst_check() and trigger a relookup.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:33 -04:00
Martin KaFai Lau
8b9df26577 ipv6: Combine rt6_alloc_cow and rt6_alloc_clone
A prep work for creating RTF_CACHE on exception only.  After this
patch, the same condition (rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY))
is checked twice. This redundancy will be removed in the later patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:33 -04:00
Martin KaFai Lau
2647a9b070 ipv6: Remove external dependency on rt6i_gateway and RTF_ANYCAST
When creating a RTF_CACHE route, RTF_ANYCAST is set based on rt6i_dst.
Also, rt6i_gateway is always set to the nexthop while the nexthop
could be a gateway or the rt6i_dst.addr.

After removing the rt6i_dst and rt6i_src dependency in the last patch,
we also need to stop the caller from depending on rt6i_gateway and
RTF_ANYCAST.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:33 -04:00
Martin KaFai Lau
fd0273d793 ipv6: Remove external dependency on rt6i_dst and rt6i_src
This patch removes the assumptions that the returned rt is always
a RTF_CACHE entry with the rt6i_dst and rt6i_src containing the
destination and source address.  The dst and src can be recovered from
the calling site.

We may consider to rename (rt6i_dst, rt6i_src) to
(rt6i_key_dst, rt6i_key_src) later.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:32 -04:00
Martin KaFai Lau
286c2349f6 ipv6: Clean up ipv6_select_ident() and ip6_fragment()
This patch changes the ipv6_select_ident() signature to return a
fragment id instead of taking a whole frag_hdr as a param to
only set the frag_hdr->identification.

It also cleans up ip6_fragment() to obtain the fragment id at the
beginning instead of using multiple "if" later to check fragment id
has been generated or not.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 13:25:32 -04:00
Hariprasad Shenai
01b6961410 cxgb4: Add PHY firmware support for T420-BT cards
Add support for flashing 10GBaseT adapter with BCM 84834 PHY and
Aquantia AQ1202 PHY.

Updating of the PHY firmware must happen before the INITIALIZE_CMD.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:17:24 -04:00
Daniel Borkmann
3b52960266 test_bpf: add more eBPF jump torture cases
Add two more eBPF test cases for JITs, i.e. the second one revealed a
bug in the x86_64 JIT compiler, where only an int3 filled image from
the allocator was emitted and later wrongly set by the compiler as the
bpf_func program code since optimization pass boundary was surpassed
w/o actually emitting opcodes.

Interpreter:

  [   45.782892] test_bpf: #242 BPF_MAXINSNS: Very long jump backwards jited:0 11 PASS
  [   45.783062] test_bpf: #243 BPF_MAXINSNS: Edge hopping nuthouse jited:0 14705 PASS

After x86_64 JIT (fixed):

  [   80.495638] test_bpf: #242 BPF_MAXINSNS: Very long jump backwards jited:1 6 PASS
  [   80.495957] test_bpf: #243 BPF_MAXINSNS: Edge hopping nuthouse jited:1 17157 PASS

Reference: http://thread.gmane.org/gmane.linux.network/364729
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:15:18 -04:00
David S. Miller
cff497c870 Merge branch 'amd-xgbe-next'
Tom Lendacky says:

====================
amd-xgbe: AMD XGBE driver updates 2015-05-22

The following patches are included in this driver update series:

- Retrieve and set an additional hardware feature setting
- Fix the initial mode/speed determination when auto-negotiation is
  disabled
- Add additional netif_dbg support to the driver

This patch series is based on net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:13:58 -04:00
Lendacky, Thomas
d5c78399b0 amd-xgbe: Add more netif_dbg output to the driver
Change more netdev_dbg statements over to netif_dbg and add some new
netif_dbg statements to the driver.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:13:58 -04:00
Lendacky, Thomas
471e14b232 amd-xgbe: Fix initial mode when auto-negotiation is disabled
When the ethtool command is used to set the speed of the device while
the device is down, the check to set the initial mode may fail when
the device is brought up, causing failure to bring the device up.

Update the code to set the initial mode based on the desired speed if
auto-negotiation is disabled.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:13:58 -04:00
Lendacky, Thomas
73c259165e amd-xgbe: Add setting of a missing hardware feature
The device private data structure contains all the defined hardware
features for the device. However one of the features is not set. Even
though the feature is not currently used, set it to avoid future
issues of the feature being checked thinking it has been properly set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:13:58 -04:00
Florian Westphal
cf82624432 ip: reject too-big defragmented DF-skb when forwarding
Send icmp pmtu error if we find that the largest fragment of df-skb
exceeded the output path mtu.

The ip output path will still catch this later on but we can avoid the
forward/postrouting hook traversal by rejecting right away.

This is what ipv6 already does.

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:08:48 -04:00
David S. Miller
b10e3d6c2e Merge branch 'af_unix_sendpage'
Hannes Frederic Sowa says:

====================
net: af_unix: zerocopy stream bits

This series implements zerocopy support for AF_UNIX SOCK_STREAM sockets.

Changelog in the specific patches. Thanks to all the reviewers!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:06:59 -04:00
Hannes Frederic Sowa
2b514574f7 net: af_unix: implement splice for stream af_unix sockets
unix_stream_recvmsg is refactored to unix_stream_read_generic in this
patch and enhanced to deal with pipe splicing. The refactoring is
inneglible, we mostly have to deal with a non-existing struct msghdr
argument.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:06:59 -04:00
Hannes Frederic Sowa
a60e3cc7c9 net: make skb_splice_bits more configureable
Prepare skb_splice_bits to be able to deal with AF_UNIX sockets.

AF_UNIX sockets don't use lock_sock/release_sock and thus we have to
use a callback to make the locking and unlocking configureable.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:06:59 -04:00
Hannes Frederic Sowa
869e7c6248 net: af_unix: implement stream sendpage support
This patch implements sendpage support for AF_UNIX SOCK_STREAM
sockets. This is also required for a complete splice implementation.

The implementation is a bit tricky because we append to already existing
skbs and so have to hold unix_sk->readlock to protect the reading side
from either advancing UNIXCB.consumed or freeing the skb at the socket
receive tail.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:06:58 -04:00
Hannes Frederic Sowa
be12a1fe29 net: skbuff: add skb_append_pagefrags and use it
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-25 00:06:58 -04:00
David S. Miller
d98c3edcbb ath10k:
* enable channel 144 on 5 GHz
 * enable Adaptive Noise Immunity (ANI) by default
 * add Wake on Wireless LAN (WOW) patterns support
 * add basic Tunneled Direct Link Setup (TDLS) support
 * add multi-channel support for QCA6174
 * enable IBSS RSN support
 * enable Bluetooth Coexistance whenever firmware supports it
 * add more versatile way to set bitrates used by the firmware
 
 ath9k:
 
 * spectral scan: add support for multiple FFT frames per report
 
 iwlwifi:
 
 * major rework of the scan code (Luca)
 * some work on the thermal code (Chaya Rachel)
 * some work on the firwmare debugging infrastructure
 
 brcmfmac:
 
 * SDIO suspend and resume fixes
 * wiphy band info and changes in regulatory settings
 * add support for BCM4324 SDIO and BCM4358 PCIe
 * enable support of PCIe devices on router platforms (Hante)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJVXbP3AAoJEG4XJFUm622b5KoH/1qlTHsKcyvdxlhQOgYNGCXA
 HNMwcxtwFyRYHFeVTGOQp2BVknEoqWTwGv1m4FQ1pBSSwuUvAyw4BHNSRat/zaNc
 wLnZgUYKH5VHeoE/cpe/Asowau+u8hru1adPsVSjudTXMinKrNaDUfjSs2U+UR0+
 BaC3PtsANk7wH82+bZq3qXYjcaZITObDe3WBmMNMG0nTimS6pScgnTUnfHch+CEA
 0sTOlZF+QTGiH/c5tw2SAoRft4OG+oTnWYQ+vEEQsVev7Yegasa/kg4NdDVdjBNk
 9VH9aDlQfGgxodCoeJuQCDzUZL8ixnvYTLeUTxqypzx9Cw0TsLDwoMQA+Ux3G8w=
 =JSya
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2015-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
ath10k:

* enable channel 144 on 5 GHz
* enable Adaptive Noise Immunity (ANI) by default
* add Wake on Wireless LAN (WOW) patterns support
* add basic Tunneled Direct Link Setup (TDLS) support
* add multi-channel support for QCA6174
* enable IBSS RSN support
* enable Bluetooth Coexistance whenever firmware supports it
* add more versatile way to set bitrates used by the firmware

ath9k:

* spectral scan: add support for multiple FFT frames per report

iwlwifi:

* major rework of the scan code (Luca)
* some work on the thermal code (Chaya Rachel)
* some work on the firwmare debugging infrastructure

brcmfmac:

* SDIO suspend and resume fixes
* wiphy band info and changes in regulatory settings
* add support for BCM4324 SDIO and BCM4358 PCIe
* enable support of PCIe devices on router platforms (Hante)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:23:01 -04:00
David S. Miller
4029685acc Merge branch 'mlx4-next'
Or Gerlitz says:

====================
mlx4: Enable single ported VFs over IB ports

This series further enhances the support for mlx4 single ported VFs
introduced in 3.15 to work over IB ports too.

Just as quick reminder, the ConnectX3 device family exposes one PCI device
which serves both ports.

This can be non-optimal under virtualization schemes where the admin
would like the VF to expose one interface to the VM, etc.

Since all the VF interaction with the firmware passes through the PF
driver, we can emulate to the VF they have one port, and further create
a set of the VFs which act on port1 of the device and another set which
acts on port2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:05:10 -04:00
Or Gerlitz
be9b9eca25 net/mlx4_core: Enable single ported IB VFs
Remove the limitation that disallows configuring single ported VFs
in the presence of IB ports, after addressing the issues that
prevented that to work.

SMI (QP0) requests/responses are still not supported for single
ported IB VFs.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:05:10 -04:00
Or Gerlitz
e5dfbf9a79 net/mlx4_core: Adjust the schedule queue port in reset-to-init too
It's legal for drivers to provide the QP port through the
QPC schedule-queue field on the reset-to-init QP state change.

Add adjusting of the schedule queue port in the SRIOV wrapper
for that operation too.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:05:10 -04:00
Or Gerlitz
f40e99e927 net/mlx4_core: Adjust the schedule queue port for single ported IB VFs
Some VF drivers flow set the schedule queue in the QP context but
without setting none of OPTPAR_SCHED_QUEUE or OPTPAR_PRIMARY_ADDR_PATH.

To allow for such non-modified drivers to function as single ported
IB VFs, we must adjust the schedule queue port whenever being set,
e.g as currently done for single ported Eth VFs.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:05:09 -04:00
Or Gerlitz
74d4943fbb net/mlx4_core: Modify port values when generting EQEs for VFs
As part of enabling single ported VFs over IB ports we need to handle
some of the flows for generting EQ events for VFs which don't come
into play under Eth ports.

This mainly includes port management events derived from changes of the
phyiscal port (lid change, client re-register, down/up, etc), VF pkey table
changes and VF guid changes initiated by the IB driver.

(1) make sure that events are generated only for VFs sitting on
    the relevant physical port (under the ALL_SLAVES flow).

(2) before generating the event, convert from physical (one or two)
    to VF port (always equals one).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:05:09 -04:00
Or Gerlitz
430910b1b9 IB/mlx4: Convert slave port before building address-handle
When multiplexling a MAD sent from VF, we should convert the port used
by the guest to send the packet to the actual physical port which will be
used to transmit the packet, before building the relevant address-handle (AH).

This is needed under VPI for single ported VFs, since the code that builds
the AH (mlx4_ib_query_ah()) makes decisions based on the input port. If we
use the port number provided by the guest, it might have different protocol
vs. the one this packat has to go from, and hence the result could be wrong.

So far, the conversion was done after the AH was built and it worked for
single ported Eth VFs which were not enabled under VPI. When adding support
for single ported IB VFs and VPI, we hit that.

Fixes: 449fc48866 ('net/mlx4: Adapt code for N-Port VF')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:05:09 -04:00
Or Gerlitz
7c35ef4525 net/mlx4_core: Enhance the MAD_IFC wrapper to convert VF port to physical
Single port VFs always provide port = 1 (even if the actual physical
port used is port 2). As such, we need to convert the port provided
by the VF to the physical port before calling into the firmware.

It turns out that the Linux mlx4 VF RoCE driver maintains a copy of
the GID table and hence this change became critical only for single
ported IB VFs, but it could be needed for other RoCE VF drivers too.

Fixes: 449fc48866 ('net/mlx4: Adapt code for N-Port VF')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:05:09 -04:00
Geert Uytterhoeven
29044b5886 enic: Grammar s/an negative/a negative/
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-24 23:03:05 -04:00
David S. Miller
36583eb54d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cadence/macb.c
	drivers/net/phy/phy.c
	include/linux/skbuff.h
	net/ipv4/tcp.c
	net/switchdev/switchdev.c

Switchdev was a case of RTNH_H_{EXTERNAL --> OFFLOAD}
renaming overlapping with net-next changes of various
sorts.

phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.

tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.

macb.c involved the addition of two zyncq device entries.

skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-23 01:22:35 -04:00
David S. Miller
fa7912be96 Merge branch 'pktgen-new-scripts'
Jesper Dangaard Brouer says:

====================
pktgen: cleanups and introducing new samples/pktgen scripts

v3:
 - Aborted v2 send due it was not generating diff stat
   (this is a bug in stg-mail, if not in the root directory)

v2: address nitpicks from Cong Wang
 - Remove useless cat's, but keep them for old pgset()
 - Comment on: Due to pgctrl, cannot use exit code $? from grep
 - Use arithmetic compare in pktgen_sample03_burst_single_flow.sh

This patchset is focused on making pktgen easier to use and better
documented. It contains a number of documentation updates and minor
changes to pktgen.  The major contribution is introduction of common
helper function for sample scripts.

Instead of the old pgset() function, three new shell functions for
configuring the different components of pktgen are introduced:
 pg_ctrl(), pg_thread() and pg_set().

The new functions correspond to pktgens different components.
 * pg_ctrl()   control "pgctrl" (/proc/net/pktgen/pgctrl)
 * pg_thread() control the kernel threads and binding to devices
 * pg_set()    control setup of individual devices

Helpers also provide consistent parameter parsing across the sample
scripts.

Usage example:
 ./pktgen_sample01_simple.sh -i eth41 -m 00:12:C0:02:AC:5A -d 192.168.41.2

Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
  -i : ($DEV)       output interface/device (required)
  -s : ($PKT_SIZE)  packet size
  -d : ($DEST_IP)   destination IP
  -m : ($DST_MAC)   destination MAC-addr
  -t : ($THREADS)   threads to start
  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
  -b : ($BURST)     HW level bursting of SKBs
  -v : ($VERBOSE)   verbose
  -x : ($DEBUG)     debug

These scripts are borrowed from:
 https://github.com/netoptimizer/network-testing/tree/master/pktgen
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:23 -04:00
Jesper Dangaard Brouer
05a14d5e17 pktgen: add benchmark script pktgen_bench_xmit_mode_netif_receive.sh
This script pktgen_bench_xmit_mode_netif_receive.sh is a benchmark
script, which can be used for benchmarking part of the network stack.
This can be used for performance improving or catching regression in
that area.

The script is developed for benchmarking ingress qdisc path, original
idea by Alexei Starovoitov.  This script don't really need any
hardware.  This is achieved via the recently introduced stack inject
feature "xmit_mode netif_receive". See commit 62f64aed62 ("pktgen:
introduce xmit_mode '<start_xmit|netif_receive>'").

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer
1d73ba16ad pktgen: add sample script pktgen_sample03_burst_single_flow.sh
Add the pktgen samples script pktgen_sample03_burst_single_flow.sh
that demonstrates how to acheive maximum performance.

If correctly tuned[1] single CPU 10Gbit/s wirespeed small pkts is
possible[2] which is 14.88Mpps.  The trick is to take advantage of the
"burst" feature introduced in commit 38b2cf2982 ("net: pktgen:
packet bursting via skb->xmit_more").

[1] http://netoptimizer.blogspot.dk/2014/06/pktgen-for-network-overload-testing.html
[2] http://netoptimizer.blogspot.dk/2014/10/unlocked-10gbps-tx-wirespeed-smallest.html

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer
282fb58947 pktgen: add sample script pktgen_sample02_multiqueue.sh
Add the pktgen samples script pktgen_sample02_multiqueue.sh that
demonstrates generating packets on multiqueue NICs.

Specifically notice the options "-t" that specifies how many
kernel threads to activate.  Also notice the flag QUEUE_MAP_CPU,
which cause the SKB TX queue to be mapped to the CPU running the
kernel thread.  For best scalability people are also encourage to
map NIC IRQ /proc/irq/*/smp_affinity to CPU number.

Usage example with "-t" 4 threads and help:
 ./pktgen_sample02_multiqueue.sh -i eth4 -m 00:1B:21:3C:9D:F8 -t 4

Usage: ./pktgen_sample02_multiqueue.sh [-vx] -i ethX
  -i : ($DEV)       output interface/device (required)
  -s : ($PKT_SIZE)  packet size
  -d : ($DEST_IP)   destination IP
  -m : ($DST_MAC)   destination MAC-addr
  -t : ($THREADS)   threads to start
  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
  -b : ($BURST)     HW level bursting of SKBs
  -v : ($VERBOSE)   verbose
  -x : ($DEBUG)     debug

Removing pktgen.conf-2-1 and pktgen.conf-2-2 as these examples
should be covered now.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer
6f09479758 pktgen: add sample script pktgen_sample01_simple.sh
Add the first basic pktgen samples script pktgen_sample01_simple.sh,
which demonstrates the a simple use of the helper functions.
Removing pktgen.conf-1-1 as that example should be covered now.

The naming scheme pktgen_sampleNN, where NN is a number, should encourage
reading the samples in a specific order.

Script cause pktgen sending with a single thread and single interface,
and introduce flow variation via random UDP source port.

Usage example and help:
 ./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2

Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
  -i : ($DEV)       output interface/device (required)
  -s : ($PKT_SIZE)  packet size
  -d : ($DEST_IP)   destination IP
  -m : ($DST_MAC)   destination MAC-addr
  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
  -v : ($VERBOSE)   verbose
  -x : ($DEBUG)     debug

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:17 -04:00
Jesper Dangaard Brouer
b64b0d1e64 pktgen: new pktgen helper functions for samples scripts
Preparing for removing existing samples/pktgen/ scripts, and
replacing these with easier to use samples.

This commit provides two helper shell files, that can
be "included" by shell source'ing. Namely "functions.sh"
and "parameters.sh".

The parameters.sh file support easy and consistant parameter
parsing across the sample scripts.  Usage example is printed on
errors.

The functions.sh file provides, three new shell functions for
configuring the different components of pktgen: pg_ctrl(),
pg_thread() and pg_set().  A slightly improved version of the old
pgset() function is also provided for backwards compat.

The new functions correspond to pktgens different components.
 * pg_ctrl()   control "pgctrl" (/proc/net/pktgen/pgctrl)
 * pg_thread() control the kernel threads and binding to devices
 * pg_set()    control setup of individual devices

These changes are borrowed from:
 https://github.com/netoptimizer/network-testing/tree/master/pktgen

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer
4020726479 pktgen: make /proc/net/pktgen/pgctrl report fail on invalid input
Giving /proc/net/pktgen/pgctrl an invalid command just returns shell
success and prints a warning in dmesg.  This is not very useful for
shell scripting, as it can only detect the error by parsing dmesg.

Instead return -EINVAL when the command is unknown, as this provides
userspace shell scripting a way of detecting this.

Also bump version tag to 2.75, because (1) reading /proc/net/pktgen/pgctrl
output this version number which would allow to detect this small
semantic change, and (2) because the pktgen version tag have not been
updated since 2010.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer
2a1ddf27e8 pktgen: document ability to add same device to several threads
The pktgen.txt documentation still claimed that adding same device to
multiple threads were not supported, but it have been since 2008 via
commit e6fce5b916 ("pktgen: multiqueue etc.").

Document this and describe the naming scheme dev@X, as the procfile name
still need to be unique.

Fixes: e6fce5b916 ("pktgen: multiqueue etc.")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer
91db4b3c89 pktgen: doc were missing several config options
The pktgen.txt documentation over available config options were not complete.
Making the list complete by adding the following.

Pgcontrol commands:
 reset

Device commands:
 burst
 queue_map_min
 queue_map_max
 skb_priority
 tos
 traffic_class
 node
 spi
 dst6_max
 dst6_min
 vlan_cfi
 vlan_id
 vlan_p
 svlan_cfi
 svlan_id
 svlan_p

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer
d079abd181 pktgen: adjust spacing in proc file interface output
Too many spaces were introduced in commit 63adc6fb8a ("pktgen: cleanup
checkpatch warnings"), thus misaligning "src_min:" to other columns.

Fixes: 63adc6fb8a ("pktgen: cleanup checkpatch warnings")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Jesper Dangaard Brouer
d012827e81 pktgen: remove obsolete "max_before_softirq" from pktgen doc
And cleanup some whitespaces in pktgen.txt.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 23:59:16 -04:00
Linus Torvalds
cf539cbd8a Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Radeon has two displayport fixes, one for a regression.

  i915 regression flicker fix needed so 4.0 can get fixed.

  A bunch of msm fixes and a bunch of exynos fixes, these two are
  probably a bit larger than I'd like, but most of them seems pretty
  good"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (29 commits)
  drm/radeon: fix error flag checking in native aux path
  drm/radeon: retry dcpd fetch
  drm/msm/mdp5: fix incorrect parameter for msm_framebuffer_iova()
  drm/exynos: dp: Lower level of EDID read success message
  drm/exynos: cleanup exynos_drm_plane
  drm/exynos: 'win' is always unsigned
  drm/exynos: mixer: don't dump registers under spinlock
  drm/exynos: Consolidate return statements in fimd_bind()
  drm/exynos: Constify exynos_drm_crtc_ops
  drm/exynos: Fix build breakage on !DRM_EXYNOS_FIMD
  drm/exynos: mixer: Constify platform_device_id
  drm/exynos: mixer: cleanup pixelformat handling
  drm/exynos: mixer: also allow NV21 for the video processor
  drm/exynos: mixer: remove buffer count handling in vp_video_buffer()
  drm/exynos: plane: honor buffer offset for dma_addr
  drm/exynos: fb: use drm_format_num_planes to get buffer count
  drm/i915: fix screen flickering
  drm/msm: fix locking inconsistencies in gpu->destroy()
  drm/msm/dsi: Simplify the code to get the number of read byte
  drm/msm: Attach assigned encoder to eDP and DSI connectors
  ...
2015-05-22 17:34:24 -07:00
Linus Torvalds
0b6280c620 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Don't leak ipvs->sysctl_tbl, from Tommi Rentala.

 2) Fix neighbour table entry leak in rocker driver, from Ying Xue.

 3) Do not emit bonding notifications for unregistered interfaces, from
    Nicolas Dichtel.

 4) Set ipv6 flow label properly when in TIME_WAIT state, from Florent
    Fourcot.

 5) Fix regression in ipv6 multicast filter test, from Henning Rogge.

 6) do_replace() in various footables netfilter modules is missing a
    check for 0 counters in the datastructure provided by the user.  Fix
    from Dave Jones, and found with trinity.

 7) Fix RCU bug in packet scheduler classifier module unloads, from
    Daniel Borkmann.

 8) Avoid deadlock in tcp_get_info() by using u64_sync.  From Eric
    Dumzaet.

 9) Input packet processing can race with inetdev_destroy() teardown,
    fix potential OOPS in ip_error() by explicitly testing whether the
    inetdev is still attached.  From Eric W Biederman.

10) MLDv2 parser in bridge multicast code breaks too early while
    parsing.  Fix from Thadeu Lima de Souza Cascardo.

11) Asking for settings on non-zero PHYID doesn't work because we do not
    import the command structure from the user and use the PHYID
    provided there.  Fix from Arun Parameswaran.

12) Fix UDP checksums with IPV6 RAW sockets, from Vlad Yasevich.

13) Missing NF_TABLES depends for TPROXY etc can cause build failures,
    fix from Florian Westphal.

14) Fix netfilter conntrack to handle RFC5961 challenge ACKs properly,
    from Jesper Dangaard Brouer.

15) If netlink autobind retry fails, we have to reset the sockets portid
    back to zero.  From Herbert Xu.

16) VXLAN netns exit code unregisters using wrong device, from John W
    Linville.

17) Add some USB device IDs to ath3k and btusb bluetooth drivers, from
    Dmitry Tunin and Wen-chien Jesse Sung.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  bridge: fix lockdep splat
  net: core: 'ethtool' issue with querying phy settings
  bridge: fix parsing of MLDv2 reports
  ARM: zynq: DT: Use the zynq binding with macb
  net: macb: Disable half duplex gigabit on Zynq
  net: macb: Document zynq gem dt binding
  ipv4: fill in table id when replacing a route
  cdc_ncm: Fix tx_bytes statistics
  ipv4: Avoid crashing in ip_error
  tcp: fix a potential deadlock in tcp_get_info()
  net: sched: fix call_rcu() race on classifier module unloads
  net: phy: Make sure phy_start() always re-enables the phy interrupts
  ipv6: fix ECMP route replacement
  ipv6: do not delete previously existing ECMP routes if add fails
  Revert "netfilter: bridge: query conntrack about skb dnat"
  netfilter: ensure number of counters is >0 in do_replace()
  netfilter: nfnetlink_{log,queue}: Register pernet in first place
  tcp: don't over-send F-RTO probes
  tcp: only undo on partial ACKs in CA_Loss
  net/ipv6/udp: Fix ipv6 multicast socket filter regression
  ...
2015-05-22 15:44:50 -07:00
Linus Torvalds
1c8df7bd48 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Three small fixes that have been picked up the last few weeks.
  Specifically:

   - Fix a memory corruption issue in NVMe with malignant user
     constructed request.  From Christoph.

   - Kill (now) unused blk_queue_bio(), dm was changed to not need this
     anymore.  From Mike Snitzer.

   - Always use blk_schedule_flush_plug() from the io_schedule() path
     when flushing a plug, fixing a !TASK_RUNNING warning with md.  From
     Shaohua"

* 'for-linus' of git://git.kernel.dk/linux-block:
  sched: always use blk_schedule_flush_plug in io_schedule_out
  nvme: fix kernel memory corruption with short INQUIRY buffers
  block: remove export for blk_queue_bio
2015-05-22 15:15:30 -07:00
Linus Torvalds
a30ec4b347 md fixes for 4.1-rc4
- one serious RAID0 data corruption - caused by recent bugfix that wasn't
   reviewed properly.
 - one raid5 fix in new code (a couple more of those to come).
 - one little fix to stop static analysis complaining about silly rcu
   annotation.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIVAwUAVV6rmDnsnt1WYoG5AQICpQ//VleOWnZERlVQN1nYzGHqIyPUz+dV6x8W
 RfzccMzUWqaHpZY/BaTMqJxPWgt6lxqqtN5F3u+XMdM+L2Pl0NXoRTqB+FzpLucR
 d0p76FwbW2IWnycxXUZtGjm8xT942KPdqXIP+LQRKaIhguV9P9th9WbPRvlXFfB6
 pO1h1AdxVlsfn6qgBpe49pYoUuNmpwSTTJ8Gvb7YRsHXnmB/lsjQvVsWLiMqPwug
 +bR0gM07OvTfYenrY82ztu42+xAoVoeu/XhHmPPLEV3S/SA/9kVLgBcWjq2bBjw0
 REzR2i+exEtCRe2yvetDX8320IKZcZSOE8BwBB2iUm8OVjplPxw83Bk/kB2+Zr8n
 VxAa9/vCZPLNCWyzOSi2V471NeFBys9PVkVIroHT5nQZGadkw0JYt1cEbcHSBKS/
 NYmJCu1IM0o9DPzn8jW7sXjDcB0WE8VwFMJ09QSiEYQu2bAb7j2f1pcoweN2MeNF
 qOitGj/luygXg2TBvema9qPL533DUj+eSYyyO7OQBsfkfZSDM5di4bGoZBVw118Q
 kARaJ3IZffT8gsVpgSTCsviDv5QUcs0izVS1sMKB6/SDJjtLXBMERW+LkrmjBsnB
 6+CMmwNZUIzugzBgSPBVozdNQQYjAXL7LNV6Q9YQhE0RK0+10fLLxmpovJR62bRf
 PWlg23+n4+E=
 =jXKM
 -----END PGP SIGNATURE-----

Merge tag 'md/4.1-rc4-fixes' of git://neil.brown.name/md

Pull md bugfixes from Neil Brown:
 "I have a few more raid5 bugfixes pending, but I want them to get a bit
  more review first.  In the meantime:

   - one serious RAID0 data corruption - caused by recent bugfix that
     wasn't reviewed properly.

   - one raid5 fix in new code (a couple more of those to come).

   - one little fix to stop static analysis complaining about silly rcu
     annotation"

* tag 'md/4.1-rc4-fixes' of git://neil.brown.name/md:
  md/bitmap: remove rcu annotation from pointer arithmetic.
  md/raid0: fix restore to sector variable in raid0_make_request
  raid5: fix broken async operation chain
2015-05-22 15:10:07 -07:00
Linus Torvalds
1d82b0baf9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "Updates for the input subsystem.

  The main change is that we tell joydev not to touch "absolute mice",
  such as VMware virtual mouse, as that produced bad result (cursor
  stuck in upper right corner) with games"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: smtpe-ts - wait 50mS until polling for pen-up
  Input: smtpe-ts - use msecs_to_jiffies() instead of HZ
  Input: joydev - don't classify the vmmouse as a joystick
  Input: vmmouse - do not reference non-existing version of X driver
  Input: alps - fix finger jumps on lifting 2 fingers on v7 touchpad
  Input: elantech - fix semi-mt protocol for v3 HW
  Input: sx8654 - fix memory allocation check
2015-05-22 14:49:55 -07:00
Linus Torvalds
2a058f388d Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull another crypto fix from Herbert Xu:
 "Fix ICV corruption in s390/ghash when the same tfm is used by more
  than one thread"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: s390/ghash - Fix incorrect ghash icv buffer handling.
2015-05-22 14:26:36 -07:00
Eric Dumazet
93a33a584e bridge: fix lockdep splat
Following lockdep splat was reported :

[   29.382286] ===============================
[   29.382315] [ INFO: suspicious RCU usage. ]
[   29.382344] 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 Not tainted
[   29.382380] -------------------------------
[   29.382409] net/bridge/br_private.h:626 suspicious
rcu_dereference_check() usage!
[   29.382455]
               other info that might help us debug this:

[   29.382507]
               rcu_scheduler_active = 1, debug_locks = 0
[   29.382549] 2 locks held by swapper/0/0:
[   29.382576]  #0:  (((&p->forward_delay_timer))){+.-...}, at:
[<ffffffff81139f75>] call_timer_fn+0x5/0x4f0
[   29.382660]  #1:  (&(&br->lock)->rlock){+.-...}, at:
[<ffffffffa0450dc1>] br_forward_delay_timer_expired+0x31/0x140
[bridge]
[   29.382754]
               stack backtrace:
[   29.382787] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.1.0-0.rc0.git11.1.fc23.x86_64 #1
[   29.382838] Hardware name: LENOVO 422916G/LENOVO, BIOS A1KT53AUS 04/07/2015
[   29.382882]  0000000000000000 3ebfc20364115825 ffff880666603c48
ffffffff81892d4b
[   29.382943]  0000000000000000 ffffffff81e124e0 ffff880666603c78
ffffffff8110bcd7
[   29.383004]  ffff8800785c9d00 ffff88065485ac58 ffff880c62002800
ffff880c5fc88ac0
[   29.383065] Call Trace:
[   29.383084]  <IRQ>  [<ffffffff81892d4b>] dump_stack+0x4c/0x65
[   29.383130]  [<ffffffff8110bcd7>] lockdep_rcu_suspicious+0xe7/0x120
[   29.383178]  [<ffffffffa04520f9>] br_fill_ifinfo+0x4a9/0x6a0 [bridge]
[   29.383225]  [<ffffffffa045266b>] br_ifinfo_notify+0x11b/0x4b0 [bridge]
[   29.383271]  [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
[   29.383320]  [<ffffffffa0450de8>]
br_forward_delay_timer_expired+0x58/0x140 [bridge]
[   29.383371]  [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
[   29.383416]  [<ffffffff8113a033>] call_timer_fn+0xc3/0x4f0
[   29.383454]  [<ffffffff81139f75>] ? call_timer_fn+0x5/0x4f0
[   29.383493]  [<ffffffff8110a90f>] ? lock_release_holdtime.part.29+0xf/0x200
[   29.383541]  [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
[   29.383587]  [<ffffffff8113a6a4>] run_timer_softirq+0x244/0x490
[   29.383629]  [<ffffffff810b68cc>] __do_softirq+0xec/0x670
[   29.383666]  [<ffffffff810b70d5>] irq_exit+0x145/0x150
[   29.383703]  [<ffffffff8189f506>] smp_apic_timer_interrupt+0x46/0x60
[   29.383744]  [<ffffffff8189d523>] apic_timer_interrupt+0x73/0x80
[   29.383782]  <EOI>  [<ffffffff816f131f>] ? cpuidle_enter_state+0x5f/0x2f0
[   29.383832]  [<ffffffff816f131b>] ? cpuidle_enter_state+0x5b/0x2f0

Problem here is that br_forward_delay_timer_expired() is a timer
handler, calling br_ifinfo_notify() which assumes either rcu_read_lock()
or RTNL are held.

Simplest fix seems to add rcu read lock section.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Reported-by: Dominick Grift <dac.override@gmail.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22 16:23:56 -04:00