Commit Graph

25387 Commits

Author SHA1 Message Date
Michael Chan
74706afa71 bnxt_en: Update interrupt coalescing logic.
New firmware spec. allows interrupt coalescing parameters, such as
maximums, timer units, supported features to be queried.  Update
the driver to make use of the new call to query these parameters
and provide the legacy defaults if the call is not available.

Replace the hard-coded values with these parameters.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
1dfddc41ae bnxt_en: Add maximum extended request length fw message support.
Support the max_ext_req_len field from the HWRM_VER_GET_RESPONSE.
If this field is valid and greater than the mailbox size, use the
short command format to send firmware messages greater than the
mailbox size.  Newer devices use this method to send larger messages
to the firmware.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
36e53349b6 bnxt_en: Add additional extended port statistics.
Latest firmware spec. has some additional rx extended port stats and new
tx extended port stats added.  We now need to check the size of the
returned rx and tx extended stats and determine how many counters are
valid.  New counters added include CoS byte and packet counts for rx
and tx.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
31d357c069 bnxt_en: Update firmware interface spec. to 1.10.0.3.
Among the new changes are trusted VF support, 200Gbps support, and new
API to dump ring information on the new chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Colin Ian King
fbe1222c63 qed: fix spelling mistake "Ireelevant" -> "Irrelevant"
Trivial fix to spelling mistake in DP_INFO message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:40:27 -07:00
Heiner Kallweit
2527e4037f r8169: remove unneeded call to netif_stop_queue in rtl8169_net_suspend
netif_device_detach() stops all tx queues already, so we don't need
this call.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:35:03 -07:00
Heiner Kallweit
34bc009543 r8169: simplify rtl8169_set_magic_reg
Simplify this function, no functional change intended.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:34:34 -07:00
Arnd Bergmann
44eb385bc5 octeontx2-af: remove unused cgx_fwi_link_change
The newly added driver causes a warning about a function that is
not used anywhere:

drivers/net/ethernet/marvell/octeontx2/af/cgx.c:320:12: error: 'cgx_fwi_link_change' defined but not used [-Werror=unused-function]

Remove it for now, until a user gets added. If we want to use this
function from another module, we also need a declaration in a header
file, which is currently missing, so it would have to change anyway.

Fixes: 1463f382f5 ("octeontx2-af: Add support for CGX link management")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:31:54 -07:00
Ryan C Goodfellow
5948185b97 nfp: devlink port split support for 1x100G CXP NIC
This commit makes it possible to use devlink to split the 100G CXP
Netronome into two 40G interfaces. Currently when you ask for 2
interfaces, the math in src/nfp_devlink.c:nfp_devlink_port_split
calculates that you want 5 lanes per port because for some reason
eth_port.port_lanes=10 (shouldn't this be 12 for CXP?). What we really
want when asking for 2 breakout interfaces is 4 lanes per port. This
commit makes that happen by calculating based on 8 lanes if 10 are
present.

Signed-off-by: Ryan C Goodfellow <rgoodfel@isi.edu>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Greg Weeks <greg.weeks@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:29:55 -07:00
Ioana Radulescu
b948c8c6a7 dpaa2-eth: remove unused FD field
According to the hardware ArchDef, the PTV1 field in FD[CTRL]
is ignored by WRIOP, so setting it for Tx FDs is pointless.

Remove all references to it from the code.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:23:19 -07:00
Ioana Ciornei
b00c898c00 dpaa2-eth: mark unused parameter in dpaa2_eth_tx_conf
The ch parameter is never used in the dpaa2_eth_tx_conf function but
since its prototype must match the type defined in the consume field of
struct dpaa2_eth_fq, just mark it as __always_unused.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:23:19 -07:00
Ioana Ciornei
fdb6ca9e46 dpaa2-eth: remove unused priv parameter
The priv parameter is never used in the build_linear_skb and
drain_channel function. Remove it from the function definitions.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:23:19 -07:00
Ioana Ciornei
85b7a342ba dpaa2-eth: fix uninitialized variable warnings
All 3 cases of possible uninitialized variables are false
positives since they are used only as output parameters.
Nonetheless, fix the warnings.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:23:19 -07:00
Ioana Ciornei
3233c1514f dpaa2-eth: make dpaa2_eth_set_dist_key static
The dpaa2_eth_set_dist_key function is only used in a single file.
Make it static.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:23:19 -07:00
Ioana Radulescu
b12cef51b5 dpaa2-eth: Fix Kconfig dependencies
Both ARCH_LAYERSCAPE and COMPILE_TEST dependencies are already implied
through the FSL_MC_BUS dep, so there's no need to state it explicitly.

Also, the fsl-mc bus depends on COMPILE_TEST only for some
architectures (arm, arm64, ppc, x86), so it's not correct to
claim build support unconditionally.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:23:19 -07:00
Ivan Khoronzhuk
5b3a5a14f8 net: ethernet: ti: cpsw: use for mcast entries only host port
In dual-emac mode the cpsw driver sends directed packets, that means
that packets go to the directed port, but an ALE lookup is performed
to determine untagged egress only. It means that on tx side no need
to add port bit for ALE mcast entry mask, and basically ALE entry
for port identification is needed only on rx side.

So, add only host port in dual_emac mode as used directed
transmission, and no need in one more port. For single port boards
and switch mode all ports used, as usual, so no changes for them.
Also it simplifies farther changes.

In other words, mcast entries for dual-emac should behave exactly
like unicast. It also can help avoid leaking packets between ports
with same vlan on h/w level if ports could became members of same vid.

So now, for instance, if mcast address 33:33:00:00:00:01 is added then
entries in ALE table:

vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x1
vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x1

Instead of:
vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x3
vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x5

With the same considerations, set only host port for unregistered
mcast for dual-emac mode in case of IFF_ALLMULTI is set, exactly like
it's done in cpsw_ale_set_allmulti().

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:22:12 -07:00
Ivan Khoronzhuk
5da1948969 net: ethernet: ti: cpsw: fix lost of mcast packets while rx_mode update
Whenever kernel or user decides to call rx mode update, it clears
every multicast entry from forwarding table and in some time adds
it again. This time can be enough to drop incoming multicast packets.

That's why clear only staled multicast entries and update or add new
one afterwards.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:21:28 -07:00
Ivan Khoronzhuk
58bdeac8b0 net: ethernet: ti: cpsw_ale: use const for API having pointer on mac address
It allows to use function under callbacks with same const qualifier of
mac address for farther changes.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:21:27 -07:00
Florian Fainelli
64bd9c8135 net: bcmgenet: Poll internal PHY for GENETv5
On GENETv5, there is a hardware issue which prevents the GENET hardware
from generating a link UP interrupt when the link is operating at
10Mbits/sec. Since we do not have any way to configure the link
detection logic, fallback to polling in that case.

Fixes: 421380856d ("net: bcmgenet: add support for the GENETv5 hardware")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:10:21 -07:00
David S. Miller
d0f068e572 mlx5-fixes-2018-10-10
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbvqcMAAoJEEg/ir3gV/o+eFsH/2TbJH+i1BuGMVwCB8o+U1Rz
 C01pJmR7Lb7WwQZ8ZKTOqQkS7BkGX1hNGyIlc4i6ZnP+4gsVJAbP6LKPjTvyD7e6
 TNb8bvxTUCOovknrevKkGba8tzoTTsC4wwwbHLGHd1hkKSY1P5hXg8R7vpear+n6
 /PFJwzpIXDAa8AHqeORCNYj7MneUm3kaahcmSOxOhvDbRx3UG9cgy7tEhPjZbRn5
 jPFsxFCSPcGedtI+g8bzodmpneTcu1KF6QCunrl2bGt5EzgDrbaw1UUoctxD2CJR
 Ch45W807EvBJoFiJXXCNf9N+p5020F/Q+mTmK7khPirUjdtoLdcT9Goswpjfbtk=
 =vJIA
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2018-10-10

This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.

For -stable v4.11:
('net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type')
For -stable v4.17:
('net/mlx5: Fix memory leak when setting fpga ipsec caps')
For -stable v4.18:
('net/mlx5: WQ, fixes for fragmented WQ buffers API')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 21:51:28 -07:00
David S. Miller
1986647c2f mlx5e-updates-2018-10-10
IPoIB netlink support and mlx5e pre-allocated netdevice initialization
 
 IP link was broken due to the changes in IPoIB for the rdma_netdev
 support after commit cd565b4b51
 ("IB/IPoIB: Support acceleration options callbacks").
 
 This patchset fixes IPoIB pkey creation and removal using rtnetlink by
 adding support in both IPoIB ULP layer and mlx5 layer:
 
 From Jason and Denis:
 1) Introduces changes in the RDMA netdev code in order to
    allow allocation of the netdev to be done by the rtnl netdev code.
 2) Reworks IPoIB initialization to use the two step rdma_netdev
    creation.
 
 From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting
 pre-allocated netdevs:
 3) Adds support to initialize/cleanup netdevs that are not created
    by mlx5 driver.
 4) Change mlx5e netdevice layer to accept the pre-allocated netdevice
    queue number.
 5) Initialize mlx5e generic structures in one place to be used for all
    netdevs types NIC/representors/IPoIB (both mlx5 allocated and
    pre-allocted).
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbvqHhAAoJEEg/ir3gV/o+TRQH+wYWgvi5AbqYlGpT8xSho8Dg
 5tuFE9AbYDGLLWptZgtYXsBsTaltGVqKnwWy//dAuuSI7LvqtjeKqq0G1JKau70L
 /49+vv8NTQ6vov2yY95yzzEo5B1/zVcYEl9dmJl4y6Xlwc+YIteewReVqRj+tucR
 xO7ghLWc1o4Pl7gWBAcOULyOsZc+CtG8ElwEaBROXTtYRpsR7FKn7vN+WdWZ2VOG
 MjtCUaG0vZlK1vaF76J3P9bL2V3y6o6i6S8RZwg1kPxO4jnrzyGrkxWMOX6f32KB
 JAUcLVlJW5wckcRAKfTwLxsFMrc9vLBPHyj4XneXVWGKh8/dGUGY5gdGIJV5Jlc=
 =m0tM
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-10-10

IPoIB netlink support and mlx5e pre-allocated netdevice initialization

IP link was broken due to the changes in IPoIB for the rdma_netdev
support after commit cd565b4b51
("IB/IPoIB: Support acceleration options callbacks").

This patchset fixes IPoIB pkey creation and removal using rtnetlink by
adding support in both IPoIB ULP layer and mlx5 layer:

From Jason and Denis:
1) Introduces changes in the RDMA netdev code in order to
   allow allocation of the netdev to be done by the rtnl netdev code.
2) Reworks IPoIB initialization to use the two step rdma_netdev
   creation.

From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting
pre-allocated netdevs:
3) Adds support to initialize/cleanup netdevs that are not created
   by mlx5 driver.
4) Change mlx5e netdevice layer to accept the pre-allocated netdevice
   queue number.
5) Initialize mlx5e generic structures in one place to be used for all
   netdevs types NIC/representors/IPoIB (both mlx5 allocated and
   pre-allocted).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 21:49:56 -07:00
Jian-Hong Pan
d49c88d767 r8169: Enable MSI-X on RTL8106e
Originally, we have an issue where r8169 MSI-X interrupt is broken after
S3 suspend/resume on RTL8106e of ASUS X441UAR.

02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
(rev 07)
	Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
Ethernet controller [1043:200f]
	Flags: bus master, fast devsel, latency 0, IRQ 16
	I/O ports at e000 [size=256]
	Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
	Memory at e0000000 (64-bit, prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 01
	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
	Capabilities: [d0] Vital Product Data
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Virtual Channel
	Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
	Capabilities: [170] Latency Tolerance Reporting
	Kernel driver in use: r8169
	Kernel modules: r8169

We found the all of the values in PCI BAR=4 of the ethernet adapter
become 0xFF after system resumes.  That breaks the MSI-X interrupt.
Therefore, we can only fall back to MSI interrupt to fix the issue at
that time.

However, there is a commit which resolves the drivers getting nothing in
PCI BAR=4 after system resumes.  It is 04cb3ae895d7 "PCI: Reprogram
bridge prefetch registers on resume" by Daniel Drake.

After apply the patch, the ethernet adapter works fine before suspend
and after resume.  So, we can revert the workaround after the commit
"PCI: Reprogram bridge prefetch registers on resume" is merged into main
tree.

This patch reverts commit 7bb05b85bc
"r8169: don't use MSI-X on RTL8106e".

Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=201181
Fixes: 7bb05b85bc ("r8169: don't use MSI-X on RTL8106e")
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 21:31:53 -07:00
David S. Miller
d864991b22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 21:38:46 -07:00
Jian Shen
829edbd8d4 net: hns3: Resume promisc mode and vlan filter status after loopback test
This patch resumes promisc mode and vlan filter status after
loopback test.

Fixes: 3b75c3df59 ("net: hns3: net: hns3: Add support for IFF_ALLMULTI flag")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 11:23:45 -07:00
Jian Shen
7325523ab6 net: hns3: Resume promisc mode and vlan filter status after reset
This patch resumes promisc mode and vlan filter status after reset.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 11:23:45 -07:00
Jian Shen
c60edc17df net: hns3: Enable promisc mode when mac vlan table is full
Currently, the driver does nothing when mac vlan table is full.
In this case, the packet with new mac address will be dropped
by hardware. This patch adds check for the result of sync mac
address, and enable promisc mode when mac vlan table is full.
Furtherly, disable vlan filter when enable promisc by user
command.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 11:23:45 -07:00
Jakub Kicinski
96de25060d nfp: replace long license headers with SPDX
Replace the repeated license text with SDPX identifiers.
While at it bump the Copyright dates for files we touched
this year.

Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 12:16:21 -07:00
Maciej S. Szmigiero
511cfd580f r8169: set RX_MULTI_EN bit in RxConfig for 8168F-family chips
It has been reported that since
commit 05212ba813 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
at least RTL_GIGA_MAC_VER_38 NICs work erratically after a resume from
suspend.
The problem has been traced to a missing RX_MULTI_EN bit in the RxConfig
register.
We already set this bit for RTL_GIGA_MAC_VER_35 NICs of the same 8168F
chip family so let's do it also for its other siblings: RTL_GIGA_MAC_VER_36
and RTL_GIGA_MAC_VER_38.

Curiously, the NIC seems to work fine after a system boot without having
this bit set as long as the system isn't suspended and resumed.

Fixes: 05212ba813 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 12:08:04 -07:00
Ilias Apalodimas
2a1e89df78 net: socionext: clear rx irq correctly
commit 63ae7949e9 ("net: socionext: Use descriptor info instead of MMIO reads on Rx")
removed constant mmio reads from the driver and started using a descriptor
field to check if packet should be processed.
This lead the napi rx handler being constantly called while no packets
needed processing and ksoftirq getting 100% cpu usage. Issue one mmio read
to clear the irq correcty after processing packets

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 12:04:08 -07:00
Moshe Shemesh
2645060834 net/mlx4_core: Fix warnings during boot on driverinit param set failures
During boot, mlx4_core sets the driverinit configuration parameters and
updates the devlink module on the initial values calling
devlink_param_driverinit_value_set().
If devlink_param_driverinit_value_set() returns an error mlx4_core
reports kernel module warning.

This caused false alarm during boot in case kernel was compiled with
CONFIG_NET_DEVLINK off.
Fix by removing warning reported in case
devlink_param_driverinit_value_set() fails.

This actually makes the function mlx4_devlink_set_init_value()
redundant to using directly devlink_param_driverinit_value_set() and so
removed.

It fixes the following kernel trace:

 mlx4_core 0000:00:06.0: devlink set parameter 0 value failed (err = -95)
 mlx4_core 0000:00:06.0: devlink set parameter 1 value failed (err = -95)
 mlx4_core 0000:00:06.0: devlink set parameter 4 value failed (err = -95)
 mlx4_core 0000:00:06.0: devlink set parameter 5 value failed (err = -95)
 mlx4_core 0000:00:06.0: devlink set parameter 3 value failed (err = -95)

Fixes: bd1b51dc66 ("mlx4: Add mlx4 initial parameters table and register it")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:24:35 -07:00
Arnd Bergmann
e70a57fa59 cxgb4: fix thermal configuration dependencies
With CONFIG_THERMAL=m, we get a build error:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.c: In function 'cxgb4_thermal_get_trip_type':
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.c:48:11: error: 'struct adapter' has no member named 'ch_thermal'

Once that is fixed by using IS_ENABLED() checks, we get a link error
against the thermal subsystem when cxgb4 is built-in:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_init':
cxgb4_thermal.c:(.text+0x180): undefined reference to `thermal_zone_device_register'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_remove':
cxgb4_thermal.c:(.text+0x1e0): undefined reference to `thermal_zone_device_unregister'

Finally, since CONFIG_THERMAL can be =m, the Makefile fails to pick up the
extra file into built-in.a, and we get another link failure against the
cxgb4_thermal_init/cxgb4_thermal_remove files, so the Makefile has to
be adapted as well to work for both CONFIG_THERMAL=y and =m.

Fixes: b187191577 ("cxgb4: Add thermal zone support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:21:15 -07:00
Arthur Kiyanovski
be26667cb3 net: ena: fix indentations in ena_defs for better readability
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
3a7b9d8ddd net: ena: update driver version to 2.0.1
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
f1e90f6e2c net: ena: remove redundant parameter in ena_com_admin_init()
Remove redundant spinlock acquire parameter from ena_com_admin_init()

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
87731f0c68 net: ena: change rx copybreak default to reduce kernel memory pressure
Improves socket memory utilization when receiving packets larger
than 128 bytes (the previous rx copybreak) and smaller than 256 bytes.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
0574bb806d net: ena: limit refill Rx threshold to 256 to avoid latency issues
Currently Rx refill is done when the number of required descriptors is
above 1/8 queue size. With a default of 1024 entries per queue the
threshold is 128 descriptors.
There is intention to increase the queue size to 8196 entries.
In this case threshold of 1024 descriptors is too large and can hurt
latency.
Add another limitation to Rx threshold to be at most 256 descriptors.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
bd791175a6 net: ena: explicit casting and initialization, and clearer error handling
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
cb36bb36e1 net: ena: use CSUM_CHECKED device indication to report skb's checksum status
Set skb->ip_summed to the correct value as reported by the device.
Add counter for the case where rx csum offload is enabled but
device didn't check it.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
38005ca816 net: ena: add functions for handling Low Latency Queues in ena_netdev
This patch includes all code changes necessary in ena_netdev to enable
packet sending via the LLQ placemnt mode.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
689b2bdaaa net: ena: add functions for handling Low Latency Queues in ena_com
This patch introduces APIs for detection, initialization, configuration
and actual usage of low latency queues(LLQ). It extends transmit API with
creation of LLQ descriptors in device memory (which include host buffers
descriptors as well as packet header)

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
a7982b8ec9 net: ena: introduce Low Latency Queues data structures according to ENA spec
Low Latency Queues(LLQ) allow usage of device's memory for descriptors
and headers. Such queues decrease processing time since data is already
located on the device when driver rings the doorbell.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
095f2f1fac net: ena: complete host info to match latest ENA spec
Add new fields and definitions to host info and fill them
according to the latest ENA spec version.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:50 -07:00
Arthur Kiyanovski
0e575f8542 net: ena: minor performance improvement
Reduce fastpath overhead by making ena_com_tx_comp_req_id_get() inline.
Also move it to ena_eth_com.h file with its dependency function
ena_com_cq_inc_head().

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:50 -07:00
Ido Schimmel
b02597d513 mlxsw: spectrum: Add NVE packet traps
The DECAP_ECN0 trap will be used to trap packets where the overlay
packet is marked with Non-ECT, but the underlay packet is marked with
either ECT(0), ECT(1) or CE. When trapped, such packets will be counted
as errors by the VxLAN driver and thus provide better visibility.

The NVE_ENCAP_ARP trap will be used to trap ARP packets undergoing NVE
encapsulation. This is needed in order to support E-VPN ARP suppression,
where the Linux bridge does not flood ARP packets through tunnel ports
in case it can answer the ARP request itself.

Note that all the packets trapped via these traps are marked with
'offload_fwd_mark', so as to not be re-flooded by the Linux bridge
through the ASIC ports.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
2bd414aef6 mlxsw: resources: Add NVE resources
Add the following resources to be used by the NVE code:
* Number of IPv4 underlay destination IPs in a single TNUMT record
* Number of IPv6 underlay destination IPs in a single TNUMT record

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
27f68c0850 mlxsw: reg: Add Monitoring Parsing State Register
This register is used for setting up the parsing for hash, policy-engine
and routing.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
0933781f11 mlxsw: reg: Add definition of unicast tunnel record for SFD register
Will be used to program the device with FDB records pointing to a NVE
tunnel.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
8efcf6bb48 mlxsw: reg: Add Tunneling NVE QoS Default Register
The TNQDR register configures the default QoS settings for NVE
encapsulation.

It will be used to set the default DSCP of each port to 0, so that when
DSCP is set to inherit and the overlay packet does not have an IP header
the outer DSCP will be set to 0, in accordance with the software data
path.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
fd6db27cac mlxsw: reg: Add Tunneling NVE QoS Configuration Register
The register configures how QoS is set in Encapsulation into the
underlay network.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
a77d5f0bde mlxsw: reg: Add Tunneling NVE Decapsulation ECN Mapping Register
This register configures the actions that are done during NVE
decapsulation based on the ECN bits.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
4a8d1860ed mlxsw: reg: Add Tunneling NVE Encapsulation ECN Mapping Register
This register performs mapping from overlay ECN to underlay ECN during
NVE encapsulation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
c723d19fad mlxsw: reg: Add Tunneling NVE Underlay Multicast Table Register
This register builds the linked list of underlay destination IPs used
for BUM traffic on the overlay.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
50e6eb2a63 mlxsw: reg: Add Tunnel Port Configuration Register
This register enables / disables learning on different types of tunnel
ports (e.g., NVE, VPLS).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:23 -07:00
Ido Schimmel
710dd1a0ec mlxsw: reg: Add Tunneling NVE General Configuration Register
This register configures global NVE configuration such as source IP of
the NVE tunnel and UDP source port calculation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
beda7f72c3 mlxsw: spectrum: Seed LAG hash function
Currently, the seed of the LAG hash function is always set to 0, which
means it is identical across all switches. Instead, use a random number.

This is especially important now that VxLAN is supported, as the LAG
hash function is used to calculate the UDP source port of the
encapsulated packet.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
a682a3024f mlxsw: reg: Extend FDB flush types for NVE
The device has the ability to flush all the FDB records that perform NVE
encapsulation or only a subset of these with a specific filtering
identifier (FID).

Expose these types so that they could be used by subsequent patches
where we need to flush the FDB records when an NVE device is unlinked
from a bridge (FID).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
90ea0bb551 mlxsw: spectrum: Add a new type of KVD linear record
When the device needs to flood an overlay packet to remote VTEPs it
retrieves a pointer to the head of a linked-list of records that store
the IP addresses of these VTEPs.

These records are stored in the KVD linear memory and configured via the
Tunneling NVE Underlay Multicast Table (TNUMT) register.

Add a new KVD linear entry type for these records, so that we will be
able to allocate and free them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
12066d612b mlxsw: spectrum: Move L3 protocol and address definitions to global header file
The L3 protocol and address definitions are going to be used by the NVE
code, so move them to the global header file from the one private to the
router.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
9c73b1d120 mlxsw: spectrum_switchdev: Do not assume notifier information type
VxLAN notifications are going to use a different notifier information
type, so cast to the correct type based on the received event.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
5050f6ae25 mlxsw: spectrum_switchdev: Check notification relevance based on upper device
VxLAN FDB updates are sent with the VxLAN device which is not our upper
and will therefore be ignored by current code.

Solve this by checking whether the upper device (bridge) is our upper.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
ab74c3a127 mlxsw: spectrum_switchdev: Prepare for VxLAN FDB notifications
VxLAN FDB notifications need to be handled differently than bridge FDB
notifications, so initialize the work item based on the received
notification and rename the invoked function accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
Ido Schimmel
bf341eb895 mlxsw: spectrum: Remove misuses of private header file
The spectrum_router.h header file is private to the router block and
should only be included by direct consumers of it, such as dpipe and the
multicast routing code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:08:22 -07:00
YueHaibing
df92062e49 octeontx2-af: Remove set but not used variable 'dev'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/marvell/octeontx2/af/cgx.c: In function 'cgx_fwi_event_handler':
drivers/net/ethernet/marvell/octeontx2/af/cgx.c:257:17: warning:
 variable 'dev' set but not used [-Wunused-but-set-variable]

It never be used since introduction in
commit 1463f382f5 ("octeontx2-af: Add support for CGX link management")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:05:58 -07:00
Antoine Tenart
60f8e67d98 net: mscc: allow extracting the FCS into the skb
This patch adds support for the NETIF_F_RXFCS feature in the Mscc
Ethernet driver. This feature is disabled by default and allow a user
to request the driver not to drop the FCS and to extract it into the skb
for debugging purposes.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:00:27 -07:00
Peng Li
232fc64b6e net: hns3: Add HW RSS hash information to RX skb
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Jian Shen
d97b307213 net: hns3: Add RSS tuples support for VF
This patch adds RSS tuple support for VF in revision
0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Jian Shen
374ad29176 net: hns3: Add RSS general configuration support for VF
This patch adds RSS key, hash algorithm configuration support
for VF in revision 0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Jian Shen
775501a1aa net: hns3: Add new RSS hash algorithm support for PF
This patch adds ETH_RSS_HASH_XOR hash algorithm supports, which
is supported by hw revision 0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Pieter Jansen van Vuuren
12ecf61529 nfp: flower: use host context count provided by firmware
Read the host context count symbols provided by firmware and use
it to determine the number of allocated stats ids. Previously it
won't be possible to offload more than 2^17 filter even if FW was
able to do so.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:32:44 -07:00
Pieter Jansen van Vuuren
7fade1077c nfp: flower: use stats array instead of storing stats per flow
Make use of an array stats instead of storing stats per flow which
would require a hash lookup at critical times.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:32:44 -07:00
Pieter Jansen van Vuuren
c01d0efa51 nfp: flower: use rhashtable for flow caching
Make use of relativistic hash tables for tracking flows instead
of fixed sized hash tables.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:32:44 -07:00
Nir Dotan
9e66431640 mlxsw: pci: Fix a typo
Signed-off-by: Nir Dotan <nird@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:22:07 -07:00
Colin Ian King
a26b0b53cc net: aquantia: remove some redundant variable initializations
There are several variables being initialized that are being set later
and hence the initialization is redundant and can be removed. Remove
then.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:18:01 -07:00
Tariq Toukan
37fdffb217 net/mlx5: WQ, fixes for fragmented WQ buffers API
mlx5e netdevice used to calculate fragment edges by a call to
mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct
indication for queues smaller than a PAGE_SIZE, (broken by default on
PowerPC, where PAGE_SIZE == 64KB).  Here it is replaced by the correct new
calls/API.

Since (TX/RX) Work Queues buffers are fragmented, here we introduce
changes to the API in core driver, so that it gets a stride index and
returns the index of last stride on same fragment, and an additional
wrapping function that returns the number of physically contiguous
strides that can be written contiguously to the work queue.

This obsoletes the following API functions, and their buggy
usage in EN driver:
* mlx5_wq_cyc_get_frag_size()
* mlx5_wq_cyc_ctr2fragix()

The new API improves modularity and hides the details of such
calculation for mlx5e netdevice and mlx5_ib rdma drivers.

New calculation is also more efficient, and improves performance
as follows:

Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size.

Before: 16,477,619 pps
After:  17,085,793 pps

3.7% improvement

Fixes: 3a2f703312 ("net/mlx5: Use order-0 allocations for all WQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 18:26:16 -07:00
Huy Nguyen
a48bc51315 net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type
The HW spec defines only bits 24-26 of pftype_wq as the page fault type,
use the required mask to ensure that.

Fixes: d9aaed8387 ("{net,IB}/mlx5: Refactor page fault handling")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 18:26:16 -07:00
Talat Batheesh
fd7e848077 net/mlx5: Fix memory leak when setting fpga ipsec caps
Allocated memory for context should be freed once
finished working with it.

Fixes: d6c4f0298c ("net/mlx5: Refactor accel IPSec code")
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 18:26:15 -07:00
Feras Daoud
779d986d60 net/mlx5e: Do not ignore netdevice TX/RX queues number
The current design of mlx5e driver ignores the netdevice TX/RX queues
number for netdevices that RDMA IPoIB ULP creates. Instead, the queue
number is initialized to the maximum number that mlx5 thinks best for
performance. As a result, ULP drivers that choose to create a netdevice
with queue number that is less than the maximum channels mlx5 creates,
will get a memory corruption.

This fix changes the mlx5e netdev logic to respect ULP netdevices TX/RX
queue number and use it when creating resources instead of the maximum
channel number.

Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 17:58:17 -07:00
Saeed Mahameed
cdeef2b152 net/mlx5e: Use non-delayed work for update stats
Convert mlx5e update stats work to a normal work structure, since it is
never used delayed.

Add a helper function to queue update stats work on demand which checks
for some conditions and reduce code duplication to have a better
abstraction.

Fixes: ed56c5193a ("net/mlx5e: Update NIC HW stats on demand only")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 17:58:15 -07:00
Saeed Mahameed
519a0bf5b2 net/mlx5e: Initialize all netdev common structures in one place
Move all mlx5e generic structures initializations to mlx5e_netdev_init.
The common structure new initializer function will be used to initialize
mlx5 context for netlink created netdevs such as IPoIB mlx5 accelerated
child netdevs.

Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
2018-10-10 17:58:14 -07:00
Feras Daoud
303211b44c net/mlx5e: Always initialize update stats delayed work
mlx5e_detach_netdev cancels update_stats work which was not initialized
in ipoib netdevice profile, as a result, the following assert occurs:

ODEBUG: assert_init not available (active state 0) object type:
timer_list hint:(null)

This change moves the update stats work to be initialized for all
mlx5e netdevices.

Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 17:58:14 -07:00
Feras Daoud
182570b262 net/mlx5e: Gather common netdev init/cleanup functionality in one place
Introduce a helper init/cleanup function that initializes mlx5e generic
netdev private structure, and use them from all profiles init/cleanup
callbacks.

This patch will also be helpful to initialize/cleanup netdevs that are
not created by mlx5 driver, e.g: accelerated ipoib child netdevs.

Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 17:58:13 -07:00
Denis Drozdov
f6a8a19bb1 RDMA/netdev: Hoist alloc_netdev_mqs out of the driver
netdev has several interfaces that expect to call alloc_netdev_mqs from
the core code, with the driver only providing the arguments.  This is
incompatible with the rdma_netdev interface that returns the netdev
directly.

Thus re-organize the API used by ipoib so that the verbs core code calls
alloc_netdev_mqs for the driver. This is done by allowing the drivers to
provide the allocation parameters via a 'get_params' callback and then
initializing an allocated netdev as a second step.

Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 17:58:11 -07:00
Linu Cherian
afb8902c46 octeontx2-af: Register for CGX lmac events
Added support in RVU AF driver to register for
CGX LMAC link status change events from firmware
and managing them. Processing part will be added
in followup patches.

- Introduced eventqueue for posting events from cgx lmac.
  Queueing mechanism will ensure that events can be posted
  and firmware can be acked immediately and hence event
  reception and processing are decoupled.
- Events gets added to the queue by notification callback.
  Notification callback is expected to be atomic, since it
  is called from interrupt context.
- Events are dequeued and processed in a worker thread.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Linu Cherian
1463f382f5 octeontx2-af: Add support for CGX link management
CGX LMAC initialization, link status polling etc is done
by low level secure firmware. For link management this patch
adds a interface or communication mechanism between firmware
and this kernel CGX driver.

- Firmware interface specification is defined in cgx_fw_if.h.
- Support to send/receive commands/events to/form firmware.
- events/commands implemented
  * link up
  * link down
  * reading firmware version

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Nithya Mani <nmani@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Linu Cherian
3a4fa841b0 octeontx2-af: Set RVU PFs to CGX LMACs mapping
Each of the enabled CGX LMAC is considered a physical
interface and RVU PFs are mapped to these. VFs of these
SRIOV PFs will be virtual interfaces and share CGX LMAC
along with PF.

This mapping info will be used later on for Rx/Tx pkt steering.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Sunil Goutham
8e22f04082 octeontx2-af: Add Marvell OcteonTX2 CGX driver
This patch adds basic template for Marvell OcteonTX2's
CGX ethernet interface driver. Just the probe.
RVU AF driver will use APIs exported by this driver
for various things like PF to physical interface mapping,
loopback mode, interface stats etc. Hence marged both
drivers into a single module.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Geetha sowjanya
34b34ee07d octeontx2-af: Reconfig MSIX base with IOVA
HW interprets RVU_AF_MSIXTR_BASE address as an IOVA, hence
create a IOMMU mapping for the physcial address configured by
firmware and reconfig RVU_AF_MSIXTR_BASE with IOVA.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Sunil Goutham
756051e23c octeontx2-af: Configure block LF's MSIX vector offset
Firmware configures a certain number of MSIX vectors to each of
enabled RVU PF/VF. When a block LF is attached to a PF/VF, number
of MSIX vectors needed by that LF are set aside (out of PF/VF's
total MSIX vectors) and LF's msix_offset is configured in HW.

Also added support for a RVU PF/VF to retrieve that block LF's
MSIX vector offset information from AF via mbox.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Sunil Goutham
746ea74241 octeontx2-af: Add RVU block LF provisioning support
Added support for a RVU PF/VF to request AF via mailbox
to attach or detach NPA/NIX/SSO/SSOW/TIM/CPT block LFs.
Also supports partial detachment and modifying current
LF attached count of a certian block type.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Sunil Goutham
114a767e8b octeontx2-af: Scan blocks for LFs provisioned to PF/VF
Scan all RVU blocks to find any 'LF to RVU PF/VF' mapping done by
low level firmware. If found any, mark them as used in respective
block's LF bitmap and also save mapped PF/VF's PF_FUNC info.

This is done to avoid reattaching a block LF to a different RVU PF/VF.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Aleksey Makarov
1f15462539 octeontx2-af: Convert mbox msg id check to a macro
With 10's of mailbox messages expected to be handled in future,
checking for message id could become a lengthy switch case. Hence
added a macro to auto generate the switch case for each msg id.

Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:02 -07:00
Sunil Goutham
7304ac4567 octeontx2-af: Add mailbox IRQ and msg handlers
This patch adds support for mailbox interrupt and message
handling. Mapped mailbox region and registered a workqueue
for message handling. Enabled mailbox IRQ of RVU PFs
and registered a interrupt handler. When IRQ is triggered
work is added to the mbox workqueue for msgs to get processed.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:01 -07:00
Aleksey Makarov
021e2e53b8 octeontx2-af: Add mailbox support infra
This patch adds mailbox support infrastructure APIs.
Each RVU device has a dedicated 64KB mailbox region
shared with it's peer for communication. RVU AF has
a separate mailbox region shared with each of RVU PFs
and a RVU PF has a separate region shared with each of
it's VF.

These set of APIs are used by this driver (RVU AF) and
other RVU PF/VF drivers eg netdev, crypto e.t.c.

Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Lukasz Bartosik <lbartosik@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:01 -07:00
Sunil Goutham
1054a6227c octeontx2-af: Gather RVU blocks HW info
This patch gathers NPA/NIX/SSO/SSOW/TIM/CPT RVU blocks's
HW info like number of LFs. Important register offsets
saved for later use to avoid code duplication for each block.
A bitmap is allocated for each of the blocks which later
on will be used to allocate a LF for a RVU PF/VF.

Also added RVU NIX/NPA block registers and few registers
of other blocks.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:01 -07:00
Sunil Goutham
54d557815e octeontx2-af: Reset all RVU blocks
Go through all BLKADDRs and check which ones are implemented
on this silicon and do a HW reset of each implemented block.
Also added all RVU AF and PF register offsets.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:01 -07:00
Sunil Goutham
54494aa5d1 octeontx2-af: Add Marvell OcteonTX2 RVU AF driver
This patch adds basic template for Marvell OcteonTX2's
resource virtualization unit (RVU) admin function (AF)
driver. Just the driver registration and probe.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 10:06:01 -07:00
Sudarsana Reddy Kalluru
e40a826a6c qed: Add support for virtual link.
Currently driver registers to physical link notifications (of the device)
from Management firmware (MFW). Driver doesn't get notified if there's a
change in the virtual link e.g., link-flap on the peer PF interface.
Virtual link indication from MFW reflects the per PF link status instead
of the physical link.

The patch adds driver support for,
  - Advertising the virtual link support to MFW.
  - Handling the virtual link notification from MFW.

Please consider applying it to 'net-next'.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 09:58:31 -07:00
Ganesh Goudar
b187191577 cxgb4: Add thermal zone support
Add thermal zone support to monitor ASIC's temperature.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 11:16:28 -07:00
Alaa Hleihel
27055454b4 net/mlx4_en: Use minimal rx and tx ring sizes on kdump kernel
When memory is limited (on kdump kernel), reduce size of rx and tx rings.
Also reduce the number of rx rings.

Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 11:08:48 -07:00
Arthur Kiyanovski
248ab77342 net: ena: fix auto casting to boolean
Eliminate potential auto casting compilation error.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
78a55d05de net: ena: fix NULL dereference due to untimely napi initialization
napi poll functions should be initialized before running request_irq(),
to handle a rare condition where there is a pending interrupt, causing
the ISR to fire immediately while the poll function wasn't set yet,
causing a NULL dereference.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
d7703ddbd7 net: ena: fix rare bug when failed restart/resume is followed by driver removal
In a rare scenario when ena_device_restore() fails, followed by device
remove, an FLR will not be issued. In this case, the device will keep
sending asynchronous AENQ keep-alive events, even after driver removal,
leading to memory corruption.

Fixes: 8c5c7abdeb ("net: ena: add power management ops to the ENA driver")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
d79c3888bd net: ena: fix warning in rmmod caused by double iounmap
Memory mapped with devm_ioremap is automatically freed when the driver
is disconnected from the device. Therefore there is no need to
explicitly call devm_iounmap.

Fixes: 0857d92f71 ("net: ena: add missing unmap bars on device removal")
Fixes: 411838e7b4 ("net: ena: fix rare kernel crash when bar memory remap fails")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
David S. Miller
071a234ad7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-10-08

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
BPF programs to perform lookups for sockets in a network namespace. This would
allow programs to determine early on in processing whether the stack is
expecting to receive the packet, and perform some action (eg drop,
forward somewhere) based on this information.

2) per-cpu cgroup local storage from Roman Gushchin.
Per-cpu cgroup local storage is very similar to simple cgroup storage
except all the data is per-cpu. The main goal of per-cpu variant is to
implement super fast counters (e.g. packet counters), which don't require
neither lookups, neither atomic operations in a fast path.
The example of these hybrid counters is in selftests/bpf/netcnt_prog.c

3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet

4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski

5) rename of libbpf interfaces from Andrey Ignatov.
libbpf is maturing as a library and should follow good practices in
library design and implementation to play well with other libraries.
This patch set brings consistent naming convention to global symbols.

6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
to let Apache2 projects use libbpf

7) various AF_XDP fixes from Björn and Magnus
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 23:42:44 -07:00
Ioana Ciocoi Radulescu
68049a5f4d dpaa2-eth: Don't account Tx confirmation frames on NAPI poll
Until now, both Rx and Tx confirmation frames handled during
NAPI poll were counted toward the NAPI budget. However, Tx
confirmations are lighter to process than Rx frames, which can
skew the amount of work actually done inside one NAPI cycle.

Update the code to only count Rx frames toward the NAPI budget
and set a separate threshold on how many Tx conf frames can be
processed in one poll cycle.

The NAPI poll routine stops when either the budget is consumed
by Rx frames or when Tx confirmation frames reach this threshold.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:05:05 -07:00
YueHaibing
9e19dabc05 net: mscc: ocelot: remove set but not used variable 'phy_mode'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/mscc/ocelot_board.c: In function 'mscc_ocelot_probe':
drivers/net/ethernet/mscc/ocelot_board.c:262:17: warning:
 variable 'phy_mode' set but not used [-Wunused-but-set-variable]
   enum phy_mode phy_mode;

It never used since introduction in
commit 71e32a20cf ("net: mscc: ocelot: make use of SerDes PHYs for handling their configuration")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:03:42 -07:00
Yangbo Lu
590ac2ffde net: dpaa2: fix and improve dpaa2-ptp driver
This patch is to fix and improve dpaa2-ptp driver
in some places.

- Fixed the return for some functions.
- Replaced kzalloc with devm_kzalloc.
- Removed dev_set_drvdata(dev, NULL).
- Made ptp_dpaa2_caps const.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:23:22 -07:00
Yangbo Lu
15b49f360c net: dpaa2: remove unused code for dprtc
This patch is to removed unused code for dprtc.
This code will be re-added along with more features
of dpaa2-ptp added.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:23:22 -07:00
Yangbo Lu
180f539d75 net: dpaa2: rename rtc as ptp in dpaa2-ptp driver
In dpaa2-ptp driver, it's odd to use rtc in names of
some functions and structures except these dprtc APIs.
This patch is to use ptp instead of rtc in names.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:23:22 -07:00
Yangbo Lu
58b1e729b3 net: dpaa2: fix dependency of config FSL_DPAA2_ETH
The NETDEVICES dependency and ETHERNET dependency hadn't
been required since dpaa2-eth was moved out of staging.
Also allowed COMPILE_TEST for dpaa2-eth.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:23:22 -07:00
Yangbo Lu
0a006a2f89 net: dpaa2: move DPAA2 PTP driver out of staging/
This patch is to move DPAA2 PTP driver out of staging/
since the dpaa2-eth had been moved out.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:23:22 -07:00
Quentin Monnet
7ff0ccde43 nfp: bpf: support pointers to other stack frames for BPF-to-BPF calls
Mark instructions that use pointers to areas in the stack outside of the
current stack frame, and process them accordingly in mem_op_stack().
This way, we also support BPF-to-BPF calls where the caller passes a
pointer to data in its own stack frame to the callee (typically, when
the caller passes an address to one of its local variables located in
the stack, as an argument).

Thanks to Jakub and Jiong for figuring out how to deal with this case,
I just had to turn their email discussion into this patch.

Suggested-by: Jiong Wang <jiong.wang@netronome.com>
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
4454962314 nfp: bpf: optimise save/restore for R6~R9 based on register usage
When pre-processing the instructions, it is trivial to detect what
subprograms are using R6, R7, R8 or R9 as destination registers. If a
subprogram uses none of those, then we do not need to jump to the
subroutines dedicated to saving and restoring callee-saved registers in
its prologue and epilogue.

This patch introduces detection of callee-saved registers in subprograms
and prevents the JIT from adding calls to those subroutines whenever we
can: we save some instructions in the translated program, and some time
at runtime on BPF-to-BPF calls and returns.

If no subprogram needs to save those registers, we can avoid appending
the subroutines at the end of the program.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
2178f3f0dc nfp: bpf: fix return address from register-saving subroutine to callee
On performing a BPF-to-BPF call, we first jump to a subroutine that
pushes callee-saved registers (R6~R9) to the stack, and from there we
goes to the start of the callee next. In order to do so, the caller must
pass to the subroutine the address of the NFP instruction to jump to at
the end of that subroutine. This cannot be reliably implemented when
translated the caller, as we do not always know the start offset of the
callee yet.

This patch implement the required fixup step for passing the start
offset in the callee via the register used by the subroutine to hold its
return address.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
bdf4c66faf nfp: bpf: update fixup function for BPF-to-BPF calls support
Relocation for targets of BPF-to-BPF calls are required at the end of
translation. Update the nfp_fixup_branches() function in that regard.

When checking that the last instruction of each bloc is a branch, we
must account for the length of the instructions required to pop the
return address from the stack.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
fb19816541 nfp: bpf: account for additional stack usage when checking stack limit
Offloaded programs using BPF-to-BPF calls use the stack to store the
return address when calling into a subprogram. Callees also need some
space to save eBPF registers R6 to R9. And contrarily to kernel
verifier, we align stack frames on 64 bytes (and not 32). Account for
all this when checking the stack size limit before JIT-ing the program.
This means we have to recompute maximum stack usage for the program, we
cannot get the value from the kernel.

In addition to adapting the checks on stack usage, move them to the
finalize() callback, now that we have it and because such checks are
part of the verification step rather than translation.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
389f263b60 nfp: bpf: add main logics for BPF-to-BPF calls support in nfp driver
This is the main patch for the logics of BPF-to-BPF calls in the nfp
driver.

The functions called on BPF_JUMP | BPF_CALL and BPF_JUMP | BPF_EXIT were
used to call helpers and exit from the program, respectively; make them
usable for calling into, or returning from, a BPF subprogram as well.

For all calls, push the return address as well as the callee-saved
registers (R6 to R9) to the stack, and pop them upon returning from the
calls. In order to limit the overhead in terms of instruction number,
this is done through dedicated subroutines. Jumping to the callee
actually consists in jumping to the subroutine, that "returns" to the
callee: this will require some fixup for passing the address in a later
patch. Similarly, returning consists in jumping to the subroutine, which
pops registers and then return directly to the caller (but no fixup is
needed here).

Return to the caller is performed with the RTN instruction newly added
to the JIT.

For the few steps where we need to know what subprogram an instruction
belongs to, the struct nfp_insn_meta is extended with a new subprog_idx
field.

Note that checks on the available stack size, to take into account the
additional requirements associated to BPF-to-BPF calls (storing R6-R9
and return addresses), are added in a later patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
e3b49dc69b nfp: bpf: account for BPF-to-BPF calls when preparing nfp JIT
Similarly to "exit" or "helper call" instructions, BPF-to-BPF calls will
require additional processing before translation starts, in order to
record and mark jump destinations.

We also mark the instructions where each subprogram begins. This will be
used in a following commit to determine where to add prologues for
subprograms.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:13 +02:00
Quentin Monnet
bcfdfb7c96 nfp: bpf: ignore helper-related checks for BPF calls in nfp verifier
The checks related to eBPF helper calls are performed each time the nfp
driver meets a BPF_JUMP | BPF_CALL instruction. However, these checks
are not relevant for BPF-to-BPF call (same instruction code, different
value in source register), so just skip the checks for such calls.

While at it, rename the function that runs those checks to make it clear
they apply to _helper_ calls only.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
Quentin Monnet
c5da54d93e nfp: bpf: copy eBPF subprograms information from kernel verifier
In order to support BPF-to-BPF calls in offloaded programs, the nfp
driver must collect information about the distinct subprograms: namely,
the number of subprograms composing the complete program and the stack
depth of those subprograms. The latter in particular is non-trivial to
collect, so we copy those elements from the kernel verifier via the
newly added post-verification hook. The struct nfp_prog is extended to
store this information. Stack depths are stored in an array of dedicated
structs.

Subprogram start indexes are not collected. Instead, meta instructions
associated to the start of a subprogram will be marked with a flag in a
later patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
Quentin Monnet
1a7e62e632 nfp: bpf: rename nfp_prog->stack_depth as nfp_prog->stack_frame_depth
In preparation for support for BPF to BPF calls in offloaded programs,
rename the "stack_depth" field of the struct nfp_prog as
"stack_frame_depth". This is to make it clear that the field refers to
the maximum size of the current stack frame (as opposed to the maximum
size of the whole stack memory).

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
Quentin Monnet
c941ce9c28 bpf: add verifier callback to get stack usage info for offloaded progs
In preparation for BPF-to-BPF calls in offloaded programs, add a new
function attribute to the struct bpf_prog_offload_ops so that drivers
supporting eBPF offload can hook at the end of program verification, and
potentially extract information collected by the verifier.

Implement a minimal callback (returning 0) in the drivers providing the
structs, namely netdevsim and nfp.

This will be useful in the nfp driver, in later commits, to extract the
number of subprograms as well as the stack depth for those subprograms.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08 10:24:12 +02:00
Gustavo A. R. Silva
5fc7c12ffa bnxt_en: Remove unnecessary unsigned integer comparison and initialize variable
There is no need to compare *val.vu32* with < 0 because
such variable is of type u32 (32 bits, unsigned), making it
impossible to hold a negative value. Fix this by removing
such comparison.

Also, initialize variable *max_val* to -1, just in case
it is not initialized to either BNXT_MSIX_VEC_MAX or
BNXT_MSIX_VEC_MIN_MAX before using it in a comparison
with val.vu32 at line 159:

	if (val.vu32 > max_val)

Addresses-Coverity-ID: 1473915 ("Unsigned compared against 0")
Addresses-Coverity-ID: 1473920 ("Uninitialized scalar variable")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-07 20:36:40 -07:00
David S. Miller
72438f8cef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-06 14:43:42 -07:00
Maxime Chevallier
35f3625c21 net: mvpp2: Extract the correct ethtype from the skb for tx csum offload
When offloading the L3 and L4 csum computation on TX, we need to extract
the l3_proto from the ethtype, independently of the presence of a vlan
tag.

The actual driver uses skb->protocol as-is, resulting in packets with
the wrong L4 checksum being sent when there's a vlan tag in the packet
header and checksum offloading is enabled.

This commit makes use of vlan_protocol_get() to get the correct ethtype
regardless the presence of a vlan tag.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 14:52:43 -07:00
Quentin Schulz
71e32a20cf net: mscc: ocelot: make use of SerDes PHYs for handling their configuration
Previously, the SerDes muxing was hardcoded to a given mode in the MAC
controller driver. Now, the SerDes muxing is configured within the
Device Tree and is enforced in the MAC controller driver so we can have
a lot of different SerDes configurations.

Make use of the SerDes PHYs in the MAC controller to set up the SerDes
according to the SerDes<->switch port mapping and the communication mode
with the Ethernet PHY.

Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 14:36:44 -07:00
Quentin Schulz
66c2132333 net: mscc: ocelot: simplify register access for PLL5 configuration
Since HSIO address space can be accessed by different drivers, let's
simplify the register address definitions so that it can be easily used
by all drivers and put the register address definition in the
include/soc/mscc/ocelot_hsio.h header file.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 14:36:44 -07:00
Quentin Schulz
8afc978925 net: mscc: ocelot: move the HSIO header to include/soc
Since HSIO address space can be used by different drivers (PLL, SerDes
muxing, temperature sensor), let's move it somewhere it can be included
by all drivers.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 14:36:44 -07:00
Quentin Schulz
19aedfbe65 net: mscc: ocelot: get HSIO regmap from syscon
HSIO address space was moved to a syscon, hence we need to get the
regmap of this address space from there and no more from the device
node.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 14:36:44 -07:00
Jian Shen
701a6d6ac7 net: hns3: Fix for rx vlan id handle to support Rev 0x21 hardware
In revision 0x20, we use vlan id != 0 to check whether a vlan tag
has been offloaded, so vlan id 0 is not supported.

In revision 0x21, rx buffer descriptor adds two bits to indicate
whether one or more vlan tags have been offloaded, so vlan id 0
is valid now.

This patch seperates the handle for vlan id 0, add vlan id 0 support
for revision 0x21.

Fixes: 5b5455a9ed ("net: hns3: Add STRP_TAGP field support for hardware revision 0x21")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 12:01:55 -07:00
Zhongzhu Liu
64d114f0a7 net: hns3: Add egress/ingress vlan filter for revision 0x21
In revision 0x21, hw supports both ingress and egress vlan filter.
This patch adds support for it.

Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 12:01:55 -07:00
Jian Shen
1f6db58973 net: hns3: Drop depricated mta table support
For mta table support has been dropped, remove the code for mta table.

Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 12:01:55 -07:00
Jian Shen
39932473b6 net: hns3: Optimize for unicast mac vlan table
In previously implement for unicast mac vlan table, the space is
shared by all the functions, driver does nothing when the space is
exhausted. This patch preallocates the space of unicast mac vlan
table for each function by software. Each function can only use its
private space and available shared space, avoiding single function
exhausts too much space, and other functions are unable to add
unicast mac address.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 12:01:55 -07:00
Jian Shen
f05e210971 net: hns3: Clear mac vlan table entries when unload driver or function reset
In original codes, the mac vlan table entries are not cleared when
unload hns3 driver. The dirty mac vlan table entries will make the
result of looking up mac vlan table being unexpected.

When doing core reset or global reset, the firmware will clear all
the tables for driver, and driver shouldn't send any commands to
firmware during reset. But when doing function reset, the driver
needs to clear the tables itself.

This patch clears the mac vlan table entries for each client when
unload driver or reset.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 12:01:54 -07:00
Jian Shen
dd2b6ef950 net: hns3: Remove the default mask configuration for mac vlan table
The default mask configuration has been done by firmware, so the driver
doesn't need to do it any more.

Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 12:01:54 -07:00
Wenwen Wang
2c05d88818 net: cxgb3_main: fix a missing-check bug
In cxgb_extension_ioctl(), the command of the ioctl is firstly copied from
the user-space buffer 'useraddr' to 'cmd' and checked through the
switch statement. If the command is not as expected, an error code
EOPNOTSUPP is returned. In the following execution, i.e., the cases of the
switch statement, the whole buffer of 'useraddr' is copied again to a
specific data structure, according to what kind of command is requested.
However, after the second copy, there is no re-check on the newly-copied
command. Given that the buffer 'useraddr' is in the user space, a malicious
user can race to change the command between the two copies. By doing so,
the attacker can supply malicious data to the kernel and cause undefined
behavior.

This patch adds a re-check in each case of the switch statement if there is
a second copy in that case, to re-check whether the command obtained in the
second copy is the same as the one in the first copy. If not, an error code
EINVAL is returned.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 11:47:19 -07:00
Ganesh Goudar
a657dbf617 cxgb4: use FW_PORT_ACTION_L1_CFG32 for 32 bit capability
when 32 bit port capability is in use, use FW_PORT_ACTION_L1_CFG32
rather than FW_PORT_ACTION_L1_CFG.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 10:34:18 -07:00
Davide Caratti
2d52527e80 be2net: don't flip hw_features when VXLANs are added/deleted
the be2net implementation of .ndo_tunnel_{add,del}() changes the value of
NETIF_F_GSO_UDP_TUNNEL bit in 'features' and 'hw_features', but it forgets
to call netdev_features_change(). Moreover, ethtool setting for that bit
can potentially be reverted after a tunnel is added or removed.

GSO already does software segmentation when 'hw_enc_features' is 0, even
if VXLAN offload is turned on. In addition, commit 096de2f83e ("benet:
stricter vxlan offloading check in be_features_check") avoids hardware
segmentation of non-VXLAN tunneled packets, or VXLAN packets having wrong
destination port. So, it's safe to avoid flipping the above feature on
addition/deletion of VXLAN tunnels.

Fixes: 630f4b7056 ("be2net: Export tunnel offloads only when a VxLAN tunnel is created")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-05 00:59:21 -07:00
Vasundhara Volam
c78fe05887 bnxt_en: get the reduced max_irqs by the ones used by RDMA
When getting the max rings supported, get the reduced max_irqs
by the ones used by RDMA.

If the number MSIX is the limiting factor, this bug may cause the
max ring count to be higher than it should be when RDMA driver is
loaded and may result in ring allocation failures.

Fixes: 30f529473e ("bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Venkat Duvvuru
a2bf74f4e1 bnxt_en: free hwrm resources, if driver probe fails.
When the driver probe fails, all the resources that were allocated prior
to the failure must be freed. However, hwrm dma response memory is not
getting freed.

This patch fixes the problem described above.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Vasundhara Volam
5db0e0969a bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits
set only for the queue ids which are having the valid parameters.

This causes firmware to return error when the TC to hardware CoS queue
mapping is not 1:1 during DCBNL ETS setup.

Fixes: 2e8ef77ee0 ("bnxt_en: Add TC to hardware QoS queue mapping logic.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Michael Chan
dbe80d446c bnxt_en: Fix VNIC reservations on the PF.
The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG
firmware call to reserve VNICs.  This has the effect that the firmware
will keep a large number of VNICs for the PF, and having very few for
VFs.  DPDK driver running on the VFs, which requires more VNICs, may not
work properly as a result.

Fixes: 674f50a5b0 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Vasundhara Volam
2dc0865e9a bnxt_en: Add a driver specific gre_ver_check devlink parameter.
This patch adds following driver-specific permanent mode boolean
parameter.

gre_ver_check - Generic Routing Encapsulation(GRE) version check
will be enabled in the device. If disabled, device skips version
checking for GRE packets.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
Vasundhara Volam
f399e84978 bnxt_en: Use msix_vec_per_pf_max and msix_vec_per_pf_min devlink params.
This patch adds support for following generic permanent mode
devlink parameters. They can be modified using devlink param
commands.

msix_vec_per_pf_max - This param sets the number of MSIX vectors
that the device requests from the host on driver initialization.
This value is set in the device which limits MSIX vectors per PF.

msix_vec_per_pf_min - This param sets the number of minimal MSIX
vectors required for the device initialization. Value 0 indicates
a default value is selected. This value is set in the device which
limits MSIX vectors per PF.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
Vasundhara Volam
3a1d52a54a bnxt_en: return proper error when FW returns HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED
Return proper error code when Firmware returns
HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED for HWRM_NVM_GET/SET_VARIABLE
commands.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
Vasundhara Volam
7d85923487 bnxt_en: Use ignore_ari devlink parameter
This patch adds support for ignore_ari generic permanent mode
devlink parameter. This parameter is disabled by default. It can be
enabled using devlink param commands.

ignore_ari - If enabled, device ignores ARI(Alternate Routing ID)
capability, even when platforms has the support and creates same number
of partitions when platform does not support ARI capability.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
Nathan Chancellor
8fa74e3c49 qed: Avoid implicit enum conversion in qed_ooo_submit_tx_buffers
Clang warns when one enumerated type is implicitly converted to another.

drivers/net/ethernet/qlogic/qed/qed_ll2.c:799:32: warning: implicit
conversion from enumeration type 'enum core_tx_dest' to different
enumeration type 'enum qed_ll2_tx_dest' [-Wenum-conversion]
                tx_pkt.tx_dest = p_ll2_conn->tx_dest;
                               ~ ~~~~~~~~~~~~^~~~~~~
1 warning generated.

Fix this by using a switch statement to convert between the enumerated
values since they are not 1 to 1, which matches how the rest of the
driver handles this conversion.

Link: https://github.com/ClangBuiltLinux/linux/issues/125
Suggested-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 09:57:26 -07:00
Ido Schimmel
c360867ec4 mlxsw: spectrum: Delete RIF when VLAN device is removed
In commit 602b74eda8 ("mlxsw: spectrum_switchdev: Do not leak RIFs
when removing bridge") I handled the case where RIFs created for VLAN
devices were not properly cleaned up when their real device (a bridge)
was removed.

However, I forgot to handle the case of the VLAN device itself being
removed. Do so now when the VLAN device is being unlinked from its real
device.

Fixes: 99f44bb352 ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reported-by: Artem Shvorin <art@qrator.net>
Tested-by: Artem Shvorin <art@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 09:53:03 -07:00
Nir Dotan
f3c84a8e3e mlxsw: pci: Derive event type from event queue number
Due to a hardware issue in Spectrum-2, the field event_type of the event
queue element (EQE) has become reserved. It was used to distinguish between
command interface completion events and completion events.

Use queue number to determine event type, as command interface completion
events are always received on EQ0 and mlxsw driver maps completion events
to EQ1.

Fixes: c3ab435466 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 09:53:03 -07:00
David S. Miller
9e50727f0e mlx5-updates-2018-10-03
mlx5 core driver and ethernet netdev updates, please note there is a small
 devlink releated update to allow extack argument to eswitch operations.
 
 From Eli Britstein,
 1) devlink: Add extack argument to the eswitch related operations
 2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
 3) net/mlx5e: Add extack messages for TC offload failures
 
 From Eran Ben Elisha,
 4) mlx5e: Add counter for aRFS rule insertion failures
 
 From Feras Daoud
 5) Fast teardown support for mlx5 device
 This change introduces the enhanced version of the "Force teardown" that
 allows SW to perform teardown in a faster way without the need to reclaim
 all the FW pages.
 Fast teardown provides the following advantages:
     1- Fix a FW race condition that could cause command timeout
     2- Avoid moving to polling mode
     3- Close the vport to prevent PCI ACK to be sent without been scatter
     to memory
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbtU45AAoJEEg/ir3gV/o+/C4H/RHA4KImrb476EdB3VNYMqAN
 dgXb+bmh6sZP+jHWqQ4c3aVeh6/T8qm4gwiSn2nVTtHEnxtCdIYljzDC1Nswczeg
 pSjD1eOP7M1LpAOmBb8xdnJcX7yM7r1bTklnp2sN853WShbsDRYgZBHsBwTzx25U
 ZdzL4QTLuohlG/aLrbGXMntIy45ya2fVQrnK54s18nFlgsdFjEs0mi0xaUKNBC6+
 P8CTohHAxuuxmL5b+6MIYLZCdgd8cLNQFdtqbckEVw7SvcRTxfraRlyqJ0YOgTGB
 TdSWnqZz2JYH29wSFbpFG8qX6GCv8FoiZ+fKzldbolHk442rrktHv3+Y7qQuZVs=
 =NVks
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-10-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2018-10-03

mlx5 core driver and ethernet netdev updates, please note there is a small
devlink releated update to allow extack argument to eswitch operations.

From Eli Britstein,
1) devlink: Add extack argument to the eswitch related operations
2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
3) net/mlx5e: Add extack messages for TC offload failures

From Eran Ben Elisha,
4) mlx5e: Add counter for aRFS rule insertion failures

From Feras Daoud
5) Fast teardown support for mlx5 device
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the FW pages.
Fast teardown provides the following advantages:
    1- Fix a FW race condition that could cause command timeout
    2- Avoid moving to polling mode
    3- Close the vport to prevent PCI ACK to be sent without been scatter
    to memory
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 09:48:37 -07:00
Colin Ian King
0aa63eb9a9 liquidio: fix a couple of spelling mistakes
Trivial fix to spelling mistakes in dev_dbg warning messages

"Reloade" -> "Reload"
"chang" -> "change"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 09:33:38 -07:00
David S. Miller
6f41617bf2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 21:00:17 -07:00
Feras Daoud
fcd29ad17c net/mlx5: Add Fast teardown support
Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown

This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.

Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been scatter
to memory

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-03 16:18:00 -07:00
Eran Ben Elisha
94563847a8 net/mlx5e: Add new counter for aRFS rule insertion failures
Count aRFS rules insertion failure for ethtool output. In addition, move
the error print into debug prints mechanism, as it could flood the dmesg
and reduce system BW dramatically.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-03 16:17:59 -07:00
Eli Britstein
e98bedf5e6 net/mlx5e: Add extack messages for TC offload failures
Return tc extack messages for failures to user space.
Messages provide reasons for not being able to offload rules to HW.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-03 16:17:59 -07:00
Eli Britstein
8c98ee77d9 net/mlx5e: E-Switch, Add extack messages to devlink callbacks
Return extack messages for failures in the e-switch devlink callbacks.
Messages provide reasons for not being able to issue the operation.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-03 16:17:58 -07:00
Eli Britstein
db7ff19e7b devlink: Add extack for eswitch operations
Add extack argument to the eswitch related operations.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-03 16:17:58 -07:00
Greg Kroah-Hartman
cec4de302c Merge gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net
David writes:
  "Networking fixes:
   1) Prefix length validation in xfrm layer, from Steffen Klassert.

   2) TX status reporting fix in mac80211, from Andrei Otcheretianski.

   3) Fix hangs due to TX_DROP in mac80211, from Bob Copeland.

   4) Fix DMA error regression in b43, from Larry Finger.

   5) Add input validation to xenvif_set_hash_mapping(), from Jan Beulich.

   6) SMMU unmapping fix in hns driver, from Yunsheng Lin.

   7) Bluetooh crash in unpairing on SMP, from Matias Karhumaa.

   8) WoL handling fixes in the phy layer, from Heiner Kallweit.

   9) Fix deadlock in bonding, from Mahesh Bandewar.

   10) Fill ttl inherit infor in vxlan driver, from Hangbin Liu.

   11) Fix TX timeouts during netpoll, from Michael Chan.

   12) RXRPC layer fixes from David Howells.

   13) Another batch of ndo_poll_controller() removals to deal with
       excessive resource consumption during load.  From Eric Dumazet.

   14) Fix a specific TIPC failure secnario, from LUU Duc Canh.

   15) Really disable clocks in r8169 during suspend so that low
       power states can actually be reached.

   16) Fix SYN backlog lockdep issue in tcp and dccp, from Eric Dumazet.

   17) Fix RCU locking in netpoll SKB send, which shows up in bonding,
       from Dave Jones.

   18) Fix TX stalls in r8169, from Heiner Kallweit.

   19) Fix locksup in nfp due to control message storms, from Jakub
       Kicinski.

   20) Various rmnet bug fixes from Subash Abhinov Kasiviswanathan and
       Sean Tranchetti.

   21) Fix use after free in ip_cmsg_recv_dstaddr(), from Eric Dumazet."

* gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (122 commits)
  ixgbe: check return value of napi_complete_done()
  sctp: fix fall-through annotation
  r8169: always autoneg on resume
  ipv4: fix use-after-free in ip_cmsg_recv_dstaddr()
  net: qualcomm: rmnet: Fix incorrect allocation flag in receive path
  net: qualcomm: rmnet: Fix incorrect allocation flag in transmit
  net: qualcomm: rmnet: Skip processing loopback packets
  net: systemport: Fix wake-up interrupt race during resume
  rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096
  bonding: fix warning message
  inet: make sure to grab rcu_read_lock before using ireq->ireq_opt
  nfp: avoid soft lockups under control message storm
  declance: Fix continuation with the adapter identification message
  net: fec: fix rare tx timeout
  r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO
  tun: napi flags belong to tfile
  tun: initialize napi_mutex unconditionally
  tun: remove unused parameters
  bond: take rcu lock in netpoll_send_skb_on_dev
  rtnetlink: Fail dump if target netnsid is invalid
  ...
2018-10-03 16:09:11 -07:00
Song Liu
4233cfe6ec ixgbe: check return value of napi_complete_done()
The NIC driver should only enable interrupts when napi_complete_done()
returns true. This patch adds the check for ixgbe.

Cc: stable@vger.kernel.org # 4.10+
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 14:40:32 -07:00
Rami Rosen
37ebb5fa6f iavf: fix a typo
This trivial patch fixes a typo in iavf.h.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:56:36 -07:00
Björn Töpel
8221c5eba8 ixgbe: add AF_XDP zero-copy Tx support
This patch adds zero-copy Tx support for AF_XDP sockets. It implements
the ndo_xsk_async_xmit netdev ndo and performs all the Tx logic from a
NAPI context. This means pulling egress packets from the Tx ring,
placing the frames on the NIC HW descriptor ring and completing sent
frames back to the application via the completion ring.

The regular XDP Tx ring is used for AF_XDP as well. This rationale for
this is as follows: XDP_REDIRECT guarantees mutual exclusion between
different NAPI contexts based on CPU id. In other words, a netdev can
XDP_REDIRECT to another netdev with a different NAPI context, since
the operation is bound to a specific core and each core has its own
hardware ring.

As the AF_XDP Tx action is running in the same NAPI context and using
the same ring, it will also be protected from XDP_REDIRECT actions
with the exact same mechanism.

As with AF_XDP Rx, all AF_XDP Tx specific functions are added to
ixgbe_xsk.c.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:54:37 -07:00
Björn Töpel
05ae861450 ixgbe: move common Tx functions to ixgbe_txrx_common.h
This patch prepares for the upcoming zero-copy Tx functionality by
moving common functions used both by the regular path and zero-copy
path.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:52:08 -07:00
Björn Töpel
d0bcacd0a1 ixgbe: add AF_XDP zero-copy Rx support
This patch adds zero-copy Rx support for AF_XDP sockets. Instead of
allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are
allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain
queue.

All AF_XDP specific functions are added to a new file, ixgbe_xsk.c.

Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS
will allocate a new buffer and copy the zero-copy frame prior passing
it to the kernel stack.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:51:14 -07:00
Björn Töpel
46515fdb1a ixgbe: move common Rx functions to ixgbe_txrx_common.h
This patch prepares for the upcoming zero-copy Rx functionality, by
moving/changing linkage of common functions, used both by the regular
path and zero-copy path.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:40:44 -07:00
Björn Töpel
024aa5800f ixgbe: added Rx/Tx ring disable/enable functions
Add functions for Rx/Tx ring enable/disable. Instead of resetting the
whole device, only the affected ring is disabled or enabled.

This plumbing is used in later commits, when zero-copy AF_XDP support
is introduced.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:38:40 -07:00
Radoslaw Tyl
5d826d2091 ixgbe: Fix crash with VFs and flow director on interface flap
This patch fix crash when we have restore flow director filters after reset
adapter. In ixgbe_fdir_filter_restore() filter->action is outside of the
rx_ring array, as it has a VF identifier in the upper 32 bits.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:36:20 -07:00
Nathan Chancellor
92fb7aaff8 i40e: Remove unnecessary print statement
Clang warns that the address of a pointer will always evaluated as true
in a boolean context.

drivers/net/ethernet/intel/i40e/i40e_debugfs.c:136:9: warning: address
of array 'vsi->active_vlans' will always evaluate to 'true'
[-Wpointer-bool-conversion]
                 vsi->active_vlans ? "<valid>" : "<null>");
                 ~~~~~^~~~~~~~~~~~ ~
./include/linux/device.h:1431:33: note: expanded from macro 'dev_info'
        _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
                                       ^~~~~~~~~~~
1 warning generated.

Given that the statement shows that active_vlans is always valid, just
remove the statement since it's not giving any useful information.

Link: https://github.com/ClangBuiltLinux/linux/issues/82
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:33:34 -07:00
Nathan Chancellor
43ade6ad18 i40e: Use proper enum in i40e_ndo_set_vf_link_state
Clang warns when one enumerated type is converted implicitly to another.

drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4214:42: warning:
implicit conversion from enumeration type 'enum i40e_aq_link_speed' to
different enumeration type 'enum virtchnl_link_speed'
      [-Wenum-conversion]
                pfe.event_data.link_event.link_speed = I40E_LINK_SPEED_40GB;
                                                     ~ ^~~~~~~~~~~~~~~~~~~~
1 warning generated.

Use the proper enum from virtchnl_link_speed, which has the same value
as I40E_LINK_SPEED_40GB, VIRTCHNL_LINK_SPEED_40GB. This appears to be
missed by commit ff3f4cc267 ("virtchnl: finish conversion to virtchnl
interface").

Link: https://github.com/ClangBuiltLinux/linux/issues/81
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:31:47 -07:00
Dan Carpenter
617cc646a7 ixgbevf: off by one in ixgbevf_ipsec_tx()
The ipsec->tx_tbl[] array has IXGBE_IPSEC_MAX_SA_COUNT elements so the >
should be a >=.

Fixes: 0062e7cc95 ("ixgbevf: add VF IPsec offload code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:29:34 -07:00
YueHaibing
6b27f3de22 ixgbe: remove redundant function ixgbe_fw_recovery_mode()
There are no in-tree callers.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:28:04 -07:00
Radoslaw Tyl
8d7179b1e2 ixgbe: Fix ixgbe TX hangs with XDP_TX beyond queue limit
We have Tx hang when number Tx and XDP queues are more than 64.
In XDP always is MTQC == 0x0 (64TxQs). We need more space for Tx queues.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 12:26:05 -07:00
Shannon Nelson
2c49d34f3b ixgbevf: fix msglen for ipsec mbx messages
Don't be fancy with message lengths, just set lengths to
number of dwords, not bytes.

Fixes: 0062e7cc95 ("ixgbevf: add VF IPsec offload code")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 11:42:40 -07:00
David S. Miller
072eff2d9e Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-10-03

This series contains updates to ice and virtchnl.

Yashaswini Raghuram adds a new virtchnl capability flag to support the
exchange of additional supported speeds.

Anirudh adds support for SR-IOV for the ice driver.  Added code to
initialize, configure and use mailbox queues for PF and VF
communication.  Updated the VSI and queue management to handle both PF
and VF VSI type.  Added "Adaptive Virtual Function (AVF)" support for
the ice PF driver by implementing virtchnl commands.  Extended the
malicious driver detection logic to include the VF driver as well.
Fixed the queue region size which needs to be log base 2 of the number
of queues in region.

Brett fixes an issue which was causing switch rules to be lost, by
making a call to ice_update_pkt_fwd_rule() with the necessary changes.
Fixed how the PF and VF assigned the ITR index by adding a struct member
itr_idx to be used to dynamically program the correct ITR index.

Dave fixed a potential NULL pointer dereference by adding checks in the
filter handling.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 09:39:42 -07:00
Ganesh Goudar
db3408a150 cxgb4: remove the unneeded locks
cxgb_set_tx_maxrate will be called holding rtnl lock,
hence remove all unneeded locks.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 09:34:52 -07:00
Anirudh Venkataramanan
5cc6c8b30c ice: Update version string
Update version string to 0.7.2-k

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Dave Ertman
124cd54796 ice: Use the right function to enable/disable VSI
The ice_ena/dis_vsi should have a single differentiating
factor to determine if the netdev_ops call is used or a
direct call to ice_vsi_open/close.  This is if the netif is
running or not.  If netif is running, use ndo_open/ndo_close.
Else, use ice_vsi_open/ice_vsi_close.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Brett Creeley
d2b464a7ff ice: Add more flexibility on how we assign an ITR index
This issue came about when looking at the VF function
ice_vc_cfg_irq_map_msg. Currently we are assigning the itr_setting value
to the itr_idx received from the AVF driver, which is not correct and is
not used for the VF flow anyway. Currently the only way we set the ITR
index for both the PF and VF driver is by hard coding ICE_TX_ITR or
ICE_RX_ITR for the ITR index on each q_vector.

To fix this, add the member itr_idx in struct ice_ring_container. This
can then be used to dynamically program the correct ITR index. This change
also affected the PF driver so make the necessary changes there as well.

Also, removed the itr_setting member in struct ice_ring because it is not
being used meaningfully and is going to be removed in a future patch that
includes dynamic ITR.

On another note, this will be useful moving forward if we decide to split
Rx/Tx rings on different q_vectors instead of sharing them as queue pairs.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Dave Ertman
072f0c3db9 ice: Fix potential null pointer issues
Add checks in the filter handling flow to avoid dereferencing
NULL pointers.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Brett Creeley
c60cdb13ec ice: Add code to go from ICE_FWD_TO_VSI_LIST to ICE_FWD_TO_VSI
When a switch rule is initially created we set the filter action to
ICE_FWD_TO_VSI. The filter action changes to ICE_FWD_TO_VSI_LIST
whenever more than one VSI is subscribed to the same switch rule. When
the switch rule goes from 2 VSIs in the list to 1 VSI we remove and
delete the VSI list rule, but we currently don't update the switch rule
in hardware. This is causing switch rules to be lost, so fix that by
making a call to ice_update_pkt_fwd_rule() with the necessary changes.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Anirudh Venkataramanan
be8ff000bf ice: Fix forward to queue group logic
When adding a rule, queue region size needs to be provided as log base 2
of the number of queues in region. Fix that.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Anirudh Venkataramanan
7c4bc1f576 ice: Extend malicious operations detection logic
This patch extends the existing malicious driver operation detection
logic to cover malicious operations by the VF driver as well.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Anirudh Venkataramanan
53b8decbb7 ice: Notify VF of link status change
When PF gets a link status change event, notify the VFs of the same.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Anirudh Venkataramanan
1071a8358a ice: Implement virtchnl commands for AVF support
virtchnl is a protocol/interface specification that allows the Intel
"Adaptive Virtual Function (AVF)" driver (iavf.ko) to work with more than
one physical function driver. The AVF driver sends "virtchnl commands"
(control plane only) to the PF driver over mailbox queues and the PF driver
executes these commands and returns a result to the VF, again over mailbox.

This patch adds AVF support for the ice PF driver by implementing the
following virtchnl commands:

VIRTCHNL_OP_VERSION
VIRTCHNL_OP_GET_VF_RESOURCES
VIRTCHNL_OP_RESET_VF
VIRTCHNL_OP_ADD_ETH_ADDR
VIRTCHNL_OP_DEL_ETH_ADDR
VIRTCHNL_OP_CONFIG_VSI_QUEUES
VIRTCHNL_OP_ENABLE_QUEUES
VIRTCHNL_OP_DISABLE_QUEUES
VIRTCHNL_OP_ADD_ETH_ADDR
VIRTCHNL_OP_DEL_ETH_ADDR
VIRTCHNL_OP_CONFIG_VSI_QUEUES
VIRTCHNL_OP_ENABLE_QUEUES
VIRTCHNL_OP_DISABLE_QUEUES
VIRTCHNL_OP_REQUEST_QUEUES
VIRTCHNL_OP_CONFIG_IRQ_MAP
VIRTCHNL_OP_CONFIG_RSS_KEY
VIRTCHNL_OP_CONFIG_RSS_LUT
VIRTCHNL_OP_GET_STATS
VIRTCHNL_OP_ADD_VLAN
VIRTCHNL_OP_DEL_VLAN
VIRTCHNL_OP_ENABLE_VLAN_STRIPPING
VIRTCHNL_OP_DISABLE_VLAN_STRIPPING

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Anirudh Venkataramanan
7c710869d6 ice: Add handlers for VF netdevice operations
This patch implements handlers for the following NDO operations:

.ndo_set_vf_spoofchk
.ndo_set_vf_mac
.ndo_get_vf_config
.ndo_set_vf_trust
.ndo_set_vf_vlan
.ndo_set_vf_link_state

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Anirudh Venkataramanan
007676b4ac ice: Add support for VF reset events
Post VF initialization, there are a couple of different ways in which a
VF reset can be triggered. One is when the underlying PF itself goes
through a reset and other is via a VFLR interrupt. ice_reset_vf introduced
in this patch handles both these cases.

Also introduced in this patch is a helper function ice_aq_send_msg_to_vf
to send messages to VF over the mailbox queue. The PF uses this to send
reset notifications to VFs.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:30 -07:00
Anirudh Venkataramanan
8ede01785f ice: Update VSI and queue management code to handle VF VSI
Until now, all the VSI and queue management code supported only the PF
VSI type (ICE_VSI_PF). Update these flows to handle the VF VSI type
(ICE_VSI_VF) type as well.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:29 -07:00
Anirudh Venkataramanan
ddf30f7ff8 ice: Add handler to configure SR-IOV
This patch implements parts of ice_sriov_configure and VF reset flow.

To create virtual functions (VFs), the user sets a value in num_vfs
through sysfs. This results in the kernel calling the handler for
.sriov_configure which is ice_sriov_configure.

VF setup first starts with a VF reset, followed by allocation of the VF
VSI using ice_vf_vsi_setup. Once the VF setup is complete a state bit
ICE_VF_STATE_INIT is set in the vf->states bitmap to indicate that
the VF is ready to go.

Also for VF reset to go into effect, it's necessary to issue a disable
queue command (ice_aqc_opc_dis_txqs). So this patch updates multiple
functions in the disable queue flow to take additional parameters that
distinguish if queues are being disabled due to VF reset.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:29 -07:00
Anirudh Venkataramanan
75d2b25302 ice: Add support to detect SR-IOV capability and mailbox queues
Mailbox queue is a type of control queue that's used for communication
between PF and VF. This patch adds code to initialize, configure and
use mailbox queues.

This patch also adds support to detect and parse SR-IOV capabilities
returned by the hardware.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-03 07:42:29 -07:00
Alex Xu (Hello71)
9003b36949 r8169: always autoneg on resume
This affects at least versions 25 and 33, so assume all cards are broken
and just renegotiate by default.

Fixes: 10bc6a6042 ("r8169: fix autoneg issue on resume with RTL8168E")
Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:33:43 -07:00
Nathan Chancellor
258b6d1418 cxgb4: Use proper enum in IEEE_FAUX_SYNC
Clang warns when one enumerated type is implicitly converted to another.

drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c:390:4: warning: implicit
conversion from enumeration type 'enum cxgb4_dcb_state' to different
enumeration type 'enum cxgb4_dcb_state_input' [-Wenum-conversion]
                        IEEE_FAUX_SYNC(dev, dcb);
                        ^~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h:70:10: note: expanded
from macro 'IEEE_FAUX_SYNC'
                                            CXGB4_DCB_STATE_FW_ALLSYNCED);
                                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Use the equivalent value of the expected type to silence Clang while
resulting in no functional change.

CXGB4_DCB_STATE_FW_ALLSYNCED = CXGB4_DCB_INPUT_FW_ALLSYNCED = 3

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:30:52 -07:00
Nathan Chancellor
3b0b8f0d9a cxgb4: Use proper enum in cxgb4_dcb_handle_fw_update
Clang warns when one enumerated type is implicitly converted to another.

drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c:303:7: warning: implicit
conversion from enumeration type 'enum cxgb4_dcb_state' to different
enumeration type 'enum cxgb4_dcb_state_input' [-Wenum-conversion]
                         ? CXGB4_DCB_STATE_FW_ALLSYNCED
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c:304:7: warning: implicit
conversion from enumeration type 'enum cxgb4_dcb_state' to different
enumeration type 'enum cxgb4_dcb_state_input' [-Wenum-conversion]
                         : CXGB4_DCB_STATE_FW_INCOMPLETE);
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.

Use the equivalent value of the expected type to silence Clang while
resulting in no functional change.

CXGB4_DCB_STATE_FW_INCOMPLETE = CXGB4_DCB_INPUT_FW_INCOMPLETE = 2
CXGB4_DCB_STATE_FW_ALLSYNCED = CXGB4_DCB_INPUT_FW_ALLSYNCED = 3

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:30:26 -07:00
Nathan Chancellor
0fd5480751 dpaa_eth: Remove useless declaration
Clang warns:

drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:2734:34: warning:
tentative array definition assumed to have one element
static const struct of_device_id dpaa_match[];
                                 ^
1 warning generated.

Turns out that since this driver was introduced in commit 9ad1a37493
("dpaa_eth: add support for DPAA Ethernet"), this declaration has been
unused. Remove it to silence the warning.

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:29:34 -07:00
Ioana Radulescu
afb90dbb5f dpaa2-eth: Add ethtool support for flow classification
Add support for inserting and deleting Rx flow classification
rules through ethtool.

We support classification based on some header fields for
flow-types ether, ip4, tcp4, udp4 and sctp4.

Rx queues are core affine, so the action argument effectively
selects on which cpu the matching frame will be processed.
Discarding the frame is also supported.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:24:08 -07:00
Ioana Radulescu
4aaaf9b95a dpaa2-eth: Configure Rx flow classification key
For firmware versions that support it, configure an Rx flow
classification key at probe time.

Hardware expects all rules in the classification table to share
the same key. So we setup a key containing all supported fields
at driver init and when a user adds classification rules through
ethtool, we will just mask out the unused header fields.

Since the key composition process is the same for flow
classification and hashing, reuse existing code where possible.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:24:08 -07:00
Ioana Radulescu
f76c483a0b dpaa2-eth: Rename structure
Since the array of supported header fields will be used for
Rx flow classification as well, rename it from "hash_fields" to
the more inclusive "dist_fields".

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:24:08 -07:00
Ioana Radulescu
df85aeb9b6 dpaa2-eth: Use new API for Rx flow hashing
The Management Complex (MC) firmware initially allowed the
configuration of a single key to be used both for Rx flow hashing
and flow classification. This prevented us from supporting
Rx flow classification through ethtool.

Starting with version 10.7.0, the Management Complex(MC) offers
a new set of APIs for separate configuration of Rx hashing and
classification keys.

Update the Rx flow hashing support to use the new API, if available.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:24:08 -07:00
David S. Miller
b9f1bcb220 mlx5-fixes-2018-10-01
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbsmA5AAoJEEg/ir3gV/o+Bg0IAM4qUD878349YufTxlNI3usz
 pPyRYLFfh+f3oTjGWxZ4SlAzMsCVnGDWMKFBj33qnJen1YcMxD2DriTIStQfOHkE
 EpErxxMa70YFY2VyvD6Gt/Xfw9Q4ZGrp9j4tNnXVBMVT+EDhui/OpHFlpgoLtpLk
 A8pxPRkbQNdql4OWOLwVpFJoleB0mJp4+ed+z743z4ectBfY0Iqx6Yyk1TRUzLfB
 o/cUV02M33Fi8XjjZ59crPrIc4r0iUcPDcd5+8i2WtOkfbFod9s1Dg4GMNbGpcFn
 RPJX9A+XilMPCJNM8YhwcS2JxX8Y2XwUjBDbspx0RHX3HZY22SjuN2i5p1GZnLo=
 =pu43
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2018-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2018-10-01

This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.

For -stable v4.11:
"6e0a4a23c59a ('net/mlx5: E-Switch, Fix out of bound access when setting vport rate')"

For -stable v4.18:
"98d6627c372a ('net/mlx5e: Set vlan masks for all offloaded TC rules')"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:20:24 -07:00
Subash Abhinov Kasiviswanathan
ec405641e2 net: qualcomm: rmnet: Fix incorrect allocation flag in receive path
The incoming skb needs to be reallocated in case the headroom
is not sufficient to adjust the ethernet header. This allocation
needs to be atomic otherwise it results in this splat

 [<600601bb>] ___might_sleep+0x185/0x1a3
 [<603f6314>] ? _raw_spin_unlock_irqrestore+0x0/0x27
 [<60069bb0>] ? __wake_up_common_lock+0x95/0xd1
 [<600602b0>] __might_sleep+0xd7/0xe2
 [<60065598>] ? enqueue_task_fair+0x112/0x209
 [<600eea13>] __kmalloc_track_caller+0x5d/0x124
 [<600ee9b6>] ? __kmalloc_track_caller+0x0/0x124
 [<602696d5>] __kmalloc_reserve.isra.34+0x30/0x7e
 [<603f629b>] ? _raw_spin_lock_irqsave+0x0/0x3d
 [<6026b744>] pskb_expand_head+0xbf/0x310
 [<6025ca6a>] rmnet_rx_handler+0x7e/0x16b
 [<6025c9ec>] ? rmnet_rx_handler+0x0/0x16b
 [<6027ad0c>] __netif_receive_skb_core+0x301/0x96f
 [<60033c17>] ? set_signals+0x0/0x40
 [<6027bbcb>] __netif_receive_skb+0x24/0x8e

Fixes: 74692caf1b ("net: qualcomm: rmnet: Process packets over ethernet")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:16:00 -07:00
Subash Abhinov Kasiviswanathan
6392ff3c8e net: qualcomm: rmnet: Fix incorrect allocation flag in transmit
The incoming skb needs to be reallocated in case the headroom
is not sufficient to add the MAP header. This allocation needs to
be atomic otherwise it results in the following splat

[32805.801456] BUG: sleeping function called from invalid context
[32805.841141] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[32805.904773] task: ffffffd7c5f62280 task.stack: ffffff80464a8000
[32805.910851] pc : ___might_sleep+0x180/0x188
[32805.915143] lr : ___might_sleep+0x180/0x188
[32806.131520] Call trace:
[32806.134041]  ___might_sleep+0x180/0x188
[32806.137980]  __might_sleep+0x50/0x84
[32806.141653]  __kmalloc_track_caller+0x80/0x3bc
[32806.146215]  __kmalloc_reserve+0x3c/0x88
[32806.150241]  pskb_expand_head+0x74/0x288
[32806.154269]  rmnet_egress_handler+0xb0/0x1d8
[32806.162239]  rmnet_vnd_start_xmit+0xc8/0x13c
[32806.166627]  dev_hard_start_xmit+0x148/0x280
[32806.181181]  sch_direct_xmit+0xa4/0x198
[32806.185125]  __qdisc_run+0x1f8/0x310
[32806.188803]  net_tx_action+0x23c/0x26c
[32806.192655]  __do_softirq+0x220/0x408
[32806.196420]  do_softirq+0x4c/0x70

Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:16:00 -07:00
Sean Tranchetti
a07f388e2c net: qualcomm: rmnet: Skip processing loopback packets
RMNET RX handler was processing invalid packets that were
originally sent on the real device and were looped back via
dev_loopback_xmit(). This was detected using syzkaller.

Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 22:16:00 -07:00
Florian Fainelli
45ec318578 net: systemport: Fix wake-up interrupt race during resume
The AON_PM_L2 is normally used to trigger and identify the source of a
wake-up event. Since the RX_SYS clock is no longer turned off, we also
have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and
that interrupt remains active up until the magic packet detector is
disabled which happens much later during the driver resumption.

The race happens if we have a CPU that is entering the SYSTEMPORT
INTRL2_0 handler during resume, and another CPU has managed to clear the
wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we
have the first CPU stuck in the interrupt handler with an interrupt
cause that has been cleared under its feet, and so we keep returning
IRQ_NONE and we never make any progress.

This was not a problem before because we would always turn off the
RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned
off as well, thus not latching the interrupt.

The fix is to make sure we do not enable either the MPD or
BRCM_TAG_MATCH interrupts since those are redundant with what the
AON_PM_L2 interrupt controller already processes and they would cause
such a race to occur.

Fixes: bb9051a2b2 ("net: systemport: Add support for WAKE_FILTER")
Fixes: 83e82f4c70 ("net: systemport: add Wake-on-LAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 17:34:47 -07:00
Oza Pawandeep
62b36c3ea6 PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls
After bfcb79fca1 ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.

Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-02 16:04:40 -05:00
Jakub Kicinski
ff58e2df62 nfp: avoid soft lockups under control message storm
When FW floods the driver with control messages try to exit the cmsg
processing loop every now and then to avoid soft lockups.  Cmsg
processing is generally very lightweight so 512 seems like a reasonable
budget, which should not be exceeded under normal conditions.

Fixes: 77ece8d5f1 ("nfp: add control vNIC datapath")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 11:47:58 -07:00
David S. Miller
d5486377b8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-10-02

This series contains updates to ice driver only.

Anirudh expands the use of VSI handles across the rest of the driver,
which includes refactoring the code to correctly use VSI handles.  After
a reset, ensure that all configurations for a VSI get re-applied before
moving on to rebuilding the next VSI.

Dave fixed the driver to check the current link state after reset to
ensure that the correct link state of a port is reported.  Fixed an
issue where if the driver is unloaded when traffic is in progress,
errors are generated.

Preethi breaks up the IRQ tracker into a software and hardware IRQ
tracker, where the software IRQ tracker tracks only the PF's IRQ
requests and does not play any role in the VF initialization.  The
hardware IRQ tracker represents the device's interrupt space and will be
looked up to see if the device has run our of interrupts when a
interrupt has to be allocated in the device for either PF or VF.

Md Fahad adds support for enabling/disabling RSS via ethtool.

Brett aligns the ice_reset_req enum values to the values that the
hardware understands.  Also added initial support for dynamic interrupt
moderation in the ice driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 11:39:09 -07:00
Maciej W. Rozycki
fe3a83af6a declance: Fix continuation with the adapter identification message
Fix a commit 4bcc595ccd ("printk: reinstate KERN_CONT for printing
continuation lines") regression with the `declance' driver, which caused
the adapter identification message to be split between two lines, e.g.:

declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA
, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.

Address that properly, by printing identification with a single call,
making the messages now look like:

declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Fixes: 4bcc595ccd ("printk: reinstate KERN_CONT for printing continuation lines")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 11:31:54 -07:00
Sudarsana Reddy Kalluru
631b67072b qede: Add driver support for 20G link speed.
Add driver support for reading/configuring the 20G link speed via ethtool.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 11:29:40 -07:00
Sudarsana Reddy Kalluru
5bf0961cc6 qed: Add driver support for 20G link speed.
Add driver support for configuring/reading the 20G link speed.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 11:29:40 -07:00
Rickard x Andersson
657ade07df net: fec: fix rare tx timeout
During certain heavy network loads TX could time out
with TX ring dump.
TX is sometimes never restarted after reaching
"tx_stop_threshold" because function "fec_enet_tx_queue"
only tests the first queue.

In addition the TX timeout callback function failed to
recover because it also operated only on the first queue.

Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 11:27:10 -07:00
Dave Ertman
81b23589f4 ice: Fix error on driver remove
If the driver is unloaded when traffic is in progress, errors are
generated. Fix this by releasing qvectors and NAPI handler on remove.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:20:31 -07:00
Brett Creeley
9e4ab4c29a ice: Add support for dynamic interrupt moderation
Currently there is no support for dynamic interrupt moderation. This
patch adds some initial code to support this. The following changes
were made:

1. Currently we are using multiple members to store the interrupt
   granularity (itr_gran_25/50/100/200). This is not necessary because
   we can query the device to determine what the interrupt granularity
   should be set to, done by a new function ice_get_itr_intrl_gran.

2. Added intrl to ice_q_vector structure to support interrupt rate
   limiting.

3. Added the function ice_intrl_usecs_to_reg for converting to a value
   in usecs that the device understands.

4. Added call to write to the GLINT_RATE register. Disable intrl by
   default for now.

5. Changed rx/tx_itr_setting to itr_setting because having both seems
   redundant because a ring is either Tx or Rx.

6. Initialize itr_setting for both Tx/Rx rings in ice_vsi_alloc_rings()

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:19:30 -07:00
Brett Creeley
ca4929b6df ice: Align ice_reset_req enum values to hardware reset values
Currently the ice_reset_req enum values have to be translated into
a different set of values that the hardware understands for the same
reset types. Avoid this translation by aligning ice_reset_req enum
values to the ones that the hardware understands.

Also add and else if block to check for ICE_RESET_EMPR and put a dev_dbg
message in the else case.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:18:20 -07:00
Md Fahad Iqbal Polash
492af0ab4f ice: Implement ethtool hook for RSS switch
This patch implements ethtool hook for enabling/disabling
RSS. While disabling RSS, the LUT should be cleared. And
the LUT should be reconfigured while enabling RSS.

Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:17:18 -07:00
Preethi Banala
eb0208ec42 ice: Split irq_tracker into sw_irq_tracker and hw_irq_tracker
For the PF driver, when mapping interrupts to queues, we need to request
IRQs from the kernel and we also have to allocate interrupts from
the device.

Similarly, when the VF driver (iavf.ko) initializes, it requests the kernel
IRQs that it needs but it can't directly allocate interrupts in the device.
Instead, it sends a mailbox message to the ice driver, which then allocates
interrupts in the device on the VF driver's behalf.

Currently both these cases end up having to reserve entries in
pf->irq_tracker but irq_tracker itself is sized based on how many vectors
the PF driver needs. Under the right circumstances, the VF driver can fail
to get entries in irq_tracker, which will result in the VF driver failing
probe.

To fix this, sw_irq_tracker and hw_irq_tracker are introduced. The
sw_irq_tracker tracks only the PF's IRQ request and doesn't play any
role in VF init. hw_irq_tracker represents the device's interrupt space.
When interrupts have to be allocated in the device for either PF or VF,
hw_irq_tracker will be looked up to see if the device has run out of
interrupts.

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:16:19 -07:00
Dave Ertman
5755143dd1 ice: Check for actual link state of port after reset
We are currently replaying the link state of a port after a reset, but
it is possible that the link state of a port can change during the reset
process. So check for the current link state of a port during the rebuild
process of a reset.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:15:22 -07:00
Anirudh Venkataramanan
334cb0626d ice: Implement VSI replay framework
Currently, switch filters get replayed after reset. In addition to
filters, other VSI attributes (like RSS configuration, Tx scheduler
configuration, etc.) also need to be replayed after reset.

Thus, instead of replaying based on functional blocks (i.e. replay
all filters for all VSIs, followed by RSS configuration replay for
all VSIs, and so on), it makes more sense to have the replay centered
around a VSI. In other words, replay all configurations for a VSI before
moving on to rebuilding the next VSI.

To that effect, this patch introduces a VSI replay framework in a new
function ice_vsi_replay_all. Currently it only replays switch filters,
but it will be expanded in the future to replay additional VSI attributes.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:14:23 -07:00
Anirudh Venkataramanan
4fb33f3107 ice: Expand use of VSI handles part 2/2
This patch is a continuation of the previous patch where VSI
handles are used instead of VSI numbers.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:13:23 -07:00
Anirudh Venkataramanan
5726ca0e5e ice: Expand use of VSI handles part 1/2
A VSI handle is just a number the driver maintains to uniquely identify
a VSI. A VSI handle is backed by a VSI number in the hardware. When
interacting when the hardware, VSI handles are converted into VSI numbers.

In commit 0f9d5027a7 ("ice: Refactor VSI allocation, deletion and
rebuild flow"), VSI handles were introduced but it was used only
when creating and deleting VSIs. This patch is part one of two patches
that expands the use of VSI handles across the rest of the driver. Also
in this patch, certain parts of the code had to be refactored to correctly
use VSI handles.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-02 07:11:57 -07:00
Jakub Kicinski
0c9864c05f nfp: bpf: allow control message sizing for map ops
In current ABI the size of the messages carrying map elements was
statically defined to at most 16 words of key and 16 words of value
(NFP word is 4 bytes).  We should not make this assumption and use
the max key and value sizes from the BPF capability instead.

To make sure old kernels don't get surprised with larger (or smaller)
messages bump the FW ABI version to 3 when key/value size is different
than 16 words.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02 14:39:59 +02:00
Jakub Kicinski
9bbdd41b8a nfp: allow apps to request larger MTU on control vNIC
Some apps may want to have higher MTU on the control vNIC/queue.
Allow them to set the requested MTU at init time.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02 14:39:59 +02:00
Jakub Kicinski
28264eb227 nfp: bpf: parse global BPF ABI version capability
Up until now we only had per-vNIC BPF ABI version capabilities,
which are slightly awkward to use because bulk of the resources
and configuration does not relate to any particular vNIC.  Add
a new capability for global ABI version and check the per-vNIC
version are equal to it.  Assume the ABI version 2 if no explicit
version capability is present.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02 14:39:58 +02:00
Thomas Falcon
723ad91613 ibmvnic: Add ethtool private flag for driver-defined queue limits
When choosing channel amounts and ring sizes, the maximums in the
ibmvnic driver are defined by the virtual i/o server management
partition. Even though they are defined as maximums, the client
driver may in fact successfully request resources that exceed
these limits, which are mostly dependent on a user's hardware

With this in mind, provide an ethtool flag that when enabled will
allow the user to request resources limited by driver-defined
maximums instead of limits defined by the management partition.
The driver will try to honor the user's request but may not allowed
by the management partition. In this case, the driver requests
as close as it can get to the desired amount until it succeeds.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:31:16 -07:00
Thomas Falcon
20b5ba1f61 ibmvnic: Introduce driver limits for ring sizes
Introduce driver-defined maximums for queue ring sizes. Devices
available for IBM vNIC today will likely not allow this amount,
but this should give us some leeway for future devices that may
support larger ring sizes.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:31:16 -07:00
Thomas Falcon
ad95a240a1 ibmvnic: Increase maximum queue size limit
Increase queue size limit to 16. Devices available for IBM vNIC today
will not allow this amount, but this should give us some leeway for
future devices that may support more RX or TX queues.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:31:16 -07:00
Heiner Kallweit
ad5f97faff r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO
Some of the chip-specific hw_start functions set bit TXCFG_AUTO_FIFO
in register TxConfig. The original patch changed the order of some
calls resulting in these changes being overwritten by
rtl_set_tx_config_registers() in rtl_hw_start(). This eventually
resulted in network stalls especially under high load.

Analyzing the chip-specific hw_start functions all chip version from
34, with the exception of version 39, need this bit set.
This patch moves setting this bit to rtl_set_tx_config_registers().

Fixes: 4fd48c4ac0 ("r8169: move common initializations to tp->hw_start")
Reported-by: Ortwin Glück <odi@odi.ch>
Reported-by: David Arendt <admin@prnet.org>
Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Tested-by: Tony Atkinson <tatkinson@linux.com>
Tested-by: David Arendt <admin@prnet.org>
Tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:28:21 -07:00
Florian Fainelli
a5d78ce793 net: systemport: Add software counters to track reallocations
When inserting the TSB, keep track of how many times we had to do it and
if there was a failure in doing so, this helps profile the driver for
possibly incorrect headroom settings.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:53 -07:00
Florian Fainelli
aa6ca0ec71 net: systemport: Be drop monitor friendly while re-allocating headroom
During bcm_sysport_insert_tsb() make sure we differentiate a SKB
headroom re-allocation failure from the normal swap and replace path.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00
Florian Fainelli
b5061778f8 net: systemport: Turn on offloads by default
We can turn on the RX/TX checksum offloads by default and make sure that
those are properly reflected back to e.g: stacked devices such as VLAN
or DSA.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00
Florian Fainelli
297357d1a1 net: systemport: Utilize bcm_sysport_set_features() during resume/open
During driver resume and open, the HW may have lost its context/state,
utilize bcm_sysport_set_features() to make sure we do restore the
correct set of features that were previously configured.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00
Florian Fainelli
10b476c57b net: systemport: Refactor bcm_sysport_set_features()
In preparation for unconditionally enabling TX and RX checksum offloads,
refactor bcm_sysport_set_features() a bit such that
__netdev_update_features() during register_netdev() can make sure that
features are correctly programmed during network device registration.

Since we can now be called during register_netdev() with clocks gated,
we need to temporarily turn them on/off in order to have a successful
register programming.

We also move the CRC forward setting read into
bcm_sysport_set_features() since priv->crc_fwd matters while turning on
RX checksum offload, that way we are guaranteed they are in sync in case
we ever add support for NETIF_F_RXFCS at some point in the future.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00
Jian Shen
c17852a893 net: hns3: Add support for enable/disable flow director
This patch adds switch for flow director with ethtool command

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
dc5e606477 net: hns3: Remove all flow director rules when unload hns3 driver
This patch removes all flow director rules when unload hns3 driver.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
6871af29b3 net: hns3: Add reset handle for flow director
When doing reset, remove all entries in TCAM block, and keep flow
director rules list. After finishing reset, restore all entries.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
05c2314fe6 net: hns3: Add support for rule query of flow director
This patch adds support for querying rule number and rule details
by ethtool commands.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
dd74f815dd net: hns3: Add support for rule add/delete for flow director
This patch adds support for add and delete rule by ethtool commands.
HNS3 driver supports several flow types, include ETHER_FLOW,
IP_USER_FLOW, TCP_V4_FLOW, UDP_V4_FLOW, SCTP_V4_FLOW, IPV6_USER_FLOW,
TCP_V6_FLOW, UDP_V6_FLOW and SCTP_V6_FLOW.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
1173286802 net: hns3: Add input key and action config support for flow director
Each flow director rule consists of input key and action. The input key
is the condition for matching, includes tuples of L2/L3/L4 header.
Action is the behaviour when a packet matches with the input key, such
as drop the packet, or forward to a specified queue.

The input key is stored in the tcam blocks, Each bit of input key can
be masked.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
d695964d72 net: hns3: Add flow director initialization
Flow director is a new feature supported by hardware with revision 0x21.
This patch adds flow direcor initialization for each PF. It queries flow
director mode and tcam resource from firmware, selects tuples used for
input key.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Andrew Lunn
719655a149 net: phy: Replace phy driver features u32 with link_mode bitmap
This is one step in allowing phylib to make use of link_mode bitmaps,
instead of u32 for supported and advertised features. Convert the phy
drivers to use bitmaps to indicates the features they support.

Build bitmap equivalents of the u32 values at runtime, and have the
drivers point to the appropriate bitmap. These bitmaps are shared, and
we don't want a driver to modify them. So mark them __ro_after_init.

Within phylib, the features bitmap is currently turned back into a
u32. This will be removed once the whole of phylib, and the drivers
are converted to use bitmaps.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:55:36 -07:00
Andrew Lunn
d0939c26c5 net: ethernet: xgbe: expand PHY_GBIT_FEAUTRES
The macro PHY_GBIT_FEAUTRES needs to change into a bitmap in order to
support link_modes. Remove its use from xgde by replacing it with its
definition.

Probably, the current behavior is wrong. It probably should be
ANDing not assigning.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:55:36 -07:00
Andrew Lunn
5f991f7bdd net: phy: Add helper for advertise to lcl value
Add a helper to convert the local advertising to an LCL capabilities,
which is then used to resolve pause flow control settings.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:55:36 -07:00
Jakub Kicinski
97ea8ac360 nfp: warn on experimental TLV types
Reserve two TLV types for feature development, and warn in the driver
if they ever leak into production.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:49:30 -07:00
Moritz Fischer
ea43a5907f net: nixge: Address compiler warnings when building for i386
Address compiler warning reported by kbuild autobuilders
when building for i386 as a result of dma_addr_t size on
different architectures.

warning: cast to pointer from integer of different size
	[-Wint-to-pointer-cast]

Fixes: 7e8d5755be ("net: nixge: Add support for 64-bit platforms")
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:48:08 -07:00
David S. Miller
3bd09b05b0 mlx5e-updates-2018-10-01
This series includes updates to mlx5e ethernet netdevice driver:
 
 From Or Gerlitz:
 1) Support masks for l3/l4 filters in ethtool flow steering
 2) Report checksum unnecessary also when the L3 checksum flag on the
    cqe is set and there's no L4 header
 3) Allow reporting of checksum unnecessary, using an ethtool private flag.
 
 From Gavi Teitz and Or, VF representors netdevs performance improvements
 4) Allow striding RQ in VF representor and bigger RQ size, ~3X performance improvement
 5) Enable stateless offloads for VF representor, csum and TSO, 1.5X performance improvement
 6) RSS Support for VF representors
    6.1) Allow flow table destination fir VF representor steering rule.
    6.2) Create RSS flow table per representor netdev
    6.3) Expose mlx5e RSS ethtool to be used by representor netdevs
    6.4) Enable multi-queue and RSS for VF representors, using mlx5e existing infrastructure
             for managing a multi-queue RX RSS tables.
 
 From Alaa Hleihel:
 7) Cache the system image guid, The system image guid is a read-only field
    Read this once and save it on the core device.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbsmk5AAoJEEg/ir3gV/o+S+IIAMpGR1b79uPjCIdZSTxLlCvg
 xGdc3MGQ4Lv+DuUEP1DQpZDIAJQeh10KqICGfRZxo7d4jUtajwujwaUylkZrb9da
 QNIeK2FLMXkxQo51owZxXgdAy8BNpPmg3D7xI/O/DET4UWCKRmZ/4eJfWLG7r8qp
 +BEqQmJjhY67mj1tuzJ1JsysrOvUhfXEwqJGgBCLlME8KVWh0No4fTsrAlKBmeec
 kYxaW9hS57LKOu32y8LdYb830yHDjUqtWAd+5cGWNTOaygm0dd5ZPVnR9oJ/Ga7I
 Ru64aMMQoLuRbMfryUh8gYZ/oIwP48oIIr3Wler5SXkHxEKR++F5fJrApgxvibA=
 =ivVt
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-10-01

This series includes updates to mlx5e ethernet netdevice driver:

From Or Gerlitz:
1) Support masks for l3/l4 filters in ethtool flow steering
2) Report checksum unnecessary also when the L3 checksum flag on the
   cqe is set and there's no L4 header
3) Allow reporting of checksum unnecessary, using an ethtool private flag.

From Gavi Teitz and Or, VF representors netdevs performance improvements
4) Allow striding RQ in VF representor and bigger RQ size, ~3X performance improvement
5) Enable stateless offloads for VF representor, csum and TSO, 1.5X performance improvement
6) RSS Support for VF representors
   6.1) Allow flow table destination fir VF representor steering rule.
   6.2) Create RSS flow table per representor netdev
   6.3) Expose mlx5e RSS ethtool to be used by representor netdevs
   6.4) Enable multi-queue and RSS for VF representors, using mlx5e existing infrastructure
            for managing a multi-queue RX RSS tables.

From Alaa Hleihel:
7) Cache the system image guid, The system image guid is a read-only field
   Read this once and save it on the core device.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 15:49:17 -07:00
David S. Miller
d96112b2ca Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-10-01

This series contains updates to ice driver only.

Anirudh provides several changes to "prep" the driver for upcoming
features.  Specifically, the functions that are used for PF VSI/netdev
setup will also be used in SR-IOV support and to allow the reuse of
these functions, code needs to move.

Dave provides the only other change in the series, updates the driver to
protect the reset patch in its entirety.  This is done by adding the
various bit checks to determine if a reset is scheduled/initiated and
whether it came from the software or firmware.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 15:43:58 -07:00
Dave Ertman
5df7e45d54 ice: Change pf state behavior to protect reset path
Currently, there is no bit, or set of bits, that protect the entirety
of the reset path.

If the reset is originated by the driver, then the relevant
one of the following bits will be set when the reset is scheduled:
__ICE_PFR_REQ
__ICE_CORER_REQ
__ICE_GLOBR_REQ
This bit will not be cleared until after the rebuild has completed.

If the reset is originated by the FW, then the first the driver knows of
it will be the reception of the OICR interrupt.  The __ICE_RESET_OICR_RECV
bit will be set in the interrupt handler.  This will also be the indicator
in a SW originated reset that we have completed the pre-OICR tasks and
have informed the FW that a reset was requested.

To utilize these bits, change the function:
ice_is_reset_recovery_pending()
	to be:
ice_is_reset_in_progress()

The new function will check all of the above bits in the pf->state and
will return a true if one or more of these bits are set.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:50:50 -07:00
Anirudh Venkataramanan
37bb839012 ice: Move common functions out of ice_main.c part 7/7
This patch completes the code move out of ice_main.c

The following top level functions and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup
ice_vsi_cfg_tc

The following functions were made static again:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs
void ice_vsi_map_rings_to_vectors
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_alloc_arrays

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:50:30 -07:00
Anirudh Venkataramanan
df0f847915 ice: Move common functions out of ice_main.c part 6/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs

The following functions were made static again:
ice_vsi_free_arrays
ice_vsi_clear_rings

Also, in this patch, the netdev and NAPI registration logic was de-coupled
from the VSI creation logic (ice_vsi_setup) as for SR-IOV, while we want to
create VF VSIs using ice_vsi_setup, we don't want to create netdevs.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:50:06 -07:00
Anirudh Venkataramanan
07309a0e59 ice: Move common functions out of ice_main.c part 5/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_clear
ice_vsi_close
ice_vsi_free_arrays
ice_vsi_map_rings_to_vectors

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:49:45 -07:00
Anirudh Venkataramanan
28c2a64573 ice: Move common functions out of ice_main.c part 4/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_clear_rings
ice_vsi_alloc_arrays

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:49:21 -07:00
Anirudh Venkataramanan
5153a18e57 ice: Move common functions out of ice_main.c part 3/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_delete
ice_free_res
ice_get_res
ice_is_reset_recovery_pending
ice_vsi_put_qs
ice_vsi_dis_irq
ice_vsi_free_irq
ice_vsi_free_rx_rings
ice_vsi_free_tx_rings
ice_msix_clean_rings

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:48:58 -07:00
Anirudh Venkataramanan
72adf2421d ice: Move common functions out of ice_main.c part 2/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_start_rx_rings
ice_vsi_stop_rx_rings
ice_vsi_stop_tx_rings
ice_vsi_cfg_rxqs
ice_vsi_cfg_txqs
ice_vsi_cfg_msix

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:45:56 -07:00
Anirudh Venkataramanan
45d3d428ea ice: Move common functions out of ice_main.c part 1/7
The functions that are used for PF VSI/netdev setup will also be used
for SR-IOV support. To allow reuse of these functions, move these
functions out of ice_main.c to ice_common.c/ice_lib.c

This move is done across multiple patches. Each patch moves a few
functions and may have minor adjustments. For example, a function that was
previously static in ice_main.c will be made non-static temporarily in
its new location to allow the driver to build cleanly. These adjustments
will be removed in subsequent patches where more code is moved out of
ice_main.c

In this particular patch, the following functions were moved out of
ice_main.c:
int ice_add_mac_to_list
ice_free_fltr_list
ice_stat_update40
ice_stat_update32
ice_update_eth_stats
ice_vsi_add_vlan
ice_vsi_kill_vlan
ice_vsi_manage_vlan_insertion
ice_vsi_manage_vlan_stripping

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:39:40 -07:00
Alaa Hleihel
59c9d35ea9 net/mlx5: Cache the system image guid
The system image guid is a read-only field which is used by the TC
offloads code to determine if two mlx5 devices belong to the same
ASIC while adding flows.

Read this once and save it on the core device rather than querying each
time an offloaded flow is added.

Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:47 -07:00
Or Gerlitz
b856df28f9 net/mlx5e: Allow reporting of checksum unnecessary
Currently we practically never report checksum unnecessary, because
for all IP packets we take the checksum complete path.

Enable non-default runs with reprorting checksum unnecessary, using
an ethtool private flag. This can be useful for performance evals
and other explorations.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:47 -07:00
Or Gerlitz
b820e6fb09 net/mlx5e: Enable reporting checksum unnecessary also for L3 packets
We can report checksum unnecessary also when the L3 checksum
flag on the cqe is set and there's no L4 header.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:46 -07:00
Gavi Teitz
f128f138cc net/mlx5e: Add ethtool control of ring params to VF representors
Added ethtool control to the representors for setting and querying
the ring params.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
2018-10-01 11:32:46 -07:00
Gavi Teitz
84a0973386 net/mlx5e: Enable multi-queue and RSS for VF representors
Increased the amount of channels the representors can open to be the
amount of CPUs. The default amount opened remains one.

Used the standard NIC netdev functions to:
* Set RSS params when building the representors' params.
* Setup an indirect TIR and RQT for the representors upon
  initialization.
* Create a TTC flow table for the representors' indirect TIR (when
  creating the TTC table, mlx5e_set_ttc_basic_params() is not called,
  in order to avoid setting the inner_ttc param, which is not needed).

Added ethtool control to the representors for setting and querying
the amount of open channels. Additionally, included logic in the
representors' ethtool set channels handler which controls a
representor's vport rx rule, so that if there is one open channel
the rx rule steers traffic to the representor's direct TIR, whereas
if there is more than one channel, the rx rule steers traffic to the
new TTC flow table.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:45 -07:00
Or Gerlitz
a5355de878 net/mlx5e: Expose ethtool rss key size / indirection table functions
Towards enabling RSS for the vport representors, expose the functions for
querying the rss hash key size and indirection table size via ethtool.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:45 -07:00
Gavi Teitz
3edc0159c0 net/mlx5e: Expose function for building RSS params
Towards enabling RSS for the vport representors, extract the
procedure for building a device's RSS params, and expose the
function.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:44 -07:00
Or Gerlitz
46dc933cee net/mlx5e: Provide explicit directive if to create inner indirect tirs
Change the driver functions that deal with creating indirect tirs
to get a flag telling if inner ttc is desired.

A pre-step for enabling rss on the vport representors, where
inner ttc is not needed.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:44 -07:00
Gavi Teitz
c966f7d55d net/mlx5: E-Switch, Provide flow dest when creating vport rx rule
Currently the destination for the representor e-switch rx rule is
a TIR number. Towards changing that to potentially be a flow table,
as part of enabling RSS for representors, modify the signature of
the related e-switch API to get a flow destination.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:43 -07:00
Gavi Teitz
092297e09a net/mlx5e: Extract creation of rep's default flow rule
Cleaning up the flow of the representors' rx initialization, towards
enabling RSS for the representors.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:43 -07:00
Gavi Teitz
dabeb3b0d5 net/mlx5e: Enable stateless offloads for VF representor netdevs
Enabled checksum and TSO offloads for the representors, in
order to increase their performance, which is required to
increase the performance of flows that cannot be offloaded.

Checksum offloads contribute to a general acceleration of all
traffic (to around 150%), whereas the TSO offload contributes
to a prominent acceleration of the representor's TX for traffic
flows with larger than MTU sized packets (to around 200%). This
is the usual case for TCP streams, as the PF, which serves as
the uplink representor, and the VF representors employ GRO before
forwarding the packets to the representor.

GRO was enabled implicitly for the representors beforehand, and
is explicitly enabled here to ensure that the representors preserve
the performance boost it provides (of around 200%) when working in
tandem with the TSO offload by the forwardee, which is the standard
case as both the PF and the VF representors employ HW TSO.

The impact of these changes can be seen in the following
measurements taken on a setup of a VM over a VF, connected
to OVS via the VF representor, to an external host:

Before current changes:
                     TCP Throughput [Gb/s]
External host to VM         ~ 10.5
VM to external host         ~ 23.5

With just checksum offloads enabled:
                     TCP Throughput [Gb/s]
External host to VM         ~ 14.9
VM to external host         ~ 28.5

With the TSO offload also enabled:
                     TCP Throughput [Gb/s]
External host to VM         ~ 30.5

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:42 -07:00
Gavi Teitz
749359f4aa net/mlx5e: Change VF representors' RQ type
The representors' RQ size was not large enough for them to achieve
high enough performance, and therefore needed to be enlarged, while
suffering a minimum hit to its memory usage. To achieve this the
representors RQ size was increased, and its type was changed to be a
striding RQ if it is supported.

Towards that goal the following changes were made:

* Extracted the sequence for setting the standard netdev's RQ parmas
  into a function

* Replaced the sequence for setting the representor's RQ params with
  the standard sequence

The impact of this change can be seen in the following measurements
taken on a setup of a VM over a VF, connected to OVS via the VF
representor, to an external host:

Before current change:
                     TCP Throughput [Gb/s]
VM to external host         ~  7.2

With the current change (measured with a striding RQ):
                     TCP Throughput [Gb/s]
VM to external host         ~ 23.5

Each representor now consumes 2 [MB] of memory for its packet
buffers.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:42 -07:00
Or Gerlitz
3a95e0ccaf net/mlx5e: Ethtool steering, Support masks for l3/l4 filters
Allow using partial masks for L3 addresses and L4 ports across
the place.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:41 -07:00
Jianbo Liu
cee2648762 net/mlx5e: Set vlan masks for all offloaded TC rules
In flow steering, if asked to, the hardware matches on the first ethertype
which is not vlan. It's possible to set a rule as follows, which is meant
to match on untagged packet, but will match on a vlan packet:
    tc filter add dev eth0 parent ffff: protocol ip flower ...

To avoid this for packets with single tag, we set vlan masks to tell
hardware to check the tags for every matched packet.

Fixes: 095b6cfd69 ('net/mlx5e: Add TC vlan match parsing')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Eran Ben Elisha
11aa5800ed net/mlx5: E-Switch, Fix out of bound access when setting vport rate
The code that deals with eswitch vport bw guarantee was going beyond the
eswitch vport array limit, fix that.  This was pointed out by the kernel
address sanitizer (KASAN).

The error from KASAN log:
[2018-09-15 15:04:45] BUG: KASAN: slab-out-of-bounds in
mlx5_eswitch_set_vport_rate+0x8c1/0xae0 [mlx5_core]

Fixes: c9497c9890 ("net/mlx5: Add support for setting VF min rate")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Alaa Hleihel
4d8fcf216c net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rules
If the peer device was already unbound, then do not attempt to modify
it's resources, otherwise we will crash on dereferencing non-existing
device.

Fixes: 5c65c564c9 ("net/mlx5e: Support offloading TC NIC hairpin flows")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Hans de Goede
ac8bd9e13b r8169: Disable clk during suspend / resume
Disable the clk during suspend to save power. Note that tp->clk may be
NULL, the clk core functions handle this without problems.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Carlo Caione <carlo@endlessm.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:47:57 -07:00
Shahed Shaikh
c333fa0c4f qlcnic: fix Tx descriptor corruption on 82xx devices
In regular NIC transmission flow, driver always configures MAC using
Tx queue zero descriptor as a part of MAC learning flow.
But with multi Tx queue supported NIC, regular transmission can occur on
any non-zero Tx queue and from that context it uses
Tx queue zero descriptor to configure MAC, at the same time TX queue
zero could be used by another CPU for regular transmission
which could lead to Tx queue zero descriptor corruption and cause FW
abort.

This patch fixes this in such a way that driver always configures
learned MAC address from the same Tx queue which is used for
regular transmission.

Fixes: 7e2cf4feba ("qlcnic: change driver hardware interface mechanism")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:46:07 -07:00
David S. Miller
3ff6cde846 hns3: Another build fix.
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: In function ‘hclge_get_sset_count’:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:496:31: error: ‘HNAE3_REVISION_ID_21’ undeclared (first use in this function); did you mean ‘FADT2_REVISION_ID’?
   if (hdev->pdev->revision >= HNAE3_REVISION_ID_21 ||
                               ^~~~~~~~~~~~~~~~~~~~
                               FADT2_REVISION_ID

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:21:06 -07:00
David S. Miller
d401766585 hns3: Fix the build.
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c: In function ‘hns3_self_test’:
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c:278:15: error: ‘HNS3_SELF_TEST_TYPE_NUM’ undeclared (first use in this function); did you mean ‘HNS3_SELF_TEST_TPYE_NUM’?
  int st_param[HNS3_SELF_TEST_TYPE_NUM][2];
               ^~~~~~~~~~~~~~~~~~~~~~~
               HNS3_SELF_TEST_TPYE_NUM

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 10:49:58 -07:00
Bruce Allan
3b6bf296c4 ice: fix changing of ring descriptor size (ethtool -G)
rx_mini_pending was set to an incorrect value. This was causing EINVAL to
always be returned to 'ethtool -G'. The driver does not support mini or
jumbo rings so the respective settings should be zero.

Also, change the valid range of the number of descriptors in the rings to
make the code simpler and easier for users to understand (this removes the
valid settings of 8 and 16). Add a system log message indicating when the
number is rounded-up from what the user specifies with the 'ethtool -G'
command (i.e. when it is not a multiple of 32), and update the log message
when a user-provided value is out of range to also indicate the stride.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Anirudh Venkataramanan
7d86cf3840 ice: Update to capabilities admin queue command
This patch makes a couple of changes in the way the driver uses the
"get capabilities" command.

1. Get device capabilities in addition to function capabilities

2. Align to latest spec by using cap_count to determine size of the
   buffer in case of length error.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Anirudh Venkataramanan
1886588fb6 ice: Query the Tx scheduler node before adding it
Query the Tx scheduler tree node information from FW before adding it to
the driver's software database. This will keep the node information current
in driver.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Brett Creeley
8bc8d188cd ice: Update comment for ice_fltr_mgmt_list_entry
Previously the comment stated that VSI lists should be used when a
second VSI becomes a subscriber to the "VLAN address". VSI lists
are always used for VLAN membership, so replace "VLAN address" with
"MAC address". Also note that VLAN(s) always use VSI list rules.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Jacob Keller
b2ccf317ed ice: update fw version check logic
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and
MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a
firmware version is newer than expected. This is currently tested by
comparing each field separately. Thus, we compare the branch field
against the MAX_FW_API_VER_BRANCH, and so forth.

This means that currently, if we suppose that the max firmware version
is defined as 0.2.1, i.e.

Then firmware 0.1.3 will fail to load. This is because the minor version
3 is greater than the max minor version 1.

This is not intuitive, because of the notion that increasing the major
firmware version to 2 should mean any firmware version with a major
version is less than 2 should be considered older than 2...

In order to allow both 0.2.1 and 0.1.3 to load, you would have to define
the "max" firmware version as 0.2.3.. It is possible that such
a firmware version doesn't even exist yet!

Fix this by replacing the current logic with an updated check that
behaves as follows:

First, we check the major version. If it is greater than the expected
version, then we prevent driver load. Additionally, a warning message is
logged to indicate to the system administrator that they need to update
their driver. This is now the only case where the driver will refuse to
load.

Second, if the major version is less than the expected version, we log
an information message indicating the NVM should be updated.

Third, if the major version is exact, we'll then check the minor
version. If the minor version is more than two versions less than
expected, we log an information message indicating the NVM should be
updated. If it is more than two versions greater than the expected
version, we log an information message that the driver should be
updated.

To support this, the ice_aq_ver_check function needs its signature
updated to pass the HW structure. Since we now pass this structure,
there is no need to pass the firmware API versions separately.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Bruce Allan
e4a0e1ee94 ice: update branding strings and supported device ids
Update branding strings and remove device ids 0x1594 and 0x1595.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Bruce Allan
95a525bee0 ice: replace unnecessary memcpy with direct assignment
Direct assignment is preferred over a memcpy()

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Jacob Keller
c913b73cd0 ice: use [sr]q.count when checking if queue is initialized
When shutting down the controlqs, we check if they are initialized
before we shut them down and destroy the lock. This is important, as it
prevents attempts to access the lock of an already shutdown queue.

Unfortunately, we checked rq.head and sq.head as the value to determine
if the queue was initialized. This doesn't work, because head is not
reset when the queue is shutdown. In some flows, the adminq will have
already been shut down prior to calling ice_shutdown_all_ctrlqs. This
can result in a crash due to attempting to access the already destroyed
mutex.

Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq
and ice_shutdown_rq already indicate that this is the value we should be
using to determine of the queue was initialized.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Colin Ian King
6a42b5128d qed: fix spelling mistake "b_cb_registred" -> "b_cb_registered"
Trivial fix to spelling mistake struct field name, rename it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:15:11 -07:00
Eric Dumazet
0c3b9d1b37 ibmvnic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ibmvnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

ibmvnic_netpoll_controller() was completely wrong anyway,
as it was scheduling NAPI to service RX queues (instead of TX),
so I doubt netpoll ever worked on this driver.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
a4f570be65 sfc-falcon: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

sfc-falcon uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
9447a10ff6 sfc: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

sfc uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
21627982e4 net: ena: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ena uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Netanel Belgazal <netanel@amazon.com>
Cc: Saeed Bishara <saeedb@amazon.com>
Cc: Zorik Machulsky <zorik@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
3548fcf7d8 qlogic: netxen: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

netxen uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Rahul Verma <rahul.verma@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
81b059b218 qlcnic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

qlcnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
4bd2c03be7 net: hns: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

hns uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
Eric Dumazet
226a2dd62c ehea: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ehea uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
Eric Dumazet
e71fb423e0 hinic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

hinic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Note that hinic_netpoll() was incorrectly scheduling NAPI
on both RX and TX queues.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
David S. Miller
ec72001d38 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-09-27

This series contains fixes to the ice driver only.

Jake fixes a potential crash due to attempting to access the mutex which
is already destroyed.  Fix this by using rq.count and sq.count to
determine if the queue was initialized.  Fixed the current logic for
checking the firmware version to properly handle situations when
firmware major/minor versions differ and when the branch version
differs.

Bruce replaces a memcpy() with a direct assignment, which is preferred.
Also updated the branding strings and device ids supported by the
driver.  Fixed the "ethtool -G" command in the driver, which was always
returning EINVAL when changing the descriptor ring size.

Brett update and clarified code comments.

Anirudh updates the driver to ensure we query the firmware for the
transmit scheduler node information before adding it to the driver
database, to ensure we have the current information.  Also update the
"get capabilities" command to get device and function capabilities.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:09:02 -07:00
Sudarsana Reddy Kalluru
5f672090e4 qed: Fix shmem structure inconsistency between driver and the mfw.
The structure shared between driver and the management FW (mfw) differ in
sizes. This would lead to issues when driver try to access the structure
members which are not-aligned with the mfw copy e.g., data_ptr usage in the
case of mfw_tlv request.
Align the driver structure with mfw copy, add reserved field(s) to driver
structure for the members not used by the driver.

Fixes: dd006921d6 ("qed: Add MFW interfaces for TLV request support.)
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
2018-09-28 10:39:44 -07:00
Huazhong Tan
e4fd75022c net: hns3: Fix loss of coal configuration while doing reset
The user's coal configuration will be lost after reset, so the tx_coal
and rx_coal fields are added to the struct hns_nic_priv to save the coal
configuration and used to restore the user's configuration after the reset
is complete.

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Huazhong Tan
0d43bf45f4 net: hns3: Modify hns3_get_max_available_channels
The current hns3_get_max_available_channels returns the total number
of queues for the device, which makes ethtool -L set the number of queues
per channel queues incorrectly, so hns3_get_max_available_channels should
return the maximum available number of queues per channel, depending on
the total number of queues allocated and the hardware configurations.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Huazhong Tan
fe5eb04318 net: hns3: Change return type of hclge_tm_schd_info_update()
hclge_tm_schd_info_update should return an error when num_tc is greater
than alloc_tqps.

This patch changes the return type of hnae3_register_ae_algo from void
to int.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Yunsheng Lin
93d8daf460 net: hns3: Fix for netdev not up problem when setting mtu
Currently hns3_nic_change_mtu will try to down the netdev before
setting mtu, and it does not up the netdev when the setting fails,
which causes netdev not up problem.

This patch fixes it by not returning when the setting fails.

Fixes: a8e8b7ff35 ("net: hns3: Add support to change MTU in HNS3 hardware")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Yunsheng Lin
996ff91840 net: hns3: Fix for packet buffer setting bug
The hardware expects a unit of 128 bytes when setting
packet buffer. When calculating the packet buffer size,
hclge_rx_buffer_calc does not round up the size as a unit
of 128 byte, which may casue packet lost problem when stress
testing.

This patch fixes it by rounding up packet size when calculating.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Fuyun Liang
4dc13b9668 net: hns3: Add serdes parallel inner loopback support
This patch adds serdes parallel inner loopback support for self test.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
eb66d50352 net: hns3: Rename mac loopback to app loopback
In fact, our implementation of mac loopback is the implementation of app
loopback now. Current name is wrong. This patch renames mac loopback to
app loopback.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
a7b687b354 net: hns3: Rename loop mode
Our loop mode includes mac loop, serdes loop and phy loop. Not all of them
are related with mac. This patch corrects their names.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
cd2086bf49 net: hns3: Set extra mac address of pause param for HW
The extra mac address of pause param is used to do double check
for pause frame. This patch set it to HW. If we do not do that,
pfc pause frame will be transferred protocol stack when normal
flow control mode is enabled.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00