Commit Graph

41235 Commits

Author SHA1 Message Date
Michael Chan
0fb8582ae5 bnxt_en: Log error report for dropped doorbell
Log the unrecognized error report type value as well.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 12:00:28 +00:00
Somnath Kotur
5a717f4a8e bnxt_en: Add event handler for PAUSE Storm event
FW has been modified to send a new async event when it detects
a pause storm. Register for this new event and log it upon receipt.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-27 12:00:28 +00:00
Jakub Kicinski
6f6f0ac664 mlx5-fixes-2021-12-22
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmHD/VkACgkQSD+KveBX
 +j4AXgf/eOPLXGtPLtWI6J5tVtIRk1sX0BDRcOhvHJiQtFtRGFNpQKdwsZ0bDjos
 YSuRtqiY1SORQOuqDL41r2m68jzXpU49z3O6jD4ELojw2+rKmTC6PiNdRdNm34rl
 1bU25qYNK7rsW1EyoaW1FUp91+5+1pkzWcJwO0JY6mrCoUa2FFdwFDkb6KBRDiCZ
 JLBRKFzsqzprYIFWqBm6FyE+0vFipkMzp33tIYzgoe1/A1gWOspsTJDd2tOFVXvw
 UWOudh3xYi1+7WcFov1K4vf1ppFvPhe3JkzC47Q/qqNia8gDYcXRosGw06c05Z5p
 G8CU7D44OUJ4OLjG8oMyNhFAfpUKEg==
 =tS8n
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-12-22

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
  net/mlx5e: Delete forward rule for ct or sample action
  net/mlx5e: Fix ICOSQ recovery flow for XSK
  net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow
  net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled
  net/mlx5e: Wrap the tx reporter dump callback to extract the sq
  net/mlx5: Fix tc max supported prio for nic mode
  net/mlx5: Fix SF health recovery flow
  net/mlx5: Fix error print in case of IRQ request failed
  net/mlx5: Use first online CPU instead of hard coded CPU
  net/mlx5: DR, Fix querying eswitch manager vport for ECPF
  net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources
====================

Link: https://lore.kernel.org/r/20211223190441.153012-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 19:04:33 -08:00
Jakub Kicinski
8b3f913322 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/net/sock.h
  commit 8f905c0e73 ("inet: fully convert sk->sk_rx_dst to RCU rules")
  commit 43f51df417 ("net: move early demux fields close to sk_refcnt")
  https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 16:09:58 -08:00
Nobuhiro Iwamatsu
391e5975c0 net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M
ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a
value, which is 0. Fix from BIT(0) to 0.

Reported-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Fixes: b38dd98ff8 ("net: stmmac: Add Toshiba Visconti SoCs glue driver")
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:58:13 -08:00
Xiaoliang Yang
eccffcf465 net: stmmac: ptp: fix potentially overflowing expression
Convert the u32 variable to type u64 in a context where expression of
type u64 is required to avoid potential overflow.

Fixes: e9e3720002 ("net: stmmac: ptp: update tas basetime after ptp adjust")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 09:45:55 -08:00
Ong Boon Leong
e48cb313fd net: stmmac: add tc flower filter for EtherType matching
This patch adds basic support for EtherType RX frame steering for
LLDP and PTP using the hardware offload capabilities.

Example steps for setting up RX frame steering for LLDP and PTP:
$ IFDEVNAME=eth0
$ tc qdisc add dev $IFDEVNAME ingress
$ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \
     map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \
     queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0

For LLDP
$ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88cc \
     flower hw_tc 5
OR
$ tc filter add dev $IFDEVNAME parent ffff: protocol LLDP \
     flower hw_tc 5

For PTP
$ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88f7 \
     flower hw_tc 6

Show tc ingress filter
$ tc filter show dev $IFDEVNAME ingress

v1->v2:
 Thanks to Kurt's and Sebastian's suggestion.
 - change from __be16 to u16 etype
 - change ETHER_TYPE_FULL_MASK to use cpu_to_be16() macro

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23 11:20:49 +00:00
Horatiu Vultur
2e49761e4f net: lan966x: Add support for multiple bridge flags
This patch series extends the current supported bridge flags with the
following flags: BR_FLOOD, BR_BCAST_FLOOD and BR_LEARNING.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23 11:19:06 +00:00
Christophe JAILLET
4390c6edc0 net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out'
where 'flow_flag_set(flow, FAILED);' is called.

All but the new error handling paths added by the commits given in the
Fixes tag below.

Fix these error handling paths and branch to 'err_out'.

Fixes: 166f431ec6 ("net/mlx5e: Add indirect tc offload of ovs internal port")
Fixes: b16eb3c81f ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
(cherry picked from commit 31108d142f)
2021-12-22 20:38:49 -08:00
Chris Mi
2820110d94 net/mlx5e: Delete forward rule for ct or sample action
When there is ct or sample action, the ct or sample rule will be deleted
and return. But if there is an extra mirror action, the forward rule can't
be deleted because of the return.

Fix it by removing the return.

Fixes: 69e2916ebc ("net/mlx5: CT: Add support for mirroring")
Fixes: f94d6389f6 ("net/mlx5e: TC, Add support to offload sample action")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:49 -08:00
Maxim Mikityanskiy
19c4aba2d4 net/mlx5e: Fix ICOSQ recovery flow for XSK
There are two ICOSQs per channel: one is needed for RX, and the other
for async operations (XSK TX, kTLS offload). Currently, the recovery
flow for both is the same, and async ICOSQ is mistakenly treated like
the regular ICOSQ.

This patch prevents running the regular ICOSQ recovery on async ICOSQ.
The purpose of async ICOSQ is to handle XSK wakeup requests and post
kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs,
so the regular recovery sequence is not applicable here.

Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Maxim Mikityanskiy
17958d7cd7 net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow
Both regular RQ and XSKRQ use the same ICOSQ for UMRs. When doing
recovery for the ICOSQ, don't forget to deactivate XSKRQ.

XSK can be opened and closed while channels are active, so a new mutex
prevents the ICOSQ recovery from running at the same time. The ICOSQ
recovery deactivates and reactivates XSKRQ, so any parallel change in
XSK state would break consistency. As the regular RQ is running, it's
not enough to just flush the recovery work, because it can be
rescheduled.

Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Gal Pressman
a0cb909644 net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled
When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in
Kconfig), the mlx5e_rep_tc_receive() function which is responsible for
passing the skb to the stack (or freeing it) is defined as a nop, and
results in leaking the skb memory. Replace the nop with a call to
napi_gro_receive() to resolve the leak.

Fixes: 28e7606fa8 ("net/mlx5e: Refactor rx handler of represetor device")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Amir Tzin
918fc3855a net/mlx5e: Wrap the tx reporter dump callback to extract the sq
Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
of type struct mlx5e_tx_timeout_ctx *.

 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
 BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
 kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
 CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 [mlx5_core]
 Call Trace:
 mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
 devlink_health_do_dump.part.91+0x71/0xd0
 devlink_health_report+0x157/0x1b0
 mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
 ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
 [mlx5_core]
 ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
 ? update_load_avg+0x19b/0x550
 ? set_next_entity+0x72/0x80
 ? pick_next_task_fair+0x227/0x340
 ? finish_task_switch+0xa2/0x280
   mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
   process_one_work+0x1de/0x3a0
   worker_thread+0x2d/0x3c0
 ? process_one_work+0x3a0/0x3a0
   kthread+0x115/0x130
 ? kthread_park+0x90/0x90
   ret_from_fork+0x1f/0x30
 --[ end trace 51ccabea504edaff ]---
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Kernel Offset: disabled
 end Kernel panic - not syncing: Fatal exception

To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
TX-timeout-recovery flow dump callback.

Fixes: 5f29458b77 ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:48 -08:00
Chris Mi
d671e109bd net/mlx5: Fix tc max supported prio for nic mode
Only prio 1 is supported if firmware doesn't support ignore flow
level for nic mode. The offending commit removed the check wrongly.
Add it back.

Fixes: 9a99c8f125 ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:47 -08:00
Moshe Shemesh
33de865f7b net/mlx5: Fix SF health recovery flow
SF do not directly control the PCI device. During recovery flow SF
should not be allowed to do pci disable or pci reset, its PF will do it.

It fixes the following kernel trace:
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations
mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations
mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed,
status bad resource state(0x9), syndrome (0x658908)
mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed

Fixes: 1958fc2f07 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:47 -08:00
Shay Drory
aa968f9220 net/mlx5: Fix error print in case of IRQ request failed
In case IRQ layer failed to find or to request irq, the driver is
printing the first cpu of the provided affinity as part of the error
print. Empty affinity is a valid input for the IRQ layer, and it is
an error to call cpumask_first() on empty affinity.

Remove the first cpu print from the error message.

Fixes: c36326d38d ("net/mlx5: Round-Robin EQs over IRQs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:47 -08:00
Shay Drory
26a7993c93 net/mlx5: Use first online CPU instead of hard coded CPU
Hard coded CPU (0 in our case) might be offline. Hence, use the first
online CPU instead.

Fixes: f891b7cdbd ("net/mlx5: Enable single IRQ for PCI Function")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:46 -08:00
Yevgeny Kliteynik
624bf42c2e net/mlx5: DR, Fix querying eswitch manager vport for ECPF
On BlueField the E-Switch manager is the ECPF (vport 0xFFFE), but when
querying capabilities of ECPF eswitch manager, need to query vport 0
with other_vport = 0.

Fixes: 9091b821aa ("net/mlx5: DR, Handle eswitch manager and uplink vports separately")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:46 -08:00
Miaoqian Lin
6b8b425858 net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources
The mlx5_get_uars_page() function  returns error pointers.
Using IS_ERR() to check the return value to fix this.

Fixes: 4ec9e7b026 ("net/mlx5: DR, Expose steering domain functionality")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-22 20:38:46 -08:00
Amit Cohen
70ec72d5b6 mlxsw: spectrum_flower: Make vlan_id limitation more specific
Spectrum ASICs do not support matching of VLAN ID at egress.
Currently, mlxsw driver forbids matching of all VLAN related fields at
egress, which is too strict check.

For example, the following filter is not supported by the driver:
$ tc filter add dev swpX egress protocol 802.1q pref 1 handle 101 flower
vlan_ethtype ipv4 src_ip .. dst_ip .. skip_sw action pass
Error: mlxsw_spectrum: vlan_id key is not supported on egress.
We have an error talking to the kernel

The filter above does not match on VLAN ID, but is bounced anyway.

Make the check more specific, forbid only matching of 'vlan_id' at egress.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:32 -08:00
Jakub Kicinski
5de24da1b3 mlx5-updates-2021-12-21
1) From Shay Drory: Devlink user knobs to control device's EQ size
 
 This series provides knobs which will enable users to
 minimize memory consumption of mlx5 Functions (PF/VF/SF).
 mlx5 exposes two new generic devlink params for EQ size
 configuration and uses devlink generic param max_macs.
 
 LINK: https://lore.kernel.org/netdev/20211208141722.13646-1-shayd@nvidia.com/
 
 2) From Tariq and Lama, allocate software channel objects and statistics
   of a mlx5 netdevice private data dynamically upon first demand to save on
   memory.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmHClsoACgkQSD+KveBX
 +j5d6gf+NKH8mQd6Aa/Gt4Y2DtS7GzN+dPD+2MokuT2YSWU8kVGlMnC01MTE2V2s
 jiOUC6ZEbayx2ORzd58XlcfAMEZz2WH8VGLXTdmM3niv13D7AueSsUP5SVK7Oamk
 h0bwzOV8CE1Ru5s3Q3zPaLBqTWN+TmyG42HNwYyD8GZT7O5q8iBfor97N2KN5U5u
 rhxzFcD2jfhtYxACkjea6RTN9CTM7l9FS4+DIjhu53PAOBaOZLj8NKSYUqP6I17L
 ITJDd9za4Nq/YiB2yU4UVNkDEXsuXWoqnTNsL1oEUPForEomJsnikHV9ywzyKVMn
 SbWdZOr5C14UBH9wyyauEBBv5oLB7w==
 =cotW
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-12-21

1) From Shay Drory: Devlink user knobs to control device's EQ size

This series provides knobs which will enable users to
minimize memory consumption of mlx5 Functions (PF/VF/SF).
mlx5 exposes two new generic devlink params for EQ size
configuration and uses devlink generic param max_macs.

LINK: https://lore.kernel.org/netdev/20211208141722.13646-1-shayd@nvidia.com/

2) From Tariq and Lama, allocate software channel objects and statistics
  of a mlx5 netdevice private data dynamically upon first demand to save on
  memory.

* tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Take packet_merge params directly from the RX res struct
  net/mlx5e: Allocate per-channel stats dynamically at first usage
  net/mlx5e: Use dynamic per-channel allocations in stats
  net/mlx5e: Allow profile-specific limitation on max num of channels
  net/mlx5e: Save memory by using dynamic allocation in netdev priv
  net/mlx5e: Add profile indications for PTP and QOS HTB features
  net/mlx5e: Use bitmap field for profile features
  net/mlx5: Remove the repeated declaration
  net/mlx5: Let user configure max_macs generic param
  devlink: Clarifies max_macs generic devlink param
  net/mlx5: Let user configure event_eq_size param
  devlink: Add new "event_eq_size" generic device param
  net/mlx5: Let user configure io_eq_size param
  devlink: Add new "io_eq_size" generic device param
====================

Link: https://lore.kernel.org/r/20211222031604.14540-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:13:01 -08:00
Colin Ian King
62a3106697 net: broadcom: bcm4908enet: remove redundant variable bytes
The variable bytes is being used to summate slot lengths,
however the value is never used afterwards. The summation
is redundant so remove variable bytes.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20211222003937.727325-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 14:58:43 -08:00
Jesse Brandeburg
0092db5fac ice: trivial: fix odd indenting
Fix an odd indent where some code was left indented, and causes smatch
to warn:
ice_log_pkg_init() warn: inconsistent indenting

While here, for consistency, add a break after the default case.

This commit has a Fixes: but we caught this while it was only in net-next.

Fixes: 247dd97d71 ("ice: Refactor status flow for DDP load")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20211221230538.2546315-1-jesse.brandeburg@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 14:57:55 -08:00
Jakub Kicinski
2030eddced Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-12-21

This series contains updates to ice driver only.

Karol modifies the reset flow to correct issues with PTP reset.

Jake extends PTP support for E822 based devices. This includes a few
cleanup patches, that fix some minor issues. In addition, there are some
slight refactors to ease the addition of E822 support, followed by adding
the new hardware implementation ice_ptp_hw.c.

There are a few major differences with E822 support compared to E810
support:

*) The E822 device has a Clock Generation Unit which must be initialized in
order to generate proper clock frequencies on the output that drives the PTP
hardware clock registers

*) The E822 PHY is a bit different and requires a more complex
initialization procedure which must be rerun any time the link configuration
changes.

*) The E822 devices support enhanced timestamp calibration by making use of
a process called Vernier offset measurement. This allows the hardware to
measure phase offset related to the PHY clocks for Serdes and FEC, reducing
the inaccuracy of the timestamp relative to the actual packet transmission
and receipt. Making use of this requires data gathered from the first
transmitted and received packets, and waiting for the PHY to complete the
calibration measurements. This is done as part of a new kthread, ov_work.
Note that to avoid delay in enabling timestamps, we start the PHY in
'bypass' mode which allows timestamps to be captured without the Vernier
calibration measurement. Once the first packets have been sent and received,
we then complete the calibration setup and exit bypass mode and begin using
the more precise timestamps. According to the datasheet, timestamps without
calibration data can be incorrect relative to actual receipt or transmission
by up to 1 clock cycle (~1.25 nanoseconds), while calibrated timestamps
should be correct to within 1/8th of a clock cycle (~0.15 nanoseconds).

*) E822 devices support crosstimestamping via PCIe PTM, which we enable when
available on the platform.

There is a fair amount of logic required to perform PHY and CGU
initialization, which is the vast majority of the new code, but it is fairly
self contained within ice_ptp_hw.c, with the exception of monitoring for
offset validity being handled by a kthread.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: support crosstimestamping on E822 devices if supported
  ice: exit bypass mode once hardware finishes timestamp calibration
  ice: ensure the hardware Clock Generation Unit is configured
  ice: implement basic E822 PTP support
  ice: convert clk_freq capability into time_ref
  ice: introduce ice_ptp_init_phc function
  ice: use 'int err' instead of 'int status' in ice_ptp_hw.c
  ice: PTP: move setting of tstamp_config
  ice: introduce ice_base_incval function
  ice: Fix E810 PTP reset flow
====================

Link: https://lore.kernel.org/r/20211221174845.3063640-1-anthony.l.nguyen@intel.com
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 14:25:42 -08:00
Jiasheng Jiang
9b8bdd1eb5 sfc: falcon: Check null pointer of rx_queue->page_ring
Because of the possible failure of the kcalloc, it should be better to
set rx_queue->page_ptr_mask to 0 when it happens in order to maintain
the consistency.

Fixes: 5a6681e22c ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20211220140344.978408-1-jiasheng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 12:25:18 -08:00
Jiasheng Jiang
bdf1b5c388 sfc: Check null pointer of rx_queue->page_ring
Because of the possible failure of the kcalloc, it should be better to
set rx_queue->page_ptr_mask to 0 when it happens in order to maintain
the consistency.

Fixes: 5a6681e22c ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20211220135603.954944-1-jiasheng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 12:23:18 -08:00
David E. Box
a5f8ef0baf net/mlx5e: Use auxiliary_device driver data helpers
Use auxiliary_get_drvdata and auxiliary_set_drvdata helpers.

Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20211221235852.323752-4-david.e.box@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22 13:59:01 +01:00
Jiasheng Jiang
99d7fbb5ce net: ks8851: Check for error irq
Because platform_get_irq() could fail and return error irq.
Therefore, it might be better to check it if order to avoid the use of
error irq.

Fixes: 797047f875 ("net: ks8851: Implement Parallel bus operations")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-22 10:23:50 +00:00
Jiasheng Jiang
cb93b3e11d drivers: net: smc911x: Check for error irq
Because platform_get_irq() could fail and return error irq.
Therefore, it might be better to check it if order to avoid the use of
error irq.

Fixes: ae150435b5 ("smsc: Move the SMC (SMSC) drivers")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-22 10:23:03 +00:00
Tariq Toukan
1f08917ab9 net/mlx5e: Take packet_merge params directly from the RX res struct
As packet_merge params structure is saved on the RX resources structure, there
is no need to pass it separately.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:58 -08:00
Lama Kayal
fa691d0c9c net/mlx5e: Allocate per-channel stats dynamically at first usage
Make stats allocation per-channel dynamic on demand, at channel open
operation.

Previously the stats array was pre-allocated for the maximum possible
number of channels. Here we defer the per-channel stats instance allocation
upon its first usage, so that it's allocated only if really needed.

Allocating stats on demand helps maintain a more memory-efficient code,
as we're saving memory when the used number of channels is smaller than
the maximum.

The stats memory instances are still freed in mlx5e_priv_arrays_free(),
so that they are persistent to channels' closure.

Memory size allocated for struct mlx5e_channel_stats is 3648 bytes.
If maximum number of channel stands for 64, the total memory space
allocated for stats is 3648x64 = 228K bytes. In scenarios where the
number of channels in use is significantly smaller than maximum number,
the memory saved can be remarkable.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:58 -08:00
Tariq Toukan
be98737a4f net/mlx5e: Use dynamic per-channel allocations in stats
Make stats array an array of pointer. This patch comes in to prepare for
the next patch where allocations of the stats are to be performed
dynamically on first usage.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:57 -08:00
Tariq Toukan
473baf2e9e net/mlx5e: Allow profile-specific limitation on max num of channels
Let SF/VF representor's netdev use profile-specific limitation on
max_nch to reduce its memory and HW resources consumption.

This is particularly important for environments with limited memory
and high number of SFs.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Vu Pham <vuhuong@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:57 -08:00
Tariq Toukan
0246a57ab5 net/mlx5e: Save memory by using dynamic allocation in netdev priv
Many arrays in priv are statically allocated with a pre-defined maximum
(for num channels, num TCs, etc...), that is in some cases significantly
larger than the actual maximum. Examples:
- The more VFs are supported, the less MSIX vectors each of them could
  have. This limits the max_nch for each.
- Systems with limited number of cores or MSIX (< 64).
- Netdev profiles that do not support: QoS (DCB / HTB), PTP TX port
  timestamping.

Here we save some amount of memory by moving several structures
and arrays to follow the actual maximum instead.
This patch also prepares the code for even more savings to follow.

For example, on a system where the maximum num of channel is 8,
the channels stats structs alone go down from 3648*64 = 228 KB to
3648*8 = 28.5 KB per interface.

This is important for environments with high number of VFs/SFs or
limited memory.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:57 -08:00
Tariq Toukan
1958c2bddf net/mlx5e: Add profile indications for PTP and QOS HTB features
Let the profile indicate support of the PTP and HTB (QOS) features.
This unifies the logic that calculates the number of netdev queues needed
for the features, and allows simplification of mlx5e_create_netdev(),
which no longer requires number of rx/tx queues as parameters.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:56 -08:00
Tariq Toukan
6c72cb05d4 net/mlx5e: Use bitmap field for profile features
Use a features bitmap field in mlx5e_profile to declare profile support
state of the different features.  Let it replace the existing
rx_ptp_support boolean. It will be extended to cover more features in a
downstream patch.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:56 -08:00
Shaokun Zhang
08ab0ff47b net/mlx5: Remove the repeated declaration
Function 'mlx5_esw_vport_match_metadata_supported' and
'mlx5_esw_offloads_vport_metadata_set' are declared twice, so remove
the repeated declaration and blank line.

Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:56 -08:00
Shay Drory
8680a60fc1 net/mlx5: Let user configure max_macs generic param
Currently, max_macs is taking 70Kbytes of memory per function. This
size is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the number of max_macs.

For example, to reduce the number of max_macs to 1, execute::
$ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:55 -08:00
Shay Drory
57ca767820 net/mlx5: Let user configure event_eq_size param
Event EQ is an EQ which received the notification of almost all the
events generated by the NIC.
Currently, each event EQ is taking 512KB of memory. This size is not
needed in most use cases, and is critical with large scale. Hence,
allow user to configure the size of the event EQ.

For example to reduce event EQ size to 64, execute::
$ devlink dev param set pci/0000:00:0b.0 name event_eq_size value 64 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:55 -08:00
Shay Drory
0844fa5f7b net/mlx5: Let user configure io_eq_size param
Currently, each I/O EQ is taking 128KB of memory. This size
is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the size of I/O EQs.

For example, to reduce I/O EQ size to 64, execute:
$ devlink dev param set pci/0000:00:0b.0 name io_eq_size value 64 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:54 -08:00
Xiang wangx
37cf276df1 fm10k: Fix syntax errors in comments
Delete the redundant word 'by'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Karen Sornek
630f6edc48 igbvf: Refactor trace
Refactoring "PF still resetting" message, because previous version looked
like a bug - it informed about changes that worked as designed but might
confuse users. Changes requested to make message more user-friendly.

Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Jason Wang
890781af31 igb: remove never changed variable `ret_val'
The variable used for return status in `igb_write_xmdio_reg' function
is never changed  and this function is just need return 0. Thus, the
`ret_val' can be removed and return 0 at the end of the
`igb_write_xmdio_reg' function.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
b8773a66f6 igc: Remove obsolete define
'MII_CR_FULL_DUPLEX' define not in use. This patch comes to tidy up
 obsolete define.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
d2a66dd3fd igc: Remove obsolete mask
'IGC_CTRL_EXT_LINK_MODE_MASK' not in use. This patch comes to tidy up
obsolete define.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
2a8807a765 igc: Remove obsolete nvm type
i225 devices use only spi nvm type. This patch comes to tidy up
obsolete nvm types.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
8e153faf58 igc: Remove unused phy type
_phy_none type not in use. Clean up the code accordingly,
and get rid of the unused enum line

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
7a34cda1ee igc: Remove unused _I_PHY_ID define
_I_PHY_ID not in use. Clean up the code accordingly,
and get rid of the unused define

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Jacob Keller
13a64f0b98 ice: support crosstimestamping on E822 devices if supported
E822 devices on supported platforms can generate a cross timestamp
between the platform ART and the device time. This process allows for
very precise measurement of the difference between the PTP hardware
clock and the platform time.

This is only supported if we know the TSC frequency relative to ART, so
we do not enable this unless the boot CPU has a known TSC frequency (as
required by convert_art_ns_to_tsc).

Because PCIe PTM support is not available on all platforms, introduce
CONFIG_ICE_HWTS and make it depend on X86 where we know the support
exists.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:40 -08:00
Jacob Keller
a69f1cb62a ice: exit bypass mode once hardware finishes timestamp calibration
Once the E822 device has sent and received one packet, the hardware
computes the internal delay of the PHY using a process known as Vernier
calibration. This calibration calculates a more accurate offset for the
Tx and Rx timestamps. To make use of this offset, we need to exit the
bypass mode. This cannot be done until the PHY has completed offset
calibration, as indicated by the offset valid bits.

To handle this, introduce a kthread work item which will poll the offset
valid bits every few milliseconds seeing if it is safe to exit bypass
mode.

Once we have finished calibrating the offsets, we can program the total
Tx and Rx offset registers and turn off the bypass bit. This allows the
hardware to include the more precise vernier calibration offset, and
improves the timestamp precision.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:40 -08:00
Jacob Keller
b111ab5a11 ice: ensure the hardware Clock Generation Unit is configured
The E822 device has a Clock Generation Unit (CGU) responsible for
determining the clock frequency that drives the timers.

Ensure this function is initialized when bringing up the PTP support, so
that the clock has a known frequency.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:40 -08:00
Jacob Keller
3a7496234d ice: implement basic E822 PTP support
Implement support for the basic operations needed to enable the PTP
hardware clock on E822 devices.

This includes implementations for the various PHY access functions, as
well as the ability to start and stop the PHY timers. This is different
from the E810 device because the configuration depends on link speed, so
we cannot just start the PHYs immediately. We must wait until the link
is up to get proper values for the speed based initialization.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:37 -08:00
Jacob Keller
405efa49b5 ice: convert clk_freq capability into time_ref
Convert the clk_freq value into the associated time_ref frequency value
for E822 devices. This simplifies determining the time reference value
for the clock.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
b2ee72565c ice: introduce ice_ptp_init_phc function
When we enable support for E822 devices, there are some additional
steps required to initialize the PTP hardware clock. To make this easier
to implement as device-specific behavior, refactor the register setups
in ice_ptp_init_owner to a new ice_ptp_init_phc function defined in
ice_ptp_hw.c

This function will have a common section, and an e810 specific
sub-implementation.

This will enable easily extending the functionality to cover the E822
specific setup required to initialize the hardware clock generation
unit. It also makes it clear which steps are E810 specific vs which ones
are necessary for all ice devices.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
39b2810642 ice: use 'int err' instead of 'int status' in ice_ptp_hw.c
The ice_ptp_hw.c file introduced a bunch of uses of "int status" instead
of the more traditional "int err" or "int ret". These are actually
traditional Linux error codes (as opposed to the recently removed
ice_status enumeration values).

We're about to add a bunch of new functions to ice_ptp_hw.c. It's
normally preferred in the ice driver to use "int ret" or "int err" when
dealing with error code values.

Instead of making the new functions use "int status" lets just fix all
of ice_ptp_hw.c to use "int err". This will match the new functions and
ensures a consistent style across at least the PTP related files.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
e59d75dd41 ice: PTP: move setting of tstamp_config
The tstamp_config structure is being set inside of
ice_ptp_cfg_timestamp, which is the function used to set Tx and
Rx timestamping during initialization.

This function is also used in order to set the PHY port timestamping
status. However, it makes sense to always set the tstamp_config directly
whenever the ice_set_tx_tstamp or ice_set_rx_tstamp functions are
called.

Move assignment of tstamp_config into the related functions and out of
ice_ptp_cfg_timestamp.

Now that we assign the timestamp mode in the relevant functions, we no
longer modify the config value in ice_set_timestamp_mode. In turn, we
no longer want to copy that config value into the PF cached structure.
Instead, this is now the source of truth for actual configuration. On
success of ice_set_timestamp_mode, copy the real configured mode back to
report it out to userspace.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
78267d0c9c ice: introduce ice_base_incval function
A future change will add additional possible increment values for the
E822 device support. To handle this, we want to look up the increment
value to use instead of hard coding it to the nominal value for E810
devices. Introduce ice_base_incval as a function to get the best nominal
increment value to use.

For now, it just returns the E810 value, but will be refactored in the
future to look up the value based on the device type and configured
clock frequency.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Karol Kolacinski
4809671015 ice: Fix E810 PTP reset flow
The PF reset does not reset PHC and PHY clocks so it's unnecessary to
stop them and reinitialize after the reset.
Configuring timestamping changes the VSI fields so it needs to be
performed after VSIs are initialized, which was not done in case of a
reset.

Suggested-by: Patrick Talbert <ptalbert@redhat.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Pasi Vaananen <pvaanane@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:09:57 -08:00
Heiner Kallweit
ac8c58f5b5 igb: fix deadlock caused by taking RTNL in RPM resume path
Recent net core changes caused an issue with few Intel drivers
(reportedly igb), where taking RTNL in RPM resume path results in a
deadlock. See [0] for a bug report. I don't think the core changes
are wrong, but taking RTNL in RPM resume path isn't needed.
The Intel drivers are the only ones doing this. See [1] for a
discussion on the issue. Following patch changes the RPM resume path
to not take RTNL.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=215129
[1] https://lore.kernel.org/netdev/20211125074949.5f897431@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/t/

Fixes: bd869245a3 ("net: core: try to runtime-resume detached device in __dev_open")
Fixes: f32a213765 ("ethtool: runtime-resume netdev parent before ethtool ioctl ops")
Tested-by: Martin Stolpe <martin.stolpe@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20211220201844.2714498-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20 18:48:09 -08:00
Jeroen de Borst
1f06f7d97f gve: Correct order of processing device options
The legacy raw addressing device option was processed before the
new RDA queue format option.  This caused the supported features mask,
which is provided only on the RDA queue format option, not to be set.

This disabled jumbo-frame support when using raw adressing.

Fixes: 255489f5b3 ("gve: Add a jumbo-frame device option")
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://lore.kernel.org/r/20211220192746.2900594-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20 18:47:56 -08:00
Raju Rangoju
6f60ecf233 net: amd-xgbe: Disable the CDR workaround path for Yellow Carp Devices
Yellow Carp Ethernet devices do not require
Autonegotiation CDR workaround, hence disable the same.

Co-developed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20 18:42:40 -08:00
Raju Rangoju
2d4a0b79dc net: amd-xgbe: Alter the port speed bit range
Newer generation Hardware uses the slightly different
port speed bit widths, so alter the existing port speed
bit range to extend support to the newer generation hardware
while maintaining the backward compatibility with older
generation hardware.

The previously reserved bits are now being used which
then requires the adjustment to the BIT values, e.g.:

Before:
   PORT_PROPERTY_0[22:21] - Reserved
   PORT_PROPERTY_0[26:23] - Supported Speeds

After:
   PORT_PROPERTY_0[21] - Reserved
   PORT_PROPERTY_0[26:22] - Supported Speeds

To make this backwards compatible, the existing BIT
definitions for the port speeds are incremented by one
to maintain the original position.

Co-developed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20 18:42:39 -08:00
Raju Rangoju
dbb6c58b5a net: amd-xgbe: Add Support for Yellow Carp Ethernet device
Yellow Carp Ethernet devices use the existing PCI ID but
the window settings for the indirect PCS access have been
altered. Add the check for Yellow Carp Ethernet devices to
use the new register values.

Co-developed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20 18:42:39 -08:00
Horatiu Vultur
811ba27711 net: lan966x: Extend switchdev with fdb support
Extend lan966x driver with fdb support by implementing the switchdev
calls SWITCHDEV_FDB_ADD_TO_DEVICE and SWITCHDEV_FDB_DEL_TO_DEVICE.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:06 +00:00
Horatiu Vultur
e14f72398d net: lan966x: Extend switchdev bridge flags
Currently allow a port to be part or not of the multicast flooding mask.
By implementing the switchdev calls SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS
and SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:06 +00:00
Horatiu Vultur
6d2c186afa net: lan966x: Add vlan support.
Extend the driver to support vlan filtering  by implementing the
switchdev calls SWITCHDEV_OBJ_ID_PORT_VLAN,
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:05 +00:00
Horatiu Vultur
cf2f60897e net: lan966x: Add support to offload the forwarding.
This patch adds basic support to offload in the HW the forwarding of the
frames. The driver registers to the switchdev callbacks and implements
the callbacks for attributes SWITCHDEV_ATTR_ID_PORT_STP_STATE and
SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME.
It is not allowed to add a lan966x port to a bridge that contains a
different interface than lan966x.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:05 +00:00
Horatiu Vultur
571bb516a8 net: lan966x: Remove .ndo_change_rx_flags
The function lan966x_port_change_rx_flags() was used only when
IFF_PROMISC flag was set. In that case it was setting to copy all the
frames to the CPU instead of removing any RX filters. Therefore remove
it.

Fixes: d28d6d2e37 ("net: lan966x: add port module support")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:05 +00:00
Horatiu Vultur
25ee9561ec net: lan966x: More MAC table functionality
This patch adds support for adding/removing mac entries in the SW list
of entries and in the HW table. This is used by the bridge
functionality.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:05 +00:00
Horatiu Vultur
5ccd66e01c net: lan966x: add support for interrupts from analyzer
This patch adds support for handling the interrupts generated by the
analyzer. Currently, only the MAC table generates these interrupts.
The MAC table will generate an interrupt whenever it learns or forgets
an entry in the table. It is the SW responsibility figure out which
entries were added/removed.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:05 +00:00
Horatiu Vultur
ef14049f4d net: lan966x: Add registers that are used for switch and vlan functionality
This patch adds the registers that will be used to enable switchdev and
vlan functionality in the HW.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:44:04 +00:00
Danielle Ratson
239cdd3f4c mlxsw: core: Extend devlink health reporter with new events and parameters
Extend the devlink health reporter registered by mlxsw to report new
health events and their related parameters. These are meant to aid in
debugging of hardware / firmware issues.

Beside the test event ('MLXSW_REG_MFDE_EVENT_ID_TEST') that is triggered
following the devlink health 'test' sub-command, the new events are used
to report the triggering of asserts in firmware code
('MLXSW_REG_MFDE_EVENT_ID_FW_ASSERT') and hardware issues
('MLXSW_REG_MFDE_EVENT_ID_FATAL_CAUSE').

Each event is accompanied with a severity parameter and per-event
parameters that are meant to help root cause the detected issue.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:32:21 +00:00
Danielle Ratson
e25c060c5f mlxsw: reg: Extend MFDE register with new events and parameters
Extend the Monitoring Firmware Debug (MFDE) register with new events and
their related parameters. These events will be utilized by
devlink-health in the next patch.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:32:21 +00:00
Danielle Ratson
4bcbf50291 mlxsw: core: Convert a series of if statements to switch case
Convert a series of if statements that handle different events to a
switch case statement. Encapsulate the per-event code in different
functions to simplify the code.

This is a preparation for subsequent patches that will add more events
that need to be handled.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:32:20 +00:00
Danielle Ratson
cbbd5fff86 mlxsw: Fix naming convention of MFDE fields
Currently, the MFDE register field names are using the convention:
reg_mfde_<NAME_OF_FIELD>, and do not consider the name of the MFDE
event.

Fix the field names so they fit the more accurate convention:
reg_mfde_<NAME_OF_EVENT>_<NAME_OF_FIELD>.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:32:20 +00:00
Manish Chopra
802d4d207e bnx2x: Invalidate fastpath HSI version for VFs
Commit 0a6890b9b4 ("bnx2x: Utilize FW 7.13.15.0.")
added validation for fastpath HSI versions for different
client init which was not meant for SR-IOV VF clients, which
resulted in firmware asserts when running VF clients with
different fastpath HSI version.

This patch along with the new firmware support in patch #1
fixes this behavior in order to not validate fastpath HSI
version for the VFs.

Fixes: 0a6890b9b4 ("bnx2x: Utilize FW 7.13.15.0.")
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:29:20 +00:00
Manish Chopra
b7a49f7305 bnx2x: Utilize firmware 7.13.21.0
This new firmware addresses few important issues and enhancements
as mentioned below -

- Support direct invalidation of FP HSI Ver per function ID, required for
  invalidating FP HSI Ver prior to each VF start, as there is no VF start
- BRB hardware block parity error detection support for the driver
- Fix the FCOE underrun flow
- Fix PSOD during FCoE BFS over the NIC ports after preboot driver
- Maintains backward compatibility

This patch incorporates this new firmware 7.13.21.0 in bnx2x driver.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-20 11:29:20 +00:00
Baowen Zheng
5a9959008f flow_offload: add index to flow_action_entry structure
Add index to flow_action_entry structure and delete index from police and
gate child structure.

We make this change to offload tc action for driver to identify a tc
action.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:47 +00:00
Baowen Zheng
144d4c9e80 flow_offload: reject to offload tc actions in offload drivers
A follow-up patch will allow users to offload tc actions independent of
classifier in the software datapath.

In preparation for this, teach all drivers that support offload of the flow
tables to reject such configuration as currently none of them support it.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:47 +00:00
Jiasheng Jiang
60ec7fcfe7 qlcnic: potential dereference null pointer of rx_queue->page_ring
The return value of kcalloc() needs to be checked.
To avoid dereference of null pointer in case of the failure of alloc.
Therefore, it might be better to change the return type of
qlcnic_sriov_alloc_vlans() and return -ENOMEM when alloc fails and
return 0 the others.
Also, qlcnic_sriov_set_guest_vlan_mode() and __qlcnic_pci_sriov_enable()
should deal with the return value of qlcnic_sriov_alloc_vlans().

Fixes: 154d0c810c ("qlcnic: VLAN enhancement for 84XX adapters")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-18 12:37:12 +00:00
David S. Miller
23044d77d6 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2021-12-17

Brett Creeley says:

This patch series adds support in the iavf driver for communicating and
using VIRTCHNL_VF_OFFLOAD_VLAN_V2. The current VIRTCHNL_VF_OFFLOAD_VLAN
is very limited and covers all 802.1Q VLAN offloads and filtering with
no granularity.

The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 adds more granularity, flexibility,
and support for 802.1ad offloads and filtering. This includes the VF
negotiating which VLAN offloads/filtering it's allowed, where VLAN tags
should be inserted and/or stripped into and from descriptors, and the
supported VLAN protocols.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-18 12:22:17 +00:00
David S. Miller
aa3cc8a9e4 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-17

Maciej Fijalkowski says:

It seems that previous [0] Rx fix was not enough and there are still
issues with AF_XDP Rx ZC support in ice driver. Elza reported that for
multiple XSK sockets configured on a single netdev, some of them were
becoming dead after a while. We have spotted more things that needed to
be addressed this time. More of information can be found in particular
commit messages.

It also carries Alexandr's patch that was sent previously which was
overlapping with this set.

[0]: https://lore.kernel.org/bpf/20211129231746.2767739-1-anthony.l.nguyen@intel.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-18 12:19:15 +00:00
Dan Carpenter
ab9d0e2171 net: ethernet: mtk_eth_soc: delete some dead code
The debugfs_create_dir() function never returns NULL.  It does return
error pointers but in normal situations like this there is no need to
check for errors.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211217071037.GE26548@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:24:38 -08:00
Dan Carpenter
ddfbe18da5 net: mtk_eth_soc: delete an unneeded variable
There is already an "int err" declared at the start of the function so
re-use that instead of declaring a shadow err variable.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211217070735.GC26548@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:24:26 -08:00
Gerhard Engleder
00315e1627 tsnep: Fix s390 devm_ioremap_resource warning
The following warning is fixed with additional config dependencies:

s390-linux-ld: drivers/net/ethernet/engleder/tsnep_main.o: in function `tsnep_probe':
tsnep_main.c:(.text+0x1de6): undefined reference to `devm_ioremap_resource'

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/20211216200154.1520-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:22:57 -08:00
Yevhen Orlov
2efc2256fe net: marvell: prestera: fix incorrect structure access
In line:
	upper = info->upper_dev;
We access upper_dev field, which is related only for particular events
(e.g. event == NETDEV_CHANGEUPPER). So, this line cause invalid memory
access for another events,
when ptr is not netdev_notifier_changeupper_info.

The KASAN logs are as follows:

[   30.123165] BUG: KASAN: stack-out-of-bounds in prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera]
[   30.133336] Read of size 8 at addr ffff80000cf772b0 by task udevd/778
[   30.139866]
[   30.141398] CPU: 0 PID: 778 Comm: udevd Not tainted 5.16.0-rc3 #6
[   30.147588] Hardware name: DNI AmazonGo1 A7040 board (DT)
[   30.153056] Call trace:
[   30.155547]  dump_backtrace+0x0/0x2c0
[   30.159320]  show_stack+0x18/0x30
[   30.162729]  dump_stack_lvl+0x68/0x84
[   30.166491]  print_address_description.constprop.0+0x74/0x2b8
[   30.172346]  kasan_report+0x1e8/0x250
[   30.176102]  __asan_load8+0x98/0xe0
[   30.179682]  prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera]
[   30.186847]  prestera_netdev_event_handler+0x1b4/0x1c0 [prestera]
[   30.193313]  raw_notifier_call_chain+0x74/0xa0
[   30.197860]  call_netdevice_notifiers_info+0x68/0xc0
[   30.202924]  register_netdevice+0x3cc/0x760
[   30.207190]  register_netdev+0x24/0x50
[   30.211015]  prestera_device_register+0x8a0/0xba0 [prestera]

Fixes: 3d5048cc54 ("net: marvell: prestera: move netdev topology validation to prestera_main")
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Link: https://lore.kernel.org/r/20211216171714.11341-1-yevhen.orlov@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:19:09 -08:00
Yevhen Orlov
8b681bd7c3 net: marvell: prestera: fix incorrect return of port_find
In case, when some ports is in list and we don't find requested - we
return last iterator state and not return NULL as expected.

Fixes: 501ef3066c ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Link: https://lore.kernel.org/r/20211216170736.8851-1-yevhen.orlov@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:19:00 -08:00
Aleksander Jan Bajkowski
1488fc2045 net: lantiq_xrx200: increase buffer reservation
If the user sets a lower mtu on the CPU port than on the switch,
then DMA inserts a few more bytes into the buffer than expected.
In the worst case, it may exceed the size of the buffer. The
experiments showed that the buffer should be a multiple of the
burst length value. This patch rounds the length of the rx buffer
upwards and fixes this bug. The reservation of FCS space in the
buffer has been removed as PMAC strips the FCS.

Fixes: 998ac35801 ("net: lantiq: add support for jumbo frames")
Reported-by: Thomas Nixon <tom@tomn.co.uk>
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:18:29 -08:00
Brett Creeley
92fc508598 iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2
For VIRTCHNL_VF_OFFLOAD_VLAN, PF's would limit the number of VLAN
filters a VF was allowed to add. However, by the time the opcode failed,
the VLAN netdev had already been added. VIRTCHNL_VF_OFFLOAD_VLAN_V2
added the ability for a PF to tell the VF how many VLAN filters it's
allowed to add. Make changes to support that functionality.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 12:37:19 -08:00
Brett Creeley
8afadd1cd8 iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 offload enable/disable
The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows
the VF to support 802.1Q and 802.1ad VLAN insertion and stripping if
successfully negotiated via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS.
Multiple changes were needed to support this new functionality.

1. Added new aq_required flags to support any kind of VLAN stripping and
   insertion offload requests via virtchnl.

2. Added the new method iavf_set_vlan_offload_features() that's
   used during VF initialization, VF reset, and iavf_set_features() to
   set the aq_required bits based on the current VLAN offload
   configuration of the VF's netdev.

3. Added virtchnl handling for VIRTCHNL_OP_ENABLE_STRIPPING_V2,
   VIRTCHNL_OP_DISABLE_STRIPPING_V2, VIRTCHNL_OP_ENABLE_INSERTION_V2,
   and VIRTCHNL_OP_ENABLE_INSERTION_V2.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 12:37:19 -08:00
Brett Creeley
ccd219d2ea iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 hotpath
The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows
the PF to set the location of the Tx and Rx VLAN tag for insertion and
stripping offloads. In order to support this functionality a few changes
are needed.

1. Add a new method to cache the VLAN tag location based on negotiated
   capabilities for the Tx and Rx ring flags. This needs to be called in
   the initialization and reset paths.

2. Refactor the transmit hotpath to account for the new Tx ring flags.
   When IAVF_TXR_FLAGS_VLAN_LOC_L2TAG2 is set, then the driver needs to
   insert the VLAN tag in the L2TAG2 field of the transmit descriptor.
   When the IAVF_TXRX_FLAGS_VLAN_LOC_L2TAG1 is set, then the driver needs
   to use the l2tag1 field of the data descriptor (same behavior as
   before).

3. Refactor the iavf_tx_prepare_vlan_flags() function to simplify
   transmit hardware VLAN offload functionality by only depending on the
   skb_vlan_tag_present() function. This can be done because the OS
   won't request transmit offload for a VLAN unless the driver told the
   OS it's supported and enabled.

4. Refactor the receive hotpath to account for the new Rx ring flags and
   VLAN ethertypes. This requires checking the Rx ring flags and
   descriptor status bits to determine the location of the VLAN tag.
   Also, since only a single ethertype can be supported at a time, check
   the enabled netdev features before specifying a VLAN ethertype in
   __vlan_hwaccel_put_tag().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 12:37:19 -08:00
Brett Creeley
48ccc43ecf iavf: Add support VIRTCHNL_VF_OFFLOAD_VLAN_V2 during netdev config
Based on VIRTCHNL_VF_OFFLOAD_VLAN_V2, the VF can now support more VLAN
capabilities (i.e. 802.1AD offloads and filtering). In order to
communicate these capabilities to the netdev layer, the VF needs to
parse its VLAN capabilities based on whether it was able to negotiation
VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or neither of
these.

In order to support this, add the following functionality:

iavf_get_netdev_vlan_hw_features() - This is used to determine the VLAN
features that the underlying hardware supports and that can be toggled
off/on based on the negotiated capabiltiies. For example, if
VIRTCHNL_VF_OFFLOAD_VLAN_V2 was negotiated, then any capability marked
with VIRTCHNL_VLAN_TOGGLE can be toggled on/off by the VF. If
VIRTCHNL_VF_OFFLOAD_VLAN was negotiated, then only VLAN insertion and/or
stripping can be toggled on/off.

iavf_get_netdev_vlan_features() - This is used to determine the VLAN
features that the underlying hardware supports and that should be
enabled by default. For example, if VIRTHCNL_VF_OFFLOAD_VLAN_V2 was
negotiated, then any supported capability that has its ethertype_init
filed set should be enabled by default. If VIRTCHNL_VF_OFFLOAD_VLAN was
negotiated, then filtering, stripping, and insertion should be enabled
by default.

Also, refactor iavf_fix_features() to take into account the new
capabilities. To do this, query all the supported features (enabled by
default and toggleable) and make sure the requested change is supported.
If VIRTCHNL_VF_OFFLOAD_VLAN_V2 is successfully negotiated, there is no
need to check VIRTCHNL_VLAN_TOGGLE here because the driver already told
the netdev layer which features can be toggled via netdev->hw_features
during iavf_process_config(), so only those features will be requested
to change.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 12:37:19 -08:00
Brett Creeley
209f2f9c71 iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation
In order to support the new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability the
VF driver needs to rework it's initialization state machine and reset
flow. This has to be done because successful negotiation of
VIRTCHNL_VF_OFFLOAD_VLAN_V2 requires the VF driver to perform a second
capability request via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS before
configuring the adapter and its netdev.

Add the VIRTCHNL_VF_OFFLOAD_VLAN_V2 bit when sending the
VIRTHCNL_OP_GET_VF_RESOURECES message. The underlying PF will either
support VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or
neither. Both of these offloads should never be supported together.

Based on this, add 2 new states to the initialization state machine:

__IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS
__IAVF_INIT_CONFIG_ADAPTER

The __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS state is used to request/store
the new VLAN capabilities if and only if VIRTCHNL_VLAN_OFFLOAD_VLAN_V2
was successfully negotiated in the __IAVF_INIT_GET_RESOURCES state.

The __IAVF_INIT_CONFIG_ADAPTER state is used to configure the
adapter/netdev after the resource requests have finished. The VF will
move into this state regardless of whether it successfully negotiated
VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2.

Also, add a the new flag IAVF_FLAG_AQ_GET_OFFLOAD_VLAN_V2_CAPS and set
it during VF reset. If VIRTCHNL_VF_OFFLOAD_VLAN_V2 was successfully
negotiated then the VF will request its VLAN capabilities via
VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS during the reset. This is needed
because the PF may change/modify the VF's configuration during VF reset
(i.e. modifying the VF's port VLAN configuration).

This also, required the VF to call netdev_update_features() since its
VLAN features may change during VF reset. Make sure to call this under
rtnl_lock().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 12:37:18 -08:00
Maciej Fijalkowski
dcbaf72aa4 ice: xsk: fix cleaned_count setting
Currently cleaned_count is initialized to ICE_DESC_UNUSED(rx_ring) and
later on during the Rx processing it is incremented per each frame that
driver consumed. This can result in excessive buffers requested from xsk
pool based on that value.

To address this, just drop cleaned_count and pass
ICE_DESC_UNUSED(rx_ring) directly as a function argument to
ice_alloc_rx_bufs_zc(). Idea is to ask for buffers as many as consumed.

Let us also call ice_alloc_rx_bufs_zc unconditionally at the end of
ice_clean_rx_irq_zc. This has been changed in that way for corresponding
ice_clean_rx_irq, but not here.

Fixes: 2d4238f556 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 11:18:21 -08:00
Maciej Fijalkowski
8bea15ab74 ice: xsk: allow empty Rx descriptors on XSK ZC data path
Commit ac6f733a7b ("ice: allow empty Rx descriptors") stated that ice
HW can produce empty descriptors that are valid and they should be
processed.

Add this support to xsk ZC path to avoid potential processing problems.

Fixes: 2d4238f556 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 11:16:55 -08:00
Maciej Fijalkowski
8b51a13c37 ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor
The descriptor that ntu is pointing at when we exit
ice_alloc_rx_bufs_zc() should not have its corresponding DD bit cleared
as descriptor is not allocated in there and it is not valid for HW
usage.

The allocation routine at the entry will fill the descriptor that ntu
points to after it was set to ntu + nb_buffs on previous call.

Even the spec says:
"The tail pointer should be set to one descriptor beyond the last empty
descriptor in host descriptor ring."

Therefore, step away from clearing the status_error0 on ntu + nb_buffs
descriptor.

Fixes: db804cfc21 ("ice: Use the xsk batched rx allocation interface")
Reported-by: Elza Mathew <elza.mathew@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 11:15:15 -08:00
Alexander Lobakin
0708b6facb ice: remove dead store on XSK hotpath
The 'if (ntu == rx_ring->count)' block in ice_alloc_rx_buffers_zc()
was previously residing in the loop, but after introducing the
batched interface it is used only to wrap-around the NTU descriptor,
thus no more need to assign 'xdp'.

Fixes: db804cfc21 ("ice: Use the xsk batched rx allocation interface")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 11:12:05 -08:00
Maciej Fijalkowski
617f3e1b58 ice: xsk: allocate separate memory for XDP SW ring
Currently, the zero-copy data path is reusing the memory region that was
initially allocated for an array of struct ice_rx_buf for its own
purposes. This is error prone as it is based on the ice_rx_buf struct
always being the same size or bigger than what the zero-copy path needs.
There can also be old values present in that array giving rise to errors
when the zero-copy path uses it.

Fix this by freeing the ice_rx_buf region and allocating a new array for
the zero-copy path that has the right length and is initialized to zero.

Fixes: 57f7f8b6bc ("ice: Use xdp_buf instead of rx_buf for xsk zero-copy")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 11:09:04 -08:00
Maciej Fijalkowski
afe8a3ba85 ice: xsk: return xsk buffers back to pool when cleaning the ring
Currently we only NULL the xdp_buff pointer in the internal SW ring but
we never give it back to the xsk buffer pool. This means that buffers
can be leaked out of the buff pool and never be used again.

Add missing xsk_buff_free() call to the routine that is supposed to
clean the entries that are left in the ring so that these buffers in the
umem can be used by other sockets.

Also, only go through the space that is actually left to be cleaned
instead of a whole ring.

Fixes: 2d4238f556 ("ice: Add support for AF_XDP")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 11:07:04 -08:00
Paul Cercueil
5ef11c56b2 r8169: Avoid misuse of pm_ptr() macro
The pm_ptr() macro should be used when the suspend and resume functions
can be compiled independently of the CONFIG_PM Kconfig option.

In the case of this driver, the suspend and resume functions are inside
a section protected by a #ifdef CONFIG_PM guard. Therefore pm_ptr()
should not be used.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17 16:04:14 +01:00
Dexuan Cui
6cc74443a7 net: mana: Add RX fencing
RX fencing allows the driver to know that any prior change to the RQs has
finished, e.g. when the RQs are disabled/enabled or the hashkey/indirection
table are changed, RX fencing is required.

Remove the previous workaround "ssleep(1)" and add the real support for
RX fencing as the PF driver supports the MANA_FENCE_RQ request now (any
old PF driver not supporting the request won't be used in production).

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/20211216001748.8751-1-decui@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 20:27:32 -08:00
Yang Li
431b9b4d97 net: vertexcom: remove unneeded semicolon
Eliminate the following coccicheck warning:
./drivers/net/ethernet/vertexcom/mse102x.c:414:2-3: Unneeded semicolon

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20211216015433.83383-1-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 20:26:25 -08:00
Yinjun Zhang
7ffd9041de nfp: flower: refine the use of circular buffer
The current use of circular buffer to manage stats_context_id is
very obscure, and it will cause problem if its element size is
not power of two. So change the use more straightforward and
scalable, and also change that for mask_id to keep consistency.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/1639618621-5857-1-git-send-email-yinjun.zhang@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 20:25:56 -08:00
Jakub Kicinski
7cd2802d74 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 16:13:19 -08:00
Thomas Gleixner
d86a6d47bc bus: fsl-mc: fsl-mc-allocator: Rework MSI handling
Storing a pointer to the MSI descriptor just to track the Linux interrupt
number is daft. Just store the interrupt number and be done with it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221815.207838579@linutronix.de
2021-12-16 22:16:41 +01:00
Florian Fainelli
8b8e6e7824 net: systemport: Add global locking for descriptor lifecycle
The descriptor list is a shared resource across all of the transmit queues, and
the locking mechanism used today only protects concurrency across a given
transmit queue between the transmit and reclaiming. This creates an opportunity
for the SYSTEMPORT hardware to work on corrupted descriptors if we have
multiple producers at once which is the case when using multiple transmit
queues.

This was particularly noticeable when using multiple flows/transmit queues and
it showed up in interesting ways in that UDP packets would get a correct UDP
header checksum being calculated over an incorrect packet length. Similarly TCP
packets would get an equally correct checksum computed by the hardware over an
incorrect packet length.

The SYSTEMPORT hardware maintains an internal descriptor list that it re-arranges
when the driver produces a new descriptor anytime it writes to the
WRITE_PORT_{HI,LO} registers, there is however some delay in the hardware to
re-organize its descriptors and it is possible that concurrent TX queues
eventually break this internal allocation scheme to the point where the
length/status part of the descriptor gets used for an incorrect data buffer.

The fix is to impose a global serialization for all TX queues in the short
section where we are writing to the WRITE_PORT_{HI,LO} registers which solves
the corruption even with multiple concurrent TX queues being used.

Fixes: 80105befdb ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211215202450.4086240-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 08:15:31 -08:00
Jiasheng Jiang
407ecd1bd7 sfc_ef100: potential dereference of null pointer
The return value of kmalloc() needs to be checked.
To avoid use in efx_nic_update_stats() in case of the failure of alloc.

Fixes: b593b6f1b4 ("sfc_ef100: statistics gathering")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:55:32 +00:00
Volodymyr Mytnyk
604ba23090 net: prestera: flower template support
Add user template explicit support. At this moment, max
TCAM rule size is utilized for all rules, doesn't matter
which and how much flower matches are provided by user. It
means that some of TCAM space is wasted, which impacts
the number of filters that can be offloaded.

Introducing the template, allows to have more HW offloaded
filters by specifying the template explicitly.

Example:
  tc qd add dev PORT clsact
  tc chain add dev PORT ingress protocol ip \
    flower dst_ip 0.0.0.0/16
  tc filter add dev PORT ingress protocol ip \
    flower skip_sw dst_ip 1.2.3.4/16 action drop

NOTE: chain 0 is the default chain id for "tc chain" & "tc filter"
      command, so it is omitted in the example above.

This patch adds only template support for default chain 0 suppoerted
by prestera driver at this moment. Chains are not supported yet,
and will be added later.

Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:52:53 +00:00
John Keeping
0546b224cc net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup
KASAN reports an out-of-bounds read in rk_gmac_setup on the line:

	while (ops->regs[i]) {

This happens for most platforms since the regs flexible array member is
empty, so the memory after the ops structure is being read here.  It
seems that mostly this happens to contain zero anyway, so we get lucky
and everything still works.

To avoid adding redundant data to nearly all the ops structures, add a
new flag to indicate whether the regs field is valid and avoid this loop
when it is not.

Fixes: 3bb3d6b1c1 ("net: stmmac: Add RK3566/RK3568 SoC support")
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:47:48 +00:00
Tao Liu
6081ac2013 gve: Add tx|rx-coalesce-usec for DQO
Adding ethtool support for changing rx-coalesce-usec and tx-coalesce-usec
when using the DQO queue format.

Signed-off-by: Tao Liu <xliutaox@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Jordan Kim
2c9198356d gve: Add consumed counts to ethtool stats
Being able to see how many descriptors are in-use is helpful
when diagnosing certain issues.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: Jordan Kim <jrkim@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Catherine Sullivan
974365e518 gve: Implement suspend/resume/shutdown
Add support for suspend, resume and shutdown.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Willem de Bruijn
497dbb2b97 gve: Add optional metadata descriptor type GVE_TXD_MTD
Allow drivers to pass metadata along with packet data to the device.
Introduce a new metadata descriptor type

* GVE_TXD_MTD

This descriptor is optional. If present it immediate follows the
packet descriptor and precedes the segment descriptor.

This descriptor may be repeated. Multiple metadata descriptors may
follow. There are no immediate uses for this, this is for future
proofing. At present devices allow only 1 MTD descriptor.

The lower four bits of the type_flags field encode GVE_TXD_MTD.
The upper four bits of the type_flags field encodes a *sub*type.

Introduce one such metadata descriptor subtype

* GVE_MTD_SUBTYPE_PATH

This shares path information with the device for network failure
discovery and robust response:

Linux derives ipv6 flowlabel and ECMP multipath from sk->sk_txhash,
and updates this field on error with sk_rethink_txhash. Allow the host
stack to do the same. Pass the tx_hash value if set. Also communicate
whether the path hash is set, or more exactly, what its type is. Define
two common types

  GVE_MTD_PATH_HASH_NONE
  GVE_MTD_PATH_HASH_L4

Concrete examples of error conditions that are resolved are
mentioned in the commits that add sk_rethink_txhash calls. Such as
commit 7788174e87 ("tcp: change IPv6 flow-label upon receiving
spurious retransmission").

Experimental results mirror what the theory suggests: where IPv6
FlowLabel is included in path selection (e.g., LAG/ECMP), flowlabel
rotation on TCP timeout avoids the vast majority of TCP disconnects
that would otherwise have occurred during link failures in long-haul
backbones, when an alternative path is available.

Rotation can be applied to various bad connection signals, such as
timeouts and spurious retransmissions. In aggregate, such flow level
signals can help locate network issues. Define initial common states:

  GVE_MTD_PATH_STATE_DEFAULT
  GVE_MTD_PATH_STATE_TIMEOUT
  GVE_MTD_PATH_STATE_CONGESTION
  GVE_MTD_PATH_STATE_RETRANSMIT

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Catherine Sullivan
5fd07df47a gve: remove memory barrier around seqno
No longer needed after we introduced the barrier in gve_napi_poll.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Catherine Sullivan
13e7939c95 gve: Update gve_free_queue_page_list signature
The id field should be a u32 not a signed int.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:53 +00:00
Catherine Sullivan
d30baacc04 gve: Move the irq db indexes out of the ntfy block struct
Giving the device access to other kernel structs is not ideal.
Move the indexes into their own array and just keep pointers to
them in the ntfy block struct.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:53 +00:00
Jeroen de Borst
a10834a36c gve: Correct order of processing device options
The legacy raw addressing device option was processed before the
new RDA queue format option.  This caused the supported features mask,
which is provided only on the RDA queue format option, not to be set.

This disabled jumbo-frame support when using raw adressing.

Fixes: 255489f5b3 ("gve: Add a jumbo-frame device option")
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:53 +00:00
Russell King (Oracle)
d8c3669397 net: mvneta: convert to pcs_validate() and phylink_generic_validate()
Convert mvneta to validate the autoneg state for 1000base-X in the
pcs_validate() operation, rather than the MAC validate() operation.
This allows us to switch the MAC validate() to use
phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:37:14 +00:00
Russell King
c2e7d2df4a net: mvneta: convert to phylink pcs operations
An initial stab at converting mvneta to PCS operations.  There's a few
FIXMEs to be solved.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:37:14 +00:00
Russell King
5a7d895369 net: mvneta: convert to use mac_prepare()/mac_finish()
Convert mvneta to use the mac_prepare() and mac_finish() methods in
preparation to converting mvneta to split-PCS support.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:37:14 +00:00
Russell King (Oracle)
85e3e0ebdb net: mvpp2: convert to pcs_validate() and phylink_generic_validate()
Convert mvpp2 to validate the autoneg state for 1000base-X in the
pcs_validate() operation, rather than the MAC validate() operation.
This allows us to switch the MAC validate() to use
phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:37:13 +00:00
Russell King (Oracle)
cff0563223 net: mvpp2: use .mac_select_pcs() interface
Use the mac_select_pcs() method to choose between the GMAC and XLG
PCS implementations.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:37:13 +00:00
David S. Miller
6209dd778f Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-15

This series contains updates to igb, igbvf, igc and ixgbe drivers.

Karen moves checks for invalid VF MAC filters to occur earlier for
igb.

Letu Ren fixes a double free issue in igbvf probe.

Sasha fixes incorrect min value being used when calculating for max for
igc.

Robert Schlabbach adds documentation on enabling NBASE-T support for
ixgbe.

Cyril Novikov adds missing initialization of MDIO bus speed for ixgbe.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:27:12 +00:00
David S. Miller
4134c846b6 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nex
t-queue

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-12-15

This series contains updates to ice driver only.

Jake makes changes to flash update. This includes the following:

 * a new shadow-ram region similar to NVM region but for the device shadow
   RAM contents. This is distinct from NVM region because shadow RAM is
   built up during device init and may be different from the raw NVM flash
   data.
 * refactoring of the ice_flash_pldm_image to become the main flash update
   entry point. This is simpler than having both an
   ice_devlink_flash_update and an ice_flash_pldm_image. It will make
   additions like dry-run easier in the future.
 * reducing time to read Option ROM version information.
 * adding support for firmware activation via devlink reload, when
   possible.

The major new work is the reload support, which allows activating firmware
immediately without a reboot when possible. Reload support only supports
firmware activation.

Jesse improves transmit code: utilizing newer netif_tx* API, adding some
prefetch calls, correcting expected conditions when calling ice_vsi_down(),
and utilizing __netdev_tx_sent_queue() call.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:23:34 +00:00
David S. Miller
823f7a5497 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
mlx5-next branch 2021-12-15

Hi Dave, Jakub, Jason

This pulls mlx5-next branch into net-next and rdma branches.
All patches already reviewed on both rdma and netdev mailing lists.

Please pull and let me know if there's any problem.

1) Add multiple FDB steering priorities [1]
2) Introduce HW bits needed to configure MAC list size of VF/SF.
   Required for ("net/mlx5: Memory optimizations") upcoming series [2].

[1] https://lore.kernel.org/netdev/20211201193621.9129-1-saeed@kernel.org/
[2] https://lore.kernel.org/lkml/20211208141722.13646-1-shayd@nvidia.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:22:20 +00:00
Ioana Ciornei
972ce7e380 dpaa2-eth: fix ethtool statistics
Unfortunately, with the blamed commit I also added a side effect in the
ethtool stats shown. Because I added two more fields in the per channel
structure without verifying if its size is used in any way, part of the
ethtool statistics were off by 2.
Fix this by not looking up the size of the structure but instead on a
fixed value kept in a macro.

Fixes: fc398bec03 ("net: dpaa2: add adaptive interrupt coalescing")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20211215105831.290070-1-ioana.ciornei@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-15 17:48:54 -08:00
Cyril Novikov
bf0a375055 ixgbe: set X550 MDIO speed before talking to PHY
The MDIO bus speed must be initialized before talking to the PHY the first
time in order to avoid talking to it using a speed that the PHY doesn't
support.

This fixes HW initialization error -17 (IXGBE_ERR_PHY_ADDR_INVALID) on
Denverton CPUs (a.k.a. the Atom C3000 family) on ports with a 10Gb network
plugged in. On those devices, HLREG0[MDCSPD] resets to 1, which combined
with the 10Gb network results in a 24MHz MDIO speed, which is apparently
too fast for the connected PHY. PHY register reads over MDIO bus return
garbage, leading to initialization failure.

Reproduced with Linux kernel 4.19 and 5.15-rc7. Can be reproduced using
the following setup:

* Use an Atom C3000 family system with at least one X552 LAN on the SoC
* Disable PXE or other BIOS network initialization if possible
  (the interface must not be initialized before Linux boots)
* Connect a live 10Gb Ethernet cable to an X550 port
* Power cycle (not reset, doesn't always work) the system and boot Linux
* Observe: ixgbe interfaces w/ 10GbE cables plugged in fail with error -17

Fixes: e84db72727 ("ixgbe: Introduce function to control MDIO speed")
Signed-off-by: Cyril Novikov <cnovikov@lynx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 11:24:39 -08:00
Robert Schlabbach
271225fd57 ixgbe: Document how to enable NBASE-T support
Commit a296d665ea ("ixgbe: Add ethtool support to enable 2.5 and 5.0
Gbps support") introduced suppression of the advertisement of NBASE-T
speeds by default, according to Todd Fujinaka to accommodate customers
with network switches which could not cope with advertised NBASE-T
speeds, as posted in the E1000-devel mailing list:

https://sourceforge.net/p/e1000/mailman/message/37106269/

However, the suppression was not documented at all, nor was how to
enable NBASE-T support.

Properly document the NBASE-T suppression and how to enable NBASE-T
support.

Fixes: a296d665ea ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support")
Reported-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 11:09:29 -08:00
Sasha Neftin
0182d1f3fa igc: Fix typo in i225 LTR functions
The LTR maximum value was incorrectly written using the scale from
the LTR minimum value. This would cause incorrect values to be sent,
in cases where the initial calculation lead to different min/max scales.

Fixes: 707abf0695 ("igc: Add initial LTR support")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 11:09:29 -08:00
Letu Ren
b6d335a60d igbvf: fix double free in igbvf_probe
In `igbvf_probe`, if register_netdev() fails, the program will go to
label err_hw_init, and then to label err_ioremap. In free_netdev() which
is just below label err_ioremap, there is `list_for_each_entry_safe` and
`netif_napi_del` which aims to delete all entries in `dev->napi_list`.
The program has added an entry `adapter->rx_ring->napi` which is added by
`netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has
been freed below label err_hw_init. So this a UAF.

In terms of how to patch the problem, we can refer to igbvf_remove() and
delete the entry before `adapter->rx_ring`.

The KASAN logs are as follows:

[   35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450
[   35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366
[   35.128360]
[   35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ #14
[   35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   35.131749] Call Trace:
[   35.132199]  dump_stack_lvl+0x59/0x7b
[   35.132865]  print_address_description+0x7c/0x3b0
[   35.133707]  ? free_netdev+0x1fd/0x450
[   35.134378]  __kasan_report+0x160/0x1c0
[   35.135063]  ? free_netdev+0x1fd/0x450
[   35.135738]  kasan_report+0x4b/0x70
[   35.136367]  free_netdev+0x1fd/0x450
[   35.137006]  igbvf_probe+0x121d/0x1a10 [igbvf]
[   35.137808]  ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf]
[   35.138751]  local_pci_probe+0x13c/0x1f0
[   35.139461]  pci_device_probe+0x37e/0x6c0
[   35.165526]
[   35.165806] Allocated by task 366:
[   35.166414]  ____kasan_kmalloc+0xc4/0xf0
[   35.167117]  foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf]
[   35.168078]  igbvf_probe+0x9c5/0x1a10 [igbvf]
[   35.168866]  local_pci_probe+0x13c/0x1f0
[   35.169565]  pci_device_probe+0x37e/0x6c0
[   35.179713]
[   35.179993] Freed by task 366:
[   35.180539]  kasan_set_track+0x4c/0x80
[   35.181211]  kasan_set_free_info+0x1f/0x40
[   35.181942]  ____kasan_slab_free+0x103/0x140
[   35.182703]  kfree+0xe3/0x250
[   35.183239]  igbvf_probe+0x1173/0x1a10 [igbvf]
[   35.184040]  local_pci_probe+0x13c/0x1f0

Fixes: d4e0fe01a3 (igbvf: add new driver to support 82576 virtual functions)
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 11:09:21 -08:00
Karen Sornek
584af82154 igb: Fix removal of unicast MAC filters of VFs
Move checking condition of VF MAC filter before clearing
or adding MAC filter to VF to prevent potential blackout caused
by removal of necessary and working VF's MAC filter.

Fixes: 1b8b062a99 ("igb: add VF trust infrastructure")
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 11:08:18 -08:00
Jesse Brandeburg
9c99d099f7 ice: use modern kernel API for kick
The kernel gained a new interface for drivers to use to combine tail
bump (doorbell) and BQL updates, attempt to use those new interfaces.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:49:25 -08:00
Jesse Brandeburg
21c6e36b1e ice: tighter control over VSI_DOWN state
The driver had comments to the effect of: This flag should be set before
calling this function. While reviewing code it was found that there were
several violations of this policy, which could introduce hard to find
bugs or races.

Fix the violations of the "VSI DOWN state must be set before calling
ice_down" and make checking the state into code with a WARN_ON.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:48:26 -08:00
Jesse Brandeburg
cc14db11c8 ice: use prefetch methods
The kernel provides some prefetch mechanisms to speed up commonly
cold cache line accesses during receive processing. Since these are
software structures it helps to have these strategically placed
prefetches.

Be careful to call BQL prefetch complete only for non XDP queues.

Co-developed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:46:28 -08:00
Jesse Brandeburg
1c96c16858 ice: update to newer kernel API
Use the netif_tx_* API from netdevice.h which has simpler parameters.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:45:28 -08:00
Jacob Keller
399e27dbbd ice: support immediate firmware activation via devlink reload
The ice hardware contains an embedded chip with firmware which can be
updated using devlink flash. The firmware which runs on this chip is
referred to as the Embedded Management Processor firmware (EMP
firmware).

Activating the new firmware image currently requires that the system be
rebooted. This is not ideal as rebooting the system can cause unwanted
downtime.

In practical terms, activating the firmware does not always require a
full system reboot. In many cases it is possible to activate the EMP
firmware immediately. There are a couple of different scenarios to
cover.

 * The EMP firmware itself can be reloaded by issuing a special update
   to the device called an Embedded Management Processor reset (EMP
   reset). This reset causes the device to reset and reload the EMP
   firmware.

 * PCI configuration changes are only reloaded after a cold PCIe reset.
   Unfortunately there is no generic way to trigger this for a PCIe
   device without a system reboot.

When performing a flash update, firmware is capable of responding with
some information about the specific update requirements.

The driver updates the flash by programming a secondary inactive bank
with the contents of the new image, and then issuing a command to
request to switch the active bank starting from the next load.

The response to the final command for updating the inactive NVM flash
bank includes an indication of the minimum reset required to fully
update the device. This can be one of the following:

 * A full power on is required
 * A cold PCIe reset is required
 * An EMP reset is required

The response to the command to switch flash banks includes an indication
of whether or not the firmware will allow an EMP reset request.

For most updates, an EMP reset is sufficient to load the new EMP
firmware without issues. In some cases, this reset is not sufficient
because the PCI configuration space has changed. When this could cause
incompatibility with the new EMP image, the firmware is capable of
rejecting the EMP reset request.

Add logic to ice_fw_update.c to handle the response data flash update
AdminQ commands.

For the reset level, issue a devlink status notification informing the
user of how to complete the update with a simple suggestion like
"Activate new firmware by rebooting the system".

Cache the status of whether or not firmware will restrict the EMP reset
for use in implementing devlink reload.

Implement support for devlink reload with the "fw_activate" flag. This
allows user space to request the firmware be activated immediately.

For the .reload_down handler, we will issue a request for the EMP reset
using the appropriate firmware AdminQ command. If we know that the
firmware will not allow an EMP reset, simply exit with a suitable
netlink extended ACK message indicating that the EMP reset is not
available.

For the .reload_up handler, simply wait until the driver has finished
resetting. Logic to handle processing of an EMP reset already exists in
the driver as part of its reset and rebuild flows.

Implement support for the devlink reload interface with the
"fw_activate" action. This allows userspace to request activation of
firmware without a reboot.

Note that support for indicating the required reset and EMP reset
restriction is not supported on old versions of firmware. The driver can
determine if the two features are supported by checking the device
capabilities report. I confirmed support has existed since at least
version 5.5.2 as reported by the 'fw.mgmt' version. Support to issue the
EMP reset request has existed in all version of the EMP firmware for the
ice hardware.

Check the device capabilities report to determine whether or not the
indications are reported by the running firmware. If the reset
requirement indication is not supported, always assume a full power on
is necessary. If the reset restriction capability is not supported,
always assume the EMP reset is available.

Users can verify if the EMP reset has activated the firmware by using
the devlink info report to check that the 'running' firmware version has
updated. For example a user might do the following:

 # Check current version
 $ devlink dev info

 # Update the device
 $ devlink dev flash pci/0000:af:00.0 file firmware.bin

 # Confirm stored version updated
 $ devlink dev info

 # Reload to activate new firmware
 $ devlink dev reload pci/0000:af:00.0 action fw_activate

 # Confirm running version updated
 $ devlink dev info

Finally, this change does *not* implement basic driver-only reload
support. I did look into trying to do this. However, it requires
significant refactor of how the ice driver probes and loads everything.
The ice driver probe and allocation flows were not designed with such
a reload in mind. Refactoring the flow to support this is beyond the
scope of this change.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:40:38 -08:00
Jacob Keller
af18d8866c ice: reduce time to read Option ROM CIVD data
During probe and device reset, the ice driver reads some data from the
NVM image as part of ice_init_nvm. Part of this data includes a section
of the Option ROM which contains version information.

The function ice_get_orom_civd_data is used to locate the '$CIV' data
section of the Option ROM.

Timing of ice_probe and ice_rebuild indicate that the
ice_get_orom_civd_data function takes about 10 seconds to finish
executing.

The function locates the section by scanning the Option ROM every 512
bytes. This requires a significant number of NVM read accesses, since
the Option ROM bank is 500KB. In the worst case it would take about 1000
reads. Worse, all PFs serialize this operation during reload because of
acquiring the NVM semaphore.

The CIVD section is located at the end of the Option ROM image data.
Unfortunately, the driver has no easy method to determine the offset
manually. Practical experiments have shown that the data could be at
a variety of locations, so simply reversing the scanning order is not
sufficient to reduce the overall read time.

Instead, copy the entire contents of the Option ROM into memory. This
allows reading the data using 4Kb pages instead of 512 bytes at a time.
This reduces the total number of firmware commands by a factor of 8. In
addition, reading the whole section together at once allows better
indication to firmware of when we're "done".

Re-write ice_get_orom_civd_data to allocate virtual memory to store the
Option ROM data. Copy the entire OptionROM contents at once using
ice_read_flash_module. Finally, use this memory copy to scan for the
'$CIV' section.

This change significantly reduces the time to read the Option ROM CIVD
section from ~10 seconds down to ~1 second. This has a significant
impact on the total time to complete a driver rebuild or probe.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:38:14 -08:00
Jacob Keller
c9f7a483e4 ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image
The ice_devlink_flash_update function performs a few upfront checks and
then calls ice_flash_pldm_image.

Most if these checks make more sense in the context of code within
ice_flash_pldm_image. Merge ice_devlink_flash_update and
ice_flash_pldm_image into one function, placing it in ice_fw_update.c

Since this is still the entry point for devlink, call the function
ice_devlink_flash_update instead of ice_flash_pldm_image. This leaves a
single function which handles the devlink parameters and then initiates
a PLDM update.

With this change, the ice_devlink_flash_update function in
ice_fw_update.c becomes the main entry point for flash update. It
elimintes some unnecessary boiler plate code between the two previous
functions. The ultimate motivation for this is that it eases supporting
a dry run with the PLDM library in a future change.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:37:14 -08:00
Jacob Keller
c356eaa824 ice: move and rename ice_check_for_pending_update
The ice_devlink_flash_update function performs a few checks and then
calls ice_flash_pldm_image. One of these checks is to call
ice_check_for_pending_update. This function checks if the device has
a pending update, and cancels it if so. This is necessary to allow
a new flash update to proceed.

We want to refactor the ice code to eliminate ice_devlink_flash_update,
moving its checks into ice_flash_pldm_image.

To do this, ice_check_for_pending_update will become static, and only
called by ice_flash_pldm_image. To make this change easier to review,
first just move the function up within the ice_fw_update.c file.

While at it, note that the function has a misleading name. Its primary
action is to cancel a pending update. Using the verb "check" does not
imply this. Rename it to ice_cancel_pending_update.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:36:10 -08:00
Jacob Keller
78ad87da99 ice: devlink: add shadow-ram region to snapshot Shadow RAM
We have a region for reading the contents of the NVM flash as
a snapshot. This region does not allow reading the Shadow RAM, as it
always passes the FLASH_ONLY bit to the low level firmware interface.

Add a separate shadow-ram region which will allow snapshot of the
current contents of the Shadow RAM. This data is built from the NVM
contents but is distinct as the device builds up the Shadow RAM during
initialization, so being able to snapshot its contents can be useful
when attempting to debug flash related issues.

Fix the comment description of the nvm-flash region which incorrectly
stated that it filled the shadow-ram region, and add a comment
explaining that the nvm-flash region does not actually read the Shadow
RAM.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-15 08:34:54 -08:00
Amit Cohen
06c08f869c mlxsw: Add support for VxLAN with IPv6 underlay
Currently, mlxsw driver supports VxLAN with IPv4 underlay only.
Add support for IPv6 underlay.

The main differences are:

* Learning is not supported for IPv6 FDB entries, use static entries and
  do not allow 'learning' flag for IPv6 VxLAN.

* IPv6 addresses for FDB entries should be saved as part of KVDL.
  Use the new API to allocate and release entries for IPv6 addresses.

* Spectrum ASICs do not fill UDP checksum, while in software IPv6 UDP
  packets with checksum zero are dropped.
  Force the relevant flags which allow the VxLAN device to generate UDP
  packets with zero checksum and also receive them.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:05:44 +00:00
Amit Cohen
0860c76416 mlxsw: spectrum_nve: Keep track of IPv6 addresses used by FDB entries
FDB entries that perform VxLAN encapsulation with an IPv6 underlay hold
a reference on a resource. Namely, the KVDL entry where the IPv6
underlay destination IP is stored. When such an FDB entry is deleted, it
needs to drop the reference from the corresponding KVDL entry.

To that end, maintain a hash table that maps an FDB entry (i.e., {MAC,
FID}) to the IPv6 address used by it.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:05:44 +00:00
Amit Cohen
4b08c3e676 mlxsw: reg: Add a function to fill IPv6 unicast FDB entries
Add a function to fill IPv6 unicast FDB entries. Use the common function
for common fields.

Unlike IPv4 entries, the underlay IP address is not filled in the
register payload, but instead a pointer to KVDL is used.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:05:44 +00:00
Amit Cohen
1fd85416e3 mlxsw: Split handling of FDB tunnel entries between address families
Currently, the function which adds/removes unicast tunnel FDB entries is
shared between IPv4 and IPv6, while for IPv6 it warns because there is
no support for it.

The code for IPv6 will be more complicated because it needs to
allocate/release a KVDL pointer for the underlay IPv6 address.

As a preparation for IPv6 underlay support, split the code according to
address family.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:05:44 +00:00
Amit Cohen
720d683cbe mlxsw: spectrum_nve_vxlan: Make VxLAN flags check per address family
As part of 'can_offload' checks, there is a check of VxLAN flags.

The supported flags for IPv6 VxLAN will be different from the existing
flags because of some limitations.

As preparation for IPv6 underlay support, make this check per address
family.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:05:44 +00:00
Amit Cohen
cf42911523 mlxsw: spectrum_ipip: Use common hash table for IPv6 address mapping
Use the common hash table introduced by the previous patch instead of
the IP-in-IP specific implementation.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:05:43 +00:00
Amit Cohen
e846efe273 mlxsw: spectrum: Add hash table for IPv6 address mapping
The device supports forwarding entries such as routes and FDBs that
perform tunnel (e.g., VXLAN, IP-in-IP) encapsulation or decapsulation.
When the underlay is IPv6, these entries do not encode the 128 bit IPv6
address used for encapsulation / decapsulation. Instead, these entries
encode a 24 bit pointer to an array called KVDL where the IPv6 address
is stored.

Currently, only IP-in-IP with IPv6 underlay is supported, but subsequent
patches will add support for VxLAN with IPv6 underlay. To avoid
duplicating the logic required to store and retrieve these IPv6
addresses, introduce a hash table that will store the mapping between
IPv6 addresses and their KVDL index.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:05:43 +00:00
David S. Miller
f71f1bcbd8 mlx5-updates-2021-12-14
Parsing Infrastructure for TC actions:
 
 The series introduce a TC action infrastructure to help
 parsing TC actions in a generic way for both FDB and NIC rules.
 
 To help maintain the parsing code of TC actions, we the parsing code to
 action parser per action TC type in separate files, instead of having one
 big switch case loop, duplicated between FDB and NIC parsers as before this
 patchset.
 
 Each TC flow_action->id is represented by a dedicated mlx5e_tc_act handler
 which has callbacks to check if the specific action is offload supported and
 to parse the specific action.
 
 We move each case (TC action) handling into the specific handler, which is
 responsible for parsing and determining if the action is supported.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmG5fUoACgkQSD+KveBX
 +j6FXwgAuth9IVE/9N/KRxlTmdG2MqHF4fXFFGtQgZ+f1a7ViwsyEXtxGb4mISzF
 EVF14etoyAuvHSFZDhD/8uqxwAKe+kGywT6BVzYKHHeRQbPRdUulOQ4AEa/CmJ6C
 fJF5d3I2ktoSkGIn1L9sOLJQJ1bWy+qpohBkkW0q0fdW1kjb2QPb0hXtIQA0gM2J
 zXZVPt0yHg21Px3stYn3HSUCdxjY9CweXZRsP5uJ7eMmDxCp7qb3xFXzzExjI9zF
 d+3QM3rRj9GBAGJHGTMYrpRVSeVPwdJL2WgA1YTO7qNXHTtewGjizHtEIKTyxBRG
 3e9HZjA4ZuwzZDlCegisw/WCE/dUIg==
 =6jKv
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saed Mahameed says:

====================
mlx5-updates-2021-12-14

Parsing Infrastructure for TC actions:

The series introduce a TC action infrastructure to help
parsing TC actions in a generic way for both FDB and NIC rules.

To help maintain the parsing code of TC actions, we the parsing code to
action parser per action TC type in separate files, instead of having one
big switch case loop, duplicated between FDB and NIC parsers as before this
patchset.

Each TC flow_action->id is represented by a dedicated mlx5e_tc_act handler
which has callbacks to check if the specific action is offload supported and
to parse the specific action.

We move each case (TC action) handling into the specific handler, which is
responsible for parsing and determining if the action is supported.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 14:46:33 +00:00
Jason Gunthorpe
c8f476da84 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
Currently, the driver ignores the user's priority for flow steering rules
in FDB namespace. Change it and create the rule in the right priority.

It will allow to create FDB steering rules in up to 16 different
priorities.
====================

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

* mellanox/mlx5-next:
  RDMA/mlx5: Add support to multiple priorities for FDB rules
  net/mlx5: Create more priorities for FDB bypass namespace
  net/mlx5: Refactor mlx5_get_flow_namespace
  net/mlx5: Separate FDB namespace
2021-12-15 08:15:53 -04:00
David S. Miller
5a21bf5bb4 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-12-14

This series contains updates to ice driver only.

Haiyue adds support to query hardware for supported PTYPEs.

Jeff changes PTYPE validation to utilize the capabilities queried from
the hardware instead of maintaining a per DDP support list.

Brett refactors promiscuous functions to provide common and clear
interfaces to call for configuration.

Wojciech modifies DDP package load to simplify determining the final
state of the load.

Tony removes the use of ice_status from the driver. This involves
removing string conversion functions, converting variables and values to
standard errors, and clean up. He also removes an unused define.

Dan Carpenter removes unneeded casts.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 10:59:57 +00:00
Joakim Zhang
0b6f65c707 net: fec: fix system hang during suspend/resume
1. During normal suspend (WoL not enabled) process, system has posibility
to hang. The root cause is TXF interrupt coming after clocks disabled,
system hang when accessing registers from interrupt handler. To fix this
issue, disable all interrupts when system suspend.

2. System also has posibility to hang with WoL enabled during suspend,
after entering stop mode, then magic pattern coming after clocks
disabled, system will be waked up, and interrupt handler will be called,
system hang when access registers. To fix this issue, disable wakeup
irq in .suspend(), and enable it in .resume().

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 10:30:25 +00:00
Clément Léger
8438699512 net: ocelot: add support to get port mac from device-tree
Add support to get mac from device-tree using of_get_ethdev_address.

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 10:29:41 +00:00
Conley Lee
3899c928bc sun4i-emac.c: remove unnecessary branch
According to the current implementation of emac_rx, every arrived packet
will be processed in the while loop. So, there is no remain packet last
time. The skb_last field and this branch for dealing with it is
unnecessary.

Signed-off-by: Conley Lee <conleylee@foxmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 10:29:03 +00:00
Roi Dayan
35bb524214 net/mlx5e: Move goto action checks into tc_action goto post parse op
Move goto action checks from parse nic/fdb funcs into the tc action
infra goto post parse op.
While moving this part also use NL_SET_ERR_MSG_MOD() instead of
NL_SET_ERR_MSG().

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:46 -08:00
Roi Dayan
c22080352e net/mlx5e: Move vlan action chunk into tc action vlan post parse op
Move vlan prio tag rewrite handling into tc action infra vlan post parse op.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:46 -08:00
Roi Dayan
dd5ab6d115 net/mlx5e: Add post_parse() op to tc action infrastructure
The post_parse() op should be called after the parse op was called
for all actions. It could be an action state is dependent on other
actions. In the new op an action can fail the parse if the state
is not valid anymore.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:45 -08:00
Roi Dayan
6bcba1bded net/mlx5e: Move sample attr allocation to tc_action sample parse op
There is no reason to wait with the kmalloc to after parsing all
other actions. There could still be a failure later and before
offloading the rule. So alloc the mem when parsing.
The memory is being released on mlx5e_flow_put() which is called
also on error flow.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:45 -08:00
Roi Dayan
8333d53e3f net/mlx5e: TC action parsing loop
Introduce a common function to implement the generic parsing loop.
The same function can be used for parsing NIC and FDB (Switchdev mode) flows.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:45 -08:00
Roi Dayan
922d69ed96 net/mlx5e: Add redirect ingress to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:44 -08:00
Roi Dayan
3929ff583d net/mlx5e: Add sample and ptype to tc_action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:44 -08:00
Roi Dayan
758bc13422 net/mlx5e: Add ct to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:44 -08:00
Roi Dayan
ab3f3d5eff net/mlx5e: Add mirred/redirect to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:43 -08:00
Roi Dayan
163b766f56 net/mlx5e: Add mpls push/pop to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:43 -08:00
Roi Dayan
8ee7263834 net/mlx5e: Add vlan push/pop/mangle to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:43 -08:00
Roi Dayan
e36db1ee7a net/mlx5e: Add pedit to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:42 -08:00
Roi Dayan
9ca1bb2cf6 net/mlx5e: Add csum to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:42 -08:00
Roi Dayan
c65686d79c net/mlx5e: Add tunnel encap/decap to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:42 -08:00
Roi Dayan
67d62ee7f4 net/mlx5e: Add goto to tc action infra
Add parsing support by implementing struct mlx5e_tc_act
for this action.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:41 -08:00
Roi Dayan
fad5479069 net/mlx5e: Add tc action infrastructure
Add an infrastructure to help parsing tc actions in a generic way.

Supporting an action parser means implementing struct mlx5e_tc_act
for that action.

The infrastructure will give the possibility to be generic when parsing tc
actions, i.e. parse_tc_nic_actions() and parse_tc_fdb_actions().
To parse tc actions a user needs to allocate a parse_state instance
and pass it when iterating over the tc actions parsers.
If a parser doesn't exists then a user can treat it as unsupported.

To add an action parser a user needs to implement two callbacks.
The can_offload() callback to quickly check if an action can be offloaded.
The parse_action() callback to do actual parsing and prepare for offload.

Add implementation for drop, trap, mark and accept action parsers with this
commit to act as examples and implement usage of the new infrastructure for
those actions.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-14 21:29:41 -08:00
Kees Cook
c2ed5611af iw_cxgb4: Use memset_startat() for cpl_t5_pass_accept_rpl
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Use memset_startat() so memset() doesn't get confused about writing beyond
the destination member that is intended to be the starting point of
zeroing through the end of the struct. Additionally, since everything
appears to perform a roundup (including allocation), just change the size
of the struct itself and add a build-time check to validate the expected
size.

Link: https://lore.kernel.org/r/20211213223331.135412-13-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-14 20:21:22 -04:00
Karol Kolacinski
37e738b6fd ice: Don't put stale timestamps in the skb
The driver has to check if it does not accidentally put the timestamp in
the SKB before previous timestamp gets overwritten.
Timestamp values in the PHY are read only and do not get cleared except
at hardware reset or when a new timestamp value is captured.
The cached_tstamp field is used to detect the case where a new timestamp
has not yet been captured, ensuring that we avoid sending stale
timestamp data to the stack.

Fixes: ea9b847cda ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-14 11:31:47 -08:00
Karol Kolacinski
0013881c11 ice: Use div64_u64 instead of div_u64 in adjfine
Change the division in ice_ptp_adjfine from div_u64 to div64_u64.
div_u64 is used when the divisor is 32 bit but in this case incval is
64 bit and it caused incorrect calculations and incval adjustments.

Fixes: 06c16d89d2 ("ice: register 1588 PTP clock device object for E810 devices")
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-14 11:29:34 -08:00
Tony Nguyen
f8a3bcceb4 ice: Remove unused ICE_FLOW_SEG_HDRS_L2_MASK
Remove the unused define ICE_FLOW_SEG_HDRS_L2_MASK.

Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
2021-12-14 10:19:14 -08:00
Dan Carpenter
e53a80835f ice: Remove unnecessary casts
The "bitmap" variable is already an unsigned long so there is no need
for this cast.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-14 10:19:14 -08:00
Tony Nguyen
c14846914e ice: Propagate error codes
As all functions now return standard error codes, propagate the values
being returned instead of converting them to generic values.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
2021-12-14 10:19:14 -08:00
Tony Nguyen
2ccc1c1ccc ice: Remove excess error variables
ice_status previously had a variable to contain these values where other
error codes had a variable as well. With ice_status now being an int,
there is no need for two variables to hold error values. In cases where
this occurs, remove one of the excess variables and use a single one.
Some initialization of variables are no longer needed and have been
removed.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
2021-12-14 10:19:13 -08:00
Tony Nguyen
5518ac2a64 ice: Cleanup after ice_status removal
Clean up code after changing ice_status to int. Rearrange to fix reverse
Christmas tree and pull lines up where applicable.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
2021-12-14 10:19:13 -08:00
Tony Nguyen
d54699e27d ice: Remove enum ice_status
Replace uses of ice_status to, as equivalent as possible, error codes.
Remove enum ice_status and its helper conversion function as they are no
longer needed.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
2021-12-14 10:19:13 -08:00
Tony Nguyen
5e24d5984c ice: Use int for ice_status
To prepare for removal of ice_status, change the variables from
ice_status to int. This eases the transition when values are changed to
return standard int error codes over enum ice_status.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
2021-12-14 10:19:13 -08:00
Tony Nguyen
5f87ec4861 ice: Remove string printing for ice_status
Remove the ice_stat_str() function which prints the string
representation of the ice_status error code. With upcoming changes
moving away from ice_status, there will be no need for this function.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
2021-12-14 10:19:13 -08:00
Wojciech Drewek
247dd97d71 ice: Refactor status flow for DDP load
Before this change, final state of the DDP pkg load process was
dependent on many variables such as: ice_status, pkg version,
ice_aq_err. The last one had be stored in hw->pkg_dwnld_status.
It was impossible to conclude this state just from ice_status, that's
why logging process of DDP pkg load in the caller was a little bit
complicated.

With this patch new status enum is introduced - ice_ddp_state.
It covers all the possible final states of the loading process.
What's tricky for ice_ddp_state is that not only
ICE_DDP_PKG_SUCCESS(=0) means that load was successful. Actually
three states mean that:
 - ICE_DDP_PKG_SUCCESS
 - ICE_DDP_PKG_SAME_VERSION_ALREADY_LOADED
 - ICE_DDP_PKG_COMPATIBLE_ALREADY_LOADED
ice_is_init_pkg_successful can tell that information.

One ddp_state should not be used outside of ice_init_pkg which is
ICE_DDP_PKG_ALREADY_LOADED. It is more generic, it is used in
ice_dwnld_cfg_bufs to see if pkg is already loaded. At this point
we can't use one of the specific one (SAME_VERSION, COMPATIBLE,
NOT_SUPPORTED) because we don't have information on the package
currently loaded in HW (we are before calling ice_get_pkg_info).

We can get rid of hw->pkg_dwnld_status because we are immediately
mapping aq errors to ice_ddp_state in ice_dwnld_cfg_bufs.

Other errors like ICE_ERR_NO_MEMORY, ICE_ERR_PARAM are mapped the
generic ICE_DDP_PKG_ERR.

Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-14 10:19:13 -08:00
Brett Creeley
fabf480bf9 ice: Refactor promiscuous functions
Some of the promiscuous mode functions take a boolean to indicate
set/clear, which affects readability. Refactor and provide an
interface for the promiscuous mode code with explicit set and clear
promiscuous mode operations.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-14 10:18:52 -08:00
Jeff Guo
60f44fe4cd ice: refactor PTYPE validating
Since the capability of a PTYPE within a specific package could be
negotiated by checking the HW bit map, it means that there's no need
to maintain a different PTYPE list for each type of the package when
parsing PTYPE. So refactor the PTYPE validating mechanism.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-14 08:06:47 -08:00
Haiyue Wang
8818b95409 ice: Add package PTYPE enable information
Scan the 'Marker Ptype TCAM' section to retrieve the Rx parser PTYPE
enable information from the current package.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-14 08:06:47 -08:00
Dany Madden
fe4c82a7e0 ibmvnic: remove unused defines
IBMVNIC_STATS_TIMEOUT and IBMVNIC_INIT_FAILED are not used in the driver.
Remove them.

Suggested-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:58:02 +00:00
Dany Madden
b6ee566cf3 ibmvnic: Update driver return codes
Update return codes to be more informative.

Signed-off-by: Jacob Root <otis@otisroot.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:58:02 +00:00
Danielle Ratson
b442f2ea84 mlxsw: spectrum_router: Consolidate MAC profiles when possible
Currently, when setting a router interface (RIF) MAC address while the
MAC profile is not shared with other RIFs, the profile is edited so that
the new MAC address is assigned to it.

This does not take into account a situation in which the new MAC address
already matches an existing MAC profile. In that situation, two MAC
profiles will be occupied even though they hold MAC addresses from the
same profile.

In order to prevent that, add a check to ensure that editing a MAC
profile takes place only when the new MAC address does not match an
existing profile.

Fixes: 605d25cd78 ("mlxsw: spectrum_router: Add RIF MAC profiles support")
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Tested-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:56:10 +00:00
David S. Miller
a41c4d96ae Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-13

This series contains updates to iavf driver only.

Dan Carpenter fixes some missing mutex unlocking.

Stefan Assmann restores stopping watchdog from overriding to reset state.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:41:05 +00:00
Ong Boon Leong
aeb7c75cb7 net: stmmac: fix tc flower deletion for VLAN priority Rx steering
To replicate the issue:-

1) Add 1 flower filter for VLAN Priority based frame steering:-
$ IFDEVNAME=eth0
$ tc qdisc add dev $IFDEVNAME ingress
$ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \
   map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \
   queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc filter add dev $IFDEVNAME parent ffff: protocol 802.1Q \
   flower vlan_prio 0 hw_tc 0

2) Get the 'pref' id
$ tc filter show dev $IFDEVNAME ingress

3) Delete a specific tc flower record (say pref 49151)
$ tc filter del dev $IFDEVNAME parent ffff: pref 49151

From dmesg, we will observe kernel NULL pointer ooops

[  197.170464] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  197.171367] #PF: supervisor read access in kernel mode
[  197.171367] #PF: error_code(0x0000) - not-present page
[  197.171367] PGD 0 P4D 0
[  197.171367] Oops: 0000 [#1] PREEMPT SMP NOPTI

<snip>

[  197.171367] RIP: 0010:tc_setup_cls+0x20b/0x4a0 [stmmac]

<snip>

[  197.171367] Call Trace:
[  197.171367]  <TASK>
[  197.171367]  ? __stmmac_disable_all_queues+0xa8/0xe0 [stmmac]
[  197.171367]  stmmac_setup_tc_block_cb+0x70/0x110 [stmmac]
[  197.171367]  tc_setup_cb_destroy+0xb3/0x180
[  197.171367]  fl_hw_destroy_filter+0x94/0xc0 [cls_flower]

The above issue is due to previous incorrect implementation of
tc_del_vlan_flow(), shown below, that uses flow_cls_offload_flow_rule()
to get struct flow_rule *rule which is no longer valid for tc filter
delete operation.

  struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
  struct flow_dissector *dissector = rule->match.dissector;

So, to ensure tc_del_vlan_flow() deletes the right VLAN cls record for
earlier configured RX queue (configured by hw_tc) in tc_add_vlan_flow(),
this patch introduces stmmac_rfs_entry as driver-side flow_cls_offload
record for 'RX frame steering' tc flower, currently used for VLAN
priority. The implementation has taken consideration for future extension
to include other type RX frame steering such as EtherType based.

v2:
 - Clean up overly extensive backtrace and rewrite git message to better
   explain the kernel NULL pointer issue.

Fixes: 0e039f5cf8 ("net: stmmac: add RX frame steering based on VLAN priority in tc flower")
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:32:19 +00:00
Hangbin Liu
9c9211a3fc net_tstamp: add new flag HWTSTAMP_FLAG_BONDED_PHC_INDEX
Since commit 94dd016ae5 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP
ioctl to active device") the user could get bond active interface's
PHC index directly. But when there is a failover, the bond active
interface will change, thus the PHC index is also changed. This may
break the user's program if they did not update the PHC timely.

This patch adds a new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX.
When the user wants to get the bond active interface's PHC, they need to
add this flag and be aware the PHC index may be changed.

With the new flag. All flag checks in current drivers are removed. Only
the checking in net_hwtstamp_validate() is kept.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:28:24 +00:00
Maor Gottlieb
c7d5fa105b net/mlx5: Create more priorities for FDB bypass namespace
Create 16 flow steering priorities for FDB bypass users.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-13 16:03:00 -08:00
Maor Gottlieb
4588fed7be net/mlx5: Refactor mlx5_get_flow_namespace
Have all the namespace type check in the same switch case.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-13 16:02:59 -08:00
Maor Gottlieb
22c3f2f56b net/mlx5: Separate FDB namespace
This patch doesn't add an additional namespaces, but just separates the
naming to be used by each FDB user, bypass and kernel.
Downstream patches will actually split this up and allow to have more
than single priority for the bypass users.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-13 16:02:59 -08:00
Maciej Fijalkowski
d27a662290 xsk: Wipe out dead zero_copy_allocator declarations
zero_copy_allocator has been removed back when Bjorn Topel introduced
xsk_buff_pool. Remove references to it that were dangling in the tree.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20211210171511.11574-1-maciej.fijalkowski@intel.com
2021-12-14 00:24:24 +01:00
Paolo Abeni
c8064e5b4a bpf: Let bpf_warn_invalid_xdp_action() report more info
In non trivial scenarios, the action id alone is not sufficient to
identify the program causing the warning. Before the previous patch,
the generated stack-trace pointed out at least the involved device
driver.

Let's additionally include the program name and id, and the relevant
device name.

If the user needs additional infos, he can fetch them via a kernel
probe, leveraging the arguments added here.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/ddb96bb975cbfddb1546cf5da60e77d5100b533c.1638189075.git.pabeni@redhat.com
2021-12-13 22:28:27 +01:00
Stefan Assmann
fe523d7c9a iavf: do not override the adapter state in the watchdog task (again)
The watchdog task incorrectly changes the state to __IAVF_RESETTING,
instead of letting the reset task take care of that. This was already
resolved by commit 22c8fd71d3 ("iavf: do not override the adapter
state in the watchdog task") but the problem was reintroduced by the
recent code refactoring in commit 45eebd6299 ("iavf: Refactor iavf
state machine tracking").

Fixes: 45eebd6299 ("iavf: Refactor iavf state machine tracking")
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-13 10:47:03 -08:00
Dan Carpenter
bc2f39a625 iavf: missing unlocks in iavf_watchdog_task()
This code was re-organized and there some unlocks missing now.

Fixes: 898ef1cb1c ("iavf: Combine init and watchdog state machines")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-13 09:19:22 -08:00
Lorenzo Bianconi
a3c62a0422 net: mtk_eth: add COMPILE_TEST support
Improve the build testing of mtk_eth drivers by enabling them when
COMPILE_TEST is selected. Moreover COMPILE_TEST will be useful
for the driver development.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 15:07:40 +00:00
David Wu
884d2b8454 net: stmmac: Add GFP_DMA32 for rx buffers if no 64 capability
Use page_pool_alloc_pages instead of page_pool_dev_alloc_pages, which
can give the gfp parameter, in the case of not supporting 64-bit width,
using 32-bit address memory can reduce a copy from swiotlb.

Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 15:06:07 +00:00
Wang Qing
be565ec71d net: ethernet: ti: add missing of_node_put before return
Fix following coccicheck warning:
WARNING: Function "for_each_child_of_node"
should have of_node_put() before return.

Early exits from for_each_child_of_node should decrement the
node reference counter.

Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:54:54 +00:00
Clément Léger
3cfcda2aee net: ocelot: use dma_unmap_addr to get tx buffer dma_addr
dma_addr was declared using DEFINE_DMA_UNMAP_ADDR() which requires to
use dma_unmap_addr() to access it.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 753a026cfe ("net: ocelot: add FDMA support")
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:51:21 +00:00
Horatiu Vultur
b26980ab2a net: lan966x: Fix the configuration of the pcs
When inserting a SFP that runs at 2.5G, then the Serdes was still
configured to run at 1G. Because the config->speed was 0, and then the
speed of the serdes was not configured at all, it was using the default
value which is 1G. This patch stop calling the serdes function set_speed
and allow the serdes to figure out the speed based on the interface
type.

Fixes: d28d6d2e37 ("net: lan966x: add port module support")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:39:41 +00:00
Miaoqian Lin
ab8eb798dd net: bcmgenet: Fix NULL vs IS_ERR() checking
The phy_attach() function does not return NULL. It returns error pointers.

Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:32:08 +00:00
Stefan Wahren
2f207cbf0d net: vertexcom: Add MSE102x SPI support
This implements an SPI protocol driver for Vertexcom MSE102x
Homeplug GreenPHY chip.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:15:41 +00:00
Russell King (Oracle)
2106be4fdf net: mvneta: mark as a legacy_pre_march2020 driver
mvneta provides mac_an_restart and mac_pcs_get_state methods, so needs
to be marked as a legacy driver. Marek spotted that mvneta had stopped
working in 2500base-X mode - thanks for reporting.

Reported-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:07:09 +00:00
Russell King (Oracle)
62cc9a7387 net: axienet: mark as a legacy_pre_march2020 driver
axienet has a PCS, but does not make use of the phylink PCS support.
Mark it was a pre-March 2020 driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:07:09 +00:00
Xiaoliang Yang
3a6c12a0c6 net: stmmac: bump tc when get underflow error from DMA descriptor
In DMA threshold mode, frame underflow errors may sometimes occur when
the TC(threshold control) value is not enough. The TC value need to be
bumped up in this case.

There is no underflow interrupt bit on DMA_CH(#i)_Status of dwmac4, so
the DMA threshold cannot be bumped up in stmmac_dma_interrupt(). The
i.mx8mp board observed an underflow error while running NFS boot, the
NFS rootfs could not be mounted.

The underflow error can be got from the DMA descriptor TDES3 on dwmac4.
This patch bump up tc value once underflow error is got from TDES3.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 12:20:35 +00:00
Yufeng Mo
6dde452bce net: hns3: fix race condition in debugfs
When multiple threads concurrently access the debugfs content, data
and pointer exceptions may occur. Therefore, mutex lock protection is
added for debugfs.

Fixes: 5e69ea7ee2 ("net: hns3: refactor the debugfs process")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 16:20:50 +00:00
Jie Wang
27cbf64a76 net: hns3: fix use-after-free bug in hclgevf_send_mbx_msg
Currently, the hns3_remove function firstly uninstall client instance,
and then uninstall acceletion engine device. The netdevice is freed in
client instance uninstall process, but acceletion engine device uninstall
process still use it to trace runtime information. This causes a use after
free problem.

So fixes it by check the instance register state to avoid use after free.

Fixes: d8355240cf ("net: hns3: add trace event support for PF/VF mailbox")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 16:20:50 +00:00
Clément Léger
753a026cfe net: ocelot: add FDMA support
Ethernet frames can be extracted or injected autonomously to or from
the device’s DDR3/DDR3L memory and/or PCIe memory space. Linked list
data structures in memory are used for injecting or extracting Ethernet
frames. The FDMA generates interrupts when frame extraction or
injection is done and when the linked lists need updating.

The FDMA is shared between all the ethernet ports of the switch and
uses a linked list of descriptors (DCB) to inject and extract packets.
Before adding descriptors, the FDMA channels must be stopped. It would
be inefficient to do that each time a descriptor would be added so the
channels are restarted only once they stopped.

Both channels uses ring-like structure to feed the DCBs to the FDMA.
head and tail are never touched by hardware and are completely handled
by the driver. On top of that, page recycling has been added and is
mostly taken from gianfar driver.

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Co-developed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 20:56:58 -08:00
Clément Léger
de5841e1c9 net: ocelot: add support for ndo_change_mtu
This commit adds support for changing MTU for the ocelot register based
interface. For ocelot, JUMBO frame size can be set up to 25000 bytes
but has been set to 9000 which is a saner value and allows for maximum
gain of performance with FDMA.

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 20:56:57 -08:00
Clément Léger
b471a71e52 net: ocelot: add and export ocelot_ptp_rx_timestamp()
In order to support PTP in FDMA, PTP handling code is needed. Since
this is the same as for register-based extraction, export it with
a new ocelot_ptp_rx_timestamp() function.

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 20:56:57 -08:00
Clément Léger
e5150f0072 net: ocelot: export ocelot_ifh_port_set() to setup IFH
FDMA will need this code to prepare the injection frame header when
sending SKBs. Move this code into ocelot_ifh_port_set() and add
conditional IFH setting for vlan and rew op if they are not set.

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 20:56:57 -08:00
Erik Ekman
7adf905333 net: bna: Update supported link modes
The BR-series installation guide from https://driverdownloads.qlogic.com/
mentions the cards support 10Gbase-SR/LR as well as direct attach cables.

The cards only have SFP+ ports, so 10000baseT is not the right mode.
Switch to using more specific link modes added in commit 5711a98221
("net: ethtool: add support for 1000BaseX and missing 10G link modes").

Only compile tested.

Signed-off-by: Erik Ekman <erik@kryo.se>
Link: https://lore.kernel.org/r/20211208230022.153496-1-erik@kryo.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 19:50:40 -08:00
Nitesh Narayan Lal
4b3ddc6462 net/mlx4: Use irq_update_affinity_hint()
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.

To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20210903152430.244937-15-nitesh@redhat.com
2021-12-10 20:47:40 +01:00
Nitesh Narayan Lal
7451e9ea8e net/mlx5: Use irq_set_affinity_and_hint()
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts and to apply
the provided mask as the affinity for the mlx5 interrupts. However,
irq_set_affinity_hint() applying the provided cpumask as an affinity for
the interrupt is an undocumented side effect.

To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_set_affinity_and_hint()
where the provided mask needs to be applied as the affinity and
affinity_hint pointer needs to be set and replace with
irq_update_affinity_hint() where only affinity_hint needs to be updated.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210903152430.244937-14-nitesh@redhat.com
2021-12-10 20:47:39 +01:00
Nitesh Narayan Lal
2d1e72f235 hinic: Use irq_set_affinity_and_hint()
The driver uses irq_set_affinity_hint() to:

 - Set the affinity_hint which is consumed by the userspace for
   distributing the interrupts

 - Enforce affinity

As per commit 352f58b0d9 ("net-next/hinic: Set Rxq irq to specific cpu
for NUMA"), the hinic driver enforces its own affinity to bind IRQs to the
local NUMA node. However, irq_set_affinity_hint() applying the provided
cpumask as an affinity for the interrupt is an undocumented side effect.

To remove this side effect irq_set_affinity_hint() has been marked as
deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_set_affinity_and_hint()
where the provided mask needs to be applied as the affinity and
affinity_hint pointer needs to be set and replace with
irq_update_affinity_hint() where only affinity_hint needs to be updated.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210903152430.244937-13-nitesh@redhat.com
2021-12-10 20:47:39 +01:00
Nitesh Narayan Lal
cc493264c0 ixgbe: Use irq_update_affinity_hint()
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.

To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20210903152430.244937-10-nitesh@redhat.com
2021-12-10 20:47:39 +01:00
Nitesh Narayan Lal
b8b9dd5252 be2net: Use irq_update_affinity_hint()
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.

To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210903152430.244937-9-nitesh@redhat.com
2021-12-10 20:47:39 +01:00
Nitesh Narayan Lal
cb39ca92eb enic: Use irq_update_affinity_hint()
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.

To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christian Benvenuti <benve@cisco.com>
Link: https://lore.kernel.org/r/20210903152430.244937-8-nitesh@redhat.com
2021-12-10 20:47:39 +01:00
Nitesh Narayan Lal
d34c54d173 i40e: Use irq_update_affinity_hint()
The driver uses irq_set_affinity_hint() for two purposes:

 - To set the affinity_hint which is consumed by the userspace for
   distributing the interrupts

 - To apply an affinity that it provides for the i40e interrupts

The latter is done to ensure that all the interrupts are evenly spread
across all available CPUs. However, since commit a0c9259dc4 ("irq/matrix:
Spread interrupts on allocation") the spreading of interrupts is
dynamically performed at the time of allocation. Hence, there is no need
for the drivers to enforce their own affinity for the spreading of
interrupts.

Also, irq_set_affinity_hint() applying the provided cpumask as an affinity
for the interrupt is an undocumented side effect. To remove this side
effect irq_set_affinity_hint() has been marked as deprecated and new
interfaces have been introduced. Hence, replace the irq_set_affinity_hint()
with the new interface irq_update_affinity_hint() that only sets the
pointer for the affinity_hint.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20210903152430.244937-4-nitesh@redhat.com
2021-12-10 20:47:38 +01:00
Nitesh Narayan Lal
0f9744f4ed iavf: Use irq_update_affinity_hint()
The driver uses irq_set_affinity_hint() for two purposes:

- To set the affinity_hint which is consumed by the userspace for
  distributing the interrupts

- To apply an affinity that it provides for the iavf interrupts

The latter is done to ensure that all the interrupts are evenly spread
across all available CPUs. However, since commit a0c9259dc4 ("irq/matrix:
Spread interrupts on allocation") the spreading of interrupts is
dynamically performed at the time of allocation. Hence, there is no need
for the drivers to enforce their own affinity for the spreading of
interrupts.

Also, irq_set_affinity_hint() applying the provided cpumask as an affinity
for the interrupt is an undocumented side effect. To remove this side
effect irq_set_affinity_hint() has been marked as deprecated and new
interfaces have been introduced. Hence, replace the irq_set_affinity_hint()
with the new interface irq_update_affinity_hint() that only sets the
pointer for the affinity_hint.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20210903152430.244937-3-nitesh@redhat.com
2021-12-10 20:47:38 +01:00
Geert Uytterhoeven
e5d75fc20b sh_eth: Use dev_err_probe() helper
Use the dev_err_probe() helper, instead of open-coding the same
operation.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/2576cc15bdbb5be636640f491bcc087a334e2c02.1638959463.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 19:10:23 -08:00
Jakub Kicinski
3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Russell King (Oracle)
11053047a4 net: ag71xx: remove unnecessary legacy methods
ag71xx may have a PCS, but it does not appear to support configuration
of the PCS in the current code. The functions to get its state merely
report that the link is down, and the AN restart function is empty.

Since neither of these functions will be called unless phylink's legacy
flag is set, we can safely remove these functions and indicate this is
a modern driver.

Should PCS support be added later, it will need to be modelled using
the phylink_pcs support rather than operating as a legacy driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 11:21:04 -08:00
Russell King (Oracle)
b06515367f net: mtk_eth_soc: mark as a legacy_pre_march2020 driver
mtk_eth_soc has not been updated for commit 7cceb599d1 ("net: phylink:
avoid mac_config calls"), and makes use of state->speed and
state->duplex in contravention of the phylink documentation. This makes
reliant on the legacy behaviours, so mark it as a legacy driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 11:21:03 -08:00
José Expósito
9acfc57fa2 net: mana: Fix memory leak in mana_hwc_create_wq
If allocating the DMA buffer fails, mana_hwc_destroy_wq was called
without previously storing the pointer to the queue.

In order to avoid leaking the pointer to the queue, store it as soon as
it is allocated.

Addresses-Coverity-ID: 1484720 ("Resource leak")
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/20211208223723.18520-1-jose.exposito89@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 07:58:41 -08:00
Jianglei Nie
c56c96303e nfp: Fix memory leak in nfp_cpp_area_cache_add()
In line 800 (#1), nfp_cpp_area_alloc() allocates and initializes a
CPP area structure. But in line 807 (#2), when the cache is allocated
failed, this CPP area structure is not freed, which will result in
memory leak.

We can fix it by freeing the CPP area when the cache is allocated
failed (#2).

792 int nfp_cpp_area_cache_add(struct nfp_cpp *cpp, size_t size)
793 {
794 	struct nfp_cpp_area_cache *cache;
795 	struct nfp_cpp_area *area;

800	area = nfp_cpp_area_alloc(cpp, NFP_CPP_ID(7, NFP_CPP_ACTION_RW, 0),
801 				  0, size);
	// #1: allocates and initializes

802 	if (!area)
803 		return -ENOMEM;

805 	cache = kzalloc(sizeof(*cache), GFP_KERNEL);
806 	if (!cache)
807 		return -ENOMEM; // #2: missing free

817	return 0;
818 }

Fixes: 4cb584e0ee ("nfp: add CPP access core")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20211209061511.122535-1-niejianglei2021@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 07:53:33 -08:00
Gustavo A. R. Silva
9d922f5df5 net: huawei: hinic: Use devm_kcalloc() instead of devm_kzalloc()
Use 2-factor multiplication argument form devm_kcalloc() instead
of devm_kzalloc().

Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211208040311.GA169838@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 18:37:28 -08:00
Gustavo A. R. Silva
d7ca9a34dd net: hinic: Use devm_kcalloc() instead of devm_kzalloc()
Use 2-factor multiplication argument form devm_kcalloc() instead
of devm_kzalloc().

Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211208003527.GA75483@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 18:37:06 -08:00
Louis Amas
a50e659b2a net: mvpp2: fix XDP rx queues registering
The registration of XDP queue information is incorrect because the
RX queue id we use is invalid. When port->id == 0 it appears to works
as expected yet it's no longer the case when port->id != 0.

The problem arised while using a recent kernel version on the
MACCHIATOBin. This board has several ports:
 * eth0 and eth1 are 10Gbps interfaces ; both ports has port->id == 0;
 * eth2 is a 1Gbps interface with port->id != 0.

Code from xdp-tutorial (more specifically advanced03-AF_XDP) was used
to test packet capture and injection on all these interfaces. The XDP
kernel was simplified to:

	SEC("xdp_sock")
	int xdp_sock_prog(struct xdp_md *ctx)
	{
		int index = ctx->rx_queue_index;

		/* A set entry here means that the correspnding queue_id
		* has an active AF_XDP socket bound to it. */
		if (bpf_map_lookup_elem(&xsks_map, &index))
			return bpf_redirect_map(&xsks_map, index, 0);

		return XDP_PASS;
	}

Starting the program using:

	./af_xdp_user -d DEV

Gives the following result:

 * eth0 : ok
 * eth1 : ok
 * eth2 : no capture, no injection

Investigating the issue shows that XDP rx queues for eth2 are wrong:
XDP expects their id to be in the range [0..3] but we found them to be
in the range [32..35].

Trying to force rx queue ids using:

	./af_xdp_user -d eth2 -Q 32

fails as expected (we shall not have more than 4 queues).

When we register the XDP rx queue information (using
xdp_rxq_info_reg() in function mvpp2_rxq_init()) we tell it to use
rxq->id as the queue id. This value is computed as:

	rxq->id = port->id * max_rxq_count + queue_id

where max_rxq_count depends on the device version. In the MACCHIATOBin
case, this value is 32, meaning that rx queues on eth2 are numbered
from 32 to 35 - there are four of them.

Clearly, this is not the per-port queue id that XDP is expecting:
it wants a value in the range [0..3]. It shall directly use queue_id
which is stored in rxq->logic_rxq -- so let's use that value instead.

rxq->id is left untouched ; its value is indeed valid but it should
not be used in this context.

This is consistent with the remaining part of the code in
mvpp2_rxq_init().

With this change, packet capture is working as expected on all the
MACCHIATOBin ports.

Fixes: b27db2274b ("mvpp2: use page_pool allocator")
Signed-off-by: Louis Amas <louis.amas@eho.link>
Signed-off-by: Emmanuel Deloget <emmanuel.deloget@eho.link>
Reviewed-by: Marcin Wojtas <mw@semihalf.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/r/20211207143423.916334-1-louis.amas@eho.link
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 18:29:37 -08:00
Jakub Kicinski
b5b6b6baf2 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-08

Yahui adds re-initialization of Flow Director for VF reset.

Paul restores interrupts when enabling VFs.

Dave re-adds bandwidth check for DCBNL and moves DSCP mode check
earlier in the function.

Jesse prevents reporting of dropped packets that occur during
initialization and fixes reporting of statistics which could occur with
frequent reads.

Michal corrects setting of protocol type for UDP header and fixes lack
of differentiation when adding filters for tunnels.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: safer stats processing
  ice: fix adding different tunnels
  ice: fix choosing UDP header type
  ice: ignore dropped packets during init
  ice: Fix problems with DSCP QoS implementation
  ice: rearm other interrupt cause register after enabling VFs
  ice: fix FDIR init missing when reset VF
====================

Link: https://lore.kernel.org/r/20211208211144.2629867-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 16:36:13 -08:00
Jakub Kicinski
6efcdadc15 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2021-12-08

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).

The main changes are:

1) Fix an off-by-two error in packet range markings and also add a batch of
   new tests for coverage of these corner cases, from Maxim Mikityanskiy.

2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.

3) Fix two functional regressions and a build warning related to BTF kfunc
   for modules, from Kumar Kartikeya Dwivedi.

4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
   PREEMPT_RT kernels, from Sebastian Andrzej Siewior.

5) Add missing includes in order to be able to detangle cgroup vs bpf header
   dependencies, from Jakub Kicinski.

6) Fix regression in BPF sockmap tests caused by missing detachment of progs
   from sockets when they are removed from the map, from John Fastabend.

7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
   dispatcher, from Björn Töpel.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add selftests to cover packet access corner cases
  bpf: Fix the off-by-two error in range markings
  treewide: Add missing includes masked by cgroup -> bpf dependency
  tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
  bpf: Fix bpf_check_mod_kfunc_call for built-in modules
  bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
  mips, bpf: Fix reference to non-existing Kconfig symbol
  bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
  Documentation/locking/locktypes: Update migrate_disable() bits.
  bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
  bpf, sockmap: Attach map progs to psock early for feature probes
  bpf, x86: Fix "no previous prototype" warning
====================

Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 16:06:44 -08:00
Jesse Brandeburg
1a0f25a52e ice: safer stats processing
The driver was zeroing live stats that could be fetched by
ndo_get_stats64 at any time. This could result in inconsistent
statistics, and the telltale sign was when reading stats frequently from
/proc/net/dev, the stats would go backwards.

Fix by collecting stats into a local, and delaying when we write to the
structure so it's not incremental.

Fixes: fcea6f3da5 ("ice: Add stats and ethtool support")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-08 10:37:02 -08:00
Colin Foster
32ecd22ba6 net: mscc: ocelot: split register definitions to a separate file
Move these to a separate file will allow them to be shared to other
drivers.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 21:44:49 -08:00
Joakim Zhang
b5bd95d171 net: fec: only clear interrupt of handling queue in fec_enet_rx_queue()
Background:
We have a customer is running a Profinet stack on the 8MM which receives and
responds PNIO packets every 4ms and PNIO-CM packets every 40ms. However, from
time to time the received PNIO-CM package is "stock" and is only handled when
receiving a new PNIO-CM or DCERPC-Ping packet (tcpdump shows the PNIO-CM and
the DCERPC-Ping packet at the same time but the PNIO-CM HW timestamp is from
the expected 40 ms and not the 2s delay of the DCERPC-Ping).

After debugging, we noticed PNIO, PNIO-CM and DCERPC-Ping packets would
be handled by different RX queues.

The root cause should be driver ack all queues' interrupt when handle a
specific queue in fec_enet_rx_queue(). The blamed patch is introduced to
receive as much packets as possible once to avoid interrupt flooding.
But it's unreasonable to clear other queues'interrupt when handling one
queue, this patch tries to fix it.

Fixes: ed63f1dcd5 (net: fec: clear receive interrupts before processing a packet)
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Reported-by: Nicolas Diaz <nicolas.diaz@nxp.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20211206135457.15946-1-qiangqing.zhang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 21:39:39 -08:00
Colin Ian King
3c5290a2dc net: hns3: Fix spelling mistake "faile" -> "failed"
There is a spelling mistake in a dev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20211206091207.113648-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 21:35:12 -08:00
Jakub Kicinski
65af674a59 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-06

This series contains updates to iavf and i40e drivers.

Mitch adds restoration of MSI state during reset for iavf.

Michal fixes checking and reporting of descriptor count changes to
communicate changes and/or issues for iavf.

Karen resolves an issue with failed handling of VF requests while a VF
reset is occurring for i40e.

Mateusz removes clearing of VF requested queue count when configuring
VF ADQ for i40e.

Norbert fixes a NULL pointer dereference that can occur when getting VSI
descriptors for i40e.
====================

Link: https://lore.kernel.org/r/20211206183519.2733180-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 21:33:12 -08:00
Ameer Hamza
e6f60c51f0 gve: fix for null pointer dereference.
Avoid passing NULL skb to __skb_put() function call if
napi_alloc_skb() returns NULL.

Fixes: 37149e9374 ("gve: Implement packet continuation for RX.")
Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com>
Link: https://lore.kernel.org/r/20211205183810.8299-1-amhamza.mgc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:57:17 -08:00
Michal Swiatkowski
de6acd1cdd ice: fix adding different tunnels
Adding filters with the same values inside for VXLAN and Geneve causes HW
error, because it looks exactly the same. To choose between different
type of tunnels new recipe is needed. Add storing tunnel types in
creating recipes function and start checking it in finding function.

Change getting open tunnels function to return port on correct tunnel
type. This is needed to copy correct port to dummy packet.

Block user from adding enc_dst_port via tc flower, because VXLAN and
Geneve filters can be created only with destination port which was
previously opened.

Fixes: 8b032a55c1 ("ice: low level support for tunnels")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07 13:21:01 -08:00
Michal Swiatkowski
0e32ff0240 ice: fix choosing UDP header type
In tunnels packet there can be two UDP headers:
- outer which for hw should be mark as ICE_UDP_OF
- inner which for hw should be mark as ICE_UDP_ILOS or as ICE_TCP_IL if
  inner header is of TCP type

In none tunnels packet header can be:
- UDP, which for hw should be mark as ICE_UDP_ILOS
- TCP, which for hw should be mark as ICE_TCP_IL

Change incorrect ICE_UDP_OF for none tunnel packets to ICE_UDP_ILOS.
ICE_UDP_OF is incorrect for none tunnel packets and setting it leads to
error from hw while adding this kind of recipe.

In summary, for tunnel outer port type should always be set to
ICE_UDP_OF, for none tunnel outer and tunnel inner it should always be
set to ICE_UDP_ILOS.

Fixes: 9e300987d4 ("ice: VXLAN and Geneve TC support")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07 13:21:01 -08:00
Jesse Brandeburg
28dc1b86f8 ice: ignore dropped packets during init
If the hardware is constantly receiving unicast or broadcast packets
during driver load, the device previously counted many GLV_RDPC (VSI
dropped packets) events during init. This causes confusing dropped
packet statistics during driver load. The dropped packets counter
incrementing does stop once the driver finishes loading.

Avoid this problem by baselining our statistics at the end of driver
open instead of the end of probe.

Fixes: cdedef59de ("ice: Configure VSIs for Tx/Rx")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07 13:21:01 -08:00
Dave Ertman
6d39ea19b0 ice: Fix problems with DSCP QoS implementation
The patch that implemented DSCP QoS implementation removed a
bandwidth check that was used to check for a specific condition
caused by some corner cases.  This check should not of been
removed.

The same patch also added a check for when the DCBx state could
be changed in relation to DSCP, but the check was erroneously
added nested in a check for CEE mode, which made the check useless.

Fix these problems by re-adding the bandwidth check and relocating
the DSCP mode check earlier in the function that changes DCBx state
in the driver.

Fixes: 2a87bd73e5 ("ice: Add DSCP support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07 13:21:01 -08:00
Paul Greenwalt
2657e16d8c ice: rearm other interrupt cause register after enabling VFs
The other interrupt cause register (OICR), global interrupt 0, is
disabled when enabling VFs to prevent handling VFLR. If the OICR is
not rearmed then the VF cannot communicate with the PF.

Rearm the OICR after enabling VFs.

Fixes: 916c7fdf5e ("ice: Separate VF VSI initialization/creation from reset flow")
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07 13:21:01 -08:00
Yahui Cao
f23ab04dd6 ice: fix FDIR init missing when reset VF
When VF is being reset, ice_reset_vf() will be called and FDIR
resource should be released and initialized again.

Fixes: 1f7ea1cd6a ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Yahui Cao <yahui.cao@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07 13:21:00 -08:00
Dan Carpenter
d17b9737c2 net/qla3xxx: fix an error code in ql_adapter_up()
The ql_wait_for_drvr_lock() fails and returns false, then this
function should return an error code instead of returning success.

The other problem is that the success path prints an error message
netdev_err(ndev, "Releasing driver lock\n");  Delete that and
re-order the code a little to make it more clear.

Fixes: 5a4faa8737 ("[PATCH] qla3xxx NIC driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211207082416.GA16110@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 10:37:10 -08:00
Guangbin Huang
364d470d54 Revert "net: hns3: add void before function which don't receive ret"
This reverts commit 5ac4f180bd.

Sorry for taking no notice that the function devlink_register() has been
already declared as void, so it is needs to revert this patch.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/20211204012448.51360-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:30:12 -08:00
José Expósito
01081be1ea net: prestera: replace zero-length array with flexible-array member
One-element and zero-length arrays are deprecated and should be
replaced with flexible-array members:
https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Replace zero-length array with flexible-array member and make use
of the struct_size() helper.

Link: https://github.com/KSPP/linux/issues/78
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Reviewed-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Tested-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20211204171349.22776-1-jose.exposito89@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:29:27 -08:00
Norbert Zulinski
23ec111bf3 i40e: Fix NULL pointer dereference in i40e_dbg_dump_desc
When trying to dump VFs VSI RX/TX descriptors
using debugfs there was a crash
due to NULL pointer dereference in i40e_dbg_dump_desc.
Added a check to i40e_dbg_dump_desc that checks if
VSI type is correct for dumping RX/TX descriptors.

Fixes: 02e9c29081 ("i40e: debugfs interface")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-06 09:44:56 -08:00
Mateusz Palczewski
8aa55ab422 i40e: Fix pre-set max number of queues for VF
After setting pre-set combined to 16 queues and reserving 16 queues by
tc qdisc, pre-set maximum combined queues returned to default value
after VF reset being 4 and this generated errors during removing tc.
Fixed by removing clear num_req_queues before reset VF.

Fixes: e284fc2804 (i40e: Add and delete cloud filter)
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Bindushree P <Bindushree.p@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-06 09:44:56 -08:00
Karen Sornek
61125b8be8 i40e: Fix failed opcode appearing if handling messages from VF
Fix failed operation code appearing if handling messages from VF.
Implemented by waiting for VF appropriate state if request starts
handle while VF reset.
Without this patch the message handling request while VF is in
a reset state ends with error -5 (I40E_ERR_PARAM).

Fixes: 5c3c48ac6b ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-06 09:44:56 -08:00
Michal Maloszewski
1a1aa356dd iavf: Fix reporting when setting descriptor count
iavf_set_ringparams doesn't communicate to the user that

1. The user requested descriptor count is out of range. Instead it
   just quietly sets descriptors to the "clamped" value and calls it
   done. This makes it look an invalid value was successfully set as
   the descriptor count when this isn't actually true.

2. The user provided descriptor count needs to be inflated for alignment
   reasons.

This behavior is confusing. The ice driver has already addressed this
by rejecting invalid values for descriptor count and
messaging for alignment adjustments.
Do the same thing here by adding the error and info messages.

Fixes: fbb7ddfef2 ("i40evf: core ethtool functionality")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-06 08:27:50 -08:00
Manish Chopra
823163ba6e qed*: esl priv flag support through ethtool
ESL(Enhanced System Lockdown) was designed to lock PCI adapter firmware
images and prevent changes to critical non-volatile configuration data
so that uncontrolled, malicious or unintentional modification to the
adapters are avoided, ensuring it's operational state. Once this feature is
enabled, the device is locked, rejecting any modification to non-volatile
images. Once unlocked, the protection is off such that firmware and
non-volatile configurations may be altered.

Driver just reflects the capability and status of this through
the ethtool private flag.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-03 18:24:21 -08:00
Manish Chopra
0cc3a80179 qed*: enhance tx timeout debug info
This patch add some new qed APIs to query status block
info and report various data to MFW on tx timeout event

Along with that it enhances qede to dump more debug logs
(not just specific to the queue which was reported by stack)
on tx timeout which includes various other basic metadata about
all tx queues and other info (like status block etc.)

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-03 18:24:20 -08:00
Manish Chopra
8e227b198a qede: validate non LSO skb length
Although it is unlikely that stack could transmit a non LSO
skb with length > MTU, however in some cases or environment such
occurrences actually resulted into firmware asserts due to packet
length being greater than the max supported by the device (~9700B).

This patch adds the safeguard for such odd cases to avoid firmware
asserts.

v2: Added "Fixes" tag with one of the initial driver commit
    which enabled the TX traffic actually (as this was probably
    day1 issue which was discovered recently by some customer
    environment)

Fixes: a2ec6172d2 ("qede: Add support for link")
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Link: https://lore.kernel.org/r/20211203174413.13090-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-03 16:53:37 -08:00
Jakub Kicinski
8581fd402a treewide: Add missing includes masked by cgroup -> bpf dependency
cgroup.h (therefore swap.h, therefore half of the universe)
includes bpf.h which in turn includes module.h and slab.h.
Since we're about to get rid of that dependency we need
to clean things up.

v2: drop the cpu.h include from cacheinfo.h, it's not necessary
and it makes riscv sensitive to ordering of include files.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/all/20211120035253.72074-1-kuba@kernel.org/  # v1
Link: https://lore.kernel.org/all/20211120165528.197359-1-kuba@kernel.org/ # cacheinfo discussion
Link: https://lore.kernel.org/bpf/20211202203400.1208663-1-kuba@kernel.org
2021-12-03 10:58:13 -08:00
Dan Carpenter
badd7857f5 net: altera: set a couple error code in probe()
There are two error paths which accidentally return success instead of
a negative error code.

Fixes: bbd2190ce9 ("Altera TSE: Add main and header file for Altera Ethernet Driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 14:23:11 +00:00
Dan Carpenter
bb14bfc7eb net: lan966x: fix a IS_ERR() vs NULL check in lan966x_create_targets()
The devm_ioremap() function does not return error pointers.  It returns
NULL.

Fixes: db8bcaad53 ("net: lan966x: add the basic lan966x driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 14:22:00 +00:00
Yang Yingliang
f6882b8fac net: prestera: acl: fix return value check in prestera_acl_rule_entry_find()
rhashtable_lookup_fast() returns NULL pointer not ERR_PTR().
Return rhashtable_lookup_fast() directly to fix this.

Fixes: 47327e198d ("net: prestera: acl: migrate to new vTCAM api")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 14:19:54 +00:00
Jiasheng Jiang
128f6ec95a net: bcm4908: Handle dma_set_coherent_mask error codes
The return value of dma_set_coherent_mask() is not always 0.
To catch the exception in case that dma is not support the mask.

Fixes: 9d61d138ab ("net: broadcom: rename BCM4908 driver & update DT binding")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 14:18:53 +00:00
Jie Wang
184da9dc78 net: hns3: fix hns3 driver header file not self-contained issue
The hns3 driver header file uses the structure of other files, but does
not include corresponding file, which causes a check warning that the
header file is not self-contained.

Therefore, the required header file is included in the header file, and
the structure declaration is added to the header file to avoid cyclic
dependency of the header file.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:01:00 +00:00
Hao Chen
7acf76b1cd net: hns3: replace one tab with space in for statement
Replace one tab with space between symbol ')' and '{' in for statement of
function hclge_map_tqp().

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Hao Chen
40975e749d net: hns3: remove rebundant line for hclge_dbg_dump_tm_pg()
Return value judgment should follow the function call, so remove line
between them.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Hao Chen
4e599dddee net: hns3: add comments for hclge_dbg_fill_content()
When we use hclge_dbg_fill_content() to fill contents with
specific format according to struct hclge_dbg_item *items,
it may cause content cover due to unreasonable items.

So add comments to explain how to avoid it.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Hao Chen
5ac4f180bd net: hns3: add void before function which don't receive ret
Add void before function which don't receive ret to improve code
readability.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Hao Chen
9fcadbaae8 net: hns3: align return value type of atomic_read() with its output
Change output value type of atomic_read() from %u to %d.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Guangbin Huang
72dcdec10f net: hns3: modify one argument type of function hclge_ncl_config_data_print
The argument len will not be changed in hclge_ncl_config_data_print(), it
is no need to declare as a pointer, so modify it into int type.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Hao Chen
0cc25c6a14 net: hns3: Align type of some variables with their print type
The c language has a set of implicit type conversions, when
two variables perform bitwise or arithmetic operations.

For example, variable A (type u16/u8) -1, its output is int type variable.
u16/u8 will convert to int type implicitly before it does arithmetic
operations. So, change 1 to unsigned type.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Guangbin Huang
114967adbc net: hns3: add print vport id for failed message of vlan
This patch adds print vport id when failed to get or set vlan
filter parameters.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:59 +00:00
Guangbin Huang
e7a51bf590 net: hns3: refactor function hclge_set_vlan_filter_hw
Function hclge_set_vlan_filter_hw() is a bit too long, so add a new
function hclge_need_update_port_vlan() to simplify code and improve
code readability.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:58 +00:00
Yufeng Mo
23e0316049 net: hns3: optimize function hclge_cfg_common_loopback()
hclge_cfg_common_loopback() is a bit too long, so
encapsulate hclge_cfg_common_loopback_cmd_send() and
hclge_cfg_common_loopback_wait() two functions to
improve readability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:00:58 +00:00
Avihai Horon
b247f32aec net/mlx5: Dynamically resize flow counters query buffer
The flow counters bulk query buffer is allocated once during
mlx5_fc_init_stats(). For PFs and VFs this buffer usually takes a little
more than 512KB of memory, which is aligned to the next power of 2, to
1MB. For SFs, this buffer is reduced and takes around 128 Bytes.

The buffer size determines the maximum number of flow counters that
can be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.

There are cases that don't use many flow counters and don't need a big
buffer (e.g. SFs, VFs). Since this size is critical with large scale,
in these cases the buffer size should be reduced.

In order to reduce memory consumption while maintaining query
performance, change the query buffer's allocation scheme to the
following:
- First allocate the buffer with small initial size.
- If the number of counters surpasses the initial size, resize the
  buffer to the maximum size.

The buffer only grows and isn't shrank, because users with many flow
counters don't care about the buffer size and we don't want to add
resize overhead if the current number of counters drops.

This solution is preferable to the current one, which is less accurate
and only addresses SFs.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:16 -08:00
Roi Dayan
d4bb053139 net/mlx5e: TC, Set flow attr ip_version earlier
Setting flow attr ip_version is not related to parsing tc flow actions.
It needs to be set after parsing flower matches which changes the spec.
So move it outside parse_tc_fdb_actions() and set it in
__mlx5e_add_fdb_flow().

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:16 -08:00
Roi Dayan
df99047724 net/mlx5e: TC, Move common flow_action checks into function
Remove duplicate checks on flow_action by using common function.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:15 -08:00
Roi Dayan
70a140ea6f net/mlx5e: Remove redundant actions arg from vlan push/pop funcs
Passing actions is redundant and can be retrieved from flow attr.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:14 -08:00
Roi Dayan
3cc78411f3 net/mlx5e: Remove redundant actions arg from validate_goto_chain()
Passing actions is redundant and can be retrieved from flow.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:13 -08:00
Roi Dayan
9745dbe036 net/mlx5e: TC, Remove redundant action stack var
Remove the action stack var from parse tc fdb actions
and prase tc nic actions, use the flow attr action var directly.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:12 -08:00
Tariq Toukan
e9542221c4 net/mlx5e: Hide function mlx5e_num_channels_changed
No calls for mlx5e_num_channels_changed() out of en_main.c,
turn it static and remove from header.
Keep the wrapper function mlx5e_num_channels_changed_ctx exposed.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:11 -08:00
Ben Ben-Ishay
3ef1f8e795 net/mlx5e: SHAMPO, clean MLX5E_MAX_KLM_PER_WQE macro
This commit reduces unused variable from MLX5E_MAX_KLM_PER_WQE macro that
introduced by commit d7b896acbd ("net/mlx5e: Add support to klm_umr_wqe").

Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:10 -08:00
Saeed Mahameed
fad1783a6d net/mlx5: Print more info on pci error handlers
In case mlx5_pci_err_detected was called with state equals to
pci_channel_io_perm_failure, the driver will never come back up.

It is nice to know why the driver went to zombie land, so print some
useful information on pci err handlers.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
2021-12-02 16:53:10 -08:00
Dan Carpenter
c64d01b3ce net/mlx5: SF, silence an uninitialized variable warning
This code sometimes calls mlx5_sf_hw_table_hwc_init() when "ext_base_id"
is uninitialized.  It's not used on that path, but it generates a static
checker warning to pass uninitialized variables to another function.
It may also generate runtime UBSan  warnings depending on if the
mlx5_sf_hw_table_hwc_init() function is inlined or not.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:09 -08:00
Christophe JAILLET
31108d142f net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out'
where 'flow_flag_set(flow, FAILED);' is called.

All but the new error handling paths added by the commits given in the
Fixes tag below.

Fix these error handling paths and branch to 'err_out'.

Fixes: 166f431ec6 ("net/mlx5e: Add indirect tc offload of ovs internal port")
Fixes: b16eb3c81f ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:09 -08:00
Wei Yongjun
baf5c00130 net/mlx5: Fix error return code in esw_qos_create()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 85c5f7c920 ("net/mlx5: E-switch, Create QoS on demand")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:08 -08:00
Arnd Bergmann
d2b8c7ba3c mlx5: fix mlx5i_grp_sw_update_stats() stack usage
The mlx5e_sw_stats structure has grown to the point of triggering
a warning when put on the stack of a function:

mlx5/core/ipoib/ipoib.c: In function 'mlx5i_grp_sw_update_stats':
mlx5/core/ipoib/ipoib.c:136:1: error: the frame size of 1028 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

In this case, only five of the structure members are actually set,
so it's sufficient to have those as separate local variables.
As en_rep.c uses 'struct rtnl_link_stats64' for this, just use
the same one here for consistency.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:07 -08:00
Arnd Bergmann
7a7dd5114f mlx5: fix psample_sample_packet link error
When PSAMPLE is a loadable module, built-in drivers cannot use it:

aarch64-linux-ld: drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.o: in function `mlx5e_tc_sample_skb':
sample.c:(.text+0xd68): undefined reference to `psample_sample_packet'

Add the same dependency here that is used for MLXSW

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:07 -08:00
Jakub Kicinski
fc993be36f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02 11:44:56 -08:00
Horatiu Vultur
cc9cf69eea net: lan966x: Fix builds for lan966x driver
The lan966x is using the function 'packing' to create/extract the
information for the IFH, that is used to be added in front of the frames
when they are injected/extracted.
Therefore update the Kconfig to select config option 'PACKING' whenever
lan966x driver is enabled.

Fixes: db8bcaad53 ("net: lan966x: add the basic lan966x driver")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:24:08 +00:00
Prabhakar Kushwaha
7e9979e360 qed: Enhance rammod debug prints to provide pretty details
Instead of printing numbers of protocol IDs and rammod commands,
enhance debug prints to provide names. s_protocol_types[] and
s_ramrod_cmd_ids arrays[] are added to support along with APIs.

Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:22:17 +00:00
Horatiu Vultur
a290cf6927 net: lan966x: Fix duplicate check in frame extraction
The blamed commit generates the following smatch static checker warning:

 drivers/net/ethernet/microchip/lan966x/lan966x_main.c:515 lan966x_xtr_irq_handler()
         warn: duplicate check 'sz < 0' (previous on line 502)

This patch fixes this issue removing the duplicate check 'sz < 0'

Fixes: d28d6d2e37 ("net: lan966x: add port module support")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:17:28 +00:00
Sukadev Bhattiprolu
5b08560181 ibmvnic: drop bad optimization in reuse_tx_pools()
When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.

But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.

Fixes: bbd809305b ("ibmvnic: Reuse tx pools when possible")
Reported-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:09:19 +00:00
Sukadev Bhattiprolu
0584f49496 ibmvnic: drop bad optimization in reuse_rx_pools()
When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.

But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.

Fixes: 489de956e7 ("ibmvnic: Reuse rx pools when possible")
Reported-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:09:19 +00:00
Tianhao Chai
553217c244 ethernet: aquantia: Try MAC address from device tree
Apple M1 Mac minis (2020) with 10GE NICs do not have MAC address in the
card, but instead need to obtain MAC addresses from the device tree. In
this case the hardware will report an invalid MAC.

Currently atlantic driver does not query the DT for MAC address and will
randomly assign a MAC if the NIC doesn't have a permanent MAC burnt in.
This patch causes the driver to perfer a valid MAC address from OF (if
present) over HW self-reported MAC and only fall back to a random MAC
address when neither of them is valid.

Signed-off-by: Tianhao Chai <cth451@gmail.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Reviewed-by: Hector Martin <marcan@marcan.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:06:03 +00:00
Jie Wang
1b33341e3d net: hns3: refactor function hns3_get_vector_ring_chain()
Currently  hns3_get_vector_ring_chain() is a bit long. Refactor it by
extracting sub process to improve the readability.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:43 +00:00
Jie Wang
358e3edb31 net: hns3: refactor function hclge_set_channels()
Currently  hclge_set_channels() is a bit long. Refactor it by extracting
sub process to improve the readability.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:43 +00:00
Jie Wang
673b35b6a5 net: hns3: refactor function hclge_configure()
Currently  hclge_configure() is a bit long. Refactor it by extracting sub
process to improve the readability.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:43 +00:00
Jian Shen
d25f5eddbe net: hns3: split function hclge_update_port_base_vlan_cfg()
Currently the function hclge_update_port_base_vlan_cfg() is a
bit long. Split it to several small functions, to improve the
readability.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:43 +00:00
Yufeng Mo
8d4b409bac net: hns3: split function hns3_nic_net_xmit()
Function hns3_nic_net_xmit() is a bit too long. So add a
new function hns3_handle_skb_desc() to simplify code and improve
code readability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:43 +00:00
Jian Shen
a41fb3961d net: hns3: split function hclge_get_fd_rule_info()
Currently the function hclge_get_fd_rule_info() is a bit long.
Split it to several small functions, to improve readability.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:42 +00:00
Jian Shen
b60f9d2ec4 net: hns3: split function hclge_init_vlan_config()
Currently the function hclge_init_vlan_config() is a bit long.
Split it to several small functions, to simplify code and
improve code readability.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:42 +00:00
Peng Li
a1cfb24d01 net: hns3: refactor function hns3_fill_skb_desc to simplify code
The function hns3_fill_skb_desc is hard to read, this patch
extract 2 functions and add new a struct data to simplify the
code and Improve readability.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:42 +00:00
Peng Li
e6d72f6ac2 net: hns3: extract macro to simplify ring stats update code
As the code to update ring stats is alike for different ring stats
type, this patch extract macro to simplify ring stats update code.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 11:53:42 +00:00
Zhou Qingyang
e07a097b49 octeontx2-af: Fix a memleak bug in rvu_mbox_init()
In rvu_mbox_init(), mbox_regions is not freed or passed out
under the switch-default region, which could lead to a memory leak.

Fix this bug by changing 'return err' to 'goto free_regions'.

This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.

Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.

Builds with CONFIG_OCTEONTX2_AF=y show no new warnings,
and our static analyzer no longer warns about this code.

Fixes: 98c5611163 (“octeontx2-af: cn10k: Add mbox support for CN10K platform”)
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Link: https://lore.kernel.org/r/20211130165039.192426-1-zhou1615@umn.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01 19:11:05 -08:00
Zhou Qingyang
addad76431 net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
In mlx4_en_try_alloc_resources(), mlx4_en_copy_priv() is called and
tmp->tx_cq will be freed on the error path of mlx4_en_copy_priv().
After that mlx4_en_alloc_resources() is called and there is a dereference
of &tmp->tx_cq[t][i] in mlx4_en_alloc_resources(), which could lead to
a use after free problem on failure of mlx4_en_copy_priv().

Fix this bug by adding a check of mlx4_en_copy_priv()

This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.

Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.

Builds with CONFIG_MLX4_EN=m show no new warnings,
and our static analyzer no longer warns about this code.

Fixes: ec25bc04ed ("net/mlx4_en: Add resilience in low memory systems")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20211130164438.190591-1-zhou1615@umn.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01 19:04:50 -08:00
Russell King
0dc1df0598 net: mvneta: program 1ms autonegotiation clock divisor
Program the 1ms autonegotiation clock divisor according to the clocking
rate of neta - without this, the 1ms clock ticks at about 660us on
Armada 38x configured for 250MHz. Bring this into correct specification.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Marek Behún <kabel@kernel.org>
Link: https://lore.kernel.org/r/E1ms4WD-00EKLK-Ld@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01 19:01:36 -08:00
Zhou Qingyang
e2dabc4f7e net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
In qlcnic_83xx_add_rings(), the indirect function of
ahw->hw_ops->alloc_mbx_args will be called to allocate memory for
cmd.req.arg, and there is a dereference of it in qlcnic_83xx_add_rings(),
which could lead to a NULL pointer dereference on failure of the
indirect function like qlcnic_83xx_alloc_mbx_args().

Fix this bug by adding a check of alloc_mbx_args(), this patch
imitates the logic of mbx_cmd()'s failure handling.

This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.

Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.

Builds with CONFIG_QLCNIC=m show no new warnings, and our
static analyzer no longer warns about this code.

Fixes: 7f9664525f ("qlcnic: 83xx memory map and HW access routine")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Link: https://lore.kernel.org/r/20211130110848.109026-1-zhou1615@umn.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01 18:51:36 -08:00
Christophe JAILLET
699e53e4fa net: spider_net: Use non-atomic bitmap API when applicable
No concurrent access is possible when a bitmap is local to a function.
So prefer the non-atomic functions to save a few cycles.
   - replace a 'for' loop by an equivalent non-atomic 'bitmap_fill()' call
   - use '__set_bit()'

While at it, clear the 'bitmask' bitmap only when needed.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/3de0792f5088f00d135c835df6c19e63ae95f5d2.1638026251.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01 18:45:46 -08:00
Mitch Williams
7e4dcc1396 iavf: restore MSI state on reset
If the PF experiences an FLR, the VF's MSI and MSI-X configuration will
be conveniently and silently removed in the process. When this happens,
reset recovery will appear to complete normally but no traffic will
pass. The netdev watchdog will helpfully notify everyone of this issue.

To prevent such public embarrassment, restore MSI configuration at every
reset. For normal resets, this will do no harm, but for VF resets
resulting from a PF FLR, this will keep the VF working.

Fixes: 5eae00c57f ("i40evf: main driver core")
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-01 13:46:14 -08:00
Amit Cohen
51ef6b0079 mlxsw: Use Switch Multicast ID Register Version 2
The SMID-V2 register maps Multicast ID (MID) into a list of local ports.
It is a new version of SMID in order to support 1024 bits of local_port.

Add SMID-V2 register and use it instead of SMID.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:34 +00:00
Amit Cohen
e86ad8ce5b mlxsw: Use Switch Flooding Table Register Version 2
The SFTR-V2 register is used for flooding packet replication.
It is a new version of SFTR in order to support 1024 bits of local_port.

Add SFTR-V2 register and use it instead of SFTR.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:34 +00:00
Amit Cohen
f8538aec88 mlxsw: Add support for more than 256 ports in SBSR register
Add 'port_page' field in SBSR to be able to query occupancy of more than
256 ports. The field determines the range of the ports specified in the
'ingress_port_mask' and 'egress_port_mask' bit masks:
>From '256 * port_page' to '256 * port_page + 255'.

For each local port, the appropriate port page is used. A query is never
performed for a port range that spans multiple port pages.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:34 +00:00
Amit Cohen
c934757d90 mlxsw: Use u16 for local_port field instead of u8
Currently, local_port field is saved as u8, which means that maximum 256
ports can be used.

As preparation for Spectrum-4, which will support more than 256 ports,
local_port field should be extended.

Save local_port as u16 to allow use of additional ports.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:34 +00:00
Amit Cohen
242e696e03 mlxsw: reg: Adjust PPCNT register to support local port 255
Local port 255 has a special meaning in PPCNT register, it is used to
refer to all local ports. This wild card ability is not currently used
by the driver.

Special casing local port 255 in Spectrum-4 systems where it is a valid
port is going to be a problem.

Work around this issue by adding and always setting the 'lp_gl' bit
which instructs the device's firmware to treat this local port like an
ordinary port.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:34 +00:00
Amit Cohen
da56f1a0d2 mlxsw: reg: Increase 'port_num' field in PMTDB register
'port_num' field is used to indicate the local port value which can be
assigned to a module.

Increase the field from 8 bits to 10 bits in order to support more than
255 ports.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:34 +00:00
Amit Cohen
fd24b29a1b mlxsw: reg: Align existing registers to use extended local_port field
Add support for 10-bit local ports in device registers by making use of the
MLXSW_ITEM32_LP() macro that was added in the previous patch.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:33 +00:00
Amit Cohen
fda39347d9 mlxsw: item: Add support for local_port field in a split form
Currently, local_port field uses 8 bits, which means that maximum 256
ports can be used.

As preparation for the next ASIC, which will support more than 256
ports, local_port field should be extended to 10 bits.

It is not possible to use 10 consecutive bits in all registers, and
therefore, the field is split into 2 fields:
1. local_port - the existing 8 bits, represent LSB of the extended
   field.
2. lp_msb - extra 2 bits, represent MSB of the extended field.

To avoid complex programming when reading/writing local_port, add a
dedicated macro which creates get and set functions which handle both parts
of local_port.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:33 +00:00
Amit Cohen
b25dea489b mlxsw: reg: Remove unused functions
The functions mlxsw_reg_sfd_uc_unpack() and
mlxsw_reg_sfd_uc_lag_unpack() are not used. Remove them.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:33 +00:00
Amit Cohen
2cb310dc44 mlxsw: spectrum: Bump minimum FW version to xx.2010.1006
Add latest verified version of Nvidia Spectrum-family switch firmware,
for Spectrum (13.2010.1006), Spectrum-2 (29.2010.1006) and
Spectrum-3 (30.2010.1006).

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:52:33 +00:00
David S. Miller
8c659fdab0 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2021-11-30

This series contains updates to iavf driver only.

Patryk adds a debug message when MTU is changed.

Grzegorz adds messaging when transitioning in and out of multicast
promiscuous mode.

Jake returns correct error codes for iavf_parse_cls_flower().

Jedrzej adds messaging for when the driver is removed and refactors
struct usage to take less memory. He also adjusts ethtool statistics to
only display information on active queues.

Tony allows for user to specify the RSS hash.

Karen resolves some static analysis warnings, corrects format specifiers,
and rewords a message to come across as informational.

v2:
- Dropped patch 1 (for net) and 5
- Change MTU message from info to debug
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:46:03 +00:00
David S. Miller
749c69400a Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-11-30

This series contains updates to ice driver only.

Shiraz corrects assignment of boolean variable and removes an unused
enum.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:42:23 +00:00
David S. Miller
b8a841a9da Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2021-11-30

Jesper Dangaard Brouer says:

Changes to fix and enable XDP metadata to a specific Intel driver igc.
Tested with hardware i225 that uses driver igc, while testing AF_XDP
access to metadata area.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-01 14:41:02 +00:00
Ben Ben-Ishay
8c8cf03822 net/mlx5e: SHAMPO, Fix constant expression result
mlx5e_build_shampo_hd_umr uses counters i and index incorrectly
as unsigned, thus the err state err_unmap could stuck in endless loop.
Change i to int to solve the first issue.
Reduce index check to solve the second issue, the caller function
validates that index could not rotate.

Fixes: 64509b0525 ("net/mlx5e: Add data path for SHAMPO feature")
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:06 -08:00
Aya Levin
502e82b913 net/mlx5: Fix access to a non-supported register
Validate MRTC register is supported before triggering a delayed work
which accesses it.

Fixes: 5a1023deee ("net/mlx5: Add periodic update of host time to firmware")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:06 -08:00
Gal Pressman
924cc4633f net/mlx5: Fix too early queueing of log timestamp work
The log timestamp work should not be queued before the command interface
is initialized, move it to a later stage in the init flow.

Fixes: 5a1023deee ("net/mlx5: Add periodic update of host time to firmware")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:05 -08:00
Amir Tzin
76091b0fb6 net/mlx5: Fix use after free in mlx5_health_wait_pci_up
The device health recovery flow calls mlx5_health_wait_pci_up() which
queries the device for FW_RESET timeout after freeing the device
timeouts structure on mlx5_function_teardown(). Fix this bug by moving
timeouts structure init/cleanup to the device's init/uninit phases.
Since it is necessary to reset default software timeouts on function
reload, extract setting of defaults values from mlx5_tout_init() and
call mlx5_tout_set_def_val() directly from mlx5_function_setup().

Fixes: 5945e1adea ("net/mlx5: Read timeout values from init segment")
Reported by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:05 -08:00
Maor Dickman
e219440da0 net/mlx5: E-Switch, Use indirect table only if all destinations support it
When adding rule with multiple destinations, indirect table is used for all of
the destinations if at least one of the destinations support it, this can cause
creation of invalid indirect tables for the destinations that doesn't support it.

Fixed it by using indirect table only if all destinations support it.

Fixes: a508728a4c ("net/mlx5e: VF tunnel RX traffic offloading")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:05 -08:00
Dmytro Linkin
5c4e8ae7aa net/mlx5: E-Switch, Check group pointer before reading bw_share value
If log_esw_max_sched_depth is not supported group pointer of the vport
is NULL. Hence, check the pointer before reading bw_share value.

Fixes: 0fe132eac3 ("net/mlx5: E-switch, Allow to add vports to rate groups")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:04 -08:00
Mark Bloch
43a0696f11 net/mlx5: E-Switch, fix single FDB creation on BlueField
Always use MLX5_FLOW_TABLE_OTHER_VPORT flag when creating egress ACL
table for single FDB. Not doing so on BlueField will make firmware fail
the command. On BlueField the E-Switch manager is the ECPF (vport 0xFFFE)
which is filled in the flow table creation command but as the
other_vport field wasn't set the firmware complains about a bad parameter.

This is different from a regular HCA where the E-Switch manager vport is
the PF (vport 0x0). Passing MLX5_FLOW_TABLE_OTHER_VPORT will make the
firmware happy both on BlueField and on regular HCAs without special
condition for each.

This fixes the bellow firmware syndrome:
mlx5_cmd_check:819:(pid 571): CREATE_FLOW_TABLE(0x930) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x754a4)

Fixes: db202995f5 ("net/mlx5: E-Switch, add logic to enable shared FDB")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:04 -08:00
Dmytro Linkin
1e59b32e45 net/mlx5: E-switch, Respect BW share of the new group
To enable transmit schduler on vport FW require non-zero configuration
for vport's TSAR. If vport added to the group which has configured BW
share value and TX rate values of the vport are zero, then scheduler
wouldn't be enabled on this vport.
Fix that by calling BW normalization if BW share of the new group is
configured.

Fixes: 0fe132eac3 ("net/mlx5: E-switch, Allow to add vports to rate groups")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:04 -08:00
Maor Gottlieb
ffdf453152 net/mlx5: Lag, Fix recreation of VF LAG
Driver needs to nullify the port select attributes of the LAG when
port selection is destroyed, otherwise it breaks recreation of the
LAG.
It fixes the below kernel oops:

 [  587.906377] BUG: kernel NULL pointer dereference, address: 0000000000000008
 [  587.908843] #PF: supervisor read access in kernel mode
 [  587.910730] #PF: error_code(0x0000) - not-present page
 [  587.912580] PGD 0 P4D 0
 [  587.913632] Oops: 0000 [#1] SMP PTI
 [  587.914644] CPU: 5 PID: 165 Comm: kworker/u20:5 Tainted: G           OE     5.9.0_mlnx #1
 [  587.916152] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 [  587.918332] Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
 [  587.919479] RIP: 0010:mlx5_del_flow_rules+0x10/0x270 [mlx5_core]
 [  587.920568] mlx5_core 0000:08:00.1 enp8s0f1: Link up
 [  587.920680] Code: c0 09 80 a0 e8 cf 42 a4 e0 48 c7 c3 f4 ff ff ff e8 8a 88 dd e0 e9 ab fe ff ff 0f 1f 44 00 00 41 56 41 55 49 89 fd 41 54 55 53 <48> 8b 47 08 48 8b 68 28 48 85 ed 74 2e 48 8d 7d 38 e8 6a 64 34 e1
 [  587.925116] bond0: (slave enp8s0f1): Enslaving as an active interface with an up link
 [  587.930415] RSP: 0018:ffffc9000048fd88 EFLAGS: 00010282
 [  587.930417] RAX: ffff88846c14fac0 RBX: ffff88846cddcb80 RCX: 0000000080400007
 [  587.930417] RDX: 0000000080400008 RSI: ffff88846cddcb80 RDI: 0000000000000000
 [  587.930419] RBP: ffff88845fd80140 R08: 0000000000000001 R09: ffffffffa074ba00
 [  587.938132] R10: ffff88846c14fec0 R11: 0000000000000001 R12: ffff88846c122f10
 [  587.939473] R13: 0000000000000000 R14: 0000000000000001 R15: ffff88846d7a0000
 [  587.940800] FS:  0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000
 [  587.942416] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [  587.943536] CR2: 0000000000000008 CR3: 000000000240a002 CR4: 0000000000770ee0
 [  587.944904] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [  587.946308] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [  587.947639] PKRU: 55555554
 [  587.948236] Call Trace:
 [  587.948834]  mlx5_lag_destroy_definer.isra.3+0x16/0x90 [mlx5_core]
 [  587.950033]  mlx5_lag_destroy_definers+0x5b/0x80 [mlx5_core]
 [  587.951128]  mlx5_deactivate_lag+0x6e/0x80 [mlx5_core]
 [  587.952146]  mlx5_do_bond+0x150/0x450 [mlx5_core]
 [  587.953086]  mlx5_do_bond_work+0x3e/0x50 [mlx5_core]
 [  587.954086]  process_one_work+0x1eb/0x3e0
 [  587.954899]  worker_thread+0x2d/0x3c0
 [  587.955656]  ? process_one_work+0x3e0/0x3e0
 [  587.956493]  kthread+0x115/0x130
 [  587.957174]  ? kthread_park+0x90/0x90
 [  587.957929]  ret_from_fork+0x1f/0x30
 [  587.973055] ---[ end trace 71ccd6eca89f5513 ]---

Fixes: b7267869e9 ("net/mlx5: Lag, add support to create/destroy/modify port selection")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:03 -08:00
Moshe Shemesh
e45c0b3449 net/mlx5: Move MODIFY_RQT command to ignore list in internal error state
When the device is in internal error state, command interface isn't
accessible and the driver decides which commands to fail and which
to ignore.

Move the MODIFY_RQT command to the ignore list in order to avoid
the following redundant warning messages in internal error state:

mlx5_core 0000:82:00.1: mlx5e_rss_disable:419:(pid 23754): Failed to redirect RQT 0x0 to drop RQ 0xc00848: err = -5
mlx5_core 0000:82:00.1: mlx5e_rx_res_channels_deactivate:598:(pid 23754): Failed to redirect direct RQT 0x1 to drop RQ 0xc00848 (channel 0): err = -5
mlx5_core 0000:82:00.1: mlx5e_rx_res_channels_deactivate:607:(pid 23754): Failed to redirect XSK RQT 0x19 to drop RQ 0xc00848 (channel 0): err = -5

Fixes: 43ec0f41fa ("net/mlx5e: Hide all implementation details of mlx5e_rx_res")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:03 -08:00
Tariq Toukan
4cce2ccf08 net/mlx5e: Sync TIR params updates against concurrent create/modify
Transport Interface Receive (TIR) objects perform the packet processing and
reassembly and is also responsible for demultiplexing the packets into the
different RQs.

There are certain TIR context attributes that propagate to the pointed RQs
and applied to them (like packet_merge offloads (LRO/SHAMPO) and
tunneled_offload_en).  When TIRs do not agree on attributes values, a "last
one wins" policy is applied.  Hence, if not synced properly, a race between
TIR params update and a concurrent TIR create/modify operation might yield
to a mismatch between the shadow parameters in SW and the actual applied
state of the RQs in HW.

tunneled_offload_en is a fixed attribute per profile, while packet merge
offload state might be toggled and get out-of-sync. When this happens,
packet_merge offload might be working although not requested, or the
opposite.

All updates to packet_merge state and all create/modify operations of
regular redirection/steering TIRs are done under the same priv->state_lock,
so they do not run in parallel, and no race is possible.

However, there are other kind of TIRs (acceleration offloads TIRs, like TLS
TIRs) which are created on demand for each new connection without holding
the coarse priv->state_lock, hence might race.

Fix this by synchronizing all packet_merge state reads and writes against
all TIR create/modify operations. Include the modify operations of the
regular redirection steering TIRs under the new lock, for better code
layering and division of responsibilities.

Fixes: 1182f36593 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:03 -08:00
Raed Salem
51ebf5db67 net/mlx5e: Fix missing IPsec statistics on uplink representor
The cited patch added the IPsec support to uplink representor, however
as uplink representors have his private statistics where IPsec stats
is not part of it, that effectively makes IPsec stats hidden when uplink
representor stats queried.

Resolve by adding IPsec stats to uplink representor private statistics.

Fixes: 5589b8f1a2 ("net/mlx5e: Add IPsec support to uplink representor")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:02 -08:00
Raed Salem
c65d638ab3 net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation
Current code wrongly uses the skb->protocol field which reflects the
outer l3 protocol to set the inner l3 type in Software Parser (SWP)
fields settings in the ethernet segment (eseg) in flows where inner
l3 exists like in Vxlan over ESP flow, the above method wrongly use
the outer protocol type instead of the inner one. thus breaking cases
where inner and outer headers have different protocols.

Fix by setting the inner l3 type in SWP according to the inner l3 ip
header version.

Fixes: 2ac9cfe782 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30 22:35:02 -08:00
Max Filippov
23ea630f86 net: natsemi: fix hw address initialization for jazz and xtensa
Use eth_hw_addr_set function instead of writing the address directly to
net_device::dev_addr.

Fixes: adeef3e321 ("net: constify netdev->dev_addr")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20211130143600.31970-1-jcmvbkbc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-30 18:15:58 -08:00
Randy Dunlap
b0f38e1597 natsemi: xtensa: fix section mismatch warnings
Fix section mismatch warnings in xtsonic. The first one appears to be
bogus and after fixing the second one, the first one is gone.

WARNING: modpost: vmlinux.o(.text+0x529adc): Section mismatch in reference from the function sonic_get_stats() to the function .init.text:set_reset_devices()
The function sonic_get_stats() references
the function __init set_reset_devices().
This is often because sonic_get_stats lacks a __init
annotation or the annotation of set_reset_devices is wrong.

WARNING: modpost: vmlinux.o(.text+0x529b3b): Section mismatch in reference from the function xtsonic_probe() to the function .init.text:sonic_probe1()
The function xtsonic_probe() references
the function __init sonic_probe1().
This is often because xtsonic_probe lacks a __init
annotation or the annotation of sonic_probe1 is wrong.

Fixes: 74f2a5f0ef ("xtensa: Add support for the Sonic Ethernet device for the XT2000 board.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Chris Zankel <chris@zankel.net>
Cc: linux-xtensa@linux-xtensa.org
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Link: https://lore.kernel.org/r/20211130063947.7529-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-30 18:13:37 -08:00
Jedrzej Jagielski
64430f70ba iavf: Fix displaying queue statistics shown by ethtool
Driver provided too many lines as an output to ethtool -S command.
Return actual length of string set of ethtool stats. Instead of predefined
maximal value use the actual value on netdev, iterate over active queues.
Without this patch, ethtool -S report would produce additional
erroneous lines of queues that are not configured.

Signed-off-by: Witold Fijalkowski <witoldx.fijalkowski@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:07 -08:00
Karen Sornek
c2fbcc94d5 iavf: Refactor string format to avoid static analysis warnings
Change format to match variable type that is used in string.

Use %u format for unsigned variable and %d format for signed variable
to remove static analysis warnings.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:07 -08:00
Karen Sornek
fbe66f57d3 iavf: Refactor text of informational message
This message is intended to be informational to indicate a reset is about
to happen, but the use of "warning" in the message text can cause concern
with users.  Reword the message to make it less alarming.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:07 -08:00
Karen Sornek
349181b7b8 iavf: Fix static code analysis warning
Change min() to min_t() to fix static code analysis warning of possible
overflow.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:07 -08:00
Jedrzej Jagielski
4d0dbd9678 iavf: Refactor iavf_mac_filter struct memory usage
iavf_mac_filter struct contained couple boolean
flags using up more memory than is necessary.
Change the flags to be bitfields in an anonymous struct
so all the flags now fit in one byte.

Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:07 -08:00
Tony Nguyen
b231b59a2f iavf: Enable setting RSS hash key
Driver support for changing the RSS hash key exists, however, checks
have caused it to be reported as unsupported. Remove the check and
allow the hash key to be specified.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
2021-11-30 08:56:07 -08:00
Jedrzej Jagielski
bdb9e5c7ae iavf: Add trace while removing device
Add kernel trace that device was removed.
Currently there is no such information.
I.e. Host admin removes a PCI device from a VM,
than on VM shall be info about the event.

This patch adds info log to iavf_remove function.

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:07 -08:00
Jacob Keller
9f4651ea3e iavf: return errno code instead of status code
The iavf_parse_cls_flower function returns an integer error code, and
not an iavf_status enumeration.

Fix the function to use the standard errno value EINVAL as its return
instead of using IAVF_ERR_CONFIG.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:06 -08:00
Grzegorz Szczurek
f1db020ba4 iavf: Log info when VF is entering and leaving Allmulti mode
Add log when VF is entering and leaving Allmulti mode.
The change of VF state is visible in dmesg now.
Without this commit, entering and leaving Allmulti mode
is not logged in dmesg.

Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:06 -08:00
Patryk Małek
aeb5d11fd1 iavf: Add change MTU message
Add a netdev_dbg log entry in case of a change of MTU so that user is
notified about this change in the same manner as in case of pf driver.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:56:00 -08:00
Jesper Dangaard Brouer
f51b5e2b59 igc: enable XDP metadata in driver
Enabling the XDP bpf_prog access to data_meta area is a very small
change. Hint passing 'true' to xdp_prepare_buff().

The SKB layers can also access data_meta area, which required more
driver changes to support. Reviewers, notice the igc driver have two
different functions that can create SKBs, depending on driver config.

Hint for testers, ethtool priv-flags legacy-rx enables
the function igc_construct_skb()

 ethtool --set-priv-flags DEV legacy-rx on

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:40:31 -08:00
Jesper Dangaard Brouer
4fa8fcd344 igc: AF_XDP zero-copy metadata adjust breaks SKBs on XDP_PASS
Driver already implicitly supports XDP metadata access in AF_XDP
zero-copy mode, as xsk_buff_pool's xp_alloc() naturally set xdp_buff
data_meta equal data.

This works fine for XDP and AF_XDP, but if a BPF-prog adjust via
bpf_xdp_adjust_meta() and choose to call XDP_PASS, then igc function
igc_construct_skb_zc() will construct an invalid SKB packet. The
function correctly include the xdp->data_meta area in the memcpy, but
forgot to pull header to take metasize into account.

Fixes: fc9df2a0b5 ("igc: Enable RX via AF_XDP zero-copy")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:19:25 -08:00
Shiraz Saleem
244714da8d net/ice: Remove unused enum
Remove ice_devlink_param_id enum as its not used.

Suggested-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:02:12 -08:00
Shiraz Saleem
7b62483f64 net/ice: Fix boolean assignment
vbool in ice_devlink_enable_roce_get can be assigned to a
non-0/1 constant.

Fix this assignment of vbool to be 0/1.

Fixes: e523af4ee5 ("net/ice: Add support for enable_iwarp and enable_roce devlink param")
Suggested-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30 08:02:12 -08:00
Lv Ruyi
9c32950f24 net: mscc: ocelot: fix mutex_lock not released
If err is true, the function will be returned, but mutex_lock isn't
released.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:30:26 +00:00
Wei Yongjun
c019087932 net: hns3: make symbol 'hclge_mac_speed_map_to_fw' static
The sparse tool complains as follows:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:2656:28: warning:
 symbol 'hclge_mac_speed_map_to_fw' was not declared. Should it be static?

This symbol is not used outside of hclge_main.c, so marks it static.

Fixes: e46da6a3d4 ("net: hns3: refine function hclge_cfg_mac_speed_dup_hw()")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:28:14 +00:00
Volodymyr Mytnyk
adefefe528 net: prestera: acl: add rule stats support
Make flower to use counter API to get rule HW statistics.

Co-developed-by: Serhiy Boiko <serhiy.boiko@marvell.com>
Signed-off-by: Serhiy Boiko <serhiy.boiko@marvell.com>
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:26:01 +00:00
Volodymyr Mytnyk
6e36c7bcb4 net: prestera: add counter HW API
Add counter API for getting HW statistics.

- HW statistics gathered by this API are deleyed.
- Batch of conters is supported.
- acl stat is supported.

Co-developed-by: Serhiy Boiko <serhiy.boiko@marvell.com>
Signed-off-by: Serhiy Boiko <serhiy.boiko@marvell.com>
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:26:01 +00:00
Volodymyr Mytnyk
47327e198d net: prestera: acl: migrate to new vTCAM api
- Add new vTCAM HW API to configure HW ACLs.
- Migrate acl to use new vTCAM HW API.
- No counter support in this patch-set.

Co-developed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:26:01 +00:00
Leon Romanovsky
4c897cfc46 devlink: Simplify devlink resources unregister call
The devlink_resources_unregister() used second parameter as an
entry point for the recursive removal of devlink resources. None
of the callers outside of devlink core needed to use this field,
so let's remove it.

As part of this removal, the "struct devlink_resource" was moved
from .h to .c file as it is not possible to use in any place in
the code except devlink.c.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:23:32 +00:00
Jean Sacren
6167597d44 net: cxgb: fix a typo in kernel doc
Fix a trivial typo of 'pakcet' in cxgb kernel doc.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:18:08 +00:00
Jean Sacren
067bb3c307 net: cxgb3: fix typos in kernel doc
Fix two trivial typos of 'pakcet' in cxgb3 kernel doc.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:17:46 +00:00
Dongliang Mu
f4a8adbfe4 dpaa2-eth: destroy workqueue at the end of remove function
The commit c55211892f ("dpaa2-eth: support PTP Sync packet one-step
timestamping") forgets to destroy workqueue at the end of remove
function.

Fix this by adding destroy_workqueue before fsl_mc_portal_free and
free_netdev.

Fixes: c55211892f ("dpaa2-eth: support PTP Sync packet one-step timestamping")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:14:33 +00:00
Yang Yingliang
2680ce7fc9 net: lantiq: fix missing free_netdev() on error in ltq_etop_probe()
Add the missing free_netdev() before return from ltq_etop_probe()
in the error handling case.

Fixes: 14d4e308e0 ("net: lantiq: configure the burst length in ethernet drivers")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:11:27 +00:00
Maciej Fijalkowski
d1ec975f9f ice: xsk: clear status_error0 for each allocated desc
Fix a bug in which the receiving of packets can stop in the zero-copy
driver. Ice HW ignores 3 lower bits from QRX_TAIL register, which means
that tail is bumped only on intervals of 8. Currently with XSK RX
batching in place, ice_alloc_rx_bufs_zc() clears the status_error0 only
of the last descriptor that has been allocated/taken from the XSK buffer
pool. status_error0 includes DD bit that is looked upon by the
ice_clean_rx_irq_zc() to tell if a descriptor can be processed.

The bug can be triggered when driver updates the ntu but not the
QRX_TAIL, so HW wouldn't have a chance to write to the ready
descriptors. Later on driver moves the ntc to the mentioned set of
descriptors and interprets them as a ready to be processed, since
corresponding DD bits were not cleared nor any writeback has happened
that would clear it. This can then lead to ntc == ntu case which means
that ring is empty and no further packet processing.

Fix the XSK traffic hang that can be observed when l2fwd scenario from
xdpsock is used by making sure that status_error0 is cleared for each
descriptor that is fed to HW and therefore we are sure that driver will
not processed non-valid DD bits. This will also prevent the driver from
processing the descriptors that were allocated in favor of the
previously processed ones, but writeback didn't happen yet.

Fixes: db804cfc21 ("ice: Use the xsk batched rx allocation interface")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:09:33 +00:00
Christophe JAILLET
b83f5ac7d9 net: marvell: mvpp2: Fix the computation of shared CPUs
'bitmap_fill()' fills a bitmap one 'long' at a time.
It is likely that an exact number of bits is expected.

Use 'bitmap_set()' instead in order not to set unexpected bits.

Fixes: e531f76757 ("net: mvpp2: handle cases where more CPUs are available than s/w threads")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:05:16 +00:00
Bhupesh Sharma
4047b9db1a net: stmmac: Add platform level debug register dump feature
dwmac-qcom-ethqos currently exposes a mechanism to dump rgmii registers
after the 'stmmac_dvr_probe()' returns. However with commit
5ec5582343 ("net: stmmac: add clocks management for gmac driver"),
we now let 'pm_runtime_put()' disable the clocks before returning from
'stmmac_dvr_probe()'.

This causes a crash when 'rgmii_dump()' register dumps are enabled,
as the clocks are already off.

Since other dwmac drivers (possible future users as well) might
require a similar register dump feature, introduce a platform level
callback to allow the same.

This fixes the crash noticed while enabling rgmii_dump() dumps in
dwmac-qcom-ethqos driver as well. It also allows future changes
to keep a invoking the register dump callback from the correct
place inside 'stmmac_dvr_probe()'.

Fixes: 5ec5582343 ("net: stmmac: add clocks management for gmac driver")
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 11:57:58 +00:00
Wei Yongjun
1a59c9c555 net: mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()
Add the missing mutex_unlock before return from function
ocelot_hwstamp_set() in the ocelot_setup_ptp_traps() error
handling case.

Fixes: 96ca08c058 ("net: mscc: ocelot: set up traps for PTP packets")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211129151652.1165433-1-weiyongjun1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 20:20:34 -08:00
Heiner Kallweit
09ae03e2fc stmmac: remove ethtool driver version info
I think there's no benefit in reporting a date from almost 6 yrs ago.
Let ethtool report the default (kernel version) instead.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:40:44 +00:00
Yufeng Mo
1d851c0905 net: hns3: split function hns3_set_l2l3l4()
Function hns3_set_l2l3l4() is a bit too long. So add two
new functions hns3_set_l3_type() and hns3_set_l4_csum_length()
to simplify code and improve code readability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Yufeng Mo
2fbf6a07f5 net: hns3: split function hns3_handle_bdinfo()
Function hns3_handle_bdinfo() is a bit too long. So add two
new functions hns3_handle_rx_ts_info() and hns3_handle_rx_vlan_tag(
to simplify code and improve code readability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Yufeng Mo
8469b645c9 net: hns3: split function hns3_nic_get_stats64()
Function hns3_nic_get_stats64() is a bit too long. So add a
new function hns3_fetch_stats() to simplify code and improve
code readability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Guangbin Huang
e06dac5290 net: hns3: refine function hclge_tm_pri_q_qs_cfg()
This patch encapsulates the process code for queue to qset config of two
mode(tc based and vnet based) into two function, for making code more
concise.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Guangbin Huang
7ca561be11 net: hns3: add new function hclge_tm_schd_mode_tc_base_cfg()
This patch encapsulates the process code of tc based schedule mode of
function hclge_tm_lvl34_schd_mode_cfg() into a new function
hclge_tm_schd_mode_tc_base_cfg(). It make code more concise and the new
process code can be reused.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Guangbin Huang
e46da6a3d4 net: hns3: refine function hclge_cfg_mac_speed_dup_hw()
To reuse the code of converting speed of driver to speed of firmware in
function hclge_cfg_mac_speed_dup_hw(), encapsulate them into a new
function hclge_convert_to_fw_speed().

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Yufeng Mo
a4ae2bc0ab net: hns3: split function hns3_get_tx_timeo_queue_info()
Function hns3_get_tx_timeo_queue_info() is a bit too long. So add two
new functions hns3_dump_queue_stats() and hns3_dump_queue_reg() to
simplify code and improve code readability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Jie Wang
e6fe5e1671 net: hns3: refactor two hns3 debugfs functions
Use for statement to optimize some print work of function
hclge_dbg_dump_rst_info() and hclge_dbg_dump_mac_enable_status() to
improve code simplicity.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Hao Chen
e74a726da2 net: hns3: refactor hns3_nic_reuse_page()
Split rx copybreak handle into a separate function from function
hns3_nic_reuse_page() to improve code simplicity.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Jiaran Zhang
ed0e658c51 net: hns3: refactor reset_prepare_general retry statement
Currently, the hclge_reset_prepare_general function uses the goto
statement to jump upwards, which increases code complexity and makes
the program structure difficult to understand. In addition, if
reset_pending is set, retry_cnt cannot be increased. This may result
in a failure to exit the retry or increase the number of retries.

Use the while statement instead to make the program easier to understand
and solve the problem that the goto statement cannot be exited.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:26:17 +00:00
Sameer Saurabh
060a0fb721 atlantic: Remove warn trace message.
Remove the warn trace message - it's not a correct check here, because
the function can still be called on the device in DOWN state

Fixes: 508f2e3dce ("net: atlantic: split rx and tx per-queue stats")
Signed-off-by: Sameer Saurabh <ssaurabh@marvell.com>
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:24:22 +00:00
Dmitry Bogdanov
2087ced0fc atlantic: Fix statistics logic for production hardware
B0 is the main and widespread device revision of atlantic2 HW. In the
current state, driver will incorrectly fetch the statistics for this
revision.

Fixes: 5cfd54d7dc ("net: atlantic: minimal A2 fw_ops")
Signed-off-by: Dmitry Bogdanov <dbezrukov@marvell.com>
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:24:22 +00:00
Sameer Saurabh
03fa512189 Remove Half duplex mode speed capabilities.
Since Half Duplex mode has been deprecated by the firmware, driver should
not advertise Half Duplex speed in ethtool support link speed values.

Fixes: 071a02046c ("net: atlantic: A2: half duplex support")
Signed-off-by: Sameer Saurabh <ssaurabh@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:24:22 +00:00
Nikita Danilov
413d5e09ca atlantic: Add missing DIDs and fix 115c.
At the late production stages new dev ids were introduced. These are
now in production, so its important for the driver to recognize these.
And also fix the board caps for AQC115C adapter.

Fixes: b3f0c79cba ("net: atlantic: A2 hw_ops skeleton")
Signed-off-by: Nikita Danilov <ndanilov@aquantia.com>
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:24:21 +00:00
Sameer Saurabh
2465c80223 atlantic: Fix to display FW bundle version instead of FW mac version.
The correct way to reflect firmware version is to use bundle version.
Hence populating the same instead of MAC fw version.

Fixes: c1be0bf092 ("net: atlantic: common functions needed for basic A2 init/deinit hw_ops")
Signed-off-by: Sameer Saurabh <ssaurabh@marvell.com>
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:24:21 +00:00
Nikita Danilov
aa685acd98 atlatnic: enable Nbase-t speeds with base-t
When 2.5G is advertised, N-Base should be advertised against the T-base
caps. N5G is out of use in baseline code and driver should treat both 5G
and N5G (and also 2.5G and N2.5G) equally from user perspective.

Fixes: 5cfd54d7dc ("net: atlantic: minimal A2 fw_ops")
Signed-off-by: Nikita Danilov <ndanilov@aquantia.com>
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:24:21 +00:00
Dmitry Bogdanov
aa1dcb5646 atlantic: Increase delay for fw transactions
The max waiting period (of 1 ms) while reading the data from FW shared
buffer is too small for certain types of data (e.g., stats). There's a
chance that FW could be updating buffer at the same time and driver
would be unsuccessful in reading data. Firmware manual recommends to
have 1 sec timeout to fix this issue.

Fixes: 5cfd54d7dc ("net: atlantic: minimal A2 fw_ops")
Signed-off-by: Dmitry Bogdanov <dbezrukov@marvell.com>
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 14:24:21 +00:00
Erik Ekman
2191b1dfef net/mlx4_en: Update reported link modes for 1/10G
When link modes were initially added in commit 2c76267943
("net/mlx4_en: Use PTYS register to query ethtool settings") and
later updated for the new ethtool API in commit 3d8f7cc78d
("net: mlx4: use new ETHTOOL_G/SSETTINGS API") the only 1/10G non-baseT
link modes configured were 1000baseKX, 10000baseKX4 and 10000baseKR.
It looks like these got picked to represent other modes since nothing
better was available.

Switch to using more specific link modes added in commit 5711a98221
("net: ethtool: add support for 1000BaseX and missing 10G link modes").

Tested with MCX311A-XCAT connected via DAC.
Before:

% sudo ethtool enp3s0
Settings for enp3s0:
	Supported ports: [ FIBRE ]
	Supported link modes:   1000baseKX/Full
	                        10000baseKR/Full
	Supported pause frame use: Symmetric Receive-only
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  1000baseKX/Full
	                        10000baseKR/Full
	Advertised pause frame use: Symmetric
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: 10000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: Direct Attach Copper
	PHYAD: 0
	Transceiver: internal
	Supports Wake-on: d
	Wake-on: d
        Current message level: 0x00000014 (20)
                               link ifdown
	Link detected: yes

With this change:

% sudo ethtool enp3s0
	Settings for enp3s0:
	Supported ports: [ FIBRE ]
	Supported link modes:   1000baseX/Full
	                        10000baseCR/Full
 	                        10000baseSR/Full
	Supported pause frame use: Symmetric Receive-only
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  1000baseX/Full
 	                        10000baseCR/Full
 	                        10000baseSR/Full
	Advertised pause frame use: Symmetric
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: 10000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: Direct Attach Copper
	PHYAD: 0
	Transceiver: internal
	Supports Wake-on: d
	Wake-on: d
        Current message level: 0x00000014 (20)
                               link ifdown
	Link detected: yes

Tested-by: Michael Stapelberg <michael@stapelberg.ch>
Signed-off-by: Erik Ekman <erik@kryo.se>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 13:03:32 +00:00
Horatiu Vultur
12c2d0a5b8 net: lan966x: add ethtool configuration and statistics
This patch adds support for statistics counters for the network
interfaces. Also adds support for configuring the network interface via
ethtool like: speed, duplex etc.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:58:38 +00:00
Horatiu Vultur
e18aba8941 net: lan966x: add mactable support
This patch adds support for MAC table operations like add and forget.
Also add the functionality to read the MAC address from DT, if there is
no MAC set in DT it would use a random one.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:58:38 +00:00
Horatiu Vultur
d28d6d2e37 net: lan966x: add port module support
This patch adds support for netdev and phylink in the switch. The
injection + extraction is register based. This will be replaced with DMA
accees.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:58:38 +00:00
Horatiu Vultur
db8bcaad53 net: lan966x: add the basic lan966x driver
This patch adds basic SwitchDev driver framework for lan966x. It
includes only the IO range mapping and probing of the switch.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:58:38 +00:00
Hao Chen
e54b708c54 net: hns3: use macro IANA_VXLAN_GPE_UDP_PORT to replace number 4790
This patch uses macro IANA_VXLAN_GPE_UDP_PORT to replace number 4790 for
cleanup.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:19:53 +00:00
Vincent Whitchurch
f8e7dfd6fd net: stmmac: Avoid DMA_CHAN_CONTROL write if no Split Header support
The driver assumes that split headers can be enabled/disabled without
stopping/starting the device, so it writes DMA_CHAN_CONTROL from
stmmac_set_features().  However, on my system (IP v5.10a without Split
Header support), simply writing DMA_CHAN_CONTROL when DMA is running
(for example, with the commands below) leads to a TX watchdog timeout.

 host$ socat TCP-LISTEN:1024,fork,reuseaddr - &
 device$ ethtool -K eth0 tso off
 device$ ethtool -K eth0 tso on
 device$ dd if=/dev/zero bs=1M count=10 | socat - TCP4:host:1024
 <tx watchdog timeout>

Note that since my IP is configured without Split Header support, the
driver always just reads and writes the same value to the
DMA_CHAN_CONTROL register.

I don't have access to any platforms with Split Header support so I
don't know if these writes to the DMA_CHAN_CONTROL while DMA is running
actually work properly on such systems.  I could not find anything in
the databook that says that DMA_CHAN_CONTROL should not be written when
the DMA is running.

But on systems without Split Header support, there is in any case no
need to call enable_sph() in stmmac_set_features() at all since SPH can
never be toggled, so we can avoid the watchdog timeout there by skipping
this call.

Fixes: 8c6fc097a2 ("net: stmmac: gmac4+: Add Split Header support")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:08:05 +00:00
Maxime Chevallier
2551dc9e39 net: mvneta: Add TC traffic shaping offload
The mvneta controller is able to do some tocken-bucket per-queue traffic
shaping. This commit adds support for setting these using the TC mqprio
interface.

The token-bucket parameters are customisable, but the current
implementation configures them to have a 10kbps resolution for the
rate limitation, since it allows to cover the whole range of max_rate
values from 10kbps to 5Gbps with 10kbps increments.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:05:52 +00:00
Maxime Chevallier
e9f7099d07 net: mvneta: Allow having more than one queue per TC
The current mqprio implementation assumed that we are only using one
queue per TC. Use the offset and count parameters to allow using
multiple queues per TC. In that case, the controller will use a standard
round-robin algorithm to pick queues assigned to the same TC, with the
same priority.

This only applies to VLAN priorities in ingress traffic, each TC
corresponding to a vlan priority.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:05:52 +00:00
Maxime Chevallier
e7ca75fe66 net: mvneta: Don't force-set the offloading flag
The qopt->hw flag is set by the TC code according to the offloading mode
asked by user. Don't force-set it in the driver, but instead read it to
make sure we do what's asked.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:05:52 +00:00
Maxime Chevallier
75fa71e3ac net: mvneta: Use struct tc_mqprio_qopt_offload for MQPrio configuration
The struct tc_mqprio_qopt_offload is a container for struct tc_mqprio_qopt,
that allows passing extra parameters, such as traffic shaping. This commit
converts the current mqprio code to that new struct.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29 12:05:52 +00:00
Jakub Kicinski
93d5404e89 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ipa/ipa_main.c
  8afc7e471a ("net: ipa: separate disabling setup from modem stop")
  76b5fbcd6b ("net: ipa: kill ipa_modem_init()")

Duplicated include, drop one.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 13:45:19 -08:00
Vladimir Oltean
c49a35eedf net: mscc: ocelot: correctly report the timestamping RX filters in ethtool
The driver doesn't support RX timestamping for non-PTP packets, but it
declares that it does. Restrict the reported RX filters to PTP v2 over
L2 and over L4.

Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:38:21 -08:00
Vladimir Oltean
96ca08c058 net: mscc: ocelot: set up traps for PTP packets
IEEE 1588 support was declared too soon for the Ocelot switch. Out of
reset, this switch does not apply any special treatment for PTP packets,
i.e. when an event message is received, the natural tendency is to
forward it by MAC DA/VLAN ID. This poses a problem when the ingress port
is under a bridge, since user space application stacks (written
primarily for endpoint ports, not switches) like ptp4l expect that PTP
messages are always received on AF_PACKET / AF_INET sockets (depending
on the PTP transport being used), and never being autonomously
forwarded. Any forwarding, if necessary (for example in Transparent
Clock mode) is handled in software by ptp4l. Having the hardware forward
these packets too will cause duplicates which will confuse endpoints
connected to these switches.

So PTP over L2 barely works, in the sense that PTP packets reach the CPU
port, but they reach it via flooding, and therefore reach lots of other
unwanted destinations too. But PTP over IPv4/IPv6 does not work at all.
This is because the Ocelot switch have a separate destination port mask
for unknown IP multicast (which PTP over IP is) flooding compared to
unknown non-IP multicast (which PTP over L2 is) flooding. Specifically,
the driver allows the CPU port to be in the PGID_MC port group, but not
in PGID_MCIPV4 and PGID_MCIPV6. There are several presentations from
Allan Nielsen which explain that the embedded MIPS CPU on Ocelot
switches is not very powerful at all, so every penny they could save by
not allowing flooding to the CPU port module matters. Unknown IP
multicast did not make it.

The de facto consensus is that when a switch is PTP-aware and an
application stack for PTP is running, switches should have some sort of
trapping mechanism for PTP packets, to extract them from the hardware
data path. This avoids both problems:
(a) PTP packets are no longer flooded to unwanted destinations
(b) PTP over IP packets are no longer denied from reaching the CPU since
    they arrive there via a trap and not via flooding

It is not the first time when this change is attempted. Last time, the
feedback from Allan Nielsen and Andrew Lunn was that the traps should
not be installed by default, and that PTP-unaware switching may be
desired for some use cases:
https://patchwork.ozlabs.org/project/netdev/patch/20190813025214.18601-5-yangbo.lu@nxp.com/

To address that feedback, the present patch adds the necessary packet
traps according to the RX filter configuration transmitted by user space
through the SIOCSHWTSTAMP ioctl. Trapping is done via VCAP IS2, where we
keep 5 filters, which are amended each time RX timestamping is enabled
or disabled on a port:
- 1 for PTP over L2
- 2 for PTP over IPv4 (UDP ports 319 and 320)
- 2 for PTP over IPv6 (UDP ports 319 and 320)

The cookie by which these filters (invisible to tc) are identified is
strategically chosen such that it does not collide with the filters used
for the ocelot-8021q tagging protocol by the Felix driver, or with the
MRP traps set up by the Ocelot library.

Other alternatives were considered, like patching user space to do
something, but there are so many ways in which PTP packets could be made
to reach the CPU, generically speaking, that "do what?" is a very valid
question. The ptp4l program from the linuxptp stack already attempts to
do something: it calls setsockopt(IP_ADD_MEMBERSHIP) (and
PACKET_ADD_MEMBERSHIP, respectively) which translates in both cases into
a dev_mc_add() on the interface, in the kernel:
https://github.com/richardcochran/linuxptp/blob/v3.1.1/udp.c#L73
https://github.com/richardcochran/linuxptp/blob/v3.1.1/raw.c

Reality shows that this is not sufficient in case the interface belongs
to a switchdev driver, as dev_mc_add() does not show the intention to
trap a packet to the CPU, but rather the intention to not drop it (it is
strictly for RX filtering, same as promiscuous does not mean to send all
traffic to the CPU, but to not drop traffic with unknown MAC DA). This
topic is a can of worms in itself, and it would be great if user space
could just stay out of it.

On the other hand, setting up PTP traps privately within the driver is
not new by any stretch of the imagination:
https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c#L833
https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/dsa/hirschmann/hellcreek.c#L1050
https://elixir.bootlin.com/linux/v5.16-rc2/source/include/linux/dsa/sja1105.h#L21

So this is the approach taken here as well. The difference here being
that we prepare and destroy the traps per port, dynamically at runtime,
as opposed to driver init time, because apparently, PTP-unaware
forwarding is a use case.

Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Reported-by: Po Liu <po.liu@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:38:21 -08:00
Vladimir Oltean
95706be13b net: mscc: ocelot: create a function that replaces an existing VCAP filter
VCAP (Versatile Content Aware Processor) is the TCAM-based engine behind
tc flower offload on ocelot, among other things. The ingress port mask
on which VCAP rules match is present as a bit field in the actual key of
the rule. This means that it is possible for a rule to be shared among
multiple source ports. When the rule is added one by one on each desired
port, that the ingress port mask of the key must be edited and rewritten
to hardware.

But the API in ocelot_vcap.c does not allow for this. For one thing,
ocelot_vcap_filter_add() and ocelot_vcap_filter_del() are not symmetric,
because ocelot_vcap_filter_add() works with a preallocated and
prepopulated filter and programs it to hardware, and
ocelot_vcap_filter_del() does both the job of removing the specified
filter from hardware, as well as kfreeing it. That is to say, the only
option of editing a filter in place, which is to delete it, modify the
structure and add it back, does not work because it results in
use-after-free.

This patch introduces ocelot_vcap_filter_replace, which trivially
reprograms a VCAP entry to hardware, at the exact same index at which it
existed before, without modifying any list or allocating any memory.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:38:20 -08:00
Vladimir Oltean
8a075464d1 net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP
The ocelot driver, when asked to timestamp all receiving packets, 1588
v1 or NTP, says "nah, here's 1588 v2 for you".

According to this discussion:
https://patchwork.kernel.org/project/netdevbpf/patch/20211104133204.19757-8-martin.kaistra@linutronix.de/#24577647
drivers that downgrade from a wider request to a narrower response (or
even a response where the intersection with the request is empty) are
buggy, and should return -ERANGE instead. This patch fixes that.

Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Suggested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:38:20 -08:00
Jie Wang
82229c4dbb net: hns3: fix incorrect components info of ethtool --reset command
Currently, HNS3 driver doesn't clear the reset flags of components after
successfully executing reset, it causes userspace info of
"Components reset" and "Components not reset" is incorrect.

So fix this problem by clear corresponding reset flag after reset process.

Fixes: ddccc5e368 ("net: hns3: add support for triggering reset by ethtool")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:36:29 -08:00
Hao Chen
9c14791748 net: hns3: fix one incorrect value of page pool info when queried by debugfs
Currently, when user queries page pool info by debugfs command
"cat page_pool_info", the cnt of allocated page for page pool may be
incorrect because of memory inconsistency problem caused by compiler
optimization.

So this patch uses READ_ONCE() to read value of pages_state_hold_cnt to
fix this problem.

Fixes: 850bfb912a ("net: hns3: debugfs add support dumping page pool info")
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:36:29 -08:00
Hao Chen
b8af344cfe net: hns3: add check NULL address for page pool
When page pool is not enabled, its address value is still NULL and page
pool should not be accessed, so add a check for it.

Fixes: 850bfb912a ("net: hns3: debugfs add support dumping page pool info")
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:36:29 -08:00
Guangbin Huang
8d2ad993aa net: hns3: fix VF RSS failed problem after PF enable multi-TCs
When PF is set to multi-TCs and configured mapping relationship between
priorities and TCs, the hardware will active these settings for this PF
and its VFs.

In this case when VF just uses one TC and its rx packets contain priority,
and if the priority is not mapped to TC0, as other TCs of VF is not valid,
hardware always put this kind of packets to the queue 0. It cause this kind
of packets of VF can not be used RSS function.

To fix this problem, set tc mode of all unused TCs of VF to the setting of
TC0, then rx packet with priority which map to unused TC will be direct to
TC0.

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:36:29 -08:00
zhangyue
0435a4d080 net: qed: fix the array may be out of bound
If the variable 'p_bit->flags' is always 0,
the loop condition is always 0.

The variable 'j' may be greater than or equal to 32.

At this time, the array 'p_aeu->bits[32]' may be out
of bound.

Signed-off-by: zhangyue <zhangyue1@kylinos.cn>
Link: https://lore.kernel.org/r/20211125113610.273841-1-zhangyue1@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:29:20 -08:00
Yannick Vignon
b270bfe697 net: stmmac: Disable Tx queues when reconfiguring the interface
The Tx queues were not disabled in situations where the driver needed to
stop the interface to apply a new configuration. This could result in a
kernel panic when doing any of the 3 following actions:
* reconfiguring the number of queues (ethtool -L)
* reconfiguring the size of the ring buffers (ethtool -G)
* installing/removing an XDP program (ip l set dev ethX xdp)

Prevent the panic by making sure netif_tx_disable is called when stopping
an interface.

Without this patch, the following kernel panic can be observed when doing
any of the actions above:

Unable to handle kernel paging request at virtual address ffff80001238d040
[....]
 Call trace:
  dwmac4_set_addr+0x8/0x10
  dev_hard_start_xmit+0xe4/0x1ac
  sch_direct_xmit+0xe8/0x39c
  __dev_queue_xmit+0x3ec/0xaf0
  dev_queue_xmit+0x14/0x20
[...]
[ end trace 0000000000000002 ]---

Fixes: 5fabb01207 ("net: stmmac: Add initial XDP support")
Fixes: aa042f60e4 ("net: stmmac: Add support to Ethtool get/set ring parameters")
Fixes: 0366f7e06a ("net: stmmac: add ethtool support for get/set channels")
Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com>
Link: https://lore.kernel.org/r/20211124154731.1676949-1-yannick.vignon@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 10:38:37 -08:00
Vladimir Oltean
8abe197038 net: dsa: felix: enable cut-through forwarding between ports by default
The VSC9959 switch embedded within NXP LS1028A (and that version of
Ocelot switches only) supports cut-through forwarding - meaning it can
start the process of looking up the destination ports for a packet, and
forward towards those ports, before the entire packet has been received
(as opposed to the store-and-forward mode).

The up side is having lower forwarding latency for large packets. The
down side is that frames with FCS errors are forwarded instead of being
dropped. However, erroneous frames do not result in incorrect updates of
the FDB or incorrect policer updates, since these processes are deferred
inside the switch to the end of frame. Since the switch starts the
cut-through forwarding process after all packet headers (including IP,
if any) have been processed, packets with large headers and small
payload do not see the benefit of lower forwarding latency.

There are two cases that need special attention.

The first is when a packet is multicast (or flooded) to multiple
destinations, one of which doesn't have cut-through forwarding enabled.
The switch deals with this automatically by disabling cut-through
forwarding for the frame towards all destination ports.

The second is when a packet is forwarded from a port of lower link speed
towards a port of higher link speed. This is not handled by the hardware
and needs software intervention.

Since we practically need to update the cut-through forwarding domain
from paths that aren't serialized by the rtnl_mutex (phylink
mac_link_down/mac_link_up ops), this means we need to serialize physical
link events with user space updates of bonding/bridging domains.

Enabling cut-through forwarding is done per {egress port, traffic class}.
I don't see any reason why this would be a configurable option as long
as it works without issues, and there doesn't appear to be any user
space configuration tool to toggle this on/off, so this patch enables
cut-through forwarding on all eligible ports and traffic classes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211125125808.2383984-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:32:07 -08:00
Vladimir Oltean
a8bd9fa5b5 net: ocelot: remove "bridge" argument from ocelot_get_bridge_fwd_mask
The only called takes ocelot_port->bridge and passes it as the "bridge"
argument to this function, which then compares it with
ocelot_port->bridge. This is not useful.

Instead, we would like this function to return 0 if ocelot_port->bridge
is not present, which is what this patch does.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211125125808.2383984-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:32:07 -08:00
Ong Boon Leong
61da6ac715 net: stmmac: perserve TX and RX coalesce value during XDP setup
When XDP program is loaded, it is desirable that the previous TX and RX
coalesce values are not re-inited to its default value. This prevents
unnecessary re-configurig the coalesce values that were working fine
before.

Fixes: ac746c8520 ("net: stmmac: enhance XDP ZC driver level switching performance")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20211124114019.3949125-1-boon.leong.ong@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:27:13 -08:00
Yang Yingliang
739752d655 tsnep: Add missing of_node_put() in tsnep_mdio_init()
The node pointer is returned by of_get_child_by_name() with
refcount incremented in tsnep_mdio_init(). Calling of_node_put()
to aovid the refcount leak in tsnep_mdio_init().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211124084048.175456-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:20:34 -08:00
Russell King (Oracle)
cc0a75eb03 net: macb: convert to phylink_generic_validate()
Populate the supported interfaces bitmap and MAC capabilities mask for
the macb driver and remove the old validate implementation.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1mpuRv-00D4rb-Lz@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:13:21 -08:00
Heiner Kallweit
4e9c91cf92 r8169: disable detection of chip version 60
It seems only XID 609 made it to the mass market. Therefore let's
disable detection of the other RTL8125a XID's. If nobody complains
we can remove support for RTL_GIGA_MAC_VER_60 later.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/2cd3df01-5f8b-08dd-6def-3f31a3014bde@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:10:12 -08:00
Jesse Brandeburg
eaeace6077 igb: fix netpoll exit with traffic
Oleksandr brought a bug report where netpoll causes trace
messages in the log on igb.

Danielle brought this back up as still occurring, so we'll try
again.

[22038.710800] ------------[ cut here ]------------
[22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll
[22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0

As Alex suggested, change the driver to return work_done at the
exit of napi_poll, which should be safe to do in this driver
because it is not polling multiple queues in this single napi
context (multiple queues attached to one MSI-X vector). Several
other drivers contain the same simple sequence, so I hope
this will not create new problems.

Fixes: 16eb8815c2 ("igb: Refactor clean_rx_irq to reduce overhead and improve performance")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Danielle Ratson <danieller@nvidia.com>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/20211123204000.1597971-1-jesse.brandeburg@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 07:39:31 -08:00
Heiner Kallweit
ddb826c2c9 lan743x: fix deadlock in lan743x_phy_link_status_change()
Usage of phy_ethtool_get_link_ksettings() in the link status change
handler isn't needed, and in combination with the referenced change
it results in a deadlock. Simply remove the call and replace it with
direct access to phydev->speed. The duplex argument of
lan743x_phy_update_flowcontrol() isn't used and can be removed.

Fixes: c10a485c3d ("phy: phy_ethtool_ksettings_get: Lock the phy for consistency")
Reported-by: Alessandro B Maurici <abmaurici@gmail.com>
Tested-by: Alessandro B Maurici <abmaurici@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/40e27f76-0ba3-dcef-ee32-a78b9df38b0f@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24 18:19:58 -08:00
Rahul Lakkireddy
e670e1e86b cxgb4: allow reading unrecognized port module eeprom
Even if firmware fails to recognize the plugged-in port module type,
allow reading port module EEPROM anyway. This helps in obtaining
necessary diagnostics information for debugging and analysis.

Signed-off-by: Manoj Malviya <manojmalviya@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Link: https://lore.kernel.org/r/1637682437-31407-1-git-send-email-rahul.lakkireddy@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24 17:27:07 -08:00
Gerhard Engleder
1aad9634b9 tsnep: Fix resource_size cocci warning
The following warning is fixed, by removing the unused resource size:

drivers/net/ethernet/engleder/tsnep_main.c:1155:21-24:
WARNING: Suspicious code. resource_size is maybe missing with io

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/20211124205225.13985-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24 17:09:13 -08:00
Yang Li
6a9d66a05b tsnep: fix platform_no_drv_owner.cocci warning
Remove .owner field if calls are used which set it automatically

Eliminate the following coccicheck warning:
./drivers/net/ethernet/engleder/tsnep_main.c:1263:3-8: No need to set
.owner here. The core will do it.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/1637721384-70836-2-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24 17:09:07 -08:00
Yufeng Mo
db596298ed net: hns3: add dql info when tx timeout
When tx timeout occurs, the info of dql maybe helpful, so print
these info to hns3_get_tx_timeo_queue_info().

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-24 14:12:26 +00:00
Jie Wang
8488e3c682 net: hns3: debugfs add drop packet statistics of multicast and broadcast for igu
Currently, there is no way to get drop packet number of multicast and
broadcast in IGU hardware module, it is not convenient to find problem
when multicast packet or broadcast packet is dropped in IGU, so this
patch adds statistics for them in debugfs.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-24 14:12:26 +00:00
Yufeng Mo
4f331fda35 net: hns3: format the output of the MAC address
Printing the whole MAC addresse may bring security risks. Therefore,
the MAC address is partially encrypted to improve security.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-24 14:12:26 +00:00
Yufeng Mo
d9069dab20 net: hns3: add log for workqueue scheduled late
When the mbx or reset message arrives, the driver is informed
through an interrupt. This task can be processed only after
the workqueue is scheduled. In some cases, this workqueue
scheduling takes a long time. As a result, the mbx or reset
service task cannot be processed in time. So add some warning
message to improve debugging efficiency for this case.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-24 14:12:26 +00:00
Kurt Kanzenbach
c6d5f19330 net: stmmac: Calculate CDC error only once
The clock domain crossing error (CDC) is calculated at every fetch of Tx or Rx
timestamps. It includes a division. Especially on arm32 based systems it is
expensive. It also requires two conditionals in the hotpath.

Add a compensation value cache to struct plat_stmmacenet_data and subtract it
unconditionally in the RX/TX functions which spares the conditionals.

The value is initialized to 0 and if supported calculated in the PTP
initialization code.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20211122111931.135135-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-23 20:11:05 -08:00
Marek Behún
7b1b62bc1e net: marvell: mvpp2: increase MTU limit when XDP enabled
Currently mvpp2_xdp_setup won't allow attaching XDP program if
  mtu > ETH_DATA_LEN (1500).

The mvpp2_change_mtu on the other hand checks whether
  MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE.

These two checks are semantically different.

Moreover this limit can be increased to MVPP2_MAX_RX_BUF_SIZE, since in
mvpp2_rx we have
  xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
  xdp.frame_sz = PAGE_SIZE;

Change the checks to check whether
  mtu > MVPP2_MAX_RX_BUF_SIZE

Fixes: 07dd0a7aae ("mvpp2: add basic XDP support")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:37:09 +00:00
Jakub Kicinski
2106efda78 net: remove .ndo_change_proto_down
.ndo_change_proto_down was added seemingly to enable out-of-tree
implementations. Over 2.5yrs later we still have no real users
upstream. Hardwire the generic implementation for now, we can
revert once real users materialize. (rocker is a test vehicle,
not a user.)

We need to drop the optimization on the sysfs side, because
unlike ndos priv_flags will be changed at runtime, so we'd
need READ_ONCE/WRITE_ONCE everywhere..

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:18:48 +00:00
David S. Miller
c384cee14a Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-11-22

Shiraz Saleem says:

Currently E800 devices come up as RoCEv2 devices by default.

This series add supports for users to configure iWARP or RoCEv2 functionality
per PCI function. devlink parameters is used to realize this and is keyed
off similar work in [1].

[1] https://lore.kernel.org/linux-rdma/20210810132424.9129-1-parav@nvidia.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:17:24 +00:00
Zheyu Ma
b82d71c0f8 net: chelsio: cxgb4vf: Fix an error code in cxgb4vf_pci_probe()
During the process of driver probing, probe function should return < 0
for failure, otherwise kernel will treat value == 0 as success.

Therefore, we should set err to -EINVAL when
adapter->registered_device_map is NULL. Otherwise kernel will assume
that driver has been successfully probed and will cause unexpected
errors.

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:15:53 +00:00
Marek Behún
4043ec701c net: marvell: mvpp2: Add support for 5gbase-r
Add support for PHY_INTERFACE_MODE_5GBASER.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:14:48 +00:00
Heiner Kallweit
c75a9ad436 r8169: fix incorrect mac address assignment
The original changes brakes MAC address assignment on older chip
versions (see bug report [0]), and it brakes random MAC assignment.

is_valid_ether_addr() requires that its argument is word-aligned.
Add the missing alignment to array mac_addr.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=215087

Fixes: 1c5d09d587 ("ethernet: r8169: use eth_hw_addr_set()")
Reported-by: Richard Herbert <rherbert@sympatico.ca>
Tested-by: Richard Herbert <rherbert@sympatico.ca>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:12:37 +00:00
Gerhard Engleder
75e4720651 tsnep: Fix set MAC address
Commit 4dfb998264 ("tsn:  Fix build.") fixed compilation with const
dev_addr. In tsnep_netdev_set_mac_address() the call of ether_addr_copy()
was replaced with dev_set_mac_address(), which calls
ndo_set_mac_address(). This results in an endless recursive loop because
ndo_set_mac_address is set to tsnep_netdev_set_mac_address.

Call eth_hw_addr_set() instead of dev_set_mac_address() in
ndo_set_mac_address()/tsnep_netdev_set_mac_address() to copy the address
as intended.

[   26.563303] Insufficient stack space to handle exception!
[   26.563312] ESR: 0x96000047 -- DABT (current EL)
[   26.563317] FAR: 0xffff80000a507fc0
[   26.563320] Task stack:     [0xffff80000a508000..0xffff80000a50c000]
[   26.563324] IRQ stack:      [0xffff80000a0c0000..0xffff80000a0c4000]
[   26.563327] Overflow stack: [0xffff00007fbaf2b0..0xffff00007fbb02b0]
[   26.563333] CPU: 3 PID: 381 Comm: ifconfig Not tainted 5.16.0-rc1-zynqmp #60
[   26.563340] Hardware name: TSN endpoint (DT)
[   26.563343] pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   26.563351] pc : inetdev_event+0x4/0x560
[   26.563364] lr : raw_notifier_call_chain+0x54/0x78
[   26.563372] sp : ffff80000a508040
[   26.563374] x29: ffff80000a508040 x28: ffff00000132b800 x27: 0000000000000000
[   26.563386] x26: 0000000000000000 x25: ffff800000ea5058 x24: 0904030201020001
[   26.563396] x23: ffff800000ea5058 x22: ffff80000a5080e0 x21: 0000000000000009
[   26.563405] x20: 00000000fffffffa x19: ffff80000a009510 x18: 0000000000000000
[   26.563414] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffd1341030
[   26.563422] x14: ffffffffffffffff x13: 0000000000000020 x12: 0101010101010101
[   26.563432] x11: 0000000000000020 x10: 0101010101010101 x9 : 7f7f7f7f7f7f7f7f
[   26.563441] x8 : 7f7f7f7f7f7f7f7f x7 : fefefeff30677364 x6 : 0000000080808080
[   26.563450] x5 : 0000000000000000 x4 : ffff800008dee170 x3 : ffff80000a50bd42
[   26.563459] x2 : ffff80000a5080e0 x1 : 0000000000000009 x0 : ffff80000a0092d0
[   26.563470] Kernel panic - not syncing: kernel stack overflow
[   26.563474] CPU: 3 PID: 381 Comm: ifconfig Not tainted 5.16.0-rc1-zynqmp #60
[   26.563481] Hardware name: TSN endpoint (DT)
[   26.563484] Call trace:
[   26.563486]  dump_backtrace+0x0/0x1b0
[   26.563497]  show_stack+0x18/0x68
[   26.563504]  dump_stack_lvl+0x68/0x84
[   26.563513]  dump_stack+0x18/0x34
[   26.563519]  panic+0x164/0x324
[   26.563524]  nmi_panic+0x64/0x98
[   26.563533]  panic_bad_stack+0x108/0x128
[   2k6.563539]  handle_bad_stack+0x38/0x68
[   26.563548]  __bad_stack+0x88/0x8c
[   26.563553]  inetdev_event+0x4/0x560
[   26.563560]  call_netdevice_notifiers_info+0x58/0xa8
[   26.563569]  dev_set_mac_address+0x78/0x110
[   26.563576]  tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
[   26.563591]  dev_set_mac_address+0xc4/0x110
[   26.563599]  tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
...
[   26.565444]  dev_set_mac_address+0xc4/0x110
[   26.565452]  tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
[   26.565462]  dev_set_mac_address+0xc4/0x110
[   26.565469]  dev_set_mac_address_user+0x44/0x68
[   26.565477]  dev_ifsioc+0x30c/0x568
[   26.565483]  dev_ioctl+0x124/0x3f0
[   26.565489]  sock_do_ioctl+0xb4/0xf8
[   26.565497]  sock_ioctl+0x2f4/0x398
[   26.565503]  __arm64_sys_ioctl+0xa8/0xe8
[   26.565511]  invoke_syscall+0x44/0x108
[   26.565520]  el0_svc_common.constprop.3+0x94/0xf8
[   26.565527]  do_el0_svc+0x24/0x88
[   26.565534]  el0_svc+0x20/0x50
[   26.565541]  el0t_64_sync_handler+0x90/0xb8
[   26.565548]  el0t_64_sync+0x180/0x184
[   26.565556] SMP: stopping secondary CPUs
[   26.565622] Kernel Offset: disabled
[   26.565624] CPU features: 0x0,00004002,00000846
[   26.565628] Memory Limit: none
[   27.843428] ---[ end Kernel panic - not syncing: kernel stack overflow ]---

Fixes: 4dfb998264 ("tsn:  Fix build.")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:11:00 +00:00
David S. Miller
52911bb62e Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-11-22

Maciej Fijalkowski says:

Here are the two fixes for issues around ethtool's set_channels()
callback for ice driver. Both are related to XDP resources. First one
corrects the size of vsi->txq_map that is used to track the usage of Tx
resources and the second one prevents the wrong refcounting of bpf_prog.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 12:10:07 +00:00
Amit Cohen
63b08b1f68 mlxsw: spectrum: Protect driver from buggy firmware
When processing port up/down events generated by the device's firmware,
the driver protects itself from events reported for non-existent local
ports, but not the CPU port (local port 0), which exists, but lacks a
netdev.

This can result in a NULL pointer dereference when calling
netif_carrier_{on,off}().

Fix this by bailing early when processing an event reported for the CPU
port. Problem was only observed when running on top of a buggy emulator.

Fixes: 28b1987ef5 ("mlxsw: spectrum: Register CPU port with devlink")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 11:46:18 +00:00
Danielle Ratson
ce4995bc6c mlxsw: spectrum: Allow driver to load with old firmware versions
The driver fails to load with old firmware versions that cannot report
the maximum number of RIF MAC profiles [1].

Fix this by defaulting to a maximum of a single profile in such
situations, as multiple profiles are not supported by old firmware
versions.

[1]
mlxsw_spectrum 0000:03:00.0: cannot register bus device
mlxsw_spectrum: probe of 0000:03:00.0 failed with error -5

Fixes: 1c375ffb2e ("mlxsw: spectrum_router: Expose RIF MAC profiles to devlink resource")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reported-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 11:46:18 +00:00
Danielle Ratson
c1020d3cf4 mlxsw: pci: Add shutdown method in PCI driver
On an arm64 platform with the Spectrum ASIC, after loading and executing
a new kernel via kexec, the following trace [1] is observed. This seems
to be caused by the fact that the device is not properly shutdown before
executing the new kernel.

Fix this by implementing a shutdown method which mirrors the remove
method, as recommended by the kexec maintainer [2][3].

[1]
BUG: Bad page state in process devlink pfn:22f73d
page:fffffe00089dcf40 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x2ffff00000000000()
raw: 2ffff00000000000 0000000000000000 ffffffff089d0201 0000000000000000
raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
page dumped because: nonzero _refcount
Modules linked in:
CPU: 1 PID: 16346 Comm: devlink Tainted: G B 5.8.0-rc6-custom-273020-gac6b365b1bf5 #44
Hardware name: Marvell Armada 7040 TX4810M (DT)
Call trace:
 dump_backtrace+0x0/0x1d0
 show_stack+0x1c/0x28
 dump_stack+0xbc/0x118
 bad_page+0xcc/0xf8
 check_free_page_bad+0x80/0x88
 __free_pages_ok+0x3f8/0x418
 __free_pages+0x38/0x60
 kmem_freepages+0x200/0x2a8
 slab_destroy+0x28/0x68
 slabs_destroy+0x60/0x90
 ___cache_free+0x1b4/0x358
 kfree+0xc0/0x1d0
 skb_free_head+0x2c/0x38
 skb_release_data+0x110/0x1a0
 skb_release_all+0x2c/0x38
 consume_skb+0x38/0x130
 __dev_kfree_skb_any+0x44/0x50
 mlxsw_pci_rdq_fini+0x8c/0xb0
 mlxsw_pci_queue_fini.isra.0+0x28/0x58
 mlxsw_pci_queue_group_fini+0x58/0x88
 mlxsw_pci_aqs_fini+0x2c/0x60
 mlxsw_pci_fini+0x34/0x50
 mlxsw_core_bus_device_unregister+0x104/0x1d0
 mlxsw_devlink_core_bus_device_reload_down+0x2c/0x48
 devlink_reload+0x44/0x158
 devlink_nl_cmd_reload+0x270/0x290
 genl_rcv_msg+0x188/0x2f0
 netlink_rcv_skb+0x5c/0x118
 genl_rcv+0x3c/0x50
 netlink_unicast+0x1bc/0x278
 netlink_sendmsg+0x194/0x390
 __sys_sendto+0xe0/0x158
 __arm64_sys_sendto+0x2c/0x38
 el0_svc_common.constprop.0+0x70/0x168
 do_el0_svc+0x28/0x88
 el0_sync_handler+0x88/0x190
 el0_sync+0x140/0x180

[2]
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1195432.html

[3]
https://patchwork.kernel.org/project/linux-scsi/patch/20170212214920.28866-1-anton@ozlabs.org/#20116693

Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 11:44:31 +00:00
Danielle Ratson
ed1607e2dd mlxsw: spectrum_router: Remove deadcode in mlxsw_sp_rif_mac_profile_find
The function idr_for_each_entry() already checks that the next entry in
the IDR is not NULL.

Therefore, checking that again in every iteration leads to deadcode.

Remove the unnecessary check in order to avoid that.

Addresses-Coverity: ("Logically dead code")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23 11:44:31 +00:00
Shiraz Saleem
e523af4ee5 net/ice: Add support for enable_iwarp and enable_roce devlink param
Allow support for 'enable_iwarp' and 'enable_roce' devlink params to turn
on/off iWARP or RoCE protocol support for E800 devices.

For example, a user can turn on iWARP functionality with,

devlink dev param set pci/0000:07:00.0 name enable_iwarp value true cmode runtime

This add an iWARP auxiliary rdma device, ice.iwarp.<>, under this PF.

A user request to enable both iWARP and RoCE under the same PF is rejected
since this device does not support both protocols simultaneously on the
same port.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-22 08:41:56 -08:00
Marta Plantykow
f65ee535df ice: avoid bpf_prog refcount underflow
Ice driver has the routines for managing XDP resources that are shared
between ndo_bpf op and VSI rebuild flow. The latter takes place for
example when user changes queue count on an interface via ethtool's
set_channels().

There is an issue around the bpf_prog refcounting when VSI is being
rebuilt - since ice_prepare_xdp_rings() is called with vsi->xdp_prog as
an argument that is used later on by ice_vsi_assign_bpf_prog(), same
bpf_prog pointers are swapped with each other. Then it is also
interpreted as an 'old_prog' which in turn causes us to call
bpf_prog_put on it that will decrement its refcount.

Below splat can be interpreted in a way that due to zero refcount of a
bpf_prog it is wiped out from the system while kernel still tries to
refer to it:

[  481.069429] BUG: unable to handle page fault for address: ffffc9000640f038
[  481.077390] #PF: supervisor read access in kernel mode
[  481.083335] #PF: error_code(0x0000) - not-present page
[  481.089276] PGD 100000067 P4D 100000067 PUD 1001cb067 PMD 106d2b067 PTE 0
[  481.097141] Oops: 0000 [#1] PREEMPT SMP PTI
[  481.101980] CPU: 12 PID: 3339 Comm: sudo Tainted: G           OE     5.15.0-rc5+ #1
[  481.110840] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[  481.122021] RIP: 0010:dev_xdp_prog_id+0x25/0x40
[  481.127265] Code: 80 00 00 00 00 0f 1f 44 00 00 89 f6 48 c1 e6 04 48 01 fe 48 8b 86 98 08 00 00 48 85 c0 74 13 48 8b 50 18 31 c0 48 85 d2 74 07 <48> 8b 42 38 8b 40 20 c3 48 8b 96 90 08 00 00 eb e8 66 2e 0f 1f 84
[  481.148991] RSP: 0018:ffffc90007b63868 EFLAGS: 00010286
[  481.155034] RAX: 0000000000000000 RBX: ffff889080824000 RCX: 0000000000000000
[  481.163278] RDX: ffffc9000640f000 RSI: ffff889080824010 RDI: ffff889080824000
[  481.171527] RBP: ffff888107af7d00 R08: 0000000000000000 R09: ffff88810db5f6e0
[  481.179776] R10: 0000000000000000 R11: ffff8890885b9988 R12: ffff88810db5f4bc
[  481.188026] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  481.196276] FS:  00007f5466d5bec0(0000) GS:ffff88903fb00000(0000) knlGS:0000000000000000
[  481.205633] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  481.212279] CR2: ffffc9000640f038 CR3: 000000014429c006 CR4: 00000000003706e0
[  481.220530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  481.228771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  481.237029] Call Trace:
[  481.239856]  rtnl_fill_ifinfo+0x768/0x12e0
[  481.244602]  rtnl_dump_ifinfo+0x525/0x650
[  481.249246]  ? __alloc_skb+0xa5/0x280
[  481.253484]  netlink_dump+0x168/0x3c0
[  481.257725]  netlink_recvmsg+0x21e/0x3e0
[  481.262263]  ____sys_recvmsg+0x87/0x170
[  481.266707]  ? __might_fault+0x20/0x30
[  481.271046]  ? _copy_from_user+0x66/0xa0
[  481.275591]  ? iovec_from_user+0xf6/0x1c0
[  481.280226]  ___sys_recvmsg+0x82/0x100
[  481.284566]  ? sock_sendmsg+0x5e/0x60
[  481.288791]  ? __sys_sendto+0xee/0x150
[  481.293129]  __sys_recvmsg+0x56/0xa0
[  481.297267]  do_syscall_64+0x3b/0xc0
[  481.301395]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  481.307238] RIP: 0033:0x7f5466f39617
[  481.311373] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb bd 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[  481.342944] RSP: 002b:00007ffedc7f4308 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
[  481.361783] RAX: ffffffffffffffda RBX: 00007ffedc7f5460 RCX: 00007f5466f39617
[  481.380278] RDX: 0000000000000000 RSI: 00007ffedc7f5360 RDI: 0000000000000003
[  481.398500] RBP: 00007ffedc7f53f0 R08: 0000000000000000 R09: 000055d556f04d50
[  481.416463] R10: 0000000000000077 R11: 0000000000000246 R12: 00007ffedc7f5360
[  481.434131] R13: 00007ffedc7f5350 R14: 00007ffedc7f5344 R15: 0000000000000e98
[  481.451520] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mxm_wmi mei_me coretemp mei ipmi_si ipmi_msghandler wmi acpi_pad acpi_power_meter ip_tables x_tables autofs4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel ahci crypto_simd cryptd libahci lpc_ich [last unloaded: ice]
[  481.528558] CR2: ffffc9000640f038
[  481.542041] ---[ end trace d1f24c9ecf5b61c1 ]---

Fix this by only calling ice_vsi_assign_bpf_prog() inside
ice_prepare_xdp_rings() when current vsi->xdp_prog pointer is NULL.
This way set_channels() flow will not attempt to swap the vsi->xdp_prog
pointers with itself.

Also, sprinkle around some comments that provide a reasoning about
correlation between driver and kernel in terms of bpf_prog refcount.

Fixes: efc2214b60 ("ice: Add support for XDP")
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-22 08:35:36 -08:00
Maciej Fijalkowski
792b208658 ice: fix vsi->txq_map sizing
The approach of having XDP queue per CPU regardless of user's setting
exposed a hidden bug that could occur in case when Rx queue count differ
from Tx queue count. Currently vsi->txq_map's size is equal to the
doubled vsi->alloc_txq, which is not correct due to the fact that XDP
rings were previously based on the Rx queue count. Below splat can be
seen when ethtool -L is used and XDP rings are configured:

[  682.875339] BUG: kernel NULL pointer dereference, address: 000000000000000f
[  682.883403] #PF: supervisor read access in kernel mode
[  682.889345] #PF: error_code(0x0000) - not-present page
[  682.895289] PGD 0 P4D 0
[  682.898218] Oops: 0000 [#1] PREEMPT SMP PTI
[  682.903055] CPU: 42 PID: 2878 Comm: ethtool Tainted: G           OE     5.15.0-rc5+ #1
[  682.912214] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[  682.923380] RIP: 0010:devres_remove+0x44/0x130
[  682.928527] Code: 49 89 f4 55 48 89 fd 4c 89 ff 53 48 83 ec 10 e8 92 b9 49 00 48 8b 9d a8 02 00 00 48 8d 8d a0 02 00 00 49 89 c2 48 39 cb 74 0f <4c> 3b 63 10 74 25 48 8b 5b 08 48 39 cb 75 f1 4c 89 ff 4c 89 d6 e8
[  682.950237] RSP: 0018:ffffc90006a679f0 EFLAGS: 00010002
[  682.956285] RAX: 0000000000000286 RBX: ffffffffffffffff RCX: ffff88908343a370
[  682.964538] RDX: 0000000000000001 RSI: ffffffff81690d60 RDI: 0000000000000000
[  682.972789] RBP: ffff88908343a0d0 R08: 0000000000000000 R09: 0000000000000000
[  682.981040] R10: 0000000000000286 R11: 3fffffffffffffff R12: ffffffff81690d60
[  682.989282] R13: ffffffff81690a00 R14: ffff8890819807a8 R15: ffff88908343a36c
[  682.997535] FS:  00007f08c7bfa740(0000) GS:ffff88a03fd00000(0000) knlGS:0000000000000000
[  683.006910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  683.013557] CR2: 000000000000000f CR3: 0000001080a66003 CR4: 00000000003706e0
[  683.021819] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  683.030075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  683.038336] Call Trace:
[  683.041167]  devm_kfree+0x33/0x50
[  683.045004]  ice_vsi_free_arrays+0x5e/0xc0 [ice]
[  683.050380]  ice_vsi_rebuild+0x4c8/0x750 [ice]
[  683.055543]  ice_vsi_recfg_qs+0x9a/0x110 [ice]
[  683.060697]  ice_set_channels+0x14f/0x290 [ice]
[  683.065962]  ethnl_set_channels+0x333/0x3f0
[  683.070807]  genl_family_rcv_msg_doit+0xea/0x150
[  683.076152]  genl_rcv_msg+0xde/0x1d0
[  683.080289]  ? channels_prepare_data+0x60/0x60
[  683.085432]  ? genl_get_cmd+0xd0/0xd0
[  683.089667]  netlink_rcv_skb+0x50/0xf0
[  683.094006]  genl_rcv+0x24/0x40
[  683.097638]  netlink_unicast+0x239/0x340
[  683.102177]  netlink_sendmsg+0x22e/0x470
[  683.106717]  sock_sendmsg+0x5e/0x60
[  683.110756]  __sys_sendto+0xee/0x150
[  683.114894]  ? handle_mm_fault+0xd0/0x2a0
[  683.119535]  ? do_user_addr_fault+0x1f3/0x690
[  683.134173]  __x64_sys_sendto+0x25/0x30
[  683.148231]  do_syscall_64+0x3b/0xc0
[  683.161992]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fix this by taking into account the value that num_possible_cpus()
yields in addition to vsi->alloc_txq instead of doubling the latter.

Fixes: efc2214b60 ("ice: Add support for XDP")
Fixes: 22bf877e52 ("ice: introduce XDP_TX fallback path")
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-22 08:33:54 -08:00
Arnd Bergmann
a68229ca63 nixge: fix mac address error handling again
The change to eth_hw_addr_set() caused gcc to correctly spot a
bug that was introduced in an earlier incorrect fix:

In file included from include/linux/etherdevice.h:21,
                 from drivers/net/ethernet/ni/nixge.c:7:
In function '__dev_addr_set',
    inlined from 'eth_hw_addr_set' at include/linux/etherdevice.h:319:2,
    inlined from 'nixge_probe' at drivers/net/ethernet/ni/nixge.c:1286:3:
include/linux/netdevice.h:4648:9: error: 'memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]
 4648 |         memcpy(dev->dev_addr, addr, len);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As nixge_get_nvmem_address() can return either NULL or an error
pointer, the NULL check is wrong, and we can end up reading from
ERR_PTR(-EOPNOTSUPP), which gcc knows to contain zero readable
bytes.

Make the function always return an error pointer again but fix
the check to match that.

Fixes: f3956ebb3b ("ethernet: use eth_hw_addr_set() instead of ether_addr_copy()")
Fixes: abcd3d6fc6 ("net: nixge: Fix error path for obtaining mac address")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 15:05:48 +00:00
Nicolas Iooss
f93fd0ca5e net: ax88796c: do not receive data in pointer
Function axspi_read_status calls:

    ret = spi_write_then_read(ax_spi->spi, ax_spi->cmd_buf, 1,
                              (u8 *)&status, 3);

status is a pointer to a struct spi_status, which is 3-byte wide:

    struct spi_status {
        u16 isr;
        u8 status;
    };

But &status is the pointer to this pointer, and spi_write_then_read does
not dereference this parameter:

    int spi_write_then_read(struct spi_device *spi,
                            const void *txbuf, unsigned n_tx,
                            void *rxbuf, unsigned n_rx)

Therefore axspi_read_status currently receive a SPI response in the
pointer status, which overwrites 24 bits of the pointer.

Thankfully, on Little-Endian systems, the pointer is only used in

    le16_to_cpus(&status->isr);

... which is a no-operation. So there, the overwritten pointer is not
dereferenced. Nevertheless on Big-Endian systems, this can lead to
dereferencing pointers after their 24 most significant bits were
overwritten. And in all systems this leads to possible use of
uninitialized value in functions calling spi_write_then_read which
expect status to be initialized when the function returns.

Moreover function axspi_read_status (and macro AX_READ_STATUS) do not
seem to be used anywhere. So currently this seems to be dead code. Fix
the issue anyway so that future code works properly when using function
axspi_read_status.

Fixes: a97c69ba4f ("net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver")

Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Acked-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 14:32:05 +00:00
Christophe JAILLET
5e6c7ccd3e qed: Use the bitmap API to simplify some functions
'cid_map' is a bitmap. So use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Also change some 'memset()' into 'bitmap_zero()' to keep consistency. This
is also much less verbose.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 14:30:57 +00:00
Holger Assmann
a6da2bbb00 net: stmmac: retain PTP clock time during SIOCSHWTSTAMP ioctls
Currently, when user space emits SIOCSHWTSTAMP ioctl calls such as
enabling/disabling timestamping or changing filter settings, the driver
reads the current CLOCK_REALTIME value and programming this into the
NIC's hardware clock. This might be necessary during system
initialization, but at runtime, when the PTP clock has already been
synchronized to a grandmaster, a reset of the timestamp settings might
result in a clock jump. Furthermore, if the clock is also controlled by
phc2sys in automatic mode (where the UTC offset is queried from ptp4l),
that UTC-to-TAI offset (currently 37 seconds in 2021) would be
temporarily reset to 0, and it would take a long time for phc2sys to
readjust so that CLOCK_REALTIME and the PHC are apart by 37 seconds
again.

To address the issue, we introduce a new function called
stmmac_init_tstamp_counter(), which gets called during ndo_open().
It contains the code snippet moved from stmmac_hwtstamp_set() that
manages the time synchronization. Besides, the sub second increment
configuration is also moved here since the related values are hardware
dependent and runtime invariant.

Furthermore, the hardware clock must be kept running even when no time
stamping mode is selected in order to retain the synchronized time base.
That way, timestamping can be enabled again at any time only with the
need to compensate the clock's natural drifting.

As a side effect, this patch fixes the issue that ptp_clock_info::enable
can be called before SIOCSHWTSTAMP and the driver (which looks at
priv->systime_flags) was not prepared to handle that ordering.

Fixes: 92ba688851 ("stmmac: add the support for PTP hw clock driver")
Reported-by: Michael Olbrich <m.olbrich@pengutronix.de>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Holger Assmann <h.assmann@pengutronix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 14:29:26 +00:00
Yacov Simhony
ac9f66ff04 Fix coverity issue 'Uninitialized scalar variable"
There are three boolean variable which were not initialized and later
being used in the code.

Signed-off-by: Yacov Simhony <ysimhony@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 14:27:00 +00:00
David S. Miller
4dfb998264 tsn: Fix build.
Due to const dev_addr changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 13:56:22 +00:00
Jakub Kicinski
a9c2cf9e93 octeon: constify netdev->dev_addr
Argument of a helper is missing a const.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: adeef3e321 ("net: constify netdev->dev_addr")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 13:21:41 +00:00
Haiyang Zhang
ed5356b53f net: mana: Add XDP support
Add support of XDP for the MANA driver.

Supported XDP actions:
	XDP_PASS, XDP_TX, XDP_DROP, XDP_ABORTED

XDP actions not yet supported:
	XDP_REDIRECT

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 13:20:19 +00:00
Gerhard Engleder
403f69bbdb tsnep: Add TSN endpoint Ethernet MAC driver
The TSN endpoint Ethernet MAC is a FPGA based network device for
real-time communication.

It is integrated as Ethernet controller with ethtool and PTP support.
For real-time communcation TC_SETUP_QDISC_TAPRIO is supported.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 13:19:04 +00:00
Eric Dumazet
6d872df3e3 net: annotate accesses to dev->gso_max_segs
dev->gso_max_segs is written under RTNL protection, or when the device is
not yet visible, but is read locklessly.

Add netif_set_gso_max_segs() helper.

Add the READ_ONCE()/WRITE_ONCE() pairs, and use netif_set_gso_max_segs()
where we can to better document what is going on.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:49:42 +00:00
Eric Dumazet
4b66d2161b net: annotate accesses to dev->gso_max_size
dev->gso_max_size is written under RTNL protection, or when the device is
not yet visible, but is read locklessly.

Add the READ_ONCE()/WRITE_ONCE() pairs, and use netif_set_gso_max_size()
where we can to better document what is going on.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:49:42 +00:00
Diana Wang
3bd6b2a838 nfp: checking parameter process for rx-usecs/tx-usecs is invalid
Use nn->tlv_caps.me_freq_mhz instead of nn->me_freq_mhz to check whether
rx-usecs/tx-usecs is valid.

This is because nn->tlv_caps.me_freq_mhz represents the clock_freq (MHz) of
the flow processing cores (FPC) on the NIC. While nn->me_freq_mhz is not
be set.

Fixes: ce991ab666 ("nfp: read ME frequency from vNIC ctrl memory")
Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:44:45 +00:00
Hao Chen
e175eb5fb0 net: hns3: remove the way to set tx spare buf via module parameter
The way to set tx spare buf via module parameter is not such
convenient as the way to set it via ethtool.

So,remove the way to set tx spare buf via module parameter.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:31:49 +00:00
Hao Chen
e65a0231d2 net: hns3: add support to set/get rx buf len via ethtool for hns3 driver
Rx buf len is for rx BD buffer size, support setting it via ethtool -G
parameter and getting it via ethtool -g parameter.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:31:49 +00:00
Hao Chen
7462494408 ethtool: extend ringparam setting/getting API with rx_buf_len
Add two new parameters kernel_ringparam and extack for
.get_ringparam and .set_ringparam to extend more ring params
through netlink.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:31:49 +00:00
Hao Chen
e445f08af2 net: hns3: add support to set/get tx copybreak buf size via ethtool for hns3 driver
Tx copybreak buf size is used for tx copybreak feature, the feature is
used for small size packet or frag. It adds a queue based tx shared
bounce buffer to memcpy the small packet when the len of xmitted skb is
below tx_copybreak(value to distinguish small size and normal size),
and reduce the overhead of dma map and unmap when IOMMU is on.

Support setting it via ethtool --set-tunable parameter and getting
it via ethtool --get-tunable parameter.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:31:47 +00:00
Jakub Kicinski
c9646a1803 bnx2x: constify static inline stub for dev_addr
bnx2x_vfpf_config_mac() was constified by not its stub.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-20 12:25:56 +00:00
Jakub Kicinski
0f98d7e478 82596: use eth_hw_addr_set()
Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-20 12:25:56 +00:00
Yang Li
d9f31aeaa1 ethernet: renesas: Use div64_ul instead of do_div
do_div() does a 64-by-32 division. Here the divisor is an
unsigned long which on some platforms is 64 bit wide. So use
div64_ul instead of do_div to avoid a possible truncation.

Eliminate the following coccicheck warning:
./drivers/net/ethernet/renesas/ravb_main.c:2492:1-7: WARNING:
do_div() does a 64-by-32 division, please consider using div64_ul
instead.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/1637228883-100100-1-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-19 20:13:02 -08:00
Brett Creeley
5951a2b981 iavf: Fix VLAN feature flags after VFR
When a VF goes through a reset, it's possible for the VF's feature set
to change. For example it may lose the VIRTCHNL_VF_OFFLOAD_VLAN
capability after VF reset. Unfortunately, the driver doesn't correctly
deal with this situation and errors are seen from downing/upping the
interface and/or moving the interface in/out of a network namespace.

When setting the interface down/up we see the following errors after the
VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF:

ice 0000:51:00.1: VF 1 failed opcode 12, retval: -64 iavf 0000:51:09.1:
Failed to add VLAN filter, error IAVF_NOT_SUPPORTED ice 0000:51:00.1: VF
1 failed opcode 13, retval: -64 iavf 0000:51:09.1: Failed to delete VLAN
filter, error IAVF_NOT_SUPPORTED

These add/delete errors are happening because the VLAN filters are
tracked internally to the driver and regardless of the VLAN_ALLOWED()
setting the driver tries to delete/re-add them over virtchnl.

Fix the delete failure by making sure to delete any VLAN filter tracking
in the driver when a removal request is made, while preventing the
virtchnl request.  This makes it so the driver's VLAN list is up to date
and the errors are

Fix the add failure by making sure the check for VLAN_ALLOWED() during
reset is done after the VF receives its capability list from the PF via
VIRTCHNL_OP_GET_VF_RESOURCES. If VLAN functionality is not allowed, then
prevent requesting re-adding the filters over virtchnl.

When moving the interface into a network namespace we see the following
errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from
the VF:

iavf 0000:51:09.1 enp81s0f1v1: NIC Link is Up Speed is 25 Gbps Full Duplex
iavf 0000:51:09.1 temp_27: renamed from enp81s0f1v1
iavf 0000:51:09.1 mgmt: renamed from temp_27
iavf 0000:51:09.1 dev27: set_features() failed (-22); wanted 0x020190001fd54833, left 0x020190001fd54bb3

These errors are happening because we aren't correctly updating the
netdev capabilities and dealing with ndo_fix_features() and
ndo_set_features() correctly.

Fix this by only reporting errors in the driver's ndo_set_features()
callback when VIRTCHNL_VF_OFFLOAD_VLAN is not allowed and any attempt to
enable the VLAN features is made. Also, make sure to disable VLAN
insertion, filtering, and stripping since the VIRTCHNL_VF_OFFLOAD_VLAN
flag applies to all of them and not just VLAN stripping.

Also, after we process the capabilities in the VF reset path, make sure
to call netdev_update_features() in case the capabilities have changed
in order to update the netdev's feature set to match the VF's actual
capabilities.

Lastly, make sure to always report success on VLAN filter delete when
VIRTCHNL_VF_OFFLOAD_VLAN is not supported. The changed flow in
iavf_del_vlans() allows the stack to delete previosly existing VLAN
filters even if VLAN filtering is not allowed. This makes it so the VLAN
filter list is up to date.

Fixes: 8774370d26 ("i40e/i40evf: support for VF VLAN tag stripping control")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-19 09:26:41 -08:00
Jedrzej Jagielski
3b5bdd18eb iavf: Fix refreshing iavf adapter stats on ethtool request
Currently iavf adapter statistics are refreshed only in a
watchdog task, triggered approximately every two seconds,
which causes some ethtool requests to return outdated values.

Add explicit statistics refresh when requested by ethtool -S.

Fixes: b476b0030e ("iavf: Move commands processing to the separate function")
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-19 09:26:41 -08:00
Jedrzej Jagielski
0cc318d2e8 iavf: Fix deadlock occurrence during resetting VF interface
System hangs if close the interface is called from the kernel during
the interface is in resetting state.
During resetting operation the link is closing but kernel didn't
know it and it tried to close this interface again what sometimes
led to deadlock.
Inform kernel about current state of interface
and turn off the flag IFF_UP when interface is closing until reset
is finished.
Previously it was most likely to hang the system when kernel
(network manager) tried to close the interface in the same time
when interface was in resetting state because of deadlock.

Fixes: 3c8e0b989a ("i40vf: don't stop me now")
Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-19 09:26:41 -08:00
Nitesh B Venkatesh
e792779e6b iavf: Prevent changing static ITR values if adaptive moderation is on
Resolve being able to change static values on VF when adaptive interrupt
moderation is enabled.

This problem is fixed by checking the interrupt settings is not
a combination of change of static value while adaptive interrupt
moderation is turned on.

Without this fix, the user would be able to change static values
on VF with adaptive moderation enabled.

Fixes: 65e87c0398 ("i40evf: support queue-specific settings for interrupt moderation")
Signed-off-by: Nitesh B Venkatesh <nitesh.b.venkatesh@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-19 08:23:20 -08:00
Yu Xiao
eaa54d6614 nfp: flower: correction of error handling
Removing reduplicated error handling when running into error path
of `nfp_compile_flow_metadata`.

Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 14:15:25 +00:00
Zekun Shen
0f296e782f stmmac_pci: Fix underflow size in stmmac_rx
This bug report came up when we were testing the device driver
by fuzzing. It shows that buf1_len can get underflowed and be
0xfffffffc (4294967292).

This bug is triggerable with a compromised/malfunctioning device.
We found the bug through QEMU emulation tested the patch with
emulation. We did NOT test it on real hardware.

Attached is the bug report by fuzzing.

BUG: KASAN: use-after-free in stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac]
Read of size 4294967292 at addr ffff888016358000 by task ksoftirqd/0/9

CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G        W         5.6.0 #1
Call Trace:
 dump_stack+0x76/0xa0
 print_address_description.constprop.0+0x16/0x200
 ? stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac]
 ? stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac]
 __kasan_report.cold+0x37/0x7c
 ? stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac]
 kasan_report+0xe/0x20
 check_memory_region+0x15a/0x1d0
 memcpy+0x20/0x50
 stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac]
 ? stmmac_suspend+0x850/0x850 [stmmac]
 ? __next_timer_interrupt+0xba/0xf0
 net_rx_action+0x363/0xbd0
 ? call_timer_fn+0x240/0x240
 ? __switch_to_asm+0x40/0x70
 ? napi_busy_loop+0x520/0x520
 ? __schedule+0x839/0x15a0
 __do_softirq+0x18c/0x634
 ? takeover_tasklets+0x5f0/0x5f0
 run_ksoftirqd+0x15/0x20
 smpboot_thread_fn+0x2f1/0x6b0
 ? smpboot_unregister_percpu_thread+0x160/0x160
 ? __kthread_parkme+0x80/0x100
 ? smpboot_unregister_percpu_thread+0x160/0x160
 kthread+0x2b5/0x3b0
 ? kthread_create_on_node+0xd0/0xd0
 ret_from_fork+0x22/0x40

Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu>
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:54:34 +00:00
Zekun Shen
6a405f6c37 atlantic: fix double-free in aq_ring_tx_clean
We found this bug while fuzzing the device driver. Using and freeing
the dangling pointer buff->skb would cause use-after-free and
double-free.

This bug is triggerable with compromised/malfunctioning devices. We
found the bug with QEMU emulation and tested the patch by emulation.
We did NOT test on a real device.

Attached is the bug report.

BUG: KASAN: double-free or invalid-free in consume_skb+0x6c/0x1c0

Call Trace:
 dump_stack+0x76/0xa0
 print_address_description.constprop.0+0x16/0x200
 ? consume_skb+0x6c/0x1c0
 kasan_report_invalid_free+0x61/0xa0
 ? consume_skb+0x6c/0x1c0
 __kasan_slab_free+0x15e/0x170
 ? consume_skb+0x6c/0x1c0
 kfree+0x8c/0x230
 consume_skb+0x6c/0x1c0
 aq_ring_tx_clean+0x5c2/0xa80 [atlantic]
 aq_vec_poll+0x309/0x5d0 [atlantic]
 ? _sub_I_65535_1+0x20/0x20 [atlantic]
 ? __next_timer_interrupt+0xba/0xf0
 net_rx_action+0x363/0xbd0
 ? call_timer_fn+0x240/0x240
 ? __switch_to_asm+0x34/0x70
 ? napi_busy_loop+0x520/0x520
 ? net_tx_action+0x379/0x720
 __do_softirq+0x18c/0x634
 ? takeover_tasklets+0x5f0/0x5f0
 run_ksoftirqd+0x15/0x20
 smpboot_thread_fn+0x2f1/0x6b0
 ? smpboot_unregister_percpu_thread+0x160/0x160
 ? __kthread_parkme+0x80/0x100
 ? smpboot_unregister_percpu_thread+0x160/0x160
 kthread+0x2b5/0x3b0
 ? kthread_create_on_node+0xd0/0xd0
 ret_from_fork+0x22/0x40

Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu>
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:53:51 +00:00
Heiner Kallweit
92e888bc6f sky2: use PCI VPD API in eeprom ethtool ops
Recently pci_read/write_vpd_any() have been added to the PCI VPD API.
These functions allow to access VPD address space outside the
auto-detected VPD, and they can be used to significantly simplify the
eeprom ethtool ops.

Tested with a 88E8070 card with 1KB EEPROM.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:21:46 +00:00
Volodymyr Mytnyk
e8d032507c net: marvell: prestera: fix double free issue on err path
fix error path handling in prestera_bridge_port_join() that
cases prestera driver to crash (see below).

 Trace:
   Internal error: Oops: 96000044 [#1] SMP
   Modules linked in: prestera_pci prestera uio_pdrv_genirq
   CPU: 1 PID: 881 Comm: ip Not tainted 5.15.0 #1
   pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
   pc : prestera_bridge_destroy+0x2c/0xb0 [prestera]
   lr : prestera_bridge_port_join+0x2cc/0x350 [prestera]
   sp : ffff800011a1b0f0
   ...
   x2 : ffff000109ca6c80 x1 : dead000000000100 x0 : dead000000000122
    Call trace:
   prestera_bridge_destroy+0x2c/0xb0 [prestera]
   prestera_bridge_port_join+0x2cc/0x350 [prestera]
   prestera_netdev_port_event.constprop.0+0x3c4/0x450 [prestera]
   prestera_netdev_event_handler+0xf4/0x110 [prestera]
   raw_notifier_call_chain+0x54/0x80
   call_netdevice_notifiers_info+0x54/0xa0
   __netdev_upper_dev_link+0x19c/0x380

Fixes: e1189d9a5f ("net: marvell: prestera: Add Switchdev driver implementation")
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:20:55 +00:00
Volodymyr Mytnyk
253e9b4d11 net: marvell: prestera: fix brige port operation
Return NOTIFY_DONE (dont't care) for switchdev notifications
that prestera driver don't know how to handle them.

With introduction of SWITCHDEV_BRPORT_[UN]OFFLOADED switchdev
events, the driver rejects adding swport to bridge operation
which is handled by prestera_bridge_port_join() func. The root
cause of this is that prestera driver returns error (EOPNOTSUPP)
in prestera_switchdev_blk_event() handler for unknown swdev
events. This causes switchdev_bridge_port_offload() to fail
when adding port to bridge in prestera_bridge_port_join().

Fixes: 957e2235e5 ("net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge")
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:20:10 +00:00
Kees Cook
29fd0ec65e bnx2x: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() in struct nig_stats around members egress_mac_pkt0_lo,
egress_mac_pkt0_hi, egress_mac_pkt1_lo, and egress_mac_pkt1_hi (and the
respective members in struct bnx2x_eth_stats), so they can be referenced
together. This will allow memcpy() and sizeof() to more easily reason
about sizes, improve readability, and avoid future warnings about writing
beyond the end of struct bnx2x_eth_stats's rx_stat_ifhcinbadoctets_hi.

"pahole" shows no size nor member offset changes to either struct.
"objdump -d" shows no meaningful object code changes (i.e. only source
line number induced differences and optimizations).

Additionally adds BUILD_BUG_ON() to compare the separate struct group
sizes.

Reviewed-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Link: https://lore.kernel.org/lkml/DM5PR18MB2229B0413C372CC6E49D59A3B2C59@DM5PR18MB2229.namprd18.prod.outlook.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:18:08 +00:00
Kees Cook
641d3ef00c cxgb4: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() in struct fw_eth_tx_pkt_vm_wr around members ethmacdst,
ethmacsrc, ethtype, and vlantci, so they can be referenced together. This
will allow memcpy() and sizeof() to more easily reason about sizes,
improve readability, and avoid future warnings about writing beyond the
end of ethmacdst.

"pahole" shows no size nor member offset changes to struct
fw_eth_tx_pkt_vm_wr. "objdump -d" shows no object code changes.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:17:09 +00:00
Kees Cook
88181f1d34 cxgb3: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() in struct rss_hdr around members imm_data and intr_gen,
so they can be referenced together. This will allow memcpy() and sizeof()
to more easily reason about sizes, improve readability, and avoid future
warnings about writing beyond the end of imm_data.

"pahole" shows no size nor member offset changes to struct rss_hdr.
"objdump -d" shows no object code changes.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:15:41 +00:00
Jakub Kicinski
bb52aff3e3 natsemi: macsonic: use eth_hw_addr_set()
Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
9a962aedd3 cirrus: mac89x0: use eth_hw_addr_set()
Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
e217fc4aff apple: macmace: use eth_hw_addr_set()
Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
5b6d5affd2 lasi_82594: use eth_hw_addr_set()
dev_addr is set from IO reads, passed to an arch-specific helper.
Note that the helper never reads it so uninitialized temp is fine.

Fixes build on parisc.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
80db345e7d smc9194: use eth_hw_addr_set()
dev_addr is set from IO reads, and broken from a u16 value.

Fixes build on Alpha.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
f95f8e890a 8390: wd: use eth_hw_addr_set()
IO reads, so save to an array then eth_hw_addr_set().

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
973a34c087 8390: mac8390: use eth_hw_addr_set()
Use temp to pass to the reading function, the function is generic
so can't fix there.

Fixes m68k build.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
d7d28e90e2 8390: hydra: use eth_hw_addr_set()
Loop with offsetting to every second byte, so use a temp buffer.

Fixes m68k build.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:21 +00:00
Jakub Kicinski
5114ddf8dd 8390: smc-ultra: use eth_hw_addr_set()
IO reads, so save to an array then eth_hw_addr_set().

Fixes build on Alpha.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:20 +00:00
Jakub Kicinski
cc71b8b937 amd: mvme147: use eth_hw_addr_set()
Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:20 +00:00
Jakub Kicinski
c3dc2f7196 amd: atarilance: use eth_hw_addr_set()
Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:20 +00:00
Jakub Kicinski
21942eef06 amd: hplance: use eth_hw_addr_set()
Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:20 +00:00
Jakub Kicinski
285e4c664d amd: a2065/ariadne: use eth_hw_addr_set()
dev_addr is initialized byte by byte from series.

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:20 +00:00
Jakub Kicinski
69ede3097b amd: ni65: use eth_hw_addr_set()
IO reads, so save to an array then eth_hw_addr_set().

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:20 +00:00
Jakub Kicinski
0222ee53c4 amd: lance: use eth_hw_addr_set()
IO reads, so save to an array then eth_hw_addr_set().

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 11:05:20 +00:00
Jakub Kicinski
54612b4a8b mlxsw: constify address in mlxsw_sp_port_dev_addr_set
Argument comes from netdev->dev_addr directly, it needs a const.

Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 10:46:04 +00:00
Jakub Kicinski
e291422c8f net: ax88796c: don't write to netdev->dev_addr directly
The future is here, convert the new driver as we are about
to make netdev->dev_addr const.

Acked-by: Lukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19 10:46:04 +00:00
Jakub Kicinski
50fc24944a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-18 13:13:16 -08:00
Xiaoliang Yang
a7e13edf37 net: dsa: felix: restrict psfp rules on ingress port
PSFP rules take effect on the streams from any port of VSC9959 switch.
This patch use ingress port to limit the rule only active on this port.

Each stream can only match two ingress source ports in VSC9959. Streams
from lowest port gets the configuration of SFID pointed by MAC Table
lookup and streams from highest port gets the configuration of (SFID+1)
pointed by MAC Table lookup. This patch defines the PSFP rule on highest
port as dummy rule, which means that it does not modify the MAC table.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 12:07:24 +00:00
Xiaoliang Yang
77043c3709 net: mscc: ocelot: use index to set vcap policer
Policer was previously automatically assigned from the highest index to
the lowest index from policer pool. But police action of tc flower now
uses index to set an police entry. This patch uses the police index to
set vcap policers, so that one policer can be shared by multiple rules.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 12:07:24 +00:00
Xiaoliang Yang
23e2c506ad net: mscc: ocelot: add gate and police action offload to PSFP
PSFP support gate and police action. This patch add the gate and police
action to flower parse action, check chain ID to determine which block
to offload. Adding psfp callback functions to add, delete and update gate
and police in PSFP table if hardware supports it.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 12:07:23 +00:00
Xiaoliang Yang
5b1918a54a net: mscc: ocelot: set vcap IS2 chain to goto PSFP chain
Some chips in the ocelot series such as VSC9959 support Per-Stream
Filtering and Policing(PSFP), which is processing after VCAP blocks.
We set this block on chain 30000 and set vcap IS2 chain to goto PSFP
chain if hardware support.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 12:07:23 +00:00
Xiaoliang Yang
0568c3bf3f net: mscc: ocelot: add MAC table stream learn and lookup operations
ocelot_mact_learn_streamdata() can be used in VSC9959 to overwrite an
FDB entry with stream data. The stream data includes SFID and SSID which
can be used for PSFP and FRER set.

ocelot_mact_lookup() can be used to check if the given {DMAC, VID} FDB
entry is exist, and also can retrieve the DEST_IDX and entry type for
the FDB entry.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 12:07:23 +00:00
Teng Qi
0fa68da72c net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
The definition of macro MOTO_SROM_BUG is:
  #define MOTO_SROM_BUG    (lp->active == 8 && (get_unaligned_le32(
  dev->dev_addr) & 0x00ffffff) == 0x3e0008)

and the if statement
  if (MOTO_SROM_BUG) lp->active = 0;

using this macro indicates lp->active could be 8. If lp->active is 8 and
the second comparison of this macro is false. lp->active will remain 8 in:
  lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1);
  lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1);
  lp->phy[lp->active].mc  = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].ana = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].fdx = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].ttm = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].mci = *p;

However, the length of array lp->phy is 8, so array overflows can occur.
To fix these possible array overflows, we first check lp->active and then
return -EINVAL if it is greater or equal to ARRAY_SIZE(lp->phy) (i.e. 8).

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Teng Qi <starmiku1207184332@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 12:03:17 +00:00
zhangyue
61217be886 net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
In line 5001, if all id in the array 'lp->phy[8]' is not 0, when the
'for' end, the 'k' is 8.

At this time, the array 'lp->phy[8]' may be out of bound.

Signed-off-by: zhangyue <zhangyue1@kylinos.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:59:26 +00:00
David S. Miller
718cc29daa Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
10GbE Intel Wired LAN Driver Updates 2021-11-17

Radoslaw Tyl says:

The change is a consequence of errors reported by the ixgbevf driver
while starting several virtual guests at the same time on ESX host.
During this, VF was not able to communicate correctly with the PF,
as a result reported "PF still in reset state. Is the PF interface up?"
and then goes to locked state. The only thing left was to reload
the VF driver on the guest OS.

The background of the problem is that the current PFU and VFU
semaphore locking mechanism between sender and receiver may cause
overriding Mailbox memory (VFMBMEM), in such scenario receiver of
the original message will read the invalid, corrupted or one (or more)
message may be lost.

This change is actually as a support for communication with PF ESX
driver and does not contains changes and support for ixgbe driver.
For maintain backward compatibility, previous communication method
has been preserved in the form of LEGACY functions.

In the future there is a plan to add a support for a 1.5 mailbox API
communication also to ixgbe driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:49:52 +00:00
David S. Miller
4e5d2124f7 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-
queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-11-17

This series contains updates to i40e driver only.

Eryk adds accounting for VLAN header in packet size when VF port VLAN is
configured. He also fixes TC queue distribution when the user has changed
queue counts as well as for configuration of VF ADQ which caused dropped
packets.

Michal adds tracking for when a VSI is being released to prevent null
pointer dereference when managing filters.

Karen ensures PF successfully initiates VF requested reset which could
cause a call trace otherwise.

Jedrzej moves validation of channel queue value earlier to prevent
partial configuration when the value is invalid.

Grzegorz corrects the reported error when adding filter fails.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:48:33 +00:00
Jesse Brandeburg
5d2ca2e12d e100: fix device suspend/resume
As reported in [1], e100 was no longer working for suspend/resume
cycles. The previous commit mentioned in the fixes appears to have
broken things and this attempts to practice best known methods for
device power management and keep wake-up working while allowing
suspend/resume to work. To do this, I reorder a little bit of code
and fix the resume path to make sure the device is enabled.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=214933

Fixes: 69a74aef8a ("e100: use generic power management")
Cc: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Reported-by: Alexey Kuznetsov <axet@me.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Alexey Kuznetsov <axet@me.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:40:31 +00:00
Russell King (Oracle)
6d386f6613 net: dpaa2-mac: use phylink_generic_validate()
DPAA2 has no special behaviour in its validation implementation, so can
be switched to phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:38:44 +00:00
Russell King (Oracle)
22de481d23 net: dpaa2-mac: remove interface checks in dpaa2_mac_validate()
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode, nor handle
PHY_INTERFACE_MODE_NA in the validation function. Remove these to
simplify the implementation.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:38:44 +00:00
Russell King
15d0b14cec net: dpaa2-mac: populate supported_interfaces member
Populate the phy interface mode bitmap for the Freescale DPAA2 driver
with interfaces modes supported by the MAC.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:38:44 +00:00
Russell King (Oracle)
c8fa4bac30 net: ag71xx: use phylink_generic_validate()
ag71xx apparently only supports MII port type, which makes it different
from other implementations. However, Oleksij says there is no special
reason for this.

Convert the driver to use phylink_generic_validate(), which will allow
all ethtool port linkmodes instead of only MII, giving the driver
consistent behaviour with other drivers.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:36:48 +00:00
Russell King (Oracle)
5e20a8aa48 net: ag71xx: remove interface checks in ag71xx_mac_validate()
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode, nor handle
PHY_INTERFACE_MODE_NA in the validation function. Remove these to
simplify the implementation.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:36:48 +00:00
Russell King
680e9d2cd4 net: ag71xx: populate supported_interfaces member
Populate the phy_interface_t bitmap for the Atheros ag71xx driver with
interfaces modes supported by the MAC.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:36:48 +00:00
Bhupesh Sharma
6c950ca7c1 net: stmmac: dwmac-qcom-ethqos: add platform level clocks management
Split clocks settings from init callback into clks_config callback,
which could support platform level clock management.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:33:10 +00:00
Teng Qi
a66998e0fb ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
The if statement:
  if (port >= DSAF_GE_NUM)
        return;

limits the value of port less than DSAF_GE_NUM (i.e., 8).
However, if the value of port is 6 or 7, an array overflow could occur:
  port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off;

because the length of dsaf_dev->mac_cb is DSAF_MAX_PORT_NUM (i.e., 6).

To fix this possible array overflow, we first check port and if it is
greater than or equal to DSAF_MAX_PORT_NUM, the function returns.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Teng Qi <starmiku1207184332@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18 11:15:47 +00:00
Dan Carpenter
a280ef90af octeontx2-af: debugfs: don't corrupt user memory
The user supplies the "count" value to say how big its read buffer is.
The rvu_dbg_lmtst_map_table_display() function does not take the "count"
into account but instead just copies the whole table, potentially
corrupting the user's data.

Introduce the "ret" variable to store how many bytes we can copy.  Also
I changed the type of "off" to size_t to make using min() simpler.

Fixes: 0daa55d033 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211117073454.GD5237@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-17 20:25:40 -08:00
Grzegorz Szczurek
5aff430d4e i40e: Fix display error code in dmesg
Fix misleading display error in dmesg if tc filter return fail.
Only i40e status error code should be converted to string, not linux
error code. Otherwise, we return false information about the error.

Fixes: 2f4b411a3d ("i40e: Enable cloud filters via tc-flower")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 16:20:57 -08:00
Jedrzej Jagielski
2e6d218c1e i40e: Fix creation of first queue by omitting it if is not power of two
Reject TCs creation with proper message if the first queue
assignment is not equal to the power of two.
The first queue number was checked too late in the second queue
iteration, if second queue was configured at all. Now if first queue value
is not a power of two, then trying to create qdisc will be rejected.

Fixes: 8f88b3034d ("i40e: Add infrastructure for queue channel support")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 16:20:57 -08:00
Karen Sornek
3a3b311e38 i40e: Fix warning message and call stack during rmmod i40e driver
Restore part of reset functionality used when reset is called
from the VF to reset itself. Without this fix warning message
is displayed when VF is being removed via sysfs.

Fix the crash of the VF during reset by ensuring
that the PF receives the reset message successfully.
Refactor code to use one function instead of two.

Fixes: 5c3c48ac6b ("i40e: implement virtual device interface")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 16:20:50 -08:00
Alexander Lobakin
e92af33e47 stmmac: fix build due to brainos in trans_start changes
txq_trans_cond_update() takes netdev_tx_queue *nq,
not nq->trans_start.

Fixes: 5337824f4d ("net: annotate accesses to queue->trans_start")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211117152917.3739-1-alexandr.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-17 08:16:07 -08:00
Radoslaw Tyl
339f289641 ixgbevf: Add support for new mailbox communication between PF and VF
Provide improved mailbox communication, between PF and VF,
which is defined as API version 1.5.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:07:42 -08:00
Radoslaw Tyl
c869259881 ixgbevf: Mailbox improvements
Improve reliability of the mailbox communication and remove
its potential flaws that may lead to the undefined or faulty behavior.

Recently some users reported issues on ESX with 10G Intel NICs which were
found to be caused by incorrect implementation of the PF-VF mailbox
communication.

Technical investigation highlighted areas to improve in the communication
between PF or VF that wants to send the message (sender) and the other
part which receives the message (receiver):

 - Locking the mailbox when the sender wants to send a message
 - Releasing the mailbox when the communication ends
 - Returning the result of the mailbox message execution

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:07:42 -08:00
Radoslaw Tyl
9c9463c29d ixgbevf: Add legacy suffix to old API mailbox functions
Add legacy suffix to mailbox functions which should be backwards compatible
with older PF drivers. Communication during API negotiation always has to
be done using the earlier implementation.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:07:42 -08:00
Radoslaw Tyl
887a32031a ixgbevf: Improve error handling in mailbox
Add new handling for error codes:
 IXGBE_ERR_CONFIG - ixgbe_mbx_operations is not correctly set
 IXGBE_ERR_TIMEOUT - mailbox operation, e.g. poll for message, timeout

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:07:42 -08:00
Radoslaw Tyl
0edbecd570 ixgbevf: Rename MSGTYPE to SUCCESS and FAILURE
There is name similarity within IXGBE_VT_MSGTYPE_ACK and
PFMAILBOX.ACK / VFMAILBOX.ACK. MSGTYPE macros are renamed to SUCCESS and
FAILURE because they are not specified in datasheet and now will be
easily distinguishable.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:07:42 -08:00
Eryk Rybak
9e0a603cb7 i40e: Fix ping is lost after configuring ADq on VF
Properly reconfigure VF VSIs after VF request ADQ.
Created new function to update queue mapping and queue pairs per TC
with AQ update VSI. This sets proper RSS size on NIC.
VFs num_queue_pairs should not be changed during setup of queue maps.
Previously, VF main VSI in ADQ had configured too many queues and had
wrong RSS size, which lead to packets not being consumed and drops in
connectivity.

Fixes: bc6d33c8d9 ("i40e: Fix the number of queues available to be mapped for use")
Co-developed-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:05:06 -08:00
Eryk Rybak
d2a69fefd7 i40e: Fix changing previously set num_queue_pairs for PFs
Currently, the i40e_vsi_setup_queue_map is basing the count of queues in
TCs on a VSI's alloc_queue_pairs member which is not changed throughout
any user's action (for example via ethtool's set_channels callback).

This implies that vsi->tc_config.tc_info[n].qcount value that is given
to the kernel via netdev_set_tc_queue() that notifies about the count of
queues per particular traffic class is constant even if user has changed
the total count of queues.

This in turn caused the kernel warning after setting the queue count to
the lower value than the initial one:

$ ethtool -l ens801f0
Channel parameters for ens801f0:
Pre-set maximums:
RX:             0
TX:             0
Other:          1
Combined:       64
Current hardware settings:
RX:             0
TX:             0
Other:          1
Combined:       64

$ ethtool -L ens801f0 combined 40

[dmesg]
Number of in use tx queues changed invalidating tc mappings. Priority
traffic classification disabled!

Reason was that vsi->alloc_queue_pairs stayed at 64 value which was used
to set the qcount on TC0 (by default only TC0 exists so all of the
existing queues are assigned to TC0). we update the offset/qcount via
netdev_set_tc_queue() back to the old value but then the
netif_set_real_num_tx_queues() is using the vsi->num_queue_pairs as a
value which got set to 40.

Fix it by using vsi->req_queue_pairs as a queue count that will be
distributed across TCs. Do it only for non-zero values, which implies
that user actually requested the new count of queues.

For VSIs other than main, stay with the vsi->alloc_queue_pairs as we
only allow manipulating the queue count on main VSI.

Fixes: bc6d33c8d9 ("i40e: Fix the number of queues available to be mapped for use")
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Co-developed-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:05:06 -08:00
Michal Maloszewski
37d9e304ac i40e: Fix NULL ptr dereference on VSI filter sync
Remove the reason of null pointer dereference in sync VSI filters.
Added new I40E_VSI_RELEASING flag to signalize deleting and releasing
of VSI resources to sync this thread with sync filters subtask.
Without this patch it is possible to start update the VSI filter list
after VSI is removed, that's causing a kernel oops.

Fixes: 41c445ff0f ("i40e: main driver core")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Reviewed-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Reviewed-by: Witold Fijalkowski <witoldx.fijalkowski@intel.com>
Reviewed-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:05:05 -08:00
Eryk Rybak
6afbd7b3c5 i40e: Fix correct max_pkt_size on VF RX queue
Setting VLAN port increasing RX queue max_pkt_size
by 4 bytes to take VLAN tag into account.
Trigger the VF reset when setting port VLAN for
VF to renegotiate its capabilities and reinitialize.

Fixes: ba4e003d29 ("i40e: don't hold spinlock while resetting VF")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-17 08:05:05 -08:00
Eric Dumazet
5337824f4d net: annotate accesses to queue->trans_start
In following patches, dev_watchdog() will no longer stop all queues.
It will read queue->trans_start locklessly.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 14:56:16 +00:00
Łukasz Stelmach
c366ce2875 net: ax88796c: use bit numbers insetad of bit masks
Change the values of EVENT_* constants from bit masks to bit numbers as
accepted by {clear,set,test}_bit() functions.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 14:51:40 +00:00
Pavel Skripkin
9b5a333272 net: dpaa2-eth: fix use-after-free in dpaa2_eth_remove
Access to netdev after free_netdev() will cause use-after-free bug.
Move debug log before free_netdev() call to avoid it.

Fixes: 7472dd9f64 ("staging: fsl-dpaa2/eth: Move print message")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 14:47:07 +00:00
Kurt Kanzenbach
65483559dc net: ethernet: ti: cpsw: Enable PHY timestamping
If the used PHYs also support hardware timestamping, all configuration requests
should be forwared to the PHYs instead of being processed by the MAC driver
itself.

This enables PHY timestamping in combination with the cpsw driver.

Tested with an am335x based board with two DP83640 PHYs connected to the cpsw
switch.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 14:34:33 +00:00
Russell King (Oracle)
7258aa5094 net: ocelot_net: use phylink_generic_validate()
ocelot_net has no special behaviour in its validation implementation, so
can be switched to phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:25:45 +00:00
Russell King (Oracle)
a6f5248bc0 net: ocelot_net: remove interface checks in macb_validate()
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode in the
validation function. Remove this to simplify it.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:25:45 +00:00
Russell King (Oracle)
8ea8c5b492 net: ocelot_net: populate supported_interfaces member
Populate the phy interface mode bitmap for the MSCC Ocelot driver with
the interface modes supported by the MAC.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:25:45 +00:00
Russell King (Oracle)
a4238f6ce1 net: mtk_eth_soc: use phylink_generic_validate()
mtk_eth_soc has no special behaviour in its validation implementation,
so can be switched to phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:23:39 +00:00
Russell King (Oracle)
71d9274944 net: mtk_eth_soc: drop use of phylink_helper_basex_speed()
Now that we have a better method to select SFP interface modes, we
no longer need to use phylink_helper_basex_speed() in a driver's
validation function, and we can also get rid of our hack to indicate
both 1000base-X and 2500base-X if the comphy is present to make that
work. Remove this hack and use of phylink_helper_basex_speed().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:23:39 +00:00
Russell King (Oracle)
db81ca1538 net: mtk_eth_soc: remove interface checks in mtk_validate()
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode, nor handle
PHY_INTERFACE_MODE_NA in the validation function. Remove these to
simplify the implementation.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:23:39 +00:00
Russell King (Oracle)
83800d29f0 net: mtk_eth_soc: populate supported_interfaces member
Populate the phy interface mode bitmap for the Mediatek driver with
interfaces modes supported by the MAC.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:23:39 +00:00
Russell King (Oracle)
319faa90b7 net: sparx5: use phylink_generic_validate()
Sparx5 has no special behaviour in its validation implementation, so can
be switched to phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:21:42 +00:00
Russell King (Oracle)
9b5cc05fd9 net: sparx5: clean up sparx5_phylink_validate()
sparx5_phylink_validate() no longer needs to check for
PHY_INTERFACE_MODE_NA as phylink will walk the supported interface
types to discover the link mode capabilities. Neither is it necessary
to check the device capabilities as we will not be called for
unsupported interface modes. Remove these checks.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:21:42 +00:00
Russell King (Oracle)
ae089a8191 net: sparx5: populate supported_interfaces member
Populate the phy_interface_t bitmap for the Microchip Sparx5 driver
with interfaces modes supported by the MAC.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:21:42 +00:00
Russell King (Oracle)
75021cf02f net: enetc: use phylink_generic_validate()
enetc has no special behaviour in its validation implementation, so can
be switched to phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:19:28 +00:00
Russell King (Oracle)
5a94c1ba8e net: enetc: remove interface checks in enetc_pl_mac_validate()
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode in the
validation function. Remove this to simplify it.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:19:28 +00:00
Russell King (Oracle)
4e5015df52 net: enetc: populate supported_interfaces member
Populate the phy_interface_t bitmap for the Freescale enetc driver with
interfaces modes supported by the MAC.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:19:28 +00:00
Russell King (Oracle)
72a47e1aaf net: axienet: use phylink_generic_validate()
axienet has no special behaviour in its validation implementation, so
can be switched to phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:17:44 +00:00
Russell King (Oracle)
5703a4b664 net: axienet: remove interface checks in axienet_validate()
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode in the
validation function. Remove this to simplify it.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:17:44 +00:00
Russell King (Oracle)
136a3fa28a net: axienet: populate supported_interfaces member
Populate the phy_interface_t bitmap for the Xilinx axienet driver with
interfaces modes supported by the MAC.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 11:17:44 +00:00
David S. Miller
9311ccef27 mlx5-fixes-2021-11-16
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmGUEocACgkQSD+KveBX
 +j6ynwf/UKQ4I+ZexNfgDmINn4qNgGjnOOq8EfaPPBUzUODObmgPHV7fPT8ADX3b
 sLx7CgVGRHnk36XqDhptYrIDEtbfa26epRvr3PGU4FcUmz+PoJprQtWBK2B5PDiT
 nwSmxerEauiji7LxlXx9YmtfpnbeHDFanjJyYbVnuZkqREMP7S/B73KLFnW7WEyU
 EyEKu+oahGx3C4CiF7Fg20w9mP2fujC7pw+og+g6T9dDiF/ynTCf1KZMXq1EwA1M
 +/GS37Lm4VEoCrumiF6HGYmr9gRIFhMXBTaTRYzuz/d7FPxFrtF/88IF0xfmFmLr
 7SNBWjmNnjnCaXNIlY1IbyHh9/P83Q==
 =a7ES
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2021-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-fixes-2021-11-16

Please pull this mlx5 fixes series, or let me know in case of any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-17 10:50:53 +00:00
Dmytro Linkin
85c5f7c920 net/mlx5: E-switch, Create QoS on demand
Don't create eswitch QoS (root TSAR) on switch mode change. Create it on
first child TSAR object creation - vport or rate group. Keep track
root TSAR references and release root TSAR with last object deletion.
No need to check for QoS is enabled when installing tc matchall filter.
Remove related helper function due to no users of it.

Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:52 -08:00
Dmytro Linkin
d7df09f5e7 net/mlx5: E-switch, Enable vport QoS on demand
Vports' QoS is not commonly used but consume SW/HW resources, which
becomes an issue on BlueField SoC systems.
Don't enable QoS on vports by default on eswitch mode change and enable
when it's going to be used by one of the top level users:
- configuring TC matchall filter with police action;
- setting rate with legacy NDO API;
- calling devlink ops->rate_leaf_*() callbacks.

Disable vport QoS on vport cleanup.

Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:52 -08:00
Parav Pandit
e9d491a647 net/mlx5: E-switch, move offloads mode callbacks to offloads file
eswitch.c is mainly for common code between legacy and offloads mode.
MAC address get and set via devlink is applicable only in offloads mode.

Hence, move it to eswitch_offloads.c file.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:52 -08:00
Parav Pandit
b22fd4381d net/mlx5: E-switch, Reuse mlx5_eswitch_set_vport_mac
mlx5_eswitch_set_vport_mac() routine already does necessary checks which
are duplicated in implementation of
mlx5_devlink_port_function_hw_addr_set().

Hence, reuse mlx5_eswitch_set_vport_mac() and cut down the code.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:51 -08:00
Parav Pandit
fcf8ec54b0 net/mlx5: E-switch, Remove vport enabled check
An eswitch vport of the devlink port is always enabled before a
devlink port is registered. And a eswitch vport is always disabled
after a devlink port is unregistered.
Hence avoid the vport enabled check in the devlink callback routine.
Such check is only applicable in the legacy SR-IOV callbacks.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Sunil Sudhakar Rani <sunrani@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:51 -08:00
Chris Mi
819c319c8c net/mlx5e: Specify out ifindex when looking up decap route
There is a use case that the local and remote VTEPs are in the same
host. Currently, the out ifindex is not specified when looking up the
decap route for offloads. So in this case, a local route is returned
and the route dev is lo.

Actual tunnel interface can be created with a parameter "dev" [1],
which specifies the physical device to use for tunnel endpoint
communication. Pass this parameter to driver when looking up decap
route for offloads. So that a unicast route will be returned.

[1] ip link add name vxlan1 type vxlan id 100 dev enp4s0f0 remote 1.1.1.1 dstport 4789

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:51 -08:00
Roi Dayan
fc3a879aea net/mlx5e: TC, Move comment about mod header flag to correct place
Move the comment to the correct place where the driver actually
removes the flag and not in the check that maybe pedit actions exists.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:50 -08:00
Roi Dayan
88d9748604 net/mlx5e: TC, Move kfree() calls after destroying all resources
When deleting fdb/nic flow rules first release all resources
and then call the kfree() calls instead of sparse them around
the function.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:50 -08:00
Roi Dayan
972fe492e8 net/mlx5e: TC, Destroy nic flow counter if exists
Counter is only added if counter flag exists.
So check the counter fag exists for deleting the counter.
This is the same as in add/del fdb flow.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:49 -08:00
Yihao Han
0164a9bd9d net/mlx5: TC, using swap() instead of tmp variable
swap() was used instead of the tmp variable to swap values

Signed-off-by: Yihao Han <hanyihao@vivo.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:49 -08:00
Paul Blakey
1cfd3490f2 net/mlx5: CT: Allow static allocation of mod headers
As each CT rule uses at least 4 modify header actions, each rule
causes at least 3 reallocations by the mod header actions api.

Allow initial static allocation of the mod acts array, and use it for
CT rules. If the static allocation is exceeded go back to dynamic
allocation.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
2021-11-16 20:31:48 -08:00
Paul Blakey
2c0e5cf520 net/mlx5e: Refactor mod header management API
For all mod hdr related functions to reside in a single self contained
component (mod_hdr.c), refactor alloc() and add get_id() so that user
won't rely on internal implementation, and move both to mod_hdr
component.

Rename the prefix to mlx5e_mod_hdr_* as other mod hdr functions.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16 20:31:48 -08:00